Some software will become more fragile - not mine

Brainmade mark

This blogpost is mostly a rant.

In a perfect opening note to this blog, as I was sitting down to write it, I was faced with this message in my terminal, right after typing the hugo new command:

❯ hugo new posts/software-will-get-more-fragile-over-time.md
WARN  deprecated: module.mounts.excludeFiles was deprecated in Hugo v0.153.0 and will be removed in a future release. Replaced by the simpler 'files' setting, see https://gohugo.io/configuration/module/#files

Another warning I’ll have to fix. Throw it onto the pile. OK, let’s run the Hugo server, then?

ERROR error building site: render: [en v1.0.0 guest] failed to render pages: render of "/" failed: "/Users/shay/Desktop/code/blog/themes/hermit-fork/layouts/index.html:41:73": execute of template failed: template: index.html:41:73: executing "main" at <.Site.Params.author.name>: can't evaluate field name in type string
render of "/Users/shay/Desktop/code/blog/content/levels/hooks-1.md" failed: "/Users/shay/Desktop/code/blog/themes/hermit-fork/layouts/_default/single.html:39:3": execute of template failed: template: single.html:39:3: executing "footer" at <partialCached "footer.html" .>: error calling partialCached: "/Users/shay/Desktop/code/blog/themes/hermit-fork/layouts/partials/footer.html:2:74": execute of template failed: template: _partials/footer.html:2:74: executing "_partials/footer.html" at <.Site.Params.author.name>: can't evaluate field name in type string

Ugh… OK, let’s quickly fix it. Moved the name for some reason. Now what?

ERROR The "twitter", "tweet", and "twitter_simple" shortcodes were deprecated in v0.142.0 and removed in v0.156.0. Please use the "x" shortcode instead.
ERROR The "gist" shortcode was deprecated in v0.143.0 and removed in v0.156.0.

Is software changing faster?

I don’t have any empirical stats to back this up. But yeah, definitely feels like the software I’m using is rapidly changing. At my current role at Opsin Security and as a Bay Area techie, I sorta have to stay on the “cutting edge” of software. Following up with Claude releases. Doomscrolling tech Twitter. Reading new substacks. And man… Things sure seem to be moving fast. What’s that saying?

Move fast and make sure to stay backwards compatible.

No, that’s not it.

Anyways. I feel like software is changing faster. And it feels like that to people around me as well.

What sort of change?

More features, maybe? AI ✨ stuff ✨ all over the place? Security updates because of supply chain vulns. Deprecated language versions, deprecated image layers, closed-down startups that were open core and now that core is hollow and sad.

Of course I want to be secure and update. But the rate of change and the justification is getting harder.

A house of cards with some code on top web software project is usually made up of a TON of layers. Why do so many of them break so much? Because of non-backward compatible changes. These changes are usually considered “for the better”!

ARE THEY? Most of them feel like either enshittification, updates that are just updates of updates. Some pedantic changes that I don’t care about but someone thought was extremely important; I can guarantee you it’s not.

There are great new features that I’m happy to use, performance improvements, interesting ideas! There are. But not all of it.

Why is my software so getting so fragile

So, this rant blogpost came because a project of mine with 6+ years of uptime stopped working recently. A user that wanted to play the git CTF approached me and informed me that the server is down. Huh? This was running perfectly for 6+ years, other than one unsuccessful DDOS attack 4 years ago! What happened?

Here’s the timeline:

  • I set up the server around 2020.
  • Around a year ago, I started getting deprecation notices for the EC2 server in my inbox. It’s Amazon Linux 1, which is “no longer supported”.
  • In January, the server died. AWS claimed it was a hardware failure.

So far? OK. I understand them wanted to sunset support for an operating system that was created in 2010. I understand how much work the security patching is. And the hardware failure? Big disaapointment, but “the cloud” is just someone else’s computer. And computers sometimes turn off.

When approaching to fix the server though… I didn’t exactly remember how to deploy it. It was 6 years ago! Luckily, past Shay was smart and wrote future Shay some documentation.

The documentation is available on GitHub, but here’s the important snippet:

## Build

### Ansible

Using Ansible, you can build and deploy the game server from nothing.

```bash
cd build/ansible
sed -i 's/ctf.mrnice.dev/your.server.com/g' hosts
ansible-playbook -v -i hosts build.yaml
```

Make sure that you have Ansible configured correctly with your SSH keys.
[Here's the docs](https://docs.ansible.com/ansible/latest/inventory_guide/connection_details.html).

> Note: Remember to expose 22 to your IP. If you're like me with AWS EC2, you
> need to add a rule to the security group. Like this:
>
> `aws ec2 authorize-security-group-ingress --group-id PUT_HERE --protocol tcp --port 22 --cidr "$(curl -s https://wtfismyip.com/json | jq -r '.YourFuckingIPAddress')/32"`

All I wanted was for these commands to Just Work™. I tested them. Documented them. So why not?

But of course…

❯ ansible-playbook -v -i hosts2 build.yaml
zsh: /Users/shay/Library/Python/3.11/bin/ansible-playbook: bad interpreter: /opt/homebrew/opt/python@3.11/bin/python3.11: no such file or directory
No config file found; using defaults

PLAY [ctfservers] *****************************************************************************************************************************************************************************************************

TASK [Gathering Facts] ************************************************************************************************************************************************************************************************
[WARNING]: Host 'ctf' is using the discovered Python interpreter at '/usr/bin/python3', but future installation of another Python interpreter could cause a different interpreter to be discovered. See https://docs.ansible.com/ansible-core/2.20/reference_appendices/interpreter_discovery.html for more information.
[ERROR]: Task failed: Action failed: The following modules failed to execute: ansible.legacy.setup.

Task failed: Action failed.

<<< caused by >>>

The following modules failed to execute: ansible.legacy.setup.

+--[ Sub-Event 1 of 1 ]---
|
| Module result deserialization failed: No start of json char found See stdout/stderr for the returned output.
|
+--[ End Sub-Event ]---

fatal: [ctf]: FAILED! => {"ansible_facts": {}, "changed": false, "failed_modules": {"ansible.legacy.setup": {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python3"}, "exception": "(traceback unavailable)", "failed": true, "module_stderr": "Shared connection to ctf.mrnice.dev closed.\r\n", "module_stdout": "  File \"/home/ec2-user/.ansible/tmp/ansible-tmp-1774149309.036684-55442-55961845761351/AnsiballZ_setup.py\", line 3\r\n    from __future__ import annotations\r\n    ^\r\nSyntaxError: future feature annotations is not defined\r\n", "msg": "Module result deserialization failed: No start of json char found", "rc": 1}}, "msg": "The following modules failed to execute: ansible.legacy.setup."}

The problem was that the ansible version I was on (which I was on because it got auto-updated when I was running brew update at some point) doesn’t work anymore with Python 3.6. And of course, can’t install a newer version of Python on my Amazon Linux 1!

Fuckin’ Python again.

The fix

Thank god for Charlie Marsh! uvx --from 'ansible-core==2.16.*' ansible-playbook -vvv -i hosts2 build.yaml worked. Locked ansible to a version that worked with the python3 version I had installed on the EC2 server. And it all booted back to life. But man, for 15 minutes there, I was VERY frustrated.

Why are you talking about this?

As software engineers we now, because of AI Agents, have the capability to introduce a lot more AI-generated code than we used to. We can send more Slack messages, write more blogs, generate more slop AI brainrot videos, draft longer emails. Some software engineers take that to the extreme. We are so focused on running fast we don’t mind breaking really important things like trust.

But. If our software will be consumed by anyone else, and we are not careful, this will mean pain for them.

Living in the Bay Area and working on AI security, it seems like EVERYONE is pulling the OTHER way. Nobody cares about building things that last because “software is disposable” and “the SaaSpocalypse happened” and other nonesense. As devs it will be very easy to agree to this. It’s so easy to just generate more and more and more code. Everybody’s encouraging you to do it!

Maybe I’m a luddite, but here’s what I’ll try to do differently from now on:

What I will TRY to do differently, and I implore you to try as well

Don’t spit out cryptic error messages that send my users researching. It’s possible to test old versions with new ones, catch the error, understand it, and translate it? Do it.

Note that I’m not saying “test every software against every version ever”. But if you know someone used your library in 2020, it’s very reasonable that they will use it in 2026.

Next time I need to reach for a tool for a project that should last longer than a month, I’ll assume the project will last 10 years. And I’ll imagine - how can I reboot this and make sure it Just Work™s?

In other words, I’m never leaving postgres for another DB.

I will think harder about my software OVER TIME. How can I protect my software from the rotting winds of change it’s bound to brush up against? How can I make it resilient to other developers working 10x as hard producing more code? Software was already rotting fast, but if everyone’s a 10x coder then software might rot 100x faster because everybody’s importing everybody’s else’s slopware.

This is a useful thing to add to every architecture session or DDR template. For every 3rd party dependency, how can we protect against breakage? And for people who consider the component we’re building: how can we make this component not be a pain in their ass?

Most of these documents only describe the software in SPACE (data structures, classes, deployment charts). Very few describe the software in TIME. Fewer of those describe it over a long enough time horizon to take into consideration non-backwards-compatible changes in 3rd party libraries and tools.

Prefer tools that are staticly built to scripts and interpreted languages. And prefer languages and tools that respect backwards compatibility. It’s just not worth my time to go with the “easy” option.

Or in short, Go »> Python.

Maybe I’ll stop upgrading everything. Only upgrade libraries and tools when I need a feature, or when it’s critical infrastructure where I’m contractually obligated to do so for security purposes, or the upgrades are ALWAYS easy.

This is actually harder than not upgrading.

Another important lesson: Fewer dependencies and tools? I think so. As the Go proverbs state, “A little copying is better than a little dependency.”

Although, honestly, I don’t know what I’ll replace Ansible with that will be better for maintainability.

Freeze the dev environment. When working on a project, don’t just state it depends on “ansible” - state which version.

Yes yes I know that are workspace management tools and devcontainers and all that. They will get deprecated as well. Less tools not more.

Closing thoughts

I’m happy my server still has users playing it. The project being alive is a good thing.

This was a good exercise. For most developers it’s hard to see projects in a 5-10 year scale, because:

  • Most projects don’t last that long
  • Most developers aren’t that experienced
  • Most people don’t write retros

So I’m glad I documented this moment of frustration to try and learn from it.

But man, can things Just Work™ sometimes?


Shay Nehmad

musingsai

1764 Words

2026-03-22 22:43 -0700