I think this works in simple domains. After working in big tech for a while, I am still shocked by the required complexity. Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.
Anyone proclaiming simplicity just hasnt worked at scale. Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.
A classic, Chesterton's Fence:
"There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, âI donât see the use of this; let us clear it away.â To which the more intelligent type of reformer will do well to answer: âIf you donât see the use of it, I certainly wonât let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.â"
This is the classic misunderstanding where software engineers can't seem to communicate well with each other.
We can even just look at the title here: Do the simplest thing POSSIBLE.
You can't escape complexity when a problem is complex. You could certainly still complicate it even more than necessary, though. Nowhere in this article is it saying you can avoid complexity altogether, but that many of us tend to over-complicate problems for no good reason.
> We can even just look at the title here: Do the simplest thing POSSIBLE.
I think the nuance here is that âthe simplest thing possibleâ is not always the âbest solutionâ. As an example, it is possible to solve very many business or operational problems with a simple service sitting in front of a database. At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?
Complexity is more than just the code or the infrastructure; it needs to run the entire gamut of the solution. That includes looking at the incidental complexity that goes into scaling, operating, maintaining, and migrating (if a temporary âtoo simple but fast to get goingâ stack was chosen).
Measure twice, cut once. Understand what you are trying to build, and work out a way to get there in stages that provide business value at each step. Easier said than done.
Edit: Replies seem to be getting hung up over the âDBâ reference. This is meant to be a hypothetical where the reader infers a scenario of a technology that âcan solve all problems, but is not necessarily the best solutionâ. Substitute for âwriting files to the file systemâ if you prefer.
Right, and again this is reading too much into it. The simplest thing possible does not mean the best solution. If your solution that worked really well yesterday no longer scales today, it's no longer the correct solution and will require a more complex one.
But sometimes it IS better to think a few steps ahead, rather than building a new system from scratch every time things scale up. It's not always easy to upgrade things incrementally: just look at IPv4 vs IPv6
IPv6 is arguably a good example of what happens when you don't do the simplest thing possible. What we really needed was a bigger IP address space. What we got was a whole bunch of other crap. If we had literally expanded IPv4 by a couple of octets at the end (with compatible routing), would we be there now?
I agree with thinking a few steps ahead. It is particularly useful in case of complex problems or foundational systems.
Also maybe simplicity is sometimes achieved AFTER complexity, anyway. I think the article means a solution that works now... target good enough rather than perfect. And the C2 wiki (1) has a subtitle '(if you're not sure what to do yet)'. In a related C2 wiki entry (2) Ward Cunningham says: Do the easiest thing that could possibly work, and then pound it into the simplest thing that could possibly work.
IME a lot of complexity is due to integration (in addition to things like scalability, availability, ease of operations, etc.) If I can keep interfaces and data exchange formats simple (independent, minimal, etc.) then I can refactor individual systems separately.
Yes sometimes. But how can you know beforehand? Itâs clear in hindsight, for sure.
The most fundamental issue I have witnessed with these things is that people have a very hard time taking a balanced view.
For this specific problem, should we invest in a more robust solution which takes longer to build or should we just build a scrappy version and then scale later?
There is no right or wrong. Itâs depends heavily on the context.
But, some people, especially developers I am afraid, only have one answer for every situation.
It can be hard enough to fix things when some surprise happens. Unwinding complicated âfuture proofâ things on top of that is even worse. The simpler something is, the less you hopefully have to throw away when you inevitably have to.
Is the simplest thing possible still the DB?
Yes thats why google spent decent amount of resources building out spanner because for many biz domains even at hyper scale it's still the DB.
> At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?
Don't worry, the second half of the title has this covered:
> ... that could possibly work
In the scenario you've described, the technology is not working, in the complete sense including business requirements of reasonable operating costs.
Perhaps it really did work at first, in the complete sense, when the number of users was quite small. That's where the actual content of the article kicks in: it suggests you really do use that simple solution, because maybe you'll never need to scale after all, or you'll need to rewrite everything by then anyway, or you'll have access to more engineering talent by then, etc. I'd tend to agree, but with the caveat that you should feel free to break the rule so long as you're doing it consciously. But none of that implies that you should end up in the situation you described.
> Perhaps it really did work at first, in the complete sense, when the number of users was quite small. That's where the actual content of the article kicks in: it suggests you really do use that simple solution, because maybe you'll never need to scale after all, or you'll need to rewrite everything by then anyway, or you'll have access to more engineering talent by then, etc.
This is where I am arguing nuance. These decisions are contextual; and the superficially more complicated solution may be solving inherent complexity in the problem space that only provides benefit over a time period.
As an example, some team might decide to forgo a database and read/write directly to the file system. This may enable a release in less time and that might be the right decision in certain contexts. Or it could be a terrible decision as the externalised costs begin to manifest and the business fails because of loss of customer trust.
My point is that you cannot only look at what is right in front of you, you also need to tactically plan ahead. In the big org context, you also need to strategically plan ahead.
My favourite example of this from my own career... automating timesheet -> payroll processing in a unionized environment. As we're converting the collective bargaining agreement into code, we discover that there are a pair of rules that seem contradictory. Go talk to someone in the payroll department to try to figure out how it's handled. Get an answer that makes decent sense, but have a bit of a lingering doubt about the interpretation. Talk to someone else in the same department... they tell us the alternative interpretation.
Bring the problem back to our primary contact and they've got no clue what to do. They're on like year 2 of a 7 year contract and they've just discovered that their payroll department has been interpreting the ambiguous rules somewhat randomly. No one wants to commit to an interpretation without a memorandum of understanding from the union, and no one wants to start the process of negotiating that MoU because it's going to mean backdating 2 years of payroll for an unknown number of employees, who may have been affected by it one month but not the next, depending on who processed their paystub that month.
I have worked at too many companies where the effort spent not using a simple database was an exponential drag on everything.
Hell I just spent a week doing something which should've taken 5 minutes because rather then a settings database, someone has just been maintaining a giant ball of copy+pasted terraform code instead.
The distinction you make is known to me as natural complexity (the base level due to the nature of the domain) and accidental complexity (that which is added unnecessarily on top of it).
Your definition rubs up against what a UX designer taught me years ago, which is that simple and complex are one spectrum, similar to but different from easy and hard.
Often, simple is confused for easy, and complex for hard. However, simple interfaces can hide a lot of information in unintuitive ways, while complex interfaces can present more information and options up front.
> We can even just look at the title here: Do the simplest thing POSSIBLE.
I think you're focusing on weasel words to avoid addressing the actual problem raided by OP, which is the elephant in the room.
Your limited understanding of the problem domain doesn't mean the problem has a simple or even simpler solution. It just means you failed to understand the needs and tradeoffs that led to complexity. Unwittingly, this misunderstanding originates even more complexity.
Listen, there are many types of complexity. Among which there is complexity intrinsic to the problem domain, but there is also accidental complexity that's needlessly created by tradeoffs and failures in analysis and even execution.
If you replace an existing solution with a solution which you believe is simpler, odds are you will have to scramble to address the impacts of all tradeoffs and oversights in your analysis. Addressing those represents complexity as well, complexity created by your solution.
Imagine a web service that has autoscaling rules based on request rates and computational limits. You might look at request patterns and say that this is far too complex, you can just manually scale the system with enough room to handle your average load, and when required you can just click a button and rescale it to meet demand. Awesome work, you simplified your system. Except your system, like all web services, experiences seasonal request patterns. Now you have schedules and meetings and even incidents that wake up your team in the middle of the night. Your pager fires because a feature was released and you didn't quite scaled the service ro accommodate for the new peak load. So now your simple system requires a fair degree of hand holding to work with any semblance of reliability. Is this not a form of complexity as well? Yes, yes it is. You didn't eliminated complexity, it is only shifted to another place. You saw complexity in autoscaling rules and believed you eliminated that complexity by replacing it with manual scaling, but you only ended up shifting that complexity somewhere else. Why? Because it's intrinsic to the problem domain, and requiring more manual work to tackle that complexity introduces more accidental complexity than what is required to address the issue.
I remember reviewing some code of an engineer I was managing at a FAANG. Noticed an edge case. Pointed out I thought if/when that hit, it was going to cause an alarm that would page on-call. He suggested it might be OK to ship because it was "about a one in a million chance of being hit". The service involved did 500,000 TPS. "So, just 30 times a minute, then?"
And you're right about the amount of engineering that goes into solving problems. One service adjacent to my patch was more than a decade old. Was on a low TPS but critical path for a key business problem. Had not been touched in years. Hadn't caused a single page in that decade, just trudged along, really solidly well engineered service. Somebody suggested we re-write it in a modern architecture and language (it was a kind of mini-monolith in a now unfashionable language). Engineering managers and principals all vetoed that, thank goodness - would have been 5+ years of pain for zero upside.
Accidental complexity is a thing, YAGNI is a thing, tech debt caused complexity is a thing, Iâm a foo programmer let me write bar code like itâs foo is a thing. I donât know if its all high quality needed
I have worked at scale - I have found countless examples of people not believing in simple solutions which eventually prevail and replace the big-complex thing.
Complexity is a learned engineering approach - it takes practice to learn to do it another way. So if all you see is complex solutions how would you learn otherwise?
> I have worked at scale - I have found countless examples of people not believing in simple solutions which eventually prevail and replace the big-complex thing.
I have worked at scale. I have found examples where simple solutions prevail due to inertia and inability or unwillingness to acknowledge the simple solutions failed to adequately address the requirements. The accidental complexity created by those simple solutions is downplayed as it would require reevaluating the simple solution, and thus run books and operations and maintenances are required as part of your daily operations because that's how the system is. And changing it would be too costly.
Yep- this is why itâs a silly comment to make. Now we are where we are if we didnât qualify the conversation as being for âbig scale engineersâ only.
How did those replacements go? Or were you just hoping for the opportunity?
You are not wrong, but the source of the problem may not be the domain but poor software design.
If the software base is full of gotchas and unintended side-effects then the source of the problem is in unclean separation of concerns and tight coupling. Of course, at some point refactoring just becomes an almost insurmountable task, and if the culture of the company does not change more crap will be added before even one of your refactorings land.
Believe me, it's possible to solve complex problems by clean separation of concerns and composability of simple components. It's very hard to do well, though, so lots of programmers don't even try. That's where you need strict ownership of seniors (who must also subscribe to this point of view).
> If the software base is full of gotchas and unintended side-effects then the source of the problem is in unclean separation of concerns and tight coupling.
Do you know how you get such a system? When you start with a simple system and instead of redesigning it to reflect the complexity you just keep the simple system working while extending it to shoehorn the features it needs to meet the requirements.
We get this all the time, specially when junior developers join a team. Inexperienced developers are the first ones complaining about how things are too complex for what they do. More often than not that just reflects opinionated approached to problem domains they are yet to understand. Because all problems are simple once you ignore all constraints and requirements.
> then the source of the problem is in unclean separation of concerns and tight coupling
Sometimes the problem is in the edgesâthe way the separate concerns interactânot in the nodes. This may arise, for example, where the need for an operation/interaction between components doesn't need to be idempotent because the need for it to be never came up.
What, you mean like creating a transaction where if one component does something then the second component fails, the first one should revert?
Again, wrong design. Like I said, it's very difficult to do well. Consider alternate architecture: one component adds the bulk data to request, the second component modifies it and adds other data, then the data is sent to transaction manager that commits or fails the operation, notifying both components of the result.
Now, if the first component is one k8s container already writing to the database and second is then trying to modify the database, rearchitecting that could be a major pain. So, I understand that it's difficult to do after the fact. Yet, if it's not done that way, the problem will just become bigger and bigger. In the long run, it would make more sense to rearchitect as soon as you see such a situation.
He has worked there for 2 years at staff level. This is the same about me (staff swe with more YoE than this guy in a lot more varied roles) professing about how all the things that implemented are simple at my new company who scans 1 billion objects a day - because I didnât fucking write them
The guy is full of shit.
Look at his other blog spam
The formula for prioritizing is literally this simple:
Am I working on the most important thing right now?
If not, drop what Iâm doing and go do that
Utter trash.
Look at his CV. Tiny (but impactful) features ///building on existing infrastructure which has already provably scaled to millions and likely has never seen beneath what is a rest api and a react front end///
I know this type. I AM him. Exaggerating my way through roles saying the right things through self promotion at the right times.
> Iâve also written Python and C in production
Absolute miss truth. A single line edit to existing applications/a pet project CGI server.
This is EXACTLY what I do.
Appreciate the hustle, but donât assume âbecause github + writes blog = knows thingsâ
I personally know and have (tangentially) worked with the guy and none of what youâve said is true.
> Look at his CV. Tiny (but impactful) features ///building on existing infrastructure which has already provably scaled to millions and likely has never seen beneath what is a rest api and a react front end///
Off the top of my head he wrote the socket monitoring infrastructure for Zendeskâs unicorn workers, for example.
I certainly donât agree with everything Sean says and admit that âpicking the most important workâ is a naive thing to say in most scenarios.
But writing Python in production is trivial. Why would anyone lie about that? C is different OTOH. But just because you do a single config change and get paid for that doesnât mean itâs true for everyone.
Also, staff at GitHub requires a certain bar of excellence. So I wouldnât blindly dismiss everything just out of spite.
Though in my previous job, a huge amount of complexity was due to failed, abandoned, or incomplete attempts to refactor/improve systems, and I frequently wondered, if such things had been disallowed, how much simpler the systems we inherited would have been.
This isn't to say you should never try to refactor or improve things, but make sure that it's going to work for 100% of your use cases, that you're budgeted to finish what you start, and that it can be done iteratively with the result of each step being an improvement on the previous.
Every refactor attempt starts with the intention of 100% coverage.
No one can predict how efficacious that attempt will be from the get-go. Eventually, often people find out that their assumptions were too naive or they donât have enough budget to push it to completion.
Successful refactoring attempts start small and donât try to change the universe in a single pass.
Sure, but do some due diligence. I just say that because I've seen a couple cases where someone does a hack week project that introduces some new approach that "makes things so much cleaner". But then after spending a couple months productionizing it and rolling out the first couple iterations to prod amid much fanfare, it becomes evident that while it makes some things easier (oftentimes things that weren't all that hard to begin with), it makes other things a lot harder. So then you're stuck: do you keep pushing even though it's a net negative, do you roll back and lose all that work, or do you stall and leave a two-headed system?
In most of these cases, a few days up front exploring edge cases would have identified the problems and likely would have red lighted the project before it started. It can make you feel like a party pooper when everyone is excited about the new approach, but I think it's important that a few people on the team are tasked with identifying these edge cases before greenlighting the project. Also, maybe productionize your easiest case first, just to get things going, but then do your hardest case second, to really see if the benefits are there, and designate a go/rollback decision point in your schedule.
Of course, such problems can come up in any project, but from what I've seen they tend to be more catastrophic in refactoring/rearchitecting projects. If nothing else, because while unforeseen difficulties can be hacked around for new feature launches, hacking around problems completely defeats the purpose of a refactoring project.
The problem isnât refactoring, its that it was failed, abandoned, or incomplete.
And thatâs usually because the person or small group that began the refactor werenât given the time and resources to do the refactor, and uninterested or unknowledgable people hijacked and over complicated the process, and others blocked it from happening, so what would have taken a few weeks for the initial team to have completed the refactor successfully, with a little help and cooperation from others, and had they not been pulled in 10 different ways to fight other fires â instead after months and months and expending tons of time and money on people mucking it up instead of fixing it, the refactor got abandoned, a million dollars was wasted, and the system as a whole was worse than it was before.
At least half the time, the complexity comes from the system itself, echoes of the organizational structure, infrastructure, and not the requirements or problem domain; so this advice will/should be valid more often than not.
I was one of the original engineers of DFP at Google and we built the systems that send billions of ads to billions of users a day.
The complexity comes from the fact that at scale, the state space of any problem domain is thoroughly (maybe totally) explored very rapidly.
Thatâs a way bigger problem than system complexity and pretty much any system complexity is usually the result of edge cases that need to be solved, rather than bad architecture, infrastructure or organisational issues - these problems are only significant at smaller, inexperienced companies, by the time you are at post scale (if the company survives that long) then state space exploration in implementation (features, security, non-stop operations) is where the complexity is.
My rule on edge cases is: It's OK to not handle an edge case if you know what's going to happen in that case and you've decided to accept that behavior because it's not worth doing something different. It's not OK to fail to handle an edge case because you just didn't want to think about it, which quite often is what the argument for not handling it boils down to. (Then there are the edge cases you didn't handle because you didn't know they existed, which are a whole other tragicomedy.)
Not directly related to the article we're discussing here, but, based on your experience, you might be the ideal kind of person to answer this.
At the scale you are mentioning, even "simple" solutions must be very sophisticated and nuanced. How does this transformation happen naturally from an engineer at a startup where any mainstream language + Postgres covers all your needs, to someone who can build something at Google scale?
Let's disregard the grokking of system design interview books and assume that system design interviews do look at real skills instead of learning common buzzwords.
Demonstration of capability will get you hired, capability comes only through practice.
I built a hobby system for anonymously monitoring BitTorrent by scraping the DHT, in doing this, I learned how to build a little cluster, how to handle 30,000 writes a second (which I used Cassandra for - this was new to me at the time) then build simple analytics on it to measure demand for different media.
Then my interview was just talking about this system, how the data flowed, where it can be improved, how is redundancy handled, the system consisted of about 10 different microservices so I pulled the code up for each one and I showed them.
Interested in astronomy? Build a system to track every star/comet. Interested in weather? Do SOTA predictions, interested in geography? Process the open source global gravity maps, interested in trading? Build a data aggregator for a niche.
It doesnât really matter that whatever you build âis the best in the world or notâ - the fact that you build something, practiced scaling it with whatever limited resources you have, were disciplined to take it to completion, and didnât get stuck down some rabbit hole endlessly re-architecting stuff that doesnât matter, this is what theyâre looking for - good judgement, discipline, experience.
Also attitude is important, like really, really important - some cynical ranter is not going to get hired over the âthatâs cool I can do that!â person, even if the cynical ranter has greater engineering skills, genuine enthusiasm and genuine curiosity is infectious.
If it's a legacy system, then it lives at the edges. The edges are everything.
I wish I could remember or find the proof, but in a multi-dimensional space, as the number of dimensions rise, the highest probability is for points to be located near the edges of the system -- with the limit being that they can be treated as if they all live at the edges. This is true for real systems too -- the users have found all of the limits but avoid working past them.
The system that optimally accommodates all of the edges at once is the old system.
You don't need a complicated proof, just assume a distribution in some very high number of dimensions, with samples from that distribution having randomly generated values from the distribution for each dimension. If you have if you have ~300 dimensions then statistically at least one dimension will be ~3SD from the mean, i.e. "on the edge," and as long as any one dimension is close to an edge, we define a point as being "near the edge."
It's not really meaningful though, at high dimensions you want to consider centrality metrics.
When the domain is complex, it's even MORE important that the individual components be simple with clean interfaces between them. If everything is too intertwined, you lose the ability to make changes or add new functionality without accidentally breaking something else.
As for Chesterton's Fence, you have the causality backwards. You should not build a fence or gate before you have a need for it. However, when you encounter an existing fence or gate, assume there must have been a very good reason for building it in the first place.
This is where John Gall's Systemantics comes into play, âA complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system."
Obviously a bit hyperbolic, but matches my experience.
> Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.
The amount of knowledge required to first generate the codebase, that is now missing for the rewrite, is the elephant in the room for rewrites. That's a decade of decision making, business rules changing, knowledge leaving when people depart etc.
Much like your example, if you think all the information is in the codebase then you should go away and start talking to the business stakeholders until you understand the scope of what you don't currently know.
> Anyone proclaiming simplicity just hasnt worked at scale.
Most projects don't operate at scale. And before "at scale", simple, rewritable code will always evolve better, because it's less dense, and less spread out.
There is indeed a balance between the simplest code, and the gradual abstractions needed to maintain code.
I worked with startups, small and medium sized businesses, and with a larger US airline. Engineering complexity is through the roof, when it doesn't have to be. Not on any of the projects I've seen and worked on.
Now if you're an engineer in some mega corp, things could be very different, but you're talking about the 1% there. If not less.
every complex domain and âat scaleâ is just a bunch of simple things in disguise⌠our industry is just terrible in general about breaking things down. we sort of know this so we came up with shit things like âmicroservicesâ but you spend sufficient time in the industry (almost three decades for me) and you wonât find a single place that has microservices architecture than you havenât wished was a monolith :) we are just terrible at this⌠there is no complex domain, it is just a good excuse we use to justify things
The problem with this is no one can agree about what "at scale" means.
Like yes, everyone knows that if you want to index the whole internet and have tens of thousands of searches a second there are unique challenges and you need some crazy complexity. But if you have a system that has 10 transactions a second...you probably don't. The simple thing will probably work just fine. And the vast majority of systems will never get that busy.
Computers are fast now! One powerful server (with a second powerful server, just in case) can do a lot.
Yeah, we do 100k ML inferences per second. It's not a single server, but the architecture isn't much more complicated than that.
With today's computers, indexing the entire internet and serving 100k QPS also isn't really that demanding architecturally. The vast majority of current implementation complexity exists for reasons other than necessity.
Yep, vertical scaling goes a long way. But itâs not compute where the bottleneck for scale lies, rather in the resiliency & availability.
So although a single server goes a long way, to hit that sweet 99.999 SLA, people horizontally scales way before hitting the maximum compute capacity of a singe machine. HA makes everything way more difficult to operate and reason about.
I had an engineering boss who used this as a mantra (he is now an SVP of engineering at Spotify and we worked together at Comcast)
I think the unspoken part here is âletâs start withâŚâ
It doesnât mean you wonât have to âdo all the thingsâ so much as letâs start with too little so we donât waste time doing things we end up not needing.
Once you aggregate all the simple things you may end up with a complex behemoth but hopefully you didnât spend too much time on fruitless paths getting there.
The point is to not overengineer. This is not about ignoring scale, or not considering edge cases. Don't engineer for scale that you don't even know is necessary if that complicates the code. Do the simplest thing that meets the current requirements, but write the code in such a way that more features, scale etc. can be added without disrupting dependencies.
First of all, I dont disagree. Just wanted to add that "the simple thing" is often not the obvious thing to do, and only becomes apparent after working on it for a while. Often times, when you dive into a set of adjacent functionality, you discover that it barely even works, and does not actually do nearly all the things you thought it did.
Yes. The simple thing is not necessarily the obvious thing or the most immediately salient thing. First explore the problem-solution space thoroughly, THEN choose the simple thing
> Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.
Is this really because the single problem is inherently difficult, or because you're trying to solve more than one problem (scope creep) due a fear of losing revenue? I think a lot of complexity stems from trying to group disparate problems as if they can have a single solution. If you're willing to live with a smaller customer base, then simple solutions are everywhere.
If you want simple solutions and a large customer base, that probably requires R&D.
I am deep in one such corporate complexity, yet I constantly see an ocean of items that could have been in much simpler and more robust way.
Simple stuff had tons of long term advantages and benefits - its easy to ramp up new folks on it compared to some over-abstracted hypercomplex system because some lead dev wanted to try new shiny stuff for their cvs or out of boredom. Its easy to debug, migrate, evolve and just generally maintain, something pure devs often don't care much for unless they become more senior.
Complex optimizations are for sure required for extreme performance or massive public web but that's not the bulk of global IT work done out there.
This could also point to the solution of cutting down the complexity of "big tech". So much of that complexity isn't necessary because it solves problems, it just keeps people employed.
This is a horrifically cynical take and I wish it would stop. I doubt very seriously there is any meaningfully sized collection of engineers who introduce things "just to keep themselves employed," to say nothing of having to now advance that perspective into a full blown conspiracy because code review is also a thing
What is far more likely is the proverbial "JS framework problem:" gah, this technology that I read about (or encounter) is too complex, I just want 1/10th that I understand from casually reading about it, so we should replace it with this simple thing. Oh, right, plus this one other thing that solves a problem. Oh, plus this other thing that solves this other problem. Gah, this thing is too complex!
I donât agree with the phrasing, but there is certainly a ton of complexity introduced because of engineers who are trying to be promoted or otherwise maintain their image of being capable of solving complex problems (through complex solutions).
Itâs not the same as introducing complexity to keep yourself employed, but the result is the same and so is the cause - incentive structures arenât aligned at most companies to solve problems simply and move on.
I realized that I should have asked for an example of "too complex" because I may not be following the arguments because my definition of a thing that is "too complex" almost certainly doesn't align with someone else's. In fact, I'd bet that if you rounded up 10 users from this site and polled them for something they thought was "too complex" the intersection would be a very, very small set of things
I'd recommend reading bullshit jobs by David graeber. Most jobs in most organisations have an incentive structure for an individual to keep themselves employed rather than to actually solve problems.
I'm with you that the world in general is filled with bullshit jobs, but I do not subscribe to the perspective of wholesale bullshit jobs in the cited "big tech," since in general I do not think that jobs which have meaningful ways to measure them easily fall into bullshit. Maybe middle managers?
Do you reckon the KPI's and performance indicators used in big tech count as meaningful ways to measure performance? Wouldn't someone implementing a complex resume-driven project score highly on these measurements, despite a simpler solution being correct? I am not sure that job-hopping every 18 months to maximise TC (ie optimise against your incentives) is a great way to learn about long-term design and organisational implications.
I'm not saying that these jobs are bullshit in the same way that a VP of box-ticking is, just that it's not a conspiracy that a cathedral based on 'design-doc culture' might produce incentives that result in people who focus on maximising their performance on these fiscally rewarding dot points, rather than actualising their innate belief in performant and maintainable systems.
I work at a start-up so if my code doesn't run we don't get paid. This motivates me to write it well.
> Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.
You're doing it wrong. More likely than not.
> Anyone proclaiming simplicity just hasnt worked at scale. Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.
Or, you're just used to excusing complexity because your environment rewards complexity and "big things".
Simple is not necessarily easy. Actually simple can be way harder to think of and push for, because people are so used to complexity.
Yes. Massive scale and operations may make things harder but seeking simplicity is still the right choice and "working in big tech" is not a particular hard or rare credential in HN. Try an actual argument instead of an appeal to self authority.
It's a shame he doesn't give the origin of this expression in programming. It comes from Ward Cunningham (inventor of the wiki) in his work with Kent Beck. In an interview a few years back on Dr. Dobb's, he stated that as the two of them were coding together in the late 80s, they would regularly remind each other of the principle. Eventually, it became a staple of their talks and writing.
They were cognizant of the limitations that are touched on in this article. The example they gave was of coming to a closed door. The simplest thing might be to turn the handle. But if the door is locked, then the simplest thing might be to find the key. But if you know the key is lost, the simplest thing might be to break down the door, and so on. Finding the simplest thing is not always simple, as the article states
IIRC, they were aware that this approach would leave a patchwork of technical debt (a term coined by Cunningham), but the priority on getting code working overrode that concern at least in the short term. This article would have done well to at least touch on the technical debt aspect, IMHO.
Just to add I think the same applies in business and in life. So if youâve got managers who have this vision and cascade it down then things become easier on the ground / the design.
I heard the expression from a colleague who made it his mantra and manifesto, and had no idea who it came from originally! Perhaps the highest honor for an expression's originator is for it to be so ubiquitous that no one knows he said it.
This sounds a lot like the apocryphal Einstein quote
> Everything should be made as simple as possible, but not simpler.
And I found a similar quote from Aquinas
> If a thing can be done adequately by means of one, it is superfluous to do it by means of several; for we observe that nature does not employ two instruments where one suffices
Not apocryphal. The article is referenced and discussed in this interview with Kent Beck[0]. As you see, the link goes directly to Dr. Dobb's--although the page is down. Search for "Cunningham" and the first hit takes you right to the conversation.
It's interesting you gave that example. Before my first use of a wiki I was on a team that used Lotus Notes and did project organization in a team folder. I loved that Notes would highlight which documents had been updated since the last time I read them.
In the next project, that team used a wiki. It's simpler. But, the fact it didn't tell me which documents had been updated effectively made it useless. People typed new project designs into the wiki but no one saw them since they couldn't, at a glance, know which of the hundreds of pages had been updated since they last read them.
I came here to say this! Ward taught me this when I paired with him every day when we worked together. Itâs his, dare I say, mantra when starting a new feature.
Principles like these deserve to be part of curriculum for undergrad courses. People are incorrectly trained to go for some ideal forms at the cost of complexity and fragility. Every approach, idea should be put to ruthless cost-benefit comparison, without any regard to who is proposing it, or how it sounds.
Education and training sometimes enforces prejudices, rules and stigmas that evade inspection of the subject matter in raw form.
Preference to idealism probably emerged from peace times which have no struggle. Someone would be obsessed with perfectness of a sculpture only when they don't need to hunt for the next meal. The real world runs on minimal, conservative, durable and robust approaches.
One of the ironies of this kind of advice is that it's best for people who already have a lot of experience and have the judgement to apply it. For instance, how do you know what the "simplest thing" is? And how can you be sure that it "could possibly work"?
Yesterday I had a problem with my XLSX importer (which I wrote myself--don't ask why). It turned out that I had neglected to handle XML namespaces properly because Excel always exported files with a default namespace.
Then I got a file that added a namespace to all elements and my importer instantly broke.
For example, Excel always outputs <cell ...> whereas this file has <x:cell ...>.
The "simplest thing that could possibly work" was to remove the namespace prefix and just assume that we don't have conflicting names.
But I didn't feel right about doing that. Yes, it probably would have worked fine, but I worried that I was leaving a landmine for future me.
So instead I spent 4 hours re-writing all the parsing code to handle namespaces correctly.
Whether or not you agree with my choice here, my point is that doing "the simplest thing that could possible work" is not that easy. But it does get easier the more experience you have. Of course, by then, you probably don't need this advice.
> One of the ironies of this kind of advice is that it's best for people who already have a lot of experience and have the judgement to apply it. For instance, how do you know what the "simplest thing" is?
I think the author kind of mentions this: "Figuring out the simplest solution requires considering many different approaches. In other words, it requires doing engineering."
Agreed! The author is clearly an experienced and talented software engineer.
But the irony, in my opinion, is that experienced engineers don't need this advice (they are already "doing engineering"), but junior engineers can't use this advice because they don't have the experience to know what the "simplest thing" is.
Still, the advice is useful as a mantra: to remind us of things we already know but, in the heat of the moment, sometimes forget.
I like this. I had a rule of three: figure out three qualitatively different ways to solve the problem - different in kind, not just in choice of tools. Once you have three you start to understand the trade-offs. And you can come up with others quite easily.
We attempt to address this problem at work with an extra caveat to never add code "in the wrong direction" -- so it's fine (usually preferable) to have a partial implementation, as long as it's heading in the direction we'd like the more complete implementation to go in. Basically "KISS, but no hacks".
Just curious, how would that be applied to the xslx namespace problem example given? If the full fix is to implement namespacing, what would the KISS approach be in the right direction?
This avoids the endless whack-a-mole that you get with a partial solution such as "assume namespaces are superflous", which you almost certainly will eventually discover weren't optional.
Or some other hapless person using your terrible code will discover at 2am at night sitting alone in the office building while desperately trying to do something mission critical such as using a "simple" XML export tool to cut over ten thousand users from one Novel system to another so that the citizens of the state have a functioning government in the morning.
Ask me how I know that kind of "probably won't happen" thing will, actually, happen.
I gauge it as "the simplest thing to transition". Most of the time, it's easier to transition a single service that doesn't rely on a big number of complex abstractions or extra infrastructure, even if it's at the expense of some clutter or a bit of redundancy. The new owner can step through the code and see what's going on without having to work backward to understand the abstractions or coordination of services or whatever else.
Of course plenty of times there'll be some abstractions that make the code easier to follow, even at the expense of logic locality. And other times where extra infrastructure is really necessary to improve reliability, or when your in-memory counter hack gets more requirements and replacing it with a dedicated rate limiter lets you delete all that complexity. And in those cases, by all means, add the abstractions or infrastructural pieces as needed.
But in all such cases, I try to ask myself, if I need to hand off this project afterward, which approach is going to make things easiest to explain?
Note that my perception of this has changed over time. Long ago, I was very much in the camp of "simple" meaning: make everything as terse as possible, put everything in its own service, never write code when a piece of infrastructure could do it, decouple everything to the maximum extent, make everything config-based. I ironically remember imagining how delighted the new owners would be to receive such a well-factored thing that was almost no code at all; just abstraction upon abstraction upon event upon abstraction that fit together perfectly via some config file. Of course, transition was a complete fail, as they didn't care enough to grok how the all pieces were designed to fit together, and within a month, they'd broken just about every abstraction I'd built into it, and it was a pain for anybody to work with.
Since then, I've kept things simpler, only using abstractions and extra infra where it'd be weird not to, and always thinking what's going to be the easiest thing to transition. And even though I'm not necessarily transitioning a ton of stuff, it's generally easier to ramp up teams or onboard new hires or debug problems when the code just does what it says. And it's nice because when a need for a new abstraction becomes apparent, you don't have to go back and undo the old one first.
I think most commentators here are missing the point that doing the "simplest" thing doesn't mean doing the hackiest, quickest thing.
The simplest thing can be very difficult to do. It require thought and understanding the system, which is what he says at the very beginning. But I think most people read the headline and just started spewing personal grievances.
> I just think it's ironic that this advice is useless to junior engineers but unneeded by senior engineers.
That's a good way of putting it. The advice essentially boils down to "do the right thing, don't do the wrong thing". Which is good (if common sense) advice, but doesn't practically really help with making decisions.
Itâs the same for AI vibecoding. The more experience you have, the easier it is to keep the agent on the right path. Same for identifying which tasks to use an agent for vs doing yourself.
Don't confuse sloppy with simple. Parsing XML with regex[1] (or a non-namespace-compliant XML parser) is not simple. It's messy, verbose, error-prone, and not in any way idiomatic or simple.
If you had just used a compliant XML parser as intended, you might not even have noticed that different encodings of namespaces was even occurring in the files! It just "doesn't register" when you let the parser handle this for you in the same sense that if you parse HTML (or XML) properly, then you won't notice all of the & and < encodings either. Or CDATA. Or Unicode escapes. Or anything else for that matter that you may not even be aware of.
You may be a few more steps away from making an XLSX importer work robustly. Did you read the spec? The container format supports splitting single documents into multiple (internal) files to support incremental saves of huge files. That can trip developers in the worst way, because you test with tiny files, but XLSX-handling custom code tends to be used to bulk import large files, which will occasionally use this splitting. You'll lose huge blocks of data in production, silently! That's not fun (or simple) to troubleshoot.
The fast, happy path is to start with something like System.IO.Packaging [2] which is the built-in .NET libary for the Open Packaging Conventions (OPC) container format, which is the underlying container format of all Office Open XML (OOXML) formats. Use the built-in XML parser, which handles namespaces very well. Then the only annoyance is that OOXML formats have two groups of namespaces that they can use, the Microsoft ones and the Open "standardised" ones.
Parsing XML is relatively trivial--I'd never use regex, of course, but a basic recursive descent parser can do it pretty easily. I mean, the whole point of XML is that it's supposed to be easy to parse and generate!
Namespaces add a wrinkle, but it wasn't that hard to add. And I was able to add namespace aliasing in my API to handle the two separate "standard" namespaces that you're talking about.
But you're right about OPC/OOXML--those are massive specs and even the tiny slice that I'm handling has been error-prone. I haven't dealt with multiple internal files, so that's a future bug waiting for me. The good news is I'm building a nice library of test files for my regression tests!
It really isn't, and rolling your own parser is the diametric opposite of the "do the simplest thing" philosophy.
The XML v1.1 spec is 126 KB of text, and that doesn't even include XML Namespaces, which is a separate spec with 25 KB of text.
XML is only "simple" in the sense of being well-defined, which makes interoperability simple, in some sense. Contrast this with ill-defined or implementation-defined text formats, where it's decidedly not simple to write an interoperable parser.
As an end-user of XML, the simplest thing is to use an off-the-shelf XML parser, one that's had the bugs beaten out of it by millions of users.
There are very few programming languages out that don't have a convenient, full-featured XML parser library ready to use.
One of the biggest, evergreen arguments Iâve had in my career revolves around the definition of âworksâ.
âJust because it works doesnât mean it isnât broken.â Is an aphorism that seems to click for people who are also handy in the physical world but many software developers think doesnât sound right. Every handyman has at some time used a busted tool to make a repair. They know they should get a new one, and many will make an excuse to do so at the next opportunity (hardware store trip, or sale). Maybe 8 out of ten.
In software itâs probably more like 1 out of ten who will do the equivalent effort.
One of the worst periods of my career was at a company that had a team who liked to build prototypes. They would write a hasty proof-of-concept and then their boss would parade it in front of the executives. It would be deployed somewhere and connected to a little database so it technically "worked" when they tried it.
Then the executives would be stunned that it was done so quickly. The prototype team would pass it off to another team and then move on to the next prototype.
The team that took over would open the project and discover that it was really a proof of concept, not a working site. They wouldn't include basic things like security, validation, error messages, or any of the hundred things that a real working product requires before you can put it online.
So the team that now owned it would often have to restart entirely, building it within the structures used by the rest of our products. The executives would be angry because they saw it "work" with their own eyes and thought the deployment team was just complicating things.
The worst case of this I ran into, the âmaintenanceâ team discovered some of the interactions were demo stubs. Nothing actually happened except the test data looked like the state transition worked.
Those are the worst because you donât have done criteria you can reasonably write down. Itâs whenever QA stops finding fakes in the code, plus a couple months for stragglers you might have missed.
I generally agree, except if the program is a one-time program meant to generate a single output and then you throw it away.
Until recently I would say such programs are extremely rare, but now AI makes this pretty easy. Want to do some complicated project-wide edit? I sometimes get AI to write me a one-off script to do it. I don't even need to read the script, just check the output and throw it away.
But I'm nitpicking, I do agree with it 99% of the time.
By the time youâve done something five times, itâs probably part of your actual process, and you should start treating it as normal instead of exceptional. Even if admitting so feels like a failure.
So I staple something together that works for the exact situation, then start removing the footguns Iâm likely to hit, then I start shopping it to other people I see eye to eye with, fix the footguns they run into. Then we start trying to make it into an actual project, and end game is for it to be a mandatory part of our process once the late adopters start to get onboard.
I remember once having to make a SOAP call that just wasn't connecting for some reason, but another end point on the same service was working, which made no sense. We tried calling the working endpoint right before calling the busted endpoint just for kicks, and that actually functioned. Still to this day makes no sense at all to me, we ended up moving off of soap eventually, but that code was in there until we did.
I hate the days when you are trying to fix a bug in a block of code and as you write pinning tests you realize that the code has always been broken and you cannot understand why it ever got the right answer. Youâve let the magic smoke out and you cannot put it back without fixing the problem. At some point you have to stop trying because you understand perfectly well how it should work and you need to move on to other things.
Those conversations are an important part of the job. You can, for example, agree that something works in the sense that it is currently possible to use it to obtain a desired output, while simultaneously failing to work in various ways: It might fail to do so reliably, or it might only be able to do so at great cost.
On a recent project I fixed our deployment and our hotfix process and it fundamentally changed the scope of epics the team would tackle. Up to that point we were violating the first principle of Continuous: if itâs painful, do it until it isnât. So we would barely deploy more often than we were contractually (both in the legal and internal cultural sense) obligated to do, and that meant people were very conservative about refactoring code that could lead to regressions, because the turnaround time on a failing feature toggle was a fixed tempo. You could turn a toggle on to analyze the impact but then you had to wait until the next deployment to test your fixes. Excruciating with a high deviation for estimates.
With a hotfix process that actually worked worked, people would make two or three times as many iterations, to the point we had to start coordinating to keep people from tripping over each other. And as a consequence old nasty tech debt was being fixed in every epic instead of once a year. It was a profound change.
And as is often the case, as the author I saw more benefit than most. I scooped a two year two man effort to improve response time by myself in three months, making a raft of small changes instead of a giant architectural shift. About twenty percent of the things I tried got backed out because they didnât improve speed and didnât make the code cleaner either. I could do that because the tooling wasnât broken.
The definition of 'works' depends on whether my employer wants to spend its resources (the time I'm working) on fixing it.
If they want to use those resources to prioritize quality, I'll prioritize quality. If they don't, and they just want me to hit some metric and tick a box, I'm happy to do that too.
You get what you measure. I'm happy to give my opinion on what they should measure, but I am not the one making that call.
Theyâll never prioritize the work that keeps the wheels on. You have to learn not to ask and bake it into the cost of new feature work. Itâs non negotiable or it never happens.
My second lead role, the CTO and the engineering manager thought I could walk on water and so I had considerable leeway to change things I thought needed changing.
So one of the first things I did was collectively save the team about 40 hours of code-build-test time per week. Which is really underselling it because what I actually did was both build a CI pipeline at a time nobody knew what âCIâ meant, and increase the number of cycles you could reliably get through without staying late from 4 to 5 cycles per day. A >20% improvement in iterations per day and a net reduction in errors. That was the job where I learned the dangers of pushing code after 3:30pm. Everyone rationalizes that the error they saw was a glitch or someone elseâs bug, and they push and then come in to find the early birds are mad at them. So better to finish what we now call deep work early and do lighter stuff once youâre tired.
Edit: those changes also facilitated us scaling the team to over twice the size of any project Iâd worked on before or for some time after, though the EM deserves equal credit for that feat.
Then they fired the EM and Peter Principled by far the worst manager Iâve ever worked for (fuck you Mike, everyone hated your guts), and all he wanted to know was why I was getting fewer features implemented. Because Iâm making everyone else faster. Speaking of broken, the biggest performance bottleneck in the entire app was his fault. He didnât follow the advice I gave him back when he was working in our query system. Discovering it took hiring an Oracle DB contractor (those are always exorbitant). Fixing it after it shipped was a giant pain (as to why I didnât catch his corner cutting, I was tagged in by another lead who was triple booked, and when I tagged back out he unfortunately didnât follow up sufficiently on the things I prescribed).
When I'm on the fence about some (technical) decision, I use a "razor": if all options seem equal, go with whichever is the simpler one. The results are ok so far and it has been great for reducing my brain-energy spent on pontification and early optimization too far ahead.
I liked the post, but these kinds of articles do make sense to people who've already been through the trenches & view the advice from their seasoned experience PoV and apply it accordingly. But if people without such experience follow it to the letter just because it's written, can have surprises ahead.
> A lot of engineers design by trying to think of the âidealâ system: something well-factored, near-infinitely scalable, elegantly distributed, and so on.
Was it Donald Knuth who said "premature optimization is that root of all evil"?
This article made this point very well, especially regarding the obsession with "scaling" in the SaaS world.
I've seen thousands and thousands of developer hours completely wasted, because developers were forced to massively overcomplicate greenfield code in anticipation of some entirely hypothetical future scaling requirement which either never materialized (95% of the time) or which did appear but in such a different form that the original solution only got in the way (remaining 5%).
John Ousterhoutâs Philosophy of Software Design makes the case for simplicity in a book-length form. I really like how he emphasizes the importance of design simplicity for the maintainability of software; this is where I've seen it matter the most in practice.
My current company is in that 5% part right now. Tremendous effort invested into the system, everyone involved was very proud of themselves. Unfortunately the way we actually needed to scale was almost completely untouched by any of this architecture astronomy, so we have both a terrifically complicated system - very difficult to change things without potential breakage or regression - AND it doesn't scale at all.
I don't mind, I don't blame people for not predicting the future - it's a tough game. But god the hubris and attitude we put up with until the crows came home to roost.
I would be very cautious to give an advice like this to my team. Making a thing simple is actually very hard, and many, who hear the words, may just equate âsimple thingâ with âfirst thing that comes to mindâ, which may eventually turn into a nightmare of complexity.
Generally speaking, when I hear people say this, it's a huge red flag. Really, any time anyone puts forth any kind of broad proclamation about how software development should be done, my hackles go up. Either they don't know what they're talking about, they're full of shit, or both. The only reasonable thing to conclude after lots of experience with software development is that it's hard and requires care and deliberation. There is no one-size-fits-all advice. What I want to see is people who are open-minded and thoughtful.
Simplicity (meaning the inverse of complexity) is usually the most important factor when considering two possible ways of doing something with software. And this is because it has to be conceived of, pitched to, agreed upon, built, and maintained by humans.
Unfortunately, simplicity is complicated. The median engineer in industry is not a reliable judge of which of two designs is less complex.
Further, "simplicity" as an argument has become something people can parrot. So now it's a knee-jerk fallback when a coworker challenges them about the approach they are taking. They quickly say "This is simpler" in response to a much longer, more sincere, and more correct argument. Ideally the team leader would help suss out what's going on, but increasingly the team lead is a less than competent manager, and simplicity is too complicated a topic for them to give a reliable signal. They prefer not to ruffle feathers and let whoever is doing the work make the call; the team bears the complexity.
âSimplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better.â
Yes, and when itâs time to implement something by default, you always choose "your optimal". If you have two options that solve the problem equally well, you always choose the simplest, among other things because itâs shorter.
What you really learn over time and itâs more useful, is to think along these lines: donât try to solve problems that donât exist yet.
This is a mantraic, cool headline but useless. The article doesn't develop it properly either in my opinion.
"real mastery often involves learning when to do less, not more. The fight between an ambitious novice and an old master is a well-worn cliche in martial arts movies: the novice is a blur of motion, flipping and spinning. The master is mostly still. But somehow the noviceâs attacks never seem to quite connect, and the masterâs eventual attack is decisive".
I was initially annoyed at parts of the article, but it does point out that "hacks" often adds hidden complexity that isn't simple so there is a clarity about the tradeoff.
Now the problem with the headline and repeating it is, when "just do a simple thing" becomes mandated from management (technical or not), there comes a certain stress about trying to keep it simple and if you try running with it for a complex problems you easily end up with those hacks that become innate knowledge that's hard to transfer instead of a good design (that seemed complex upfront).
Conversly, I think a lot of "needless complexity" comes from badly planned projects where people being bitten by having to continuously add hacks to handle wild requirements easily end up overdesigning something to catch them, only to end up with no more complexity in that area and then playing catchup with the next area needing ugly hacks (to then try to design that area that stabilized and the cycle repeats).
This is why as developers we do need to inject ourselves into meetings (however boring they are) where things that do land up on our desks are decided.
Itâs Rich Hickeyâs âSimple made Easyâ all over again. âSimpleâ is not the easy path. Simple (or simplex, unbraided) describes an end product with very little interleaving of components. Simplicity is elegant. It takes a lot of hard work to achieve a simple end product.
I see your point, but, taken to the extreme, all it leaves us with is "everything is a trade-off" or "there's no free lunch".
Some generalizations are necessary to formalize the experience we have accumulated in the industry and teach newcomers.
The obvious problem is that, for some strange reason, lots of concepts and patterns that may be useful when applied carefully become a cult (think clean architecture and clean code), which eventually only makes the industry worse.
For example, clean architecture/ports and adapters/hexagonal/whatever, as I see it, is a very sane and pragmatic idea in general. But somehow, all battles are around how to name folders.
I mean, I think I agree more with this sentiment than most. These overly general statements tend to not have much nuance, and do little to incorporate context.
But also keep in mind the audience: the kinds of people who are tempted to use J2EE (at the time) with event sourcing and Semantic Web, etc.
This is really a counterbalance to that: let's not add sophistication and complexity by default. We really are better off when we bias towards the simpler solutions vs one that's overly complex. It's like what Dan McKinley was talking about with "Choose Boring Technology". And of course that's true (by and large), but many in our industry act like the opposite is the case - that you get rewarded for flexing how novel you can make something.
I've spent much of my career unwinding the bad ideas of overly clever devs. Sometimes that clever dev was me!
So yes ... it's an overly general statement that shouldn't need to be said, and yet it's still useful given the tendency of many to over-engineer and use unnecessarily sophisticated approaches when simpler ones would suffice.
I don't think I would go far enough to say that it's generally a red flag...
I see people adding unnecessary complexity to things all the time and advocate for keeping things simple on a daily basis probably. Otherwise designers and product managers and customers and architects will let their mind naturally add complexity to solutions which is unnecessary.
I completely disagree with this being a red flag. It would be a huge green flag for me. The easiest thing to do is to create a complex system, making a simple one is difficult.
Did you read the article? Itâs mostly about the nuance of how to apply this philosophy in practice, not a pithy one-size-fits-all statement about all software engineering.
It seems like a lot of people think that the first draft/prototype/whatever has to be perfect.
It doesn't. It never is. It can't be.
My favorite example of this was the Moon shot. Each step was learning how to do just that one step. Mercury was just about getting into orbit, not easy even now with SpaceX though they are standing on the shoulders of those giants. Then Gemini for multiple people and orbital maneuvering (that experience gained them lots of learning) and then Apollo 8 was still a dress rehearsal even though they flew around the Moon.
Each step HAD to be simple because complexity weighed too much. But each of those simple steps were still wildly complex.
Every time I would dive in and code up something that I though was easy, it would blow up in some weird way, and I have found that doing each step individually and getting it right, might sound like I was going really slow, but it was smoother so it was faster in the end because I wasn't chasing bugs in all the places, but just one.
As someone who has built 0-1 systems at multiple startups (Seed to Series C), Iâve settled on one principle above all else:
âSimple is robustâ
Itâs easy to over-design a system up front, and even easier to over-design improvements to said system.
Customer requirements are continually evolving, and you can never really predict what the future requirements will be (even if it feels like you can).
Breaking down the principle, itâs not just that a simple system is less error prone, itâs just as important that a simple architecture is easier to change in the future.
Should you plan for X, Y, and Z?
Yes, but counterintuitively, by keeping doors open for future and building âthe simplest thing that could possibly work.â
Complexity adds constraints, these limitations make the stack more brittle over time, even when planned with the best intentions.
> real mastery often involves learning when to do less, not more
Really love and agree with this, and (shameless plug?) I think really aligns with a way of working I (and some colleagues) have been working on: https://delivervaluedaily.dev/
It nails the value of keeping things simple, and I think the link to cognitive load deserves even more emphasis. In most of the cases, simplifying means reducing the cognitive load. Sometimes consolidating pieces with DRY, sometimes using design patterns, sometimes decomposing things into services...
Every extra details or workaround increases the number of things you need to keep in your head, not just when building the system, but every time you come back to maintain or extend it. "Simple systems have fewer 'moving pieces': fewer things you have to think about when you're working with them."
Simplicity isn't just about getting the job done quickly; it's about making sure future you (or someone else) can actually understand and safely change the system later. Reducing cognitive load with simplicity pays off long after the job is done.
It's important to understand the difference between easy and simple. It's easy to add complexity, it can sometimes be hard to keep things simple.
But with that in mind, I do agree that a lot of systems are more complex than they need to be. I like to keep things simple.
Of course scalability adds complexity, and sometimes you need that. But you don't always need that, and making things scalable that don't need to be, makes them harder to understand and maintain.
âEverything should be made as simple as possible, but not simpler.â
As someone who has strived for this from early on, the problem the article overlooks is not knowing some of these various technologies everyone is talking about out, because I never felt I needed them. Am I missing something I need, but just ignorant, or is that just needless complexity that a lot of people fall for?
I donât want to test these things out to learn them in actual projects, as Iâd be adding needless complexity to systems for my own selfish ends of learning these things. I worked with someone who did this and it was a nightmare. However, without a real project, I find itâs hard to really learn something well and find the sharp edges.
Yes, and I (nearly) live this nightmare. I have someone higher up in the food chain who is fascinated with every new piece of software they find, that MIGHT be useful. We are often tasked with "looking at it, and seeing if it would be useful".
Yeah, let me shoehorn that fishing trip into my schedule without a charge number, along with the one from last week...
I was the go-to guy for this under my former boss, but he let me do pretty much whatever I wanted, so it usually wasnât an issue to not work on anything else while playing around with new stuff.
Though there was a time when he wanted me to onboard my simple little internal website to a big complicated CICD system, just so we could see how it worked and if it would be useful for other stuff. It wouldnât have been useful for anything else, and I already had a script that would deploy updates to my site that was simple, fast, and reliable. I simply ignored every request to look into that.
Other times I could tell him his idea wouldnât work, and he would say âokâ and walk away. That was that. This accounted for about 30% of what he came to me with.
Implement the simplest thing that works, maybe even by hand at first, instead of adding the tool that does "the whole thing" when you don't need "the whole thing".
Eventually you might start adding more things to it because of needs you haven't anticipated, do it.
If you find yourself building the tool that does "the whole thing" but worse, then now you know that you could actually use the tool that does "the whole thing".
Did you waste time not using the tool right from the start? That's almost a filosofical question, now you know what you need, you had the chance to avoid it if it turned out you didn't, and maybe 9 times out of 10 you will be right.
Such a familiar feeling. Articles similar to this one make lots of sense to and I do try to embrace simplicity and not optimize prematurely, but very often I have no idea whether it's the praised simplicity and pragmatism or just a lack of experience and skills.
I agree with the spirit of the article, but I think the definition of "simple" has been inverted by modern cloud infrastructure. The examples create a false choice between a "simple but unscalable" system and a "complex but scalable" one. That is rarely the trade-off today.
The in-memory rate-limiting example is a perfect case study. An in-memory solution is only simple for a single server. The moment you scale to two, the logic breaks and your effective rate limit becomes N Ă limit. You've accidentally created a distributed state problem, which is a much harder issue to solve. That isn't simple.
Compare that to using a managed service like DynamoDB or ElastiCache. It provides a single source of truth that works correctly for one node or a thousand. By the author's own definition that "simple systems are stable" and require less ongoing work, the managed service is the fundamentally simpler choice. It eliminates problems like data loss on restart and the need to reason about distributed state.
Perhaps the definition of "the simplest thing" has just evolved. In 2025, it's often not about avoiding external dependencies. You will often save time by leveraging battle-tested managed services that handle complexity and scale on your behalf.
I don't think this is particular to cloud infrastructure. Even on a single server you could make the same argument about using flat file vs sqlite vs postgres for storage. Yes, there is a lot of powerful and reusable software, both managed and unmanaged, with good abstractions and great power to weight ratios where you pay a very small complexity cost for an incredible amount of capability. Such is the nature of software.
But all of it comes with tradeoffs and you have to apply judgement. Just as it would be foolish to write almost anything these days in assembly, I think it would be almost as foolish to just default to a managed Amazon service because it scales without considering whether A) you actually need that scale and B) there are other concerns considerations as to why that service might not be the best technical fit (in particular, I've heard regrets due to overzealous adoption of DynamoDB on more than one occasion).
You make a good point about experience. I've noticed an interesting paradox there.
The engineers who most aggressively advocate for bespoke solutions in the name of "simplicity" often have the least experience with their managed equivalents, which can lead to the regrets you mentioned. Conversely, many engineers who only know how to use managed services would struggle to build the simple, self-contained solution the author describes. True judgment requires experience with both worlds.
This is also why I think asking "do we actually need this scale?" is often the wrong question; it requires predicting the future. Since most solutions work at a small scale, a better framework for making a trade-off is:
* Scalability: Will this work at a higher scale if we need it to?
* Operations: What is the on-call and maintenance load?
* Implementation: How much new code and configuration is needed?
For these questions, managed services frequently have a clear advantage. The main caveat is cost-at-scale, but thatâs a moot point in the context of the article's argument.
> A lot of engineers design by trying to think of the âidealâ system: something well-factored, near-infinitely scalable, elegantly distributed, and so on.
> Instead, spend that time understanding the current system deeply, then do the simplest thing that could possibly work.
I'd argue that a fair amount of the former results in the ability to do the latter.
There's a substantial amount of wisdom that goes into designing "simple" systems (simple to understand when reading the code). Just as there's a substantial amount of wisdom that goes into making "simple" changes to those systems.
IMO, the most important thing about this sort of advice (and maybe most advice) is to treat it as a "generally useful heuristic, subject to refinement based on judgment" and not as an "ironclad, immutable law of the kingdom, any transgression from which, will be severely punished".
Sure, try to keep things simple. Unless it doesn't make sense. Then make them less simple. Will you get it wrong sometimes? Yes. Does it matter? Not really. You'll be wrong sometimes no matter what you do, unless you are, in fact, the Flying Spaghetti Monster. You're not, so just accept some failures from time to time and - most importantly - reflect on them, try to learn from them, and expect to be better next time.
Until you get enough experience for your own good judgment, you need some rules of thumb and guidelines from more experienced peers.
As long as you understand that everything is a trade-off and, unfortunately, that the modern field is based on subjective opinions of popular and not necessarily competent people, you will be fine.
I wholeheartedly agree with this. The challenge is perception though. Many managers will see a simple solution to a complex problem and dock you for not doing real engineering, whereas a huge convoluted mess to solve a simple problem (or non-problem) gets you promoted. And in design interviews, "I'd probably implement a counter in memory" would be the last time you ever heard from that company.
This is good advice but it can be difficult to define what simple means. The only technical way I was able to make sense of it is by targeting reducing code entropy and scopes (Inspired by how language models try to minimize Solomonoff/Kolmogorov entropy).
"Itâs fun to decouple your service into two pieces so they can be scaled independently (I have seen this happen maybe ten times, and I have seen them actually be usefully scaled independently maybe once)."
Same, or reliability-tiered separately. But in both aspects I more frequently see the resulting system to be more expensive and less reliable.
Looking over the threads running here, it is interesting how differently the article's title/point is being taken, filtered through different commenter situations and experiences.
I don't see it as a blind prescription.
It doesn't imply that choosing what is simple, will be simple. Or that simplest, will be simple. Or that this is a process uniquely immune from problems or tradeoffs.
Just a reminder to never forget to aim for simplest.
A tautological cookie fortune, of something important we often functionally forget or slide on.
There is a lot of wisdom in recognizing and repeating the most important "mantras of the obvious". And listening to them reformulated, in other ways, by other people.
The greatest craftbeings never stop revisiting the basics.
I think a lot of engineers think, âI thought of a complicated set of abstract ideas that mean nothing to anyone else but will demonstrate my superior intellectâ and then make that. It took a lot of self control to not curse.
First of all, simplicity is the hardest thing there is. You have to first make something complex, and then strip away everything that isn't necessary. You won't even know how to do that properly until you've designed the thing multiple times and found all the flaws and things you actually need.
Second, you will often have wildly different contexts.
- Is this thing controlling nuclear reactors? Okay, so safety is paramount. That means it can be complex, even inefficient, as long as it's safe. It doesn't need to be simple. It would be great if it was, but it's not really necessary.
- Is the thing just a script to loop over some input and send an alert for a non-production thing? Then it doesn't really matter how you do it, just get it done and move on to the next thing.
- Is this a product for customers intended to solve a problem for them, and there's multiple competitors in the space, and they're all kind of bad? Okay, so simplicity might actually be a competitive advantage.
Third, "the simplest thing that could possibly work" leaves a lot of money on the table. Want to make a TV show that is "the simplest thing that could possibly work"? Get an iPhone and record 3 people in an empty room saying lines. Publish a new episode every week. That is technically a TV show - but it would probably not get many views. Critics saying that you have "the simplest show" is probably not gonna put money in your pocket.
You want a grand design principle that always applies? Here's one: "Design for what you need in the near future, get it done on time and under budget, and also if you have the time, try to make it work well."
> First of all, simplicity is the hardest thing there is. You have to first make something complex, and then strip away everything that isn't necessary.
I don't follow. I've made simple things many times without having to make a complex thing first.
That would make you a genius, so congrats :-) It's more likely that you thought it was simple, but it actually wasn't. Was it actually the least-complex thing that works? Or was it just a thing that worked, and it didn't seem complex? Because those are two different things.
> Want to make a TV show that is "the simplest thing that could possibly work"? Get an iPhone and record 3 people in an empty room saying lines. Publish a new episode every week.
You just described Podcast. It did work for many (obviously it failed for many as well). That's an excellent example of why one should start with the simplest thing that could possibly work. Probably better than the OP's examples.
Podcasts aren't usually scripted episodic content. Some are, but those tend to involve a lot more sound production, and they aren't filmed. If you tried to make filmed scripted episodic content the way you make a podcast, it would be terrible. The exception is improv comedy.
I get what you're saying, but you're also attempting to design the perfect system without any hindsight, which is impossible.
The beauty of this approach is that you don't design anything you don't need. The requirements will change, and the design will change. If you didn't write much in the first place, it's easy.
But designing "the simplest thing that could possibly work" may make harder than necessary to modify later. (and any time the requirements change after it's built, you're inching towards a big ball of mud, so this whole idea should be reviled whenever possible)
An example is databases. People design their database schemas in incredibly simplistic ways, and then regret it later when the predictable stuff most people need doesn't work with the old schema, and you can't even just add columns, but you have to modify existing ones. Avoid the nightmare by making it reasonably extensible from the start. It may not be "the simplest thing that could possibly work", but it is often useful and doesn't cost you anything extra.
Just as much as people say "don't prematurely optimize", they should also say "don't prematurely make it total crap".
We once rebuilt an old system and went with the simplest thing that could work. It ran great for the first few weeks, but then all kinds of edge cases started creeping in. We ended up spending more time patching things up.
> design the best system for what your requirements actually look like right now
this is the key practical advice. when you start designing for hypothetical use cases that may never happen you are opening up an infinite scope of design complexity. setting hard specifications for what you actually need and building that simplifies the design process, at least, and if you start with that kind of mindset one can hope that it carries over to the implementation.
the simplest things always win because simple is repeatable. not every simple thing wins (many are not useful or have defects) but the winners are always simple.
All too often I see this mantra used to justify doing the âeasiest thing that could possibly work.â Which in my experience is not the same thing. They can overlap, and often upstream simplicity can create a downstream effect of ease for consumers. Simplicity often requires some real effort to execute well.
From https://nshipster.com/uncertainty/ recently: "Working in software, the most annoying part of reaching Senior level is having to say âit dependsâ all the time. Much more fun getting to say âletâs ship it and iterateâ as Staff or âthat wonât scaleâ as a Principal."
IIUC, author is a Staff SWE, so this tracks.
See also "Worse is better" which has been debated a million times by now.
Simplicity is brittle. Perhaps we should take a page from nature. Nothing in nature in simple, and yet it has built some of the most robust systems (by far) that we know of.
On the meta level, the simplest thing that could possibly work is usually paying someone else to do it.
Alas, you do not have infinite money. But you can earn money by becoming this person for other people.
The catch 22 is most people aren't going to hire the guy who bills himself as the guy who does the simplest thing that could possibly work. It turns out the complexities actually are often there for good reason. It's much more valuable to pay someone who has the ability to trade simplicity off for other desirable things.
If I was running a business and I could hire someone that I knew did good work, and did the simplest thing that could possibly work (and it actually worked!) - then I would absolutely do that as soon as possible.
"It turns out the complexities actually are often there for good reason" - if they're necessary, then it gets folded into the "could possibly work" part.
The vast majority of complexities I've seen in my career did not have to be there. But then you run into Chesterton's Fence - if you're going to remove something you think is unnecessary complexity, you better be damn sure you're right.
The real question is how AI tooling is going to change this. Will the AI be smart enough to realize the unnecessary bits, or are you just going to layer increasingly more levels of crap on top? My bet is it's mostly the latter, for quite a long time.
"Will the AI be smart enough to realize the unnecessary bits, or are you just going to layer increasingly more levels of crap on top? My bet is it's mostly the latter, for quite a long time."
Dev cycles will feel no different to anyone working on a legacy product, in that case.
Useful principle. But⌠(sorry to make a simple phrase more complex) the notion just scratches the surface of complexity management.
I appreciate âPhilosophy of Software Designâ by Ousterhout. I recently read that while rebuilding a text editor. Mind blowing experience. There is a lot of opportunity to more tightly encapsulate logic, to more clearly abstract a system, to keep a system simple yet powerful and extensible. I believe I became twice as good of a developer just by reading a chapter a day and sticking with the workflow.
I always felt software is like physics: Given a problem domain, you should use the simplest model of your domain that meets your requirements.
As in physics, your model will be wrong, but it should be useful. The smaller it is (in terms of information), the easier it is to expand if and when you need it.
> You should do that too! Suppose youâve got a Golang application that you want to add some kind of rate limiting to...Actually, are you sure your edge proxy doesnât support rate limiting already? Could you just write a couple of lines in a config file instead of implementing the feature at all?
As I'm doing the simplest thing that could possibly work, I do not have an edge proxy.
Of course, the author doesn't mean _that_ kind of simplicity. There are always hidden assumptions about which pieces of complexity are assumed, and don't count against your complexity budget.
this *tactical* style of development is the same thing propounded by TDD folks. there is no design, just a wierdly glued together mishmash of things that just happen to work.
i am (fwiw once again) not against unit-testing, that is almost always needed.
Not arguing against this article but aren't all these ideas already well known in the industry? Start with an MVP, don't optimize prematurely, avoid writing brittle code (systems, really), abstract implementation details away when possible, and KISS.
This consistent with Gall's law, that says complex systems that work can only be achieved by building complexity over simple systems. Complex systems built from scratch do not work. So build the simplest system that works and then keep adding complexity to it based on requirements
It's a pithy philosophy if you already know what it means to "work". You probably don't, especially if your system is human facing. Figuring out what "works" means is almost as difficult as building things in the first place. You may as well commit to building it twice [0].
Very much agree for the type of software I've worked on my whole career. I've seen way more time and energy wasted by people trying to predict the future than fixing bugs. In practice I think it's common to realize something didn't "possibly work" until after it's already deployed, but keeping things simple makes it easy to fix. So this advice also ends up basically being "move fast break things".
A term coined by consultants who donât understand an industry who basically say âdo the least possible thing that will workâ because they donât understand the domain and donât understand what requirements are often non-negotiable table stakes of complexity you need to compete.
It reminded me of a Martin Fowler post where he was showing implementation of discounts in some system and advocating to just hard code the first discount in the method (literally getDiscount() {return 0.5}).
Even the most shallow analysis would show this was incredibly stupid and short sighted.
But this was the state of the art, or so we were told.
See also Ward Cunningham trying and failing to solve Sudoku using TDD.
The reality is most business domains are actually complex, and the one who best tackles that complexity up front can take home all the marbles.
I think this article actually expresses a dangerous, risk-prone approach to problem solving, and one which ultimately causes more problems than the ones it solves.
The risk is misunderstanding the problems they are solving, and ignoring all the constraints that drove the need for some key design traits that were in place to actually solve the problem (i.e., complexity)
Take the following example from the article:
> You should do that too! Suppose youâve got a Golang application that you want to add some kind of rate limiting to. Whatâs the simplest thing that could possibly work? Your first idea might be to add some kind of persistent storage (say, Redis) to track per-user request counts with a leaky-bucket algorithm. That would work! But do you need a whole new piece of infrastructure?
Let's ignore the mistake of describing Redis as persistent storage. The whole reason why rate limiting data is offloaded to a dedicated service is that you want to enforce rate limiting across all instances of an API. Thus all instances update request counts on a shared data store to account for all traffic hitting across all instances regardless of how many they might be. This data store needs to be very fast to minimize micro services tax and is ephemeral. Hence why a memory cache is often used.
And why do "per-user request counts in memory" not work? Because you enforce rate-limiting to prevent brownouts and ultimately denials of service triggered in your backing services. Each request that hits your API typically triggers additional requests to internal services such as memory stores, querying engines, etc. Your external facing instances are scaled to meet external load, but they also create load to internal services. You enforce rate-limiting to prevent unexpected high request rates to generate enough load to hit bottlenecks in internal services which can't or won't scale. If you enforce rate limits per instance, scaling horizontally will inadvertently lift your rate limits as well and thus allow for brownouts, thus defeating the whole purpose of introducing rate limiting.
Also, leaky bucket algorithms are employed to allow traffic bursts but still prevent abuse. This is a very mundane scenario that happens on pretty much all services consumed by client apps. Once an app is launched, they typically do authentication flows and fetch data required in app starts and get data, etc. After app inits the app is back to baseline request rates. If you have a system that runs more than a single API instance, requests are spread over instances by a load balancer. This means a user's request can be routed to any instance at an unspecified proportion. So how do you prevent abuse while still allowing these bursts to take place? Do you scale your services to handle peak loads 24/7 to accommodate request bursts from all your active users at any given moment? Or do you allow for momentary bursts spread across all instances, regardless of what instances they hit?
Sometimes a problem can be simple. Sometimes it can be made too simple, but you accept the occasional outage. But sometimes you can't afford frequent outages and you understand a small change, like putting up a memory cache instance, is all it takes to eliminate failure modes.
And what changed in the analysis to understand that your simple solution is no solution at all? Only your understanding of the problem domain.
using unicorn as a positive example is, well, a pretty negative signal
unicorn, i.e. CGI, i.e. process-per-request, became anachronistic, gosh, more than 20 years ago at this point!
at least, if you're serving any kind of meaningful load -- a bash script in a while loop can serve 100RPS on an ec2.micro, that's (hopefully) not what anyone is talking about
It's great advice for an individual or a work item. It's great advice for teams where untamed programmers run rampant. It's good advice when teams are sane.
But, it's terrible for 2025's median software team. I know that isn't OP's intention. But inevitably, all good advice falls prey to misinterpretation.
In contemporary software orgs, build fast is the norm. PMs, managers, sales & leadership want Engg to build something that works, with haste. In this imagination, simple = fast. Let me say this again. No, you cannot convince them otherwise. To the median org, SIMPLE = FAST. The org always chooses the fastest option, well past the point of diminishing returns. Now once something exists, product managers and org leaders will push to deploy it and sales teams will sell dreams around it. Now you have an albatross around your neck. Welcome to life as a fire fighter.
For the health of a company and the oncall's sanity, Engg must tug at the rope in the opposite direction from (perceived) simplicity. The negotiated middle ground can get close to the 'simple' that OP proposes. But today, if Engg starts off with 'simple', they will be rewarded with a 'demo on life support'. At a time when vibe coding is trivial, I fear things will get worse before they get better.
All that being said, OP's intended advice is one that I personally live by. But often, simple is slow. So, I keep it to myself.
Another way I like to think about this is finding 'closeable' contexts to work in; that is, abstractions that are compact and logically consistent enough that you can close them out and take them on their external interface without always knowing the inner details. Metaphorically, your system can be a bunch of closed boxes that you can then treat as boxes, rather than a bunch of open boxes whose contents are spilling out and into each other. Think 'shipping containers' instead of longshoremen throwing loose cargo into your boat.
If you can do this regularly, you can keep the _effective_ cognitive size of the system small even as each closed box might be quite complex internally.
> when I asked [KentBeck], "What's the simplest thing that could possibly work?" I wasn't even sure. I wasn't asking, "What do you know would work?" I was asking, "What's possible? What is the simplest thing we could say in code, so that we'll be talking about something that's on the screen, instead of something that's ill-formed in our mind?" I was saying, "Once we get something on the screen, we can look at it. If it needs to be more, we can make it more.
I wanted to like this article and there's some things in there to agree with but ultimately it's a very uninteresting take with a very unconvincing rate limiting example.
> System design requires competence with a lot of different tools: app servers, proxies, databases, caches, queues, and so on.
Yes! This is where I see so many systems go wrong. Complex software engineering paving over a lack of understanding of the underlying components.
> As they gain familiarity with these tools, junior engineers naturally want to use them.
Hell yea! Understanding how kafka works so you don't build some crazy queue semantics on it. Understanding the difference between headless and clusterIP services in kubernetes so you don't have to build a software solution to the etcd problems you're having.
> However, as with many skills, real mastery often involves learning when to do less, not more. The fight between an ambitious novice and an old master is a well-worn cliche in martial arts movies
Wait what? Surely you mean doing more by writing less code. Are you now saying that learning and using these well tested, well maintained, and well understood components is amateurish?
I appreciate the sentiment, and absolutely think it should be kept in mind more.
But of course, it runs afoul of reality a lot of the time.
I recently got annoyed that the Windows Task scheduler just sometimes... Doesn't fucking work. Tasks don't run, and you get a crazy mystery error code. Never seen anything like it. Drives me nuts. Goddamned Microsoft making broken shit!
I mostly write Powershell scripts for automating my system, so I figure I'll make a task scheduler which uses the C# PSHost to run the scripts, and keep the task configuration in a SQLite database. Use a simple tray app with a windows form and EFCore for SQLite to read and write the task configuration. Didn't take too long, works great. I am again happy, and even get better logging out of failure conditions, since I can trap the whole error stream instead of an exit code.
My wife is starting a business, and I start to think about using the business to also have a software business piece to it. Maybe use the task scheduler as a component in a larger suite of remote management stuff for my wife's business which we sell to similar businesses.
Well. For it to be ready, it's got to be reliable and secure. Have to implement some checks to wait if the database is locked, no biggie. Oh, but what happens if another user running the tray icon somehow can lock the database, I've got to work out how to notify the other user that the database is locked... Also now tasks need to support running as a different user than the service is running under. Now I have to store those credentials somewhere, because they can't go into the SQLite DB. DPAPI stores keys in the user profile, so all this means now I have to implement support for alternative users in the installer (and do so securely, again with the DPAPI, storing the service's credentials).
I've just added a lot of complexity to what should be a pretty simple application, and with it some failure modes. Paying customers want new features, which add more complexity and more concern cases.
This is normal software entropy, and to some extent it's unavoidable.
Had to make an account for this, but this is almost verbatim what the article is talking about:
Your wife is making a business, and you want to write some code to help.
Then suddenly your requirements balloon to multiple concurrent users, needing to have a system tray icon and then also the ability to take this code and sell it to other people. Wow this project is suddenly complex!
This is just "I need to be able to scale infinitely" written in different words. The complexity comes from wanting a ton of things before they're actually needed (with the wrinkle of wanting to use some previously written scheduler for this project.
Mostly I agree with what the author is saying. But there is a clear distinction between the simplest âsystemâ and do the simplest thing.
The simplest thing to do is almost always the easiest, but knowing what is easiest thing to do is a lot trickier â- see Javascript frameworks.
But I think I disagree with the authorâs second axiom:
â2. Simple systems are less internally-connected.â
Creating interfaces is more complex than not. Even if it leads to a cleaner design because of interface boundaries. At the least, creating those boundaries adds complexity, and I donât mean âmore effortâ. I mean it in the sense that creating functions is more complex than calling âgotoâ. And it took decades to invent the mechanism needed to call functions â- which is probably the next most simple thing.
However, using call stacks and named pointers and memory separation (functions) leads to vastly improved simplicity of the system as the system as a while grows in complexity.
So in fact, using your own in-memory rate limiter may be a simpler implementation than using Redis, but it it also violates the second principle (using clear interfaces leads to simpler systems.)
And it turns the authorâs first premise â Gunicorn is simpler than Puma. Because Puma does the equivalent of building their own rate limiter â managing its own memory and using threads instead of processes.
And Gunicorn does the equivalent of using Redis â externalizing the complexity.
What Gunicorn did was simpler to implement (because it relies on an existing isolated architecture - Unix processes and files) but means it has a greater complexity (if you take into account that it needs that whole system to work.
However that system is a brilliant set of reductions in complexity itself, but it runs up against limitations and performance at some point.
Puma takes on itself more complexity to make administering the server less complex and more performant under load. Also, because it is, in a sense, reinventing the wheel, it lacks the distillation of simplicity that is Unix.
So, less internally connected systems are easier to expand and maintain and interface boundaries lead to less complex systems as a whole, but are not, in themselves less complex.
Limitations in the system that cause performed problems (like Unix processes and function calls) are not necessarily âmore simple than can possibly workâ â- but the implementations of those abstractions are not perfect and could be improved.
Sometimes itâs not clear where to push the complexity, and sometimes itâs not clear what the right abstraction level is; but mostly itâs about making due with the existing architecture you have, and not having the time or resources to fix it. Until the complexity at your level reaches a point that itâs worth adding complexity at a higher level due to being unable to add the right amount of complexity at a lower level.
This is the advice I've been unsuccessfully trying to drill into the heads of developers at a large organisation. Unfortunately, it turns out that the "simplest thing" can be banged out in a couple of days -- mere hours with an AI -- and that just isn't compatible with a career that is made up of 6-month contracting stints. It's much, much more lucrative to drag out every project over years and keep collecting that day-rate.
Many "industry best-practices" seen in this light are make-work, a technique for expanding simple things to fill the time to keep oneself employed.
For example, the current practice of dependency injection with interfaces, services, factories, and related indirections[1] is a wonderful time waster because it can be so easily defended.
"WHAT IF we need to switch from MySQL to Oracle DB one day?" Sure, that... could happen! It won't, but it could.
[1] No! You haven't created an abstraction! You've just done the same thing, but indirectly. You've created a proxy, not a pattern. A waste of your own time and the CPU's time.
Nooo you can't use deterministic math, you gotta use kalman filters and particle filters and factor graphs and spend three months tweaking parameters to get a 5% improvement! /s
Like alright in some situations that's the only thing that could possibly work, but shoving that complexity into every state estimator without even having a way to figure out the actual covariance of the input data is a bit much. Something that behaves in an expected way and works reliably with less accuracy beats a very complex system that occasionally breaks completely imo.
Claude is great at this! If you avoid refactoring at all costs and put everything as close as possible to the relevant code, you maximise the chance it works and minimise pesky DRY complexities.
I think this works in simple domains. After working in big tech for a while, I am still shocked by the required complexity. Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.
Anyone proclaiming simplicity just hasnt worked at scale. Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.
A classic, Chesterton's Fence:
"There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, âI donât see the use of this; let us clear it away.â To which the more intelligent type of reformer will do well to answer: âIf you donât see the use of it, I certainly wonât let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.â"
This is the classic misunderstanding where software engineers can't seem to communicate well with each other.
We can even just look at the title here: Do the simplest thing POSSIBLE.
You can't escape complexity when a problem is complex. You could certainly still complicate it even more than necessary, though. Nowhere in this article is it saying you can avoid complexity altogether, but that many of us tend to over-complicate problems for no good reason.
> We can even just look at the title here: Do the simplest thing POSSIBLE.
I think the nuance here is that âthe simplest thing possibleâ is not always the âbest solutionâ. As an example, it is possible to solve very many business or operational problems with a simple service sitting in front of a database. At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?
Complexity is more than just the code or the infrastructure; it needs to run the entire gamut of the solution. That includes looking at the incidental complexity that goes into scaling, operating, maintaining, and migrating (if a temporary âtoo simple but fast to get goingâ stack was chosen).
Measure twice, cut once. Understand what you are trying to build, and work out a way to get there in stages that provide business value at each step. Easier said than done.
Edit: Replies seem to be getting hung up over the âDBâ reference. This is meant to be a hypothetical where the reader infers a scenario of a technology that âcan solve all problems, but is not necessarily the best solutionâ. Substitute for âwriting files to the file systemâ if you prefer.
Right, and again this is reading too much into it. The simplest thing possible does not mean the best solution. If your solution that worked really well yesterday no longer scales today, it's no longer the correct solution and will require a more complex one.
But sometimes it IS better to think a few steps ahead, rather than building a new system from scratch every time things scale up. It's not always easy to upgrade things incrementally: just look at IPv4 vs IPv6
IPv6 is arguably a good example of what happens when you don't do the simplest thing possible. What we really needed was a bigger IP address space. What we got was a whole bunch of other crap. If we had literally expanded IPv4 by a couple of octets at the end (with compatible routing), would we be there now?
I agree with thinking a few steps ahead. It is particularly useful in case of complex problems or foundational systems.
Also maybe simplicity is sometimes achieved AFTER complexity, anyway. I think the article means a solution that works now... target good enough rather than perfect. And the C2 wiki (1) has a subtitle '(if you're not sure what to do yet)'. In a related C2 wiki entry (2) Ward Cunningham says: Do the easiest thing that could possibly work, and then pound it into the simplest thing that could possibly work.
IME a lot of complexity is due to integration (in addition to things like scalability, availability, ease of operations, etc.) If I can keep interfaces and data exchange formats simple (independent, minimal, etc.) then I can refactor individual systems separately.
1. https://wiki.c2.com/?DoTheSimplestThingThatCouldPossiblyWork
2. https://wiki.c2.com/?SimplestOrEasiest
>But sometimes it IS better to think a few steps ahead
The trouble is by the time you get there you will discover the problem isn't what you expected and it will all have been wasted effort.
https://en.wikipedia.org/wiki/You_aren't_gonna_need_it
Yes sometimes. But how can you know beforehand? Itâs clear in hindsight, for sure.
The most fundamental issue I have witnessed with these things is that people have a very hard time taking a balanced view.
For this specific problem, should we invest in a more robust solution which takes longer to build or should we just build a scrappy version and then scale later?
There is no right or wrong. Itâs depends heavily on the context.
But, some people, especially developers I am afraid, only have one answer for every situation.
It can be hard enough to fix things when some surprise happens. Unwinding complicated âfuture proofâ things on top of that is even worse. The simpler something is, the less you hopefully have to throw away when you inevitably have to.
Is the simplest thing possible still the DB? Yes thats why google spent decent amount of resources building out spanner because for many biz domains even at hyper scale it's still the DB.
> At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?
Don't worry, the second half of the title has this covered:
> ... that could possibly work
In the scenario you've described, the technology is not working, in the complete sense including business requirements of reasonable operating costs.
Perhaps it really did work at first, in the complete sense, when the number of users was quite small. That's where the actual content of the article kicks in: it suggests you really do use that simple solution, because maybe you'll never need to scale after all, or you'll need to rewrite everything by then anyway, or you'll have access to more engineering talent by then, etc. I'd tend to agree, but with the caveat that you should feel free to break the rule so long as you're doing it consciously. But none of that implies that you should end up in the situation you described.
> Perhaps it really did work at first, in the complete sense, when the number of users was quite small. That's where the actual content of the article kicks in: it suggests you really do use that simple solution, because maybe you'll never need to scale after all, or you'll need to rewrite everything by then anyway, or you'll have access to more engineering talent by then, etc.
This is where I am arguing nuance. These decisions are contextual; and the superficially more complicated solution may be solving inherent complexity in the problem space that only provides benefit over a time period.
As an example, some team might decide to forgo a database and read/write directly to the file system. This may enable a release in less time and that might be the right decision in certain contexts. Or it could be a terrible decision as the externalised costs begin to manifest and the business fails because of loss of customer trust.
My point is that you cannot only look at what is right in front of you, you also need to tactically plan ahead. In the big org context, you also need to strategically plan ahead.
Consider for example, computerizing a currently-manual process. And the 80/20 rule.
Do you handle one "everything is perfect" happy path, and use a manual exception process for odd things?
Do you handle "most" cases, which is more tech work but shrinks the number of people you need handling one-off things?
Or do you try to computerize everything no matter how rare?
My favourite example of this from my own career... automating timesheet -> payroll processing in a unionized environment. As we're converting the collective bargaining agreement into code, we discover that there are a pair of rules that seem contradictory. Go talk to someone in the payroll department to try to figure out how it's handled. Get an answer that makes decent sense, but have a bit of a lingering doubt about the interpretation. Talk to someone else in the same department... they tell us the alternative interpretation.
Bring the problem back to our primary contact and they've got no clue what to do. They're on like year 2 of a 7 year contract and they've just discovered that their payroll department has been interpreting the ambiguous rules somewhat randomly. No one wants to commit to an interpretation without a memorandum of understanding from the union, and no one wants to start the process of negotiating that MoU because it's going to mean backdating 2 years of payroll for an unknown number of employees, who may have been affected by it one month but not the next, depending on who processed their paystub that month.
That was fun :D
> I think the nuance here is that âthe simplest thing possibleâ is not always the âbest solutionâ.
The programmer's mind is the faithful ally of the perfect in its war waged against the good enough.
The "best" solution for most people that have a problem is the one they can use right now.
I have worked at too many companies where the effort spent not using a simple database was an exponential drag on everything.
Hell I just spent a week doing something which should've taken 5 minutes because rather then a settings database, someone has just been maintaining a giant ball of copy+pasted terraform code instead.
Yes. I like to distinguish between âcomplexâ (by nature) and âcomplicatedâ (by design)
The distinction you make is known to me as natural complexity (the base level due to the nature of the domain) and accidental complexity (that which is added unnecessarily on top of it).
Your definition rubs up against what a UX designer taught me years ago, which is that simple and complex are one spectrum, similar to but different from easy and hard.
Often, simple is confused for easy, and complex for hard. However, simple interfaces can hide a lot of information in unintuitive ways, while complex interfaces can present more information and options up front.
> We can even just look at the title here: Do the simplest thing POSSIBLE.
I think you're focusing on weasel words to avoid addressing the actual problem raided by OP, which is the elephant in the room.
Your limited understanding of the problem domain doesn't mean the problem has a simple or even simpler solution. It just means you failed to understand the needs and tradeoffs that led to complexity. Unwittingly, this misunderstanding originates even more complexity.
Listen, there are many types of complexity. Among which there is complexity intrinsic to the problem domain, but there is also accidental complexity that's needlessly created by tradeoffs and failures in analysis and even execution.
If you replace an existing solution with a solution which you believe is simpler, odds are you will have to scramble to address the impacts of all tradeoffs and oversights in your analysis. Addressing those represents complexity as well, complexity created by your solution.
Imagine a web service that has autoscaling rules based on request rates and computational limits. You might look at request patterns and say that this is far too complex, you can just manually scale the system with enough room to handle your average load, and when required you can just click a button and rescale it to meet demand. Awesome work, you simplified your system. Except your system, like all web services, experiences seasonal request patterns. Now you have schedules and meetings and even incidents that wake up your team in the middle of the night. Your pager fires because a feature was released and you didn't quite scaled the service ro accommodate for the new peak load. So now your simple system requires a fair degree of hand holding to work with any semblance of reliability. Is this not a form of complexity as well? Yes, yes it is. You didn't eliminated complexity, it is only shifted to another place. You saw complexity in autoscaling rules and believed you eliminated that complexity by replacing it with manual scaling, but you only ended up shifting that complexity somewhere else. Why? Because it's intrinsic to the problem domain, and requiring more manual work to tackle that complexity introduces more accidental complexity than what is required to address the issue.
I remember reviewing some code of an engineer I was managing at a FAANG. Noticed an edge case. Pointed out I thought if/when that hit, it was going to cause an alarm that would page on-call. He suggested it might be OK to ship because it was "about a one in a million chance of being hit". The service involved did 500,000 TPS. "So, just 30 times a minute, then?"
And you're right about the amount of engineering that goes into solving problems. One service adjacent to my patch was more than a decade old. Was on a low TPS but critical path for a key business problem. Had not been touched in years. Hadn't caused a single page in that decade, just trudged along, really solidly well engineered service. Somebody suggested we re-write it in a modern architecture and language (it was a kind of mini-monolith in a now unfashionable language). Engineering managers and principals all vetoed that, thank goodness - would have been 5+ years of pain for zero upside.
Accidental complexity is a thing, YAGNI is a thing, tech debt caused complexity is a thing, Iâm a foo programmer let me write bar code like itâs foo is a thing. I donât know if its all high quality needed
I have worked at scale - I have found countless examples of people not believing in simple solutions which eventually prevail and replace the big-complex thing.
Complexity is a learned engineering approach - it takes practice to learn to do it another way. So if all you see is complex solutions how would you learn otherwise?
> I have worked at scale - I have found countless examples of people not believing in simple solutions which eventually prevail and replace the big-complex thing.
I have worked at scale. I have found examples where simple solutions prevail due to inertia and inability or unwillingness to acknowledge the simple solutions failed to adequately address the requirements. The accidental complexity created by those simple solutions is downplayed as it would require reevaluating the simple solution, and thus run books and operations and maintenances are required as part of your daily operations because that's how the system is. And changing it would be too costly.
Let's not fool ourselves.
> I have worked at scale
Yep- this is why itâs a silly comment to make. Now we are where we are if we didnât qualify the conversation as being for âbig scale engineersâ only.
How did those replacements go? Or were you just hoping for the opportunity?
You are not wrong, but the source of the problem may not be the domain but poor software design.
If the software base is full of gotchas and unintended side-effects then the source of the problem is in unclean separation of concerns and tight coupling. Of course, at some point refactoring just becomes an almost insurmountable task, and if the culture of the company does not change more crap will be added before even one of your refactorings land.
Believe me, it's possible to solve complex problems by clean separation of concerns and composability of simple components. It's very hard to do well, though, so lots of programmers don't even try. That's where you need strict ownership of seniors (who must also subscribe to this point of view).
> If the software base is full of gotchas and unintended side-effects then the source of the problem is in unclean separation of concerns and tight coupling.
Do you know how you get such a system? When you start with a simple system and instead of redesigning it to reflect the complexity you just keep the simple system working while extending it to shoehorn the features it needs to meet the requirements.
We get this all the time, specially when junior developers join a team. Inexperienced developers are the first ones complaining about how things are too complex for what they do. More often than not that just reflects opinionated approached to problem domains they are yet to understand. Because all problems are simple once you ignore all constraints and requirements.
> then the source of the problem is in unclean separation of concerns and tight coupling
Sometimes the problem is in the edgesâthe way the separate concerns interactânot in the nodes. This may arise, for example, where the need for an operation/interaction between components doesn't need to be idempotent because the need for it to be never came up.
What, you mean like creating a transaction where if one component does something then the second component fails, the first one should revert?
Again, wrong design. Like I said, it's very difficult to do well. Consider alternate architecture: one component adds the bulk data to request, the second component modifies it and adds other data, then the data is sent to transaction manager that commits or fails the operation, notifying both components of the result.
Now, if the first component is one k8s container already writing to the database and second is then trying to modify the database, rearchitecting that could be a major pain. So, I understand that it's difficult to do after the fact. Yet, if it's not done that way, the problem will just become bigger and bigger. In the long run, it would make more sense to rearchitect as soon as you see such a situation.
The author is a staff engineer at GitHub. I don't think they haven't worked at scale
He has worked there for 2 years at staff level. This is the same about me (staff swe with more YoE than this guy in a lot more varied roles) professing about how all the things that implemented are simple at my new company who scans 1 billion objects a day - because I didnât fucking write them
The guy is full of shit.
Look at his other blog spam
The formula for prioritizing is literally this simple: Am I working on the most important thing right now? If not, drop what Iâm doing and go do that
Utter trash.
Look at his CV. Tiny (but impactful) features ///building on existing infrastructure which has already provably scaled to millions and likely has never seen beneath what is a rest api and a react front end///
I know this type. I AM him. Exaggerating my way through roles saying the right things through self promotion at the right times.
> Iâve also written Python and C in production
Absolute miss truth. A single line edit to existing applications/a pet project CGI server.
This is EXACTLY what I do.
Appreciate the hustle, but donât assume âbecause github + writes blog = knows thingsâ
I personally know and have (tangentially) worked with the guy and none of what youâve said is true.
> Look at his CV. Tiny (but impactful) features ///building on existing infrastructure which has already provably scaled to millions and likely has never seen beneath what is a rest api and a react front end///
Off the top of my head he wrote the socket monitoring infrastructure for Zendeskâs unicorn workers, for example.
> Zendesk
I DON'T know the guy, but Zendesk isn't a flex IMO
Man, who hurt you?
I certainly donât agree with everything Sean says and admit that âpicking the most important workâ is a naive thing to say in most scenarios.
But writing Python in production is trivial. Why would anyone lie about that? C is different OTOH. But just because you do a single config change and get paid for that doesnât mean itâs true for everyone.
Also, staff at GitHub requires a certain bar of excellence. So I wouldnât blindly dismiss everything just out of spite.
Though in my previous job, a huge amount of complexity was due to failed, abandoned, or incomplete attempts to refactor/improve systems, and I frequently wondered, if such things had been disallowed, how much simpler the systems we inherited would have been.
This isn't to say you should never try to refactor or improve things, but make sure that it's going to work for 100% of your use cases, that you're budgeted to finish what you start, and that it can be done iteratively with the result of each step being an improvement on the previous.
Every refactor attempt starts with the intention of 100% coverage.
No one can predict how efficacious that attempt will be from the get-go. Eventually, often people find out that their assumptions were too naive or they donât have enough budget to push it to completion.
Successful refactoring attempts start small and donât try to change the universe in a single pass.
Sure, but do some due diligence. I just say that because I've seen a couple cases where someone does a hack week project that introduces some new approach that "makes things so much cleaner". But then after spending a couple months productionizing it and rolling out the first couple iterations to prod amid much fanfare, it becomes evident that while it makes some things easier (oftentimes things that weren't all that hard to begin with), it makes other things a lot harder. So then you're stuck: do you keep pushing even though it's a net negative, do you roll back and lose all that work, or do you stall and leave a two-headed system?
In most of these cases, a few days up front exploring edge cases would have identified the problems and likely would have red lighted the project before it started. It can make you feel like a party pooper when everyone is excited about the new approach, but I think it's important that a few people on the team are tasked with identifying these edge cases before greenlighting the project. Also, maybe productionize your easiest case first, just to get things going, but then do your hardest case second, to really see if the benefits are there, and designate a go/rollback decision point in your schedule.
Of course, such problems can come up in any project, but from what I've seen they tend to be more catastrophic in refactoring/rearchitecting projects. If nothing else, because while unforeseen difficulties can be hacked around for new feature launches, hacking around problems completely defeats the purpose of a refactoring project.
The problem isnât refactoring, its that it was failed, abandoned, or incomplete.
And thatâs usually because the person or small group that began the refactor werenât given the time and resources to do the refactor, and uninterested or unknowledgable people hijacked and over complicated the process, and others blocked it from happening, so what would have taken a few weeks for the initial team to have completed the refactor successfully, with a little help and cooperation from others, and had they not been pulled in 10 different ways to fight other fires â instead after months and months and expending tons of time and money on people mucking it up instead of fixing it, the refactor got abandoned, a million dollars was wasted, and the system as a whole was worse than it was before.
At least half the time, the complexity comes from the system itself, echoes of the organizational structure, infrastructure, and not the requirements or problem domain; so this advice will/should be valid more often than not.
Right but you cant expect perfect implementation, as the complexity of the business needs grows, so does the accidental complexity.
> the organizational structure, infrastructure
Those are things that matter and can't be brushed away though.
What Conway's law describes is also optimization of the software to match the shape it can be developped and maintained with fewer frictions.
Same for infra, complexity induced by it shouldn't be simplified unless you also simplify/abatract the infra first.
Conway wasnât proscribing a goal, he was describing a problem.
I was one of the original engineers of DFP at Google and we built the systems that send billions of ads to billions of users a day.
The complexity comes from the fact that at scale, the state space of any problem domain is thoroughly (maybe totally) explored very rapidly.
Thatâs a way bigger problem than system complexity and pretty much any system complexity is usually the result of edge cases that need to be solved, rather than bad architecture, infrastructure or organisational issues - these problems are only significant at smaller, inexperienced companies, by the time you are at post scale (if the company survives that long) then state space exploration in implementation (features, security, non-stop operations) is where the complexity is.
My rule on edge cases is: It's OK to not handle an edge case if you know what's going to happen in that case and you've decided to accept that behavior because it's not worth doing something different. It's not OK to fail to handle an edge case because you just didn't want to think about it, which quite often is what the argument for not handling it boils down to. (Then there are the edge cases you didn't handle because you didn't know they existed, which are a whole other tragicomedy.)
Not directly related to the article we're discussing here, but, based on your experience, you might be the ideal kind of person to answer this.
At the scale you are mentioning, even "simple" solutions must be very sophisticated and nuanced. How does this transformation happen naturally from an engineer at a startup where any mainstream language + Postgres covers all your needs, to someone who can build something at Google scale?
Let's disregard the grokking of system design interview books and assume that system design interviews do look at real skills instead of learning common buzzwords.
Demonstration of capability will get you hired, capability comes only through practice.
I built a hobby system for anonymously monitoring BitTorrent by scraping the DHT, in doing this, I learned how to build a little cluster, how to handle 30,000 writes a second (which I used Cassandra for - this was new to me at the time) then build simple analytics on it to measure demand for different media.
Then my interview was just talking about this system, how the data flowed, where it can be improved, how is redundancy handled, the system consisted of about 10 different microservices so I pulled the code up for each one and I showed them.
Interested in astronomy? Build a system to track every star/comet. Interested in weather? Do SOTA predictions, interested in geography? Process the open source global gravity maps, interested in trading? Build a data aggregator for a niche.
It doesnât really matter that whatever you build âis the best in the world or notâ - the fact that you build something, practiced scaling it with whatever limited resources you have, were disciplined to take it to completion, and didnât get stuck down some rabbit hole endlessly re-architecting stuff that doesnât matter, this is what theyâre looking for - good judgement, discipline, experience.
Also attitude is important, like really, really important - some cynical ranter is not going to get hired over the âthatâs cool I can do that!â person, even if the cynical ranter has greater engineering skills, genuine enthusiasm and genuine curiosity is infectious.
If it's a legacy system, then it lives at the edges. The edges are everything.
I wish I could remember or find the proof, but in a multi-dimensional space, as the number of dimensions rise, the highest probability is for points to be located near the edges of the system -- with the limit being that they can be treated as if they all live at the edges. This is true for real systems too -- the users have found all of the limits but avoid working past them.
The system that optimally accommodates all of the edges at once is the old system.
You don't need a complicated proof, just assume a distribution in some very high number of dimensions, with samples from that distribution having randomly generated values from the distribution for each dimension. If you have if you have ~300 dimensions then statistically at least one dimension will be ~3SD from the mean, i.e. "on the edge," and as long as any one dimension is close to an edge, we define a point as being "near the edge."
It's not really meaningful though, at high dimensions you want to consider centrality metrics.
When the domain is complex, it's even MORE important that the individual components be simple with clean interfaces between them. If everything is too intertwined, you lose the ability to make changes or add new functionality without accidentally breaking something else.
As for Chesterton's Fence, you have the causality backwards. You should not build a fence or gate before you have a need for it. However, when you encounter an existing fence or gate, assume there must have been a very good reason for building it in the first place.
This is where John Gall's Systemantics comes into play, âA complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system."
Obviously a bit hyperbolic, but matches my experience.
> Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.
The amount of knowledge required to first generate the codebase, that is now missing for the rewrite, is the elephant in the room for rewrites. That's a decade of decision making, business rules changing, knowledge leaving when people depart etc.
Much like your example, if you think all the information is in the codebase then you should go away and start talking to the business stakeholders until you understand the scope of what you don't currently know.
> Anyone proclaiming simplicity just hasnt worked at scale.
Most projects don't operate at scale. And before "at scale", simple, rewritable code will always evolve better, because it's less dense, and less spread out.
There is indeed a balance between the simplest code, and the gradual abstractions needed to maintain code.
I worked with startups, small and medium sized businesses, and with a larger US airline. Engineering complexity is through the roof, when it doesn't have to be. Not on any of the projects I've seen and worked on.
Now if you're an engineer in some mega corp, things could be very different, but you're talking about the 1% there. If not less.
every complex domain and âat scaleâ is just a bunch of simple things in disguise⌠our industry is just terrible in general about breaking things down. we sort of know this so we came up with shit things like âmicroservicesâ but you spend sufficient time in the industry (almost three decades for me) and you wonât find a single place that has microservices architecture than you havenât wished was a monolith :) we are just terrible at this⌠there is no complex domain, it is just a good excuse we use to justify things
Oh boy, this is the best example of "I have been doing it the same way for 30 years" I have ever seen in the world wild web
Google and Amazon were doing things at roughly the same scale* 20 years ago on slower hardware and less of it.
* They might be serving twice as much (but definitely not ten times as much) as they were in 2005 but mostly that scales horizontally very easily.
The problem with this is no one can agree about what "at scale" means.
Like yes, everyone knows that if you want to index the whole internet and have tens of thousands of searches a second there are unique challenges and you need some crazy complexity. But if you have a system that has 10 transactions a second...you probably don't. The simple thing will probably work just fine. And the vast majority of systems will never get that busy.
Computers are fast now! One powerful server (with a second powerful server, just in case) can do a lot.
Yeah, we do 100k ML inferences per second. It's not a single server, but the architecture isn't much more complicated than that.
With today's computers, indexing the entire internet and serving 100k QPS also isn't really that demanding architecturally. The vast majority of current implementation complexity exists for reasons other than necessity.
Yep, vertical scaling goes a long way. But itâs not compute where the bottleneck for scale lies, rather in the resiliency & availability.
So although a single server goes a long way, to hit that sweet 99.999 SLA, people horizontally scales way before hitting the maximum compute capacity of a singe machine. HA makes everything way more difficult to operate and reason about.
I had an engineering boss who used this as a mantra (he is now an SVP of engineering at Spotify and we worked together at Comcast)
I think the unspoken part here is âletâs start withâŚâ
It doesnât mean you wonât have to âdo all the thingsâ so much as letâs start with too little so we donât waste time doing things we end up not needing.
Once you aggregate all the simple things you may end up with a complex behemoth but hopefully you didnât spend too much time on fruitless paths getting there.
The point is to not overengineer. This is not about ignoring scale, or not considering edge cases. Don't engineer for scale that you don't even know is necessary if that complicates the code. Do the simplest thing that meets the current requirements, but write the code in such a way that more features, scale etc. can be added without disrupting dependencies.
See also: Google engineering practices: https://google.github.io/eng-practices/review/reviewer/looki...
And also: https://goomics.net/316
First of all, I dont disagree. Just wanted to add that "the simple thing" is often not the obvious thing to do, and only becomes apparent after working on it for a while. Often times, when you dive into a set of adjacent functionality, you discover that it barely even works, and does not actually do nearly all the things you thought it did.
Yes. The simple thing is not necessarily the obvious thing or the most immediately salient thing. First explore the problem-solution space thoroughly, THEN choose the simple thing
> Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.
Is this really because the single problem is inherently difficult, or because you're trying to solve more than one problem (scope creep) due a fear of losing revenue? I think a lot of complexity stems from trying to group disparate problems as if they can have a single solution. If you're willing to live with a smaller customer base, then simple solutions are everywhere.
If you want simple solutions and a large customer base, that probably requires R&D.
I am deep in one such corporate complexity, yet I constantly see an ocean of items that could have been in much simpler and more robust way.
Simple stuff had tons of long term advantages and benefits - its easy to ramp up new folks on it compared to some over-abstracted hypercomplex system because some lead dev wanted to try new shiny stuff for their cvs or out of boredom. Its easy to debug, migrate, evolve and just generally maintain, something pure devs often don't care much for unless they become more senior.
Complex optimizations are for sure required for extreme performance or massive public web but that's not the bulk of global IT work done out there.
This could also point to the solution of cutting down the complexity of "big tech". So much of that complexity isn't necessary because it solves problems, it just keeps people employed.
This is a horrifically cynical take and I wish it would stop. I doubt very seriously there is any meaningfully sized collection of engineers who introduce things "just to keep themselves employed," to say nothing of having to now advance that perspective into a full blown conspiracy because code review is also a thing
What is far more likely is the proverbial "JS framework problem:" gah, this technology that I read about (or encounter) is too complex, I just want 1/10th that I understand from casually reading about it, so we should replace it with this simple thing. Oh, right, plus this one other thing that solves a problem. Oh, plus this other thing that solves this other problem. Gah, this thing is too complex!
I donât agree with the phrasing, but there is certainly a ton of complexity introduced because of engineers who are trying to be promoted or otherwise maintain their image of being capable of solving complex problems (through complex solutions).
Itâs not the same as introducing complexity to keep yourself employed, but the result is the same and so is the cause - incentive structures arenât aligned at most companies to solve problems simply and move on.
I realized that I should have asked for an example of "too complex" because I may not be following the arguments because my definition of a thing that is "too complex" almost certainly doesn't align with someone else's. In fact, I'd bet that if you rounded up 10 users from this site and polled them for something they thought was "too complex" the intersection would be a very, very small set of things
I'd recommend reading bullshit jobs by David graeber. Most jobs in most organisations have an incentive structure for an individual to keep themselves employed rather than to actually solve problems.
He's an anarchist so it's not a surprise that he's grinding out the same old tropes about organizations
I'm with you that the world in general is filled with bullshit jobs, but I do not subscribe to the perspective of wholesale bullshit jobs in the cited "big tech," since in general I do not think that jobs which have meaningful ways to measure them easily fall into bullshit. Maybe middle managers?
Do you reckon the KPI's and performance indicators used in big tech count as meaningful ways to measure performance? Wouldn't someone implementing a complex resume-driven project score highly on these measurements, despite a simpler solution being correct? I am not sure that job-hopping every 18 months to maximise TC (ie optimise against your incentives) is a great way to learn about long-term design and organisational implications.
I'm not saying that these jobs are bullshit in the same way that a VP of box-ticking is, just that it's not a conspiracy that a cathedral based on 'design-doc culture' might produce incentives that result in people who focus on maximising their performance on these fiscally rewarding dot points, rather than actualising their innate belief in performant and maintainable systems.
I work at a start-up so if my code doesn't run we don't get paid. This motivates me to write it well.
> Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.
You're doing it wrong. More likely than not.
> Anyone proclaiming simplicity just hasnt worked at scale. Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.
Or, you're just used to excusing complexity because your environment rewards complexity and "big things".
Simple is not necessarily easy. Actually simple can be way harder to think of and push for, because people are so used to complexity.
Yes. Massive scale and operations may make things harder but seeking simplicity is still the right choice and "working in big tech" is not a particular hard or rare credential in HN. Try an actual argument instead of an appeal to self authority.
>> Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider
A rewrite of a decade old code base is not the simplest thing that could possibly work.
That principle is valid also outside the software engineering domain. Except for German engineers...
It's a shame he doesn't give the origin of this expression in programming. It comes from Ward Cunningham (inventor of the wiki) in his work with Kent Beck. In an interview a few years back on Dr. Dobb's, he stated that as the two of them were coding together in the late 80s, they would regularly remind each other of the principle. Eventually, it became a staple of their talks and writing.
They were cognizant of the limitations that are touched on in this article. The example they gave was of coming to a closed door. The simplest thing might be to turn the handle. But if the door is locked, then the simplest thing might be to find the key. But if you know the key is lost, the simplest thing might be to break down the door, and so on. Finding the simplest thing is not always simple, as the article states
IIRC, they were aware that this approach would leave a patchwork of technical debt (a term coined by Cunningham), but the priority on getting code working overrode that concern at least in the short term. This article would have done well to at least touch on the technical debt aspect, IMHO.
Kent Beck went on to formalize Extreme Programming, which is a collection of practices for allowing simple systems to evolve as requirements change.
Hereâs the Extreme Programming manifesto from that time. Very similar sentiment from around two decades ago.
http://www.extremeprogramming.org/rules/simple.html
Just to add I think the same applies in business and in life. So if youâve got managers who have this vision and cascade it down then things become easier on the ground / the design.
I heard the expression from a colleague who made it his mantra and manifesto, and had no idea who it came from originally! Perhaps the highest honor for an expression's originator is for it to be so ubiquitous that no one knows he said it.
This sounds a lot like the apocryphal Einstein quote
> Everything should be made as simple as possible, but not simpler.
And I found a similar quote from Aquinas
> If a thing can be done adequately by means of one, it is superfluous to do it by means of several; for we observe that nature does not employ two instruments where one suffices
(Aquinas, [BW], p. 129).
Not apocryphal. The article is referenced and discussed in this interview with Kent Beck[0]. As you see, the link goes directly to Dr. Dobb's--although the page is down. Search for "Cunningham" and the first hit takes you right to the conversation.
[0] https://blogs.oracle.com/javamagazine/post/interview-with-ke...
> inventor of the wiki
It's interesting you gave that example. Before my first use of a wiki I was on a team that used Lotus Notes and did project organization in a team folder. I loved that Notes would highlight which documents had been updated since the last time I read them.
In the next project, that team used a wiki. It's simpler. But, the fact it didn't tell me which documents had been updated effectively made it useless. People typed new project designs into the wiki but no one saw them since they couldn't, at a glance, know which of the hundreds of pages had been updated since they last read them.
It was too simple
FWIW, showing recent changes is a very common wiki feature.
Here's the page for my local makerspace's wiki, which runs on mediawiki:
https://bloominglabs.org/Special:RecentChanges?hidebots=1&li...
A terse diff like that, on a separate page, is probably not what they were referring to.
I came here to say this! Ward taught me this when I paired with him every day when we worked together. Itâs his, dare I say, mantra when starting a new feature.
> It's a shame he doesn't give the origin of this expression in programming.
It is possible the OP came to this conclusion without knowing about Ward Cunningham?
This should be the top comment.
Principles like these deserve to be part of curriculum for undergrad courses. People are incorrectly trained to go for some ideal forms at the cost of complexity and fragility. Every approach, idea should be put to ruthless cost-benefit comparison, without any regard to who is proposing it, or how it sounds.
Education and training sometimes enforces prejudices, rules and stigmas that evade inspection of the subject matter in raw form.
Preference to idealism probably emerged from peace times which have no struggle. Someone would be obsessed with perfectness of a sculpture only when they don't need to hunt for the next meal. The real world runs on minimal, conservative, durable and robust approaches.
One of the ironies of this kind of advice is that it's best for people who already have a lot of experience and have the judgement to apply it. For instance, how do you know what the "simplest thing" is? And how can you be sure that it "could possibly work"?
Yesterday I had a problem with my XLSX importer (which I wrote myself--don't ask why). It turned out that I had neglected to handle XML namespaces properly because Excel always exported files with a default namespace.
Then I got a file that added a namespace to all elements and my importer instantly broke.
For example, Excel always outputs <cell ...> whereas this file has <x:cell ...>.
The "simplest thing that could possibly work" was to remove the namespace prefix and just assume that we don't have conflicting names.
But I didn't feel right about doing that. Yes, it probably would have worked fine, but I worried that I was leaving a landmine for future me.
So instead I spent 4 hours re-writing all the parsing code to handle namespaces correctly.
Whether or not you agree with my choice here, my point is that doing "the simplest thing that could possible work" is not that easy. But it does get easier the more experience you have. Of course, by then, you probably don't need this advice.
> One of the ironies of this kind of advice is that it's best for people who already have a lot of experience and have the judgement to apply it. For instance, how do you know what the "simplest thing" is?
I think the author kind of mentions this: "Figuring out the simplest solution requires considering many different approaches. In other words, it requires doing engineering."
Agreed! The author is clearly an experienced and talented software engineer.
But the irony, in my opinion, is that experienced engineers don't need this advice (they are already "doing engineering"), but junior engineers can't use this advice because they don't have the experience to know what the "simplest thing" is.
Still, the advice is useful as a mantra: to remind us of things we already know but, in the heat of the moment, sometimes forget.
I like this. I had a rule of three: figure out three qualitatively different ways to solve the problem - different in kind, not just in choice of tools. Once you have three you start to understand the trade-offs. And you can come up with others quite easily.
I like that as a process. Seeing the trade-offs is the key. I argue that engineering is all about trade-offs.
We attempt to address this problem at work with an extra caveat to never add code "in the wrong direction" -- so it's fine (usually preferable) to have a partial implementation, as long as it's heading in the direction we'd like the more complete implementation to go in. Basically "KISS, but no hacks".
Just curious, how would that be applied to the xslx namespace problem example given? If the full fix is to implement namespacing, what would the KISS approach be in the right direction?
Use an off-the-shelf parser that handles namespaces. And escapes. And CData. And everything else you haven't thought of: https://stackoverflow.com/questions/701166/can-you-provide-s...
This avoids the endless whack-a-mole that you get with a partial solution such as "assume namespaces are superflous", which you almost certainly will eventually discover weren't optional.
Or some other hapless person using your terrible code will discover at 2am at night sitting alone in the office building while desperately trying to do something mission critical such as using a "simple" XML export tool to cut over ten thousand users from one Novel system to another so that the citizens of the state have a functioning government in the morning.
Ask me how I know that kind of "probably won't happen" thing will, actually, happen.
I really like this as a guideline.
I gauge it as "the simplest thing to transition". Most of the time, it's easier to transition a single service that doesn't rely on a big number of complex abstractions or extra infrastructure, even if it's at the expense of some clutter or a bit of redundancy. The new owner can step through the code and see what's going on without having to work backward to understand the abstractions or coordination of services or whatever else.
Of course plenty of times there'll be some abstractions that make the code easier to follow, even at the expense of logic locality. And other times where extra infrastructure is really necessary to improve reliability, or when your in-memory counter hack gets more requirements and replacing it with a dedicated rate limiter lets you delete all that complexity. And in those cases, by all means, add the abstractions or infrastructural pieces as needed.
But in all such cases, I try to ask myself, if I need to hand off this project afterward, which approach is going to make things easiest to explain?
Note that my perception of this has changed over time. Long ago, I was very much in the camp of "simple" meaning: make everything as terse as possible, put everything in its own service, never write code when a piece of infrastructure could do it, decouple everything to the maximum extent, make everything config-based. I ironically remember imagining how delighted the new owners would be to receive such a well-factored thing that was almost no code at all; just abstraction upon abstraction upon event upon abstraction that fit together perfectly via some config file. Of course, transition was a complete fail, as they didn't care enough to grok how the all pieces were designed to fit together, and within a month, they'd broken just about every abstraction I'd built into it, and it was a pain for anybody to work with.
Since then, I've kept things simpler, only using abstractions and extra infra where it'd be weird not to, and always thinking what's going to be the easiest thing to transition. And even though I'm not necessarily transitioning a ton of stuff, it's generally easier to ramp up teams or onboard new hires or debug problems when the code just does what it says. And it's nice because when a need for a new abstraction becomes apparent, you don't have to go back and undo the old one first.
I think most commentators here are missing the point that doing the "simplest" thing doesn't mean doing the hackiest, quickest thing.
The simplest thing can be very difficult to do. It require thought and understanding the system, which is what he says at the very beginning. But I think most people read the headline and just started spewing personal grievances.
My point is exactly that "the simplest thing can be very difficult to do". You need to be an experienced engineer to apply this advice.
But an experienced engineer already knows this!
I just think it's ironic that this advice is useless to junior engineers but unneeded by senior engineers.
> I just think it's ironic that this advice is useless to junior engineers but unneeded by senior engineers.
That's a good way of putting it. The advice essentially boils down to "do the right thing, don't do the wrong thing". Which is good (if common sense) advice, but doesn't practically really help with making decisions.
Yes, but this is meaningless advice.
The best solution is the simplest.
The quickest? No the simplest; sometimes thats longer.
So definitely not a complex solution? No, sometimes complexity is required, its the simplest solution possible given your constraints.
Soo⌠basically, the advice is âpick the right solutionâ.
Sometimes that will be quick. Sometimes slow. Sometimes complex. Sometimes config, Sometimes distributed.
It depends.
But the correct solution will be the simplest one.
Its just: âsolve your problems using good solutions not bad onesâ
âŚand that indeed both good, and totally useless advice.
Itâs the same for AI vibecoding. The more experience you have, the easier it is to keep the agent on the right path. Same for identifying which tasks to use an agent for vs doing yourself.
Don't confuse sloppy with simple. Parsing XML with regex[1] (or a non-namespace-compliant XML parser) is not simple. It's messy, verbose, error-prone, and not in any way idiomatic or simple.
If you had just used a compliant XML parser as intended, you might not even have noticed that different encodings of namespaces was even occurring in the files! It just "doesn't register" when you let the parser handle this for you in the same sense that if you parse HTML (or XML) properly, then you won't notice all of the & and < encodings either. Or CDATA. Or Unicode escapes. Or anything else for that matter that you may not even be aware of.
You may be a few more steps away from making an XLSX importer work robustly. Did you read the spec? The container format supports splitting single documents into multiple (internal) files to support incremental saves of huge files. That can trip developers in the worst way, because you test with tiny files, but XLSX-handling custom code tends to be used to bulk import large files, which will occasionally use this splitting. You'll lose huge blocks of data in production, silently! That's not fun (or simple) to troubleshoot.
The fast, happy path is to start with something like System.IO.Packaging [2] which is the built-in .NET libary for the Open Packaging Conventions (OPC) container format, which is the underlying container format of all Office Open XML (OOXML) formats. Use the built-in XML parser, which handles namespaces very well. Then the only annoyance is that OOXML formats have two groups of namespaces that they can use, the Microsoft ones and the Open "standardised" ones.
[1] Famously! https://stackoverflow.com/questions/8577060/why-is-it-such-a...
[2] https://learn.microsoft.com/en-us/dotnet/api/system.io.packa...
Parsing XML is relatively trivial--I'd never use regex, of course, but a basic recursive descent parser can do it pretty easily. I mean, the whole point of XML is that it's supposed to be easy to parse and generate!
Namespaces add a wrinkle, but it wasn't that hard to add. And I was able to add namespace aliasing in my API to handle the two separate "standard" namespaces that you're talking about.
But you're right about OPC/OOXML--those are massive specs and even the tiny slice that I'm handling has been error-prone. I haven't dealt with multiple internal files, so that's a future bug waiting for me. The good news is I'm building a nice library of test files for my regression tests!
> Parsing XML is relatively trivial
It really isn't, and rolling your own parser is the diametric opposite of the "do the simplest thing" philosophy.
The XML v1.1 spec is 126 KB of text, and that doesn't even include XML Namespaces, which is a separate spec with 25 KB of text.
XML is only "simple" in the sense of being well-defined, which makes interoperability simple, in some sense. Contrast this with ill-defined or implementation-defined text formats, where it's decidedly not simple to write an interoperable parser.
As an end-user of XML, the simplest thing is to use an off-the-shelf XML parser, one that's had the bugs beaten out of it by millions of users.
There are very few programming languages out that don't have a convenient, full-featured XML parser library ready to use.
Well we can agree that most people shouldn't implement their own XML parser.
One of the biggest, evergreen arguments Iâve had in my career revolves around the definition of âworksâ.
âJust because it works doesnât mean it isnât broken.â Is an aphorism that seems to click for people who are also handy in the physical world but many software developers think doesnât sound right. Every handyman has at some time used a busted tool to make a repair. They know they should get a new one, and many will make an excuse to do so at the next opportunity (hardware store trip, or sale). Maybe 8 out of ten.
In software itâs probably more like 1 out of ten who will do the equivalent effort.
One of the worst periods of my career was at a company that had a team who liked to build prototypes. They would write a hasty proof-of-concept and then their boss would parade it in front of the executives. It would be deployed somewhere and connected to a little database so it technically "worked" when they tried it.
Then the executives would be stunned that it was done so quickly. The prototype team would pass it off to another team and then move on to the next prototype.
The team that took over would open the project and discover that it was really a proof of concept, not a working site. They wouldn't include basic things like security, validation, error messages, or any of the hundred things that a real working product requires before you can put it online.
So the team that now owned it would often have to restart entirely, building it within the structures used by the rest of our products. The executives would be angry because they saw it "work" with their own eyes and thought the deployment team was just complicating things.
The worst case of this I ran into, the âmaintenanceâ team discovered some of the interactions were demo stubs. Nothing actually happened except the test data looked like the state transition worked.
Those are the worst because you donât have done criteria you can reasonably write down. Itâs whenever QA stops finding fakes in the code, plus a couple months for stragglers you might have missed.
Thousand-yard stare flashback: "Why is this taking months to make this work? John and James built us the first version in 2 weeks"
https://youtu.be/W8ILPghbqdY
From somewhere around the net, I found this quote:
> It's not enough for a program to work â it has to work for the right reasons
I guess thatâs basically the same statement, from a different angle.
I generally agree, except if the program is a one-time program meant to generate a single output and then you throw it away.
Until recently I would say such programs are extremely rare, but now AI makes this pretty easy. Want to do some complicated project-wide edit? I sometimes get AI to write me a one-off script to do it. I don't even need to read the script, just check the output and throw it away.
But I'm nitpicking, I do agree with it 99% of the time.
I often write those sorts of tools iteratively.
By the time youâve done something five times, itâs probably part of your actual process, and you should start treating it as normal instead of exceptional. Even if admitting so feels like a failure.
So I staple something together that works for the exact situation, then start removing the footguns Iâm likely to hit, then I start shopping it to other people I see eye to eye with, fix the footguns they run into. Then we start trying to make it into an actual project, and end game is for it to be a mandatory part of our process once the late adopters start to get onboard.
I remember once having to make a SOAP call that just wasn't connecting for some reason, but another end point on the same service was working, which made no sense. We tried calling the working endpoint right before calling the busted endpoint just for kicks, and that actually functioned. Still to this day makes no sense at all to me, we ended up moving off of soap eventually, but that code was in there until we did.
I hate the days when you are trying to fix a bug in a block of code and as you write pinning tests you realize that the code has always been broken and you cannot understand why it ever got the right answer. Youâve let the magic smoke out and you cannot put it back without fixing the problem. At some point you have to stop trying because you understand perfectly well how it should work and you need to move on to other things.
Those conversations are an important part of the job. You can, for example, agree that something works in the sense that it is currently possible to use it to obtain a desired output, while simultaneously failing to work in various ways: It might fail to do so reliably, or it might only be able to do so at great cost.
Itâs a frustrating argument to lose.
On a recent project I fixed our deployment and our hotfix process and it fundamentally changed the scope of epics the team would tackle. Up to that point we were violating the first principle of Continuous: if itâs painful, do it until it isnât. So we would barely deploy more often than we were contractually (both in the legal and internal cultural sense) obligated to do, and that meant people were very conservative about refactoring code that could lead to regressions, because the turnaround time on a failing feature toggle was a fixed tempo. You could turn a toggle on to analyze the impact but then you had to wait until the next deployment to test your fixes. Excruciating with a high deviation for estimates.
With a hotfix process that actually worked worked, people would make two or three times as many iterations, to the point we had to start coordinating to keep people from tripping over each other. And as a consequence old nasty tech debt was being fixed in every epic instead of once a year. It was a profound change.
And as is often the case, as the author I saw more benefit than most. I scooped a two year two man effort to improve response time by myself in three months, making a raft of small changes instead of a giant architectural shift. About twenty percent of the things I tried got backed out because they didnât improve speed and didnât make the code cleaner either. I could do that because the tooling wasnât broken.
The definition of 'works' depends on whether my employer wants to spend its resources (the time I'm working) on fixing it.
If they want to use those resources to prioritize quality, I'll prioritize quality. If they don't, and they just want me to hit some metric and tick a box, I'm happy to do that too.
You get what you measure. I'm happy to give my opinion on what they should measure, but I am not the one making that call.
Theyâll never prioritize the work that keeps the wheels on. You have to learn not to ask and bake it into the cost of new feature work. Itâs non negotiable or it never happens.
My second lead role, the CTO and the engineering manager thought I could walk on water and so I had considerable leeway to change things I thought needed changing.
So one of the first things I did was collectively save the team about 40 hours of code-build-test time per week. Which is really underselling it because what I actually did was both build a CI pipeline at a time nobody knew what âCIâ meant, and increase the number of cycles you could reliably get through without staying late from 4 to 5 cycles per day. A >20% improvement in iterations per day and a net reduction in errors. That was the job where I learned the dangers of pushing code after 3:30pm. Everyone rationalizes that the error they saw was a glitch or someone elseâs bug, and they push and then come in to find the early birds are mad at them. So better to finish what we now call deep work early and do lighter stuff once youâre tired.
Edit: those changes also facilitated us scaling the team to over twice the size of any project Iâd worked on before or for some time after, though the EM deserves equal credit for that feat.
Then they fired the EM and Peter Principled by far the worst manager Iâve ever worked for (fuck you Mike, everyone hated your guts), and all he wanted to know was why I was getting fewer features implemented. Because Iâm making everyone else faster. Speaking of broken, the biggest performance bottleneck in the entire app was his fault. He didnât follow the advice I gave him back when he was working in our query system. Discovering it took hiring an Oracle DB contractor (those are always exorbitant). Fixing it after it shipped was a giant pain (as to why I didnât catch his corner cutting, I was tagged in by another lead who was triple booked, and when I tagged back out he unfortunately didnât follow up sufficiently on the things I prescribed).
> âJust because it works doesnât mean it isnât broken.â
Meanwhile all the people writing agentic LLM systems: âHold my beerâ
When I'm on the fence about some (technical) decision, I use a "razor": if all options seem equal, go with whichever is the simpler one. The results are ok so far and it has been great for reducing my brain-energy spent on pontification and early optimization too far ahead.
I liked the post, but these kinds of articles do make sense to people who've already been through the trenches & view the advice from their seasoned experience PoV and apply it accordingly. But if people without such experience follow it to the letter just because it's written, can have surprises ahead.
> A lot of engineers design by trying to think of the âidealâ system: something well-factored, near-infinitely scalable, elegantly distributed, and so on.
Was it Donald Knuth who said "premature optimization is that root of all evil"?
This article made this point very well, especially regarding the obsession with "scaling" in the SaaS world.
I've seen thousands and thousands of developer hours completely wasted, because developers were forced to massively overcomplicate greenfield code in anticipation of some entirely hypothetical future scaling requirement which either never materialized (95% of the time) or which did appear but in such a different form that the original solution only got in the way (remaining 5%).
John Ousterhoutâs Philosophy of Software Design makes the case for simplicity in a book-length form. I really like how he emphasizes the importance of design simplicity for the maintainability of software; this is where I've seen it matter the most in practice.
My current company is in that 5% part right now. Tremendous effort invested into the system, everyone involved was very proud of themselves. Unfortunately the way we actually needed to scale was almost completely untouched by any of this architecture astronomy, so we have both a terrifically complicated system - very difficult to change things without potential breakage or regression - AND it doesn't scale at all.
I don't mind, I don't blame people for not predicting the future - it's a tough game. But god the hubris and attitude we put up with until the crows came home to roost.
I would be very cautious to give an advice like this to my team. Making a thing simple is actually very hard, and many, who hear the words, may just equate âsimple thingâ with âfirst thing that comes to mindâ, which may eventually turn into a nightmare of complexity.
Generally speaking, when I hear people say this, it's a huge red flag. Really, any time anyone puts forth any kind of broad proclamation about how software development should be done, my hackles go up. Either they don't know what they're talking about, they're full of shit, or both. The only reasonable thing to conclude after lots of experience with software development is that it's hard and requires care and deliberation. There is no one-size-fits-all advice. What I want to see is people who are open-minded and thoughtful.
Simplicity (meaning the inverse of complexity) is usually the most important factor when considering two possible ways of doing something with software. And this is because it has to be conceived of, pitched to, agreed upon, built, and maintained by humans.
Unfortunately, simplicity is complicated. The median engineer in industry is not a reliable judge of which of two designs is less complex.
Further, "simplicity" as an argument has become something people can parrot. So now it's a knee-jerk fallback when a coworker challenges them about the approach they are taking. They quickly say "This is simpler" in response to a much longer, more sincere, and more correct argument. Ideally the team leader would help suss out what's going on, but increasingly the team lead is a less than competent manager, and simplicity is too complicated a topic for them to give a reliable signal. They prefer not to ruffle feathers and let whoever is doing the work make the call; the team bears the complexity.
âSimplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better.â
â Dijkstra
Yes, and when itâs time to implement something by default, you always choose "your optimal". If you have two options that solve the problem equally well, you always choose the simplest, among other things because itâs shorter.
What you really learn over time and itâs more useful, is to think along these lines: donât try to solve problems that donât exist yet.
This is a mantraic, cool headline but useless. The article doesn't develop it properly either in my opinion.
From the article....
"real mastery often involves learning when to do less, not more. The fight between an ambitious novice and an old master is a well-worn cliche in martial arts movies: the novice is a blur of motion, flipping and spinning. The master is mostly still. But somehow the noviceâs attacks never seem to quite connect, and the masterâs eventual attack is decisive".
I was initially annoyed at parts of the article, but it does point out that "hacks" often adds hidden complexity that isn't simple so there is a clarity about the tradeoff.
Now the problem with the headline and repeating it is, when "just do a simple thing" becomes mandated from management (technical or not), there comes a certain stress about trying to keep it simple and if you try running with it for a complex problems you easily end up with those hacks that become innate knowledge that's hard to transfer instead of a good design (that seemed complex upfront).
Conversly, I think a lot of "needless complexity" comes from badly planned projects where people being bitten by having to continuously add hacks to handle wild requirements easily end up overdesigning something to catch them, only to end up with no more complexity in that area and then playing catchup with the next area needing ugly hacks (to then try to design that area that stabilized and the cycle repeats).
This is why as developers we do need to inject ourselves into meetings (however boring they are) where things that do land up on our desks are decided.
Itâs Rich Hickeyâs âSimple made Easyâ all over again. âSimpleâ is not the easy path. Simple (or simplex, unbraided) describes an end product with very little interleaving of components. Simplicity is elegant. It takes a lot of hard work to achieve a simple end product.
I see your point, but, taken to the extreme, all it leaves us with is "everything is a trade-off" or "there's no free lunch".
Some generalizations are necessary to formalize the experience we have accumulated in the industry and teach newcomers.
The obvious problem is that, for some strange reason, lots of concepts and patterns that may be useful when applied carefully become a cult (think clean architecture and clean code), which eventually only makes the industry worse.
For example, clean architecture/ports and adapters/hexagonal/whatever, as I see it, is a very sane and pragmatic idea in general. But somehow, all battles are around how to name folders.
I mean, I think I agree more with this sentiment than most. These overly general statements tend to not have much nuance, and do little to incorporate context.
But also keep in mind the audience: the kinds of people who are tempted to use J2EE (at the time) with event sourcing and Semantic Web, etc.
This is really a counterbalance to that: let's not add sophistication and complexity by default. We really are better off when we bias towards the simpler solutions vs one that's overly complex. It's like what Dan McKinley was talking about with "Choose Boring Technology". And of course that's true (by and large), but many in our industry act like the opposite is the case - that you get rewarded for flexing how novel you can make something.
I've spent much of my career unwinding the bad ideas of overly clever devs. Sometimes that clever dev was me!
So yes ... it's an overly general statement that shouldn't need to be said, and yet it's still useful given the tendency of many to over-engineer and use unnecessarily sophisticated approaches when simpler ones would suffice.
I don't think I would go far enough to say that it's generally a red flag...
I see people adding unnecessary complexity to things all the time and advocate for keeping things simple on a daily basis probably. Otherwise designers and product managers and customers and architects will let their mind naturally add complexity to solutions which is unnecessary.
I completely disagree with this being a red flag. It would be a huge green flag for me. The easiest thing to do is to create a complex system, making a simple one is difficult.
Did you read the article? Itâs mostly about the nuance of how to apply this philosophy in practice, not a pithy one-size-fits-all statement about all software engineering.
It seems like a lot of people think that the first draft/prototype/whatever has to be perfect.
It doesn't. It never is. It can't be.
My favorite example of this was the Moon shot. Each step was learning how to do just that one step. Mercury was just about getting into orbit, not easy even now with SpaceX though they are standing on the shoulders of those giants. Then Gemini for multiple people and orbital maneuvering (that experience gained them lots of learning) and then Apollo 8 was still a dress rehearsal even though they flew around the Moon.
Each step HAD to be simple because complexity weighed too much. But each of those simple steps were still wildly complex.
Every time I would dive in and code up something that I though was easy, it would blow up in some weird way, and I have found that doing each step individually and getting it right, might sound like I was going really slow, but it was smoother so it was faster in the end because I wasn't chasing bugs in all the places, but just one.
As someone who has built 0-1 systems at multiple startups (Seed to Series C), Iâve settled on one principle above all else:
âSimple is robustâ
Itâs easy to over-design a system up front, and even easier to over-design improvements to said system.
Customer requirements are continually evolving, and you can never really predict what the future requirements will be (even if it feels like you can).
Breaking down the principle, itâs not just that a simple system is less error prone, itâs just as important that a simple architecture is easier to change in the future.
Should you plan for X, Y, and Z?
Yes, but counterintuitively, by keeping doors open for future and building âthe simplest thing that could possibly work.â
Complexity adds constraints, these limitations make the stack more brittle over time, even when planned with the best intentions.
> real mastery often involves learning when to do less, not more
Really love and agree with this, and (shameless plug?) I think really aligns with a way of working I (and some colleagues) have been working on: https://delivervaluedaily.dev/
It nails the value of keeping things simple, and I think the link to cognitive load deserves even more emphasis. In most of the cases, simplifying means reducing the cognitive load. Sometimes consolidating pieces with DRY, sometimes using design patterns, sometimes decomposing things into services...
Every extra details or workaround increases the number of things you need to keep in your head, not just when building the system, but every time you come back to maintain or extend it. "Simple systems have fewer 'moving pieces': fewer things you have to think about when you're working with them."
Simplicity isn't just about getting the job done quickly; it's about making sure future you (or someone else) can actually understand and safely change the system later. Reducing cognitive load with simplicity pays off long after the job is done.
It's important to understand the difference between easy and simple. It's easy to add complexity, it can sometimes be hard to keep things simple.
But with that in mind, I do agree that a lot of systems are more complex than they need to be. I like to keep things simple.
Of course scalability adds complexity, and sometimes you need that. But you don't always need that, and making things scalable that don't need to be, makes them harder to understand and maintain.
Obligatory reference to Simple Made Easy https://www.infoq.com/presentations/Simple-Made-Easy/
I watch this talk about once per year to remind myself to eschew complexity.
âEverything should be made as simple as possible, but not simpler.â
As someone who has strived for this from early on, the problem the article overlooks is not knowing some of these various technologies everyone is talking about out, because I never felt I needed them. Am I missing something I need, but just ignorant, or is that just needless complexity that a lot of people fall for?
I donât want to test these things out to learn them in actual projects, as Iâd be adding needless complexity to systems for my own selfish ends of learning these things. I worked with someone who did this and it was a nightmare. However, without a real project, I find itâs hard to really learn something well and find the sharp edges.
Yes, and I (nearly) live this nightmare. I have someone higher up in the food chain who is fascinated with every new piece of software they find, that MIGHT be useful. We are often tasked with "looking at it, and seeing if it would be useful".
Yeah, let me shoehorn that fishing trip into my schedule without a charge number, along with the one from last week...
I was the go-to guy for this under my former boss, but he let me do pretty much whatever I wanted, so it usually wasnât an issue to not work on anything else while playing around with new stuff.
Though there was a time when he wanted me to onboard my simple little internal website to a big complicated CICD system, just so we could see how it worked and if it would be useful for other stuff. It wouldnât have been useful for anything else, and I already had a script that would deploy updates to my site that was simple, fast, and reliable. I simply ignored every request to look into that.
Other times I could tell him his idea wouldnât work, and he would say âokâ and walk away. That was that. This accounted for about 30% of what he came to me with.
Does he ask you to "figure out how to implement AI"?
That is what my boss asks us to do =p
Implement the simplest thing that works, maybe even by hand at first, instead of adding the tool that does "the whole thing" when you don't need "the whole thing".
Eventually you might start adding more things to it because of needs you haven't anticipated, do it.
If you find yourself building the tool that does "the whole thing" but worse, then now you know that you could actually use the tool that does "the whole thing".
Did you waste time not using the tool right from the start? That's almost a filosofical question, now you know what you need, you had the chance to avoid it if it turned out you didn't, and maybe 9 times out of 10 you will be right.
This is indeed a vexing issue. I feel it often. It's this feeling that leads to resume-driven development which I really work hard to avoid.
Such a familiar feeling. Articles similar to this one make lots of sense to and I do try to embrace simplicity and not optimize prematurely, but very often I have no idea whether it's the praised simplicity and pragmatism or just a lack of experience and skills.
I agree with the spirit of the article, but I think the definition of "simple" has been inverted by modern cloud infrastructure. The examples create a false choice between a "simple but unscalable" system and a "complex but scalable" one. That is rarely the trade-off today.
The in-memory rate-limiting example is a perfect case study. An in-memory solution is only simple for a single server. The moment you scale to two, the logic breaks and your effective rate limit becomes N Ă limit. You've accidentally created a distributed state problem, which is a much harder issue to solve. That isn't simple.
Compare that to using a managed service like DynamoDB or ElastiCache. It provides a single source of truth that works correctly for one node or a thousand. By the author's own definition that "simple systems are stable" and require less ongoing work, the managed service is the fundamentally simpler choice. It eliminates problems like data loss on restart and the need to reason about distributed state.
Perhaps the definition of "the simplest thing" has just evolved. In 2025, it's often not about avoiding external dependencies. You will often save time by leveraging battle-tested managed services that handle complexity and scale on your behalf.
I don't think this is particular to cloud infrastructure. Even on a single server you could make the same argument about using flat file vs sqlite vs postgres for storage. Yes, there is a lot of powerful and reusable software, both managed and unmanaged, with good abstractions and great power to weight ratios where you pay a very small complexity cost for an incredible amount of capability. Such is the nature of software.
But all of it comes with tradeoffs and you have to apply judgement. Just as it would be foolish to write almost anything these days in assembly, I think it would be almost as foolish to just default to a managed Amazon service because it scales without considering whether A) you actually need that scale and B) there are other concerns considerations as to why that service might not be the best technical fit (in particular, I've heard regrets due to overzealous adoption of DynamoDB on more than one occasion).
You make a good point about experience. I've noticed an interesting paradox there.
The engineers who most aggressively advocate for bespoke solutions in the name of "simplicity" often have the least experience with their managed equivalents, which can lead to the regrets you mentioned. Conversely, many engineers who only know how to use managed services would struggle to build the simple, self-contained solution the author describes. True judgment requires experience with both worlds.
This is also why I think asking "do we actually need this scale?" is often the wrong question; it requires predicting the future. Since most solutions work at a small scale, a better framework for making a trade-off is:
* Scalability: Will this work at a higher scale if we need it to?
* Operations: What is the on-call and maintenance load?
* Implementation: How much new code and configuration is needed?
For these questions, managed services frequently have a clear advantage. The main caveat is cost-at-scale, but thatâs a moot point in the context of the article's argument.
> A lot of engineers design by trying to think of the âidealâ system: something well-factored, near-infinitely scalable, elegantly distributed, and so on.
> Instead, spend that time understanding the current system deeply, then do the simplest thing that could possibly work.
I'd argue that a fair amount of the former results in the ability to do the latter.
There's a substantial amount of wisdom that goes into designing "simple" systems (simple to understand when reading the code). Just as there's a substantial amount of wisdom that goes into making "simple" changes to those systems.
IMO, the most important thing about this sort of advice (and maybe most advice) is to treat it as a "generally useful heuristic, subject to refinement based on judgment" and not as an "ironclad, immutable law of the kingdom, any transgression from which, will be severely punished".
Sure, try to keep things simple. Unless it doesn't make sense. Then make them less simple. Will you get it wrong sometimes? Yes. Does it matter? Not really. You'll be wrong sometimes no matter what you do, unless you are, in fact, the Flying Spaghetti Monster. You're not, so just accept some failures from time to time and - most importantly - reflect on them, try to learn from them, and expect to be better next time.
Until you get enough experience for your own good judgment, you need some rules of thumb and guidelines from more experienced peers.
As long as you understand that everything is a trade-off and, unfortunately, that the modern field is based on subjective opinions of popular and not necessarily competent people, you will be fine.
I wholeheartedly agree with this. The challenge is perception though. Many managers will see a simple solution to a complex problem and dock you for not doing real engineering, whereas a huge convoluted mess to solve a simple problem (or non-problem) gets you promoted. And in design interviews, "I'd probably implement a counter in memory" would be the last time you ever heard from that company.
> Do the simplest thing that could possibly work âŚ
That advice these days surely means having an LLM vibecode a mess of something?
Is such obvious and unquantifiable advice actually useful?
This is good advice but it can be difficult to define what simple means. The only technical way I was able to make sense of it is by targeting reducing code entropy and scopes (Inspired by how language models try to minimize Solomonoff/Kolmogorov entropy).
https://benoitessiambre.com/entropy.html https://benoitessiambre.com/integration.html
There is a sign on a wall at Apple that reads: "Simplify. Simplify. Simplify." (with the first two struck out) https://www.forbes.com/sites/kenmakovsky/2012/12/06/inside-a...
"Itâs fun to decouple your service into two pieces so they can be scaled independently (I have seen this happen maybe ten times, and I have seen them actually be usefully scaled independently maybe once)."
Same, or reliability-tiered separately. But in both aspects I more frequently see the resulting system to be more expensive and less reliable.
Looking over the threads running here, it is interesting how differently the article's title/point is being taken, filtered through different commenter situations and experiences.
I don't see it as a blind prescription.
It doesn't imply that choosing what is simple, will be simple. Or that simplest, will be simple. Or that this is a process uniquely immune from problems or tradeoffs.
Just a reminder to never forget to aim for simplest.
A tautological cookie fortune, of something important we often functionally forget or slide on.
There is a lot of wisdom in recognizing and repeating the most important "mantras of the obvious". And listening to them reformulated, in other ways, by other people.
The greatest craftbeings never stop revisiting the basics.
I think a lot of engineers think, âI thought of a complicated set of abstract ideas that mean nothing to anyone else but will demonstrate my superior intellectâ and then make that. It took a lot of self control to not curse.
Hard, hard disagree.
First of all, simplicity is the hardest thing there is. You have to first make something complex, and then strip away everything that isn't necessary. You won't even know how to do that properly until you've designed the thing multiple times and found all the flaws and things you actually need.
Second, you will often have wildly different contexts.
- Is this thing controlling nuclear reactors? Okay, so safety is paramount. That means it can be complex, even inefficient, as long as it's safe. It doesn't need to be simple. It would be great if it was, but it's not really necessary.
- Is the thing just a script to loop over some input and send an alert for a non-production thing? Then it doesn't really matter how you do it, just get it done and move on to the next thing.
- Is this a product for customers intended to solve a problem for them, and there's multiple competitors in the space, and they're all kind of bad? Okay, so simplicity might actually be a competitive advantage.
Third, "the simplest thing that could possibly work" leaves a lot of money on the table. Want to make a TV show that is "the simplest thing that could possibly work"? Get an iPhone and record 3 people in an empty room saying lines. Publish a new episode every week. That is technically a TV show - but it would probably not get many views. Critics saying that you have "the simplest show" is probably not gonna put money in your pocket.
You want a grand design principle that always applies? Here's one: "Design for what you need in the near future, get it done on time and under budget, and also if you have the time, try to make it work well."
> First of all, simplicity is the hardest thing there is. You have to first make something complex, and then strip away everything that isn't necessary.
I don't follow. I've made simple things many times without having to make a complex thing first.
That would make you a genius, so congrats :-) It's more likely that you thought it was simple, but it actually wasn't. Was it actually the least-complex thing that works? Or was it just a thing that worked, and it didn't seem complex? Because those are two different things.
> Want to make a TV show that is "the simplest thing that could possibly work"? Get an iPhone and record 3 people in an empty room saying lines. Publish a new episode every week.
You just described Podcast. It did work for many (obviously it failed for many as well). That's an excellent example of why one should start with the simplest thing that could possibly work. Probably better than the OP's examples.
Podcasts aren't usually scripted episodic content. Some are, but those tend to involve a lot more sound production, and they aren't filmed. If you tried to make filmed scripted episodic content the way you make a podcast, it would be terrible. The exception is improv comedy.
I get what you're saying, but you're also attempting to design the perfect system without any hindsight, which is impossible.
The beauty of this approach is that you don't design anything you don't need. The requirements will change, and the design will change. If you didn't write much in the first place, it's easy.
But designing "the simplest thing that could possibly work" may make harder than necessary to modify later. (and any time the requirements change after it's built, you're inching towards a big ball of mud, so this whole idea should be reviled whenever possible)
An example is databases. People design their database schemas in incredibly simplistic ways, and then regret it later when the predictable stuff most people need doesn't work with the old schema, and you can't even just add columns, but you have to modify existing ones. Avoid the nightmare by making it reasonably extensible from the start. It may not be "the simplest thing that could possibly work", but it is often useful and doesn't cost you anything extra.
Just as much as people say "don't prematurely optimize", they should also say "don't prematurely make it total crap".
We once rebuilt an old system and went with the simplest thing that could work. It ran great for the first few weeks, but then all kinds of edge cases started creeping in. We ended up spending more time patching things up.
> design the best system for what your requirements actually look like right now
this is the key practical advice. when you start designing for hypothetical use cases that may never happen you are opening up an infinite scope of design complexity. setting hard specifications for what you actually need and building that simplifies the design process, at least, and if you start with that kind of mindset one can hope that it carries over to the implementation.
the simplest things always win because simple is repeatable. not every simple thing wins (many are not useful or have defects) but the winners are always simple.
> > design the best system for what your requirements actually look like right now
But don't forget to ask your manager if they want to be prepared for future scenarios A, B, or C.
And write down their answer for later reference.
All too often I see this mantra used to justify doing the âeasiest thing that could possibly work.â Which in my experience is not the same thing. They can overlap, and often upstream simplicity can create a downstream effect of ease for consumers. Simplicity often requires some real effort to execute well.
if only designing simple systems as itâs easy as it sounds.
it takes maybe 3 to 5 rewrites before you truly grasp a problem.
From https://nshipster.com/uncertainty/ recently: "Working in software, the most annoying part of reaching Senior level is having to say âit dependsâ all the time. Much more fun getting to say âletâs ship it and iterateâ as Staff or âthat wonât scaleâ as a Principal."
IIUC, author is a Staff SWE, so this tracks.
See also "Worse is better" which has been debated a million times by now.
Simplicity is brittle. Perhaps we should take a page from nature. Nothing in nature in simple, and yet it has built some of the most robust systems (by far) that we know of.
Can your software run for millions of years?
On the meta level, the simplest thing that could possibly work is usually paying someone else to do it.
Alas, you do not have infinite money. But you can earn money by becoming this person for other people.
The catch 22 is most people aren't going to hire the guy who bills himself as the guy who does the simplest thing that could possibly work. It turns out the complexities actually are often there for good reason. It's much more valuable to pay someone who has the ability to trade simplicity off for other desirable things.
If I was running a business and I could hire someone that I knew did good work, and did the simplest thing that could possibly work (and it actually worked!) - then I would absolutely do that as soon as possible.
"It turns out the complexities actually are often there for good reason" - if they're necessary, then it gets folded into the "could possibly work" part.
The vast majority of complexities I've seen in my career did not have to be there. But then you run into Chesterton's Fence - if you're going to remove something you think is unnecessary complexity, you better be damn sure you're right.
The real question is how AI tooling is going to change this. Will the AI be smart enough to realize the unnecessary bits, or are you just going to layer increasingly more levels of crap on top? My bet is it's mostly the latter, for quite a long time.
"Will the AI be smart enough to realize the unnecessary bits, or are you just going to layer increasingly more levels of crap on top? My bet is it's mostly the latter, for quite a long time."
Dev cycles will feel no different to anyone working on a legacy product, in that case.
Useful principle. But⌠(sorry to make a simple phrase more complex) the notion just scratches the surface of complexity management.
I appreciate âPhilosophy of Software Designâ by Ousterhout. I recently read that while rebuilding a text editor. Mind blowing experience. There is a lot of opportunity to more tightly encapsulate logic, to more clearly abstract a system, to keep a system simple yet powerful and extensible. I believe I became twice as good of a developer just by reading a chapter a day and sticking with the workflow.
Great advice.
I always felt software is like physics: Given a problem domain, you should use the simplest model of your domain that meets your requirements.
As in physics, your model will be wrong, but it should be useful. The smaller it is (in terms of information), the easier it is to expand if and when you need it.
it won't work when every single PRD now has the word "extensible". i think overcomplexity often comes from requirements/ business usecases first
> You should do that too! Suppose youâve got a Golang application that you want to add some kind of rate limiting to...Actually, are you sure your edge proxy doesnât support rate limiting already? Could you just write a couple of lines in a config file instead of implementing the feature at all?
As I'm doing the simplest thing that could possibly work, I do not have an edge proxy.
Of course, the author doesn't mean _that_ kind of simplicity. There are always hidden assumptions about which pieces of complexity are assumed, and don't count against your complexity budget.
This just kicks the can down the road. What is "simple"? What does "works" mean?
I don't think the author (or anyone else) could come up with term definitions that would satisfy everyone.
... and nevertheless at the end of the article, the author does offer their understanding of the terms
hard disagree (fwiw).
this *tactical* style of development is the same thing propounded by TDD folks. there is no design, just a wierdly glued together mishmash of things that just happen to work.
i am (fwiw once again) not against unit-testing, that is almost always needed.
Not arguing against this article but aren't all these ideas already well known in the industry? Start with an MVP, don't optimize prematurely, avoid writing brittle code (systems, really), abstract implementation details away when possible, and KISS.
This consistent with Gall's law, that says complex systems that work can only be achieved by building complexity over simple systems. Complex systems built from scratch do not work. So build the simplest system that works and then keep adding complexity to it based on requirements
until? what's the threshold here? does complexity have a final boss?
It's a pithy philosophy if you already know what it means to "work". You probably don't, especially if your system is human facing. Figuring out what "works" means is almost as difficult as building things in the first place. You may as well commit to building it twice [0].
[0] https://ratfactor.com/cards/build-it-twice
XP
Very much agree for the type of software I've worked on my whole career. I've seen way more time and energy wasted by people trying to predict the future than fixing bugs. In practice I think it's common to realize something didn't "possibly work" until after it's already deployed, but keeping things simple makes it easy to fix. So this advice also ends up basically being "move fast break things".
You know what taught me this the best? Watching Mythbusters.
Time and time again amazingly complex machines and they just fail to perform better than a rubber-band and bubble gum.
eh.. there were series of clips named something like 'Industrial JP' showing the multiaxis (like 6 to 12 axis) spring coil forming machines working
This stuff just can not be reimplemented that simple and be expected to work.
The music was also quite good imo.
I get it.
YAGANI.
A term coined by consultants who donât understand an industry who basically say âdo the least possible thing that will workâ because they donât understand the domain and donât understand what requirements are often non-negotiable table stakes of complexity you need to compete.
It reminded me of a Martin Fowler post where he was showing implementation of discounts in some system and advocating to just hard code the first discount in the method (literally getDiscount() {return 0.5}).
Even the most shallow analysis would show this was incredibly stupid and short sighted.
But this was the state of the art, or so we were told.
See also Ward Cunningham trying and failing to solve Sudoku using TDD.
The reality is most business domains are actually complex, and the one who best tackles that complexity up front can take home all the marbles.
Undoubtedly, Fowler suggested this as an incomplete step in the development process, essentially to get the code to compile.
Incredible that we can tar both complexity and simplicity with the brush of "consultant BS."
Before you write a parser, try a regex. (But some times you really do need a parser.)
I would argue that regexes are often more complex than simple parsers.
That's where the familiarity factor steps in.
The problem is sometimes you don't get the right feedback to know when to stop building.
I think this article actually expresses a dangerous, risk-prone approach to problem solving, and one which ultimately causes more problems than the ones it solves.
The risk is misunderstanding the problems they are solving, and ignoring all the constraints that drove the need for some key design traits that were in place to actually solve the problem (i.e., complexity)
Take the following example from the article:
> You should do that too! Suppose youâve got a Golang application that you want to add some kind of rate limiting to. Whatâs the simplest thing that could possibly work? Your first idea might be to add some kind of persistent storage (say, Redis) to track per-user request counts with a leaky-bucket algorithm. That would work! But do you need a whole new piece of infrastructure?
Let's ignore the mistake of describing Redis as persistent storage. The whole reason why rate limiting data is offloaded to a dedicated service is that you want to enforce rate limiting across all instances of an API. Thus all instances update request counts on a shared data store to account for all traffic hitting across all instances regardless of how many they might be. This data store needs to be very fast to minimize micro services tax and is ephemeral. Hence why a memory cache is often used.
And why do "per-user request counts in memory" not work? Because you enforce rate-limiting to prevent brownouts and ultimately denials of service triggered in your backing services. Each request that hits your API typically triggers additional requests to internal services such as memory stores, querying engines, etc. Your external facing instances are scaled to meet external load, but they also create load to internal services. You enforce rate-limiting to prevent unexpected high request rates to generate enough load to hit bottlenecks in internal services which can't or won't scale. If you enforce rate limits per instance, scaling horizontally will inadvertently lift your rate limits as well and thus allow for brownouts, thus defeating the whole purpose of introducing rate limiting.
Also, leaky bucket algorithms are employed to allow traffic bursts but still prevent abuse. This is a very mundane scenario that happens on pretty much all services consumed by client apps. Once an app is launched, they typically do authentication flows and fetch data required in app starts and get data, etc. After app inits the app is back to baseline request rates. If you have a system that runs more than a single API instance, requests are spread over instances by a load balancer. This means a user's request can be routed to any instance at an unspecified proportion. So how do you prevent abuse while still allowing these bursts to take place? Do you scale your services to handle peak loads 24/7 to accommodate request bursts from all your active users at any given moment? Or do you allow for momentary bursts spread across all instances, regardless of what instances they hit?
Sometimes a problem can be simple. Sometimes it can be made too simple, but you accept the occasional outage. But sometimes you can't afford frequent outages and you understand a small change, like putting up a memory cache instance, is all it takes to eliminate failure modes.
And what changed in the analysis to understand that your simple solution is no solution at all? Only your understanding of the problem domain.
The author should actually design a successful large software system and try again.
Honestly, this is how all computing works.
For example, chips just barely work.
If they work too well, you could shrink the chip until it barely works making it cheaper or faster or use less power.
That said, although this exercise is kind of interesting - like playing jenga - it might not be fun or satisfying.
better faster cheaper - sometimes you need to choose better.
A related concept: https://en.wikipedia.org/wiki/Poka-yoke
Also: https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it
You aren't gonna need it
using unicorn as a positive example is, well, a pretty negative signal
unicorn, i.e. CGI, i.e. process-per-request, became anachronistic, gosh, more than 20 years ago at this point!
at least, if you're serving any kind of meaningful load -- a bash script in a while loop can serve 100RPS on an ec2.micro, that's (hopefully) not what anyone is talking about
It's great advice for an individual or a work item. It's great advice for teams where untamed programmers run rampant. It's good advice when teams are sane.
But, it's terrible for 2025's median software team. I know that isn't OP's intention. But inevitably, all good advice falls prey to misinterpretation.
In contemporary software orgs, build fast is the norm. PMs, managers, sales & leadership want Engg to build something that works, with haste. In this imagination, simple = fast. Let me say this again. No, you cannot convince them otherwise. To the median org, SIMPLE = FAST. The org always chooses the fastest option, well past the point of diminishing returns. Now once something exists, product managers and org leaders will push to deploy it and sales teams will sell dreams around it. Now you have an albatross around your neck. Welcome to life as a fire fighter.
For the health of a company and the oncall's sanity, Engg must tug at the rope in the opposite direction from (perceived) simplicity. The negotiated middle ground can get close to the 'simple' that OP proposes. But today, if Engg starts off with 'simple', they will be rewarded with a 'demo on life support'. At a time when vibe coding is trivial, I fear things will get worse before they get better.
All that being said, OP's intended advice is one that I personally live by. But often, simple is slow. So, I keep it to myself.
Another way I like to think about this is finding 'closeable' contexts to work in; that is, abstractions that are compact and logically consistent enough that you can close them out and take them on their external interface without always knowing the inner details. Metaphorically, your system can be a bunch of closed boxes that you can then treat as boxes, rather than a bunch of open boxes whose contents are spilling out and into each other. Think 'shipping containers' instead of longshoremen throwing loose cargo into your boat.
If you can do this regularly, you can keep the _effective_ cognitive size of the system small even as each closed box might be quite complex internally.
A few notes:
1) Sometimes the simplest things is still extremely complex
2) The simplest thing that works is often very hard to find
See also https://wiki.c2.com/?DoTheSimplestThingThatCouldPossiblyWork
> when I asked [KentBeck], "What's the simplest thing that could possibly work?" I wasn't even sure. I wasn't asking, "What do you know would work?" I was asking, "What's possible? What is the simplest thing we could say in code, so that we'll be talking about something that's on the screen, instead of something that's ill-formed in our mind?" I was saying, "Once we get something on the screen, we can look at it. If it needs to be more, we can make it more.
I wanted to like this article and there's some things in there to agree with but ultimately it's a very uninteresting take with a very unconvincing rate limiting example.
> System design requires competence with a lot of different tools: app servers, proxies, databases, caches, queues, and so on.
Yes! This is where I see so many systems go wrong. Complex software engineering paving over a lack of understanding of the underlying components.
> As they gain familiarity with these tools, junior engineers naturally want to use them.
Hell yea! Understanding how kafka works so you don't build some crazy queue semantics on it. Understanding the difference between headless and clusterIP services in kubernetes so you don't have to build a software solution to the etcd problems you're having.
> However, as with many skills, real mastery often involves learning when to do less, not more. The fight between an ambitious novice and an old master is a well-worn cliche in martial arts movies
Wait what? Surely you mean doing more by writing less code. Are you now saying that learning and using these well tested, well maintained, and well understood components is amateurish?
"When in doubt, use brute force." --Ken Thompson
I appreciate the sentiment, and absolutely think it should be kept in mind more.
But of course, it runs afoul of reality a lot of the time.
I recently got annoyed that the Windows Task scheduler just sometimes... Doesn't fucking work. Tasks don't run, and you get a crazy mystery error code. Never seen anything like it. Drives me nuts. Goddamned Microsoft making broken shit!
I mostly write Powershell scripts for automating my system, so I figure I'll make a task scheduler which uses the C# PSHost to run the scripts, and keep the task configuration in a SQLite database. Use a simple tray app with a windows form and EFCore for SQLite to read and write the task configuration. Didn't take too long, works great. I am again happy, and even get better logging out of failure conditions, since I can trap the whole error stream instead of an exit code.
My wife is starting a business, and I start to think about using the business to also have a software business piece to it. Maybe use the task scheduler as a component in a larger suite of remote management stuff for my wife's business which we sell to similar businesses.
Well. For it to be ready, it's got to be reliable and secure. Have to implement some checks to wait if the database is locked, no biggie. Oh, but what happens if another user running the tray icon somehow can lock the database, I've got to work out how to notify the other user that the database is locked... Also now tasks need to support running as a different user than the service is running under. Now I have to store those credentials somewhere, because they can't go into the SQLite DB. DPAPI stores keys in the user profile, so all this means now I have to implement support for alternative users in the installer (and do so securely, again with the DPAPI, storing the service's credentials).
I've just added a lot of complexity to what should be a pretty simple application, and with it some failure modes. Paying customers want new features, which add more complexity and more concern cases.
This is normal software entropy, and to some extent it's unavoidable.
Had to make an account for this, but this is almost verbatim what the article is talking about:
Your wife is making a business, and you want to write some code to help.
Then suddenly your requirements balloon to multiple concurrent users, needing to have a system tray icon and then also the ability to take this code and sell it to other people. Wow this project is suddenly complex!
This is just "I need to be able to scale infinitely" written in different words. The complexity comes from wanting a ton of things before they're actually needed (with the wrinkle of wanting to use some previously written scheduler for this project.
One of my favorite movie quotes is from Scotty in Star Trek: "The more they overthink the plumbing, the easier it is to stop up the drain."
Complexity is sometimes necessary, but it always creates more ways things can break.
i want to hear an input from winapi team on this.
Ockham's Software Architecture...
I call it the "Ditt-Kuh-Pow", the Dumbest Thing That Could Possibly Work.
Said that in in a telephone call one time, and the guy leading that was all "I'm mildly disturbed that you had a verbalization for that."
Get it out. Make it work. Love your work.
Mostly I agree with what the author is saying. But there is a clear distinction between the simplest âsystemâ and do the simplest thing.
The simplest thing to do is almost always the easiest, but knowing what is easiest thing to do is a lot trickier â- see Javascript frameworks.
Mostly I agree with what the author is saying. But there is a clear distinction between the simplest âsystemâ and do the simplest thing.
The simplest thing to do is almost always the easiest, but knowing what is easiest thing to do is a lot trickier â- see Javascript frameworks.
But I think I disagree with the authorâs second axiom:
â2. Simple systems are less internally-connected.â
Creating interfaces is more complex than not. Even if it leads to a cleaner design because of interface boundaries. At the least, creating those boundaries adds complexity, and I donât mean âmore effortâ. I mean it in the sense that creating functions is more complex than calling âgotoâ. And it took decades to invent the mechanism needed to call functions â- which is probably the next most simple thing.
However, using call stacks and named pointers and memory separation (functions) leads to vastly improved simplicity of the system as the system as a while grows in complexity.
So in fact, using your own in-memory rate limiter may be a simpler implementation than using Redis, but it it also violates the second principle (using clear interfaces leads to simpler systems.)
And it turns the authorâs first premise â Gunicorn is simpler than Puma. Because Puma does the equivalent of building their own rate limiter â managing its own memory and using threads instead of processes.
And Gunicorn does the equivalent of using Redis â externalizing the complexity.
What Gunicorn did was simpler to implement (because it relies on an existing isolated architecture - Unix processes and files) but means it has a greater complexity (if you take into account that it needs that whole system to work.
However that system is a brilliant set of reductions in complexity itself, but it runs up against limitations and performance at some point.
Puma takes on itself more complexity to make administering the server less complex and more performant under load. Also, because it is, in a sense, reinventing the wheel, it lacks the distillation of simplicity that is Unix.
So, less internally connected systems are easier to expand and maintain and interface boundaries lead to less complex systems as a whole, but are not, in themselves less complex.
Limitations in the system that cause performed problems (like Unix processes and function calls) are not necessarily âmore simple than can possibly workâ â- but the implementations of those abstractions are not perfect and could be improved.
Sometimes itâs not clear where to push the complexity, and sometimes itâs not clear what the right abstraction level is; but mostly itâs about making due with the existing architecture you have, and not having the time or resources to fix it. Until the complexity at your level reaches a point that itâs worth adding complexity at a higher level due to being unable to add the right amount of complexity at a lower level.
This is the advice I've been unsuccessfully trying to drill into the heads of developers at a large organisation. Unfortunately, it turns out that the "simplest thing" can be banged out in a couple of days -- mere hours with an AI -- and that just isn't compatible with a career that is made up of 6-month contracting stints. It's much, much more lucrative to drag out every project over years and keep collecting that day-rate.
Many "industry best-practices" seen in this light are make-work, a technique for expanding simple things to fill the time to keep oneself employed.
For example, the current practice of dependency injection with interfaces, services, factories, and related indirections[1] is a wonderful time waster because it can be so easily defended.
"WHAT IF we need to switch from MySQL to Oracle DB one day?" Sure, that... could happen! It won't, but it could.
[1] No! You haven't created an abstraction! You've just done the same thing, but indirectly. You've created a proxy, not a pattern. A waste of your own time and the CPU's time.
This is also good advice for personal projects - want to ship stuff? Just do what works, nobody cares!
wonderful piece.
applies to the narrative
'unfuck' anything
as well. any industry and any
'behavioral lock in'
and so on.
Nooo you can't use deterministic math, you gotta use kalman filters and particle filters and factor graphs and spend three months tweaking parameters to get a 5% improvement! /s
Like alright in some situations that's the only thing that could possibly work, but shoving that complexity into every state estimator without even having a way to figure out the actual covariance of the input data is a bit much. Something that behaves in an expected way and works reliably with less accuracy beats a very complex system that occasionally breaks completely imo.
Claude is great at this! If you avoid refactoring at all costs and put everything as close as possible to the relevant code, you maximise the chance it works and minimise pesky DRY complexities.
Don't bother with SSL, it's adds complexity.
Don't add passwords, just "password" is fine. Password policies add complexity.
For services that require passwords just create a shared spreadsheet for everyone.
/s
Isn't reading the article before posting comments considered cool anymore?