Velocity's Edge Podcast S1E6 - Thomas Dullien & Chris Swan on Decision Records

Date posted

September 17, 2025

Written by

EPSD

Velocity's Edge Podcast S1E6 - Thomas Dullien & Chris Swan on Decision Records

Most engineering leaders think institutional knowledge loss is an inevitable cost of growth. They see departing employees take critical context with them—why certain processes exist, what problems they solve, how trade-offs were evaluated—and assume the solution involves better handoff documentation or knowledge transfer sessions. But as EPSD Advisory Board members Thomas Dullien and Chris Swan learned through building and scaling organizations, the biggest risk isn’t losing people; it’s losing the reasoning behind the decisions those people made.

The difference between organizations that scale smoothly and those that constantly rehash the same choices isn’t primarily about retaining talent longer. It’s about capturing decision context before it walks out the door. When teams inherit processes or systems without understanding their origins, they either follow them blindly like cargo cults or abandon them entirely—often recreating the same expensive mistakes that led to those processes in the first place.

“At the moment you find yourself repeating something for the second or third time, it’s time to put it in writing,” Thomas explains.

Decision records aren’t just about documentation. They’re about preserving the why behind important choices and recognizing a fundamental truth: institutional memory is strategic infrastructure, not administrative overhead.

In this episode of Velocity’s Edge, Thomas, Chris, and host Nick Selby explore why decision records have become crucial for scaling teams. They tackle essential questions: How do you quantify the ROI of maintaining decision records when your time is already stretched thin? Why might documenting decisions actually accelerate execution rather than create bureaucratic drag? How do decision records help both technical architecture and business operations?

The conversation reveals why the most resilient organizations aren’t necessarily those with the lowest turnover—they’re the ones that understand how to capture decision context so future teams can make informed choices about what to change and what to preserve.

Thomas Dullien, also known as Halvar Flake, is a security and efficiency expert with deep expertise in reverse engineering, vulnerability research, and cloud economics. He founded a malware analysis company acquired by Google, where he later contributed to research on Rowhammer. He also co-founded a firm focused on system-wide performance profiling, later acquired by Elastic. His work explores the intersection of computing efficiency, economics, and sustainability.

Chris Swan is a technology leader specializing in cloud, security, and software architecture. He is an Engineer at Atsign, working on privacy-focused solutions that put users in control of their data. Previously, he held CTO and R&D leadership roles at DXC Technology, UBS, and Credit Suisse. Chris is also a Google Developer Expert in Dart and co-hosts the Tech Debt Burndown Podcast.

As in all our episodes, we speak in plain, executive-summary business terms, framing complex business and technology strategic challenges in context, using language that makes them more accessible and actionable.

Listen here, download it from Apple Podcasts, Spotify, or find it wherever you get your podcasts.

Episode Information
Season 1, Episode 6
Length: 26 minutes, 11 seconds
Host: Nick Selby
Guest: Thomas Dullien, Chris Swan
Recorded: VOXPOD Podcast Studios, Parsons Green, London
Engineer: Dayna Ruka
Editor: Dayna Ruka, Jeet Vasani

Episode Transcript

Nick Selby: It’s Velocity’s Edge podcast. I’m Nick Selby. We don’t do self-promotion or ads on this podcast, but there’s a story about an internal process we did as I was founding EPSD.

Thomas was the first person I called to be on our advisory board, and I just love how his mind works. He’s set up all these successful companies, which he might talk about. But what I really liked was he has set up companies in niche areas doing very niche things.

We were talking about the things that he would do as an advisor to the company. The first thing that he said was, “I’m not really as concerned about the decisions that you make, with the fact that you write them down. You documented the context to the decisions that you’ve made so that people in the future can understand where these things came from– what was the intent? And they don’t have to sit there and wonder, what on earth were we thinking?”

And then the second person I talked to was Chris Swan, who’s also here today. We also do the Tech Debt Burndown podcast. And right after this conversation with Thomas, Chris said almost the same thing about architecture decision records, making a decision registry. So I want to ask the two of you guys if you can please introduce yourselves, and then we can jump into the conversation. So, “Halvar,” Thomas, can you please introduce yourself?

Thomas Dullien: Sure. Thank you. My name is Thomas. I used to do security work for a very long time under the name of Halvar, and I, over the years, have done two companies. One which was doing specialized reverse-engineering tooling, which got acquired by Google in 2011, and another one that did computational efficiency tooling that was acquired by Elastic in 2021. And I’ve hence had the experience of running small companies, but also working in very large companies.

I was at Google while they expanded from like 12,500 people, or something like this, to 100-something thousand, so you see the growth in organizations. From these experiences, I’ve got strong opinions about what works and what doesn’t work in organizations.

I’m also, for almost familial reasons, very interested in how organizations thrive and fail. My father did a lot of matrix organization, research, and consulting back in the days, and he used to joke that the only places he’s ever seen a matrix organization work is when nobody was aware there is one. So, it’s really quite fascinating how large organizations succeed or fail, and perhaps it’s a bit weird that as a mathematician I’m quite interested in this. But in the end, when your ambition exceeds what you can do individually, you will need to deal with organizations and then you’ll deal with all the pathologies of them, so it’s best to start studying them.

NS: Wow. Okay. And Chris Swan, as I said, we do the Tech Debt Burndown podcast together, and this is probably going to be replayed on Tech Debt Burndown. Chris, can you introduce yourself?

Chris Swan: So hi, I’m Chris Swan. I am an advisor with Nick at EPSD, but I spend most of my time at Atsign, where I’m an engineer focused on progressive delivery. I think where the decision thing comes in, is this has been a very conscious thing for us at Atsign that, partly because we run a flat organization and don’t have a traditional hierarchical management structure, we very consciously chose to have a mechanism by which we could communicate our decisions to each other and get collaboration around how we make decisions.

That manifests itself in two places. On the tech side of the house, we make use of architecture decision records. People can see that if they go into our GitHub, to the app protocol repo, there’s a folder there called “decisions.” Pretty much all of the decisions we make on the tech side are there in the open, because we’re mostly working on open source software.

Then, within the more admin of the company, we have decision records. That’s essentially the vehicle by which money gets allocated. If you want to do anything which is going to involve spending money, then you need to bring people along with you and persuade them of that. That takes the form of a decision record, which is just a Google doc where we have a template, and you fill it out. We encourage dissent on that, so that we’ve got a healthy tension around how decisions get made.

That also then gives us a mechanism to go back and understand how we arrived at things. That applies both to the ADRs and the decision records, and lets us sometimes revisit topics and go, “Well, what were we thinking at the time, and what’s changed about the landscape around us, and how do we then adapt?”

NS: You just said because you’re an open source software company, and I’m struggling to think of another open source project or company that’s actually publishing their decision records, is this becoming more common?

CS: It is. We link to a resource, which is also hosted on GitHub, which is all about ADRs. So it explains ADRs, and where they’ve come from, and how people are using them. Part of that is a page where it’s the list of organizations using ADRs, and if you click into that, then you can see them out there in the wild.

NS: It sounds exhausting. So my next question is really going to be about ROI, and why do you want to do it? And don’t get me wrong– I’ve become a convert to this. I think it’s really great.

So, the definition here, right? Decision records are a strategic imperative once your leadership teams understand that your institutional knowledge will walk out the door with departing employees.

We’ve been talking quite a bit about how in a lot of scale-ups, the people who are who are running the things aren’t the people who built it. The executives who are running things aren’t the ones who founded it. Once that institutional knowledge leaves, then it can get really distracting, and regular battles over why we’re doing certain things. Both of you guys have done this from day one.

Thomas, I remember we had a conversation just when you had founded that other company, and across the board, some of the decisions that you made were… it was really inspiring how you were both thinking things through and then and then writing it down. Can you talk a little bit about how you’ve quantified the return on investment of maintaining these records versus the time that you spend doing it?

TD: I mean, quantified is a strong word in the sense that that would mean having dollars and cents to the decision. But I think the first thing is, even when starting a startup, you know that your time is going to become the bottleneck as one of the founders. So everything that is important to you needs to be transmissible to others. At the moment you find yourself repeating for the second or third time, it is time to put it in writing, because everything else will not scale– even less so if you’re a fully remote organization.

My first company was on site. I mean, that was 2004. Second company was fully remote in COVID. And funnily enough, the first time all employees met was post-acquisition at an all-hands of the company that had acquired us, because we had planned on meeting every quarter, and then COVID happened, so we never met. We still managed to have a very coherent and strong company culture and decision making, partially by being very focused on writing things down.

In some sense you’re almost forced, if you have a remote work component or significant remote work component, to be more writing-oriented. Then everything else flows a little bit naturally from this. When you’re in the office every day, it’s much easier to transmit your values and your culture by essentially just talking to everybody, but even that hits its limits, right?

As companies scale, you are further and further away from everybody, and you’re less and less influence on everything that’s happening. As a founder, you set culture, first and foremost, and then culture is also what you do, and what you reward. So you need to write these things down, like what is important to you. Then when you make decisions, you also try to tie that back to the culture parts.

You can argue that Amazon, during the Bezos days, had a very strong culture of tying decisions back to the fundamental values that Jeff Bezos had declared. Similarly, even Google, which was a much more chaotic and less structured place than Amazon, the values articulated by the founders are often used as tie breakers in decision making. When there’s multiple solutions, you ask, “Okay, which one of these solutions most closely aligns with our values?” And then you go after that.

And regarding the ROI, the real issue is that processes and organizations, the way you do things are to some extent also “scar tissue.” For almost every process there is a bad incident that happened that hurt, and that’s why the process was created in the first place. Except that most organizations don’t document how they got hurt. Right? And then two employee generations later, which in the current culture is what, four years? Five years? Nobody knows anymore what that process is for, and that can have only two results. Either you blindly follow a process that you don’t understand and don’t see the value of as a form of cargo cult, or you ignore it, and then you get hurt again.

And you can’t make a differentiated decision on “Is the thing that this process was trying to protect us from still a threat today?” And “Is the benefit I get from short cutting in this process worth the risk?” You can’t make an informed risk decision if you don’t document why the heck that process exists. Then you end up in these situations where… you go into any large organization, and they want to bring in somebody as a contractor part time, and all of a sudden you’re in a six month vendor onboarding process.

In some sense, a healthy organization is capable of executing things that are a net benefit to the organization, and the organization itself does not act as an impediment to do something that is beneficial to the org. There’s going to be rules and so forth so people don’t just do crazy stuff, but in general, a healthy organization should be able to do something that is obviously beneficial. But realistically, once you’re in a big organization, how often do you run into something where something that’s clearly beneficial cannot be done because of existing processes?

When we started optimyze… the initial idea for optimyze, for the efficiency work, was we tried to sell to companies. We’re going to essentially reduce your cloud bill against a cut of the savings. We’re going to bring in a bunch of experts, we’re going to look at the way that you’re spending compute, and we’ll help you reduce your cloud bill for free. All we want is a small fraction of the money we save you. So this is a net positive financially, and you have very little risk, and it looks great on paper.

It’s a very engineer-typical business idea, which neglects the fact that accounting cannot figure out how to account for something that is net positive, but that comes out of a cost center. And legal cannot figure out how to deal with a contractor that will create money, in some sense. The contracting isn’t set up to do it. Accounting doesn’t know how to do it. And unless you have a long-established relationship with somebody that has enormous pull at the company, ideally the CEO, there’s no way you’re going to pull it through, because you need somebody that is authorized to overrule all the processes. And you can’t.

When good things that should happen don’t happen.

At a smaller scale, when Google acquired my first company, we had a product that was being sold, and Google was offering through Google Checkout at the time, the ability to charge customers money, but Google themselves were organizationally incapable of lifting a pen to write an invoice if it didn’t involve like, $50,000. Google could offer the service of charging other people’s credit cards, but Google themselves couldn’t do it in any sensible way, and then accounting couldn’t figure out why security, which is a cost center, would all of a sudden generate revenue somewhere.

This is just an example of the things that happen as things scale up, and the inability to make informed risk decisions when you can’t tell why the rule is there in the first place.

NS: I see this so often.

Chris, we’ve talked about this quite a bit. Does it even affect the speed of execution? I think it actually might even make you execute faster.

CS: Yeah, I think it can do. If I think of my own experience of this, it’s actually very similar to what Thomas has just been describing. An organization that is globally dispersed across locations and time zones, never had an office, so there’s no notion of remote because remote from what? There was never anything to be remote from.

I think an important part of what’s going on here is, if we think about an “oral communication culture” versus a “written communication culture,” the exercise of writing often makes you clarify your knowledge of something. For me, this is one of the reasons why I write blog posts, is because I might think I understand something, and when I sit down and try and write about it, I realize I didn’t actually properly understand it. To crank the blog post, I’ve forced myself to properly understand it. Sometimes the beneficiary of that is future me, because I’ll come back to the problem having forgotten that I even took it on before, and there’s my blog post sat there to remind me. I think some of these decision records serve exactly the same purpose.

But this is where I’m going to unpin my grenade for today and throw it into the room. A former colleague yesterday showed me a use of decision records that I had not really conceived, and which I think might end up vastly driving the adoption of them.

At the moment it feels like one of those practices which, if we’re talking DORA or something like that, is probably evidence of elite ways of working that, along with trunk-based development and continuous delivery and everything else, you’re going to find architecture decision records.

I think this might be the thing that drives it into the mainstream: my former colleague, Doug, has been doing some work with AI coding tools. As part of providing context to the AI coding tool, he created a set of ADRs about what he wanted it to build, the decisions that he’d made, and that then drove the shape of what was being built by the AI coding tool.

TD: I agree entirely. If you use something like Claude Code or Gemini CLI, you’re supposed to provide a markup file telling the rules for the repository to the AI assistant. I think one of the risks with documenting your decisions is, in the past, people still had to have the time to actually read them. You would enable future people to read this, but only if they cared enough.

If you look at what AI… there’s many, many flaws with current AI systems still, but they are on a fairly impressive curve. And one thing that has gotten dramatically better is information discovery, in the sense that you can ask an AI for “Hey, where else in literature has something like this been discussed?” And with search grounding, it’ll sometimes hallucinate things, but by and large, you’ll get a much better search engine that can contextualize and summarize documents. And for any of these decision records, if you hook that up to a retrieval augmented generation and an AI, you can then ask natural language queries about decisions and then refer to the original documents when you don’t trust what the AI has summarized. But the discoverability of the reasons is much augmented by these things.

It’s one of the most amusing corollaries from the advent of AI, that a lot of the things that I like from my engineering, anyhow– like having a design doc of what should be done beforehand, discussing the design doc, then going on to implementation and checking in the design ideally of the repositories so people can refer back to it, but also just having a style guide, having your values documented, like what solutions do you prefer? –All of these things are helpful when you onboard a new person remote, but they’re also extremely helpful when you let an AI agent let loose on your code.

The value of unit tests is so enormous with these AI agents, because they have many stupid ideas that they try, and if you don’t tell them, “Please make sure that all the unit tests pass and the output is quite identical to what we had beforehand,” they will do something really stupid and declare victory. And then you’re involved and you need to weed through all the stupid ideas. But what helps your engineers to onboard themselves remotely helps an AI work better, and that’s actually exciting, because I feel vindicated in some sense by you do this work, and now technology comes along that leverages the work that you’ve done to make further decisions.

NS: I was thinking of Tracy Bannon’s comment about AI being that teenage apprentice in the woodshop, and what you just said brings that back. It’s one of my favorite things that I’ve heard, and I heard that through Chris.

We can make AI less stupid. Can we use decision records to impart ideas about how we do things to non-technical executives, especially people in things that influence but might not interact with engineering, like revenue and sales? How can those things help? Do you guys have any use cases there?

CS: I think absolutely, yes. I talked already about how we’ve got essentially two classes of decision record. One, the architecture decision records in engineering. But the other is the more general decision records around running the business. And that’s absolutely part of the executive focus.

I think this then becomes a central pillar of the culture of the organization. That, first of all, we’re intentional about basically everything that’s non-trivial– and that’s a surprising thing.

We talk about “vibe coding” a lot at the moment, but we’ve been doing a lot of “vibe orging” since time immemorial. And actually, if you step away from vibe orging and say, “Right, we’re going to be intentional about all of the non-trivial things that we do, and hence we’re going to write them down, and we’re going to debate them, and we will understand the pros and cons, and we will occasionally revisit this if we need to.” Then it’s just a much more thoughtful way of going about things.

I think this brings us into some of the practices that Simon Wardley discusses with his approaches to mapping. I think mapping is a structure by which you get intentional consensus around things. As it starts to push aside the, “I’ve got the highest paid person’s opinion, and therefore you have to do what I tell you,” as opposed to “We’ve got a bunch of smart people in the room, and hopefully we’ve only hired smart people, and collectively we’ve decided that this is the best way to go.”

TD: There’s a part where the documenting that a process has happened is almost as important as, or sometimes more important than, the output of the process.

One thing I find particularly hilarious is when you look at seed-stage pitch decks for startups; they all have some sort of calculation of financials at the end, and everybody knows they’re all fiction. There’s the Excel sheet with the financials for a seed-stage startup. No investor has any expectation of that actually being what happens. But you need that artifact that needs to look somewhere plausible, not to deceive anybody, just to show somebody sensible has sat down, and the person that sat down and did this is not entirely delusional.

The output itself has zero predictive value. It literally only serves as documentation. Somebody thought about this, and that person wasn’t openly crazy.

CS: I think when you get into doing due diligence on those types of things, you very quickly get into “What are the underlying assumptions here, and are they good assumptions?” And you might also be doing a sensitivity analysis of, “If I adjust these cash flows by these deltas, where does that actually take me?” And you can then hone in on “Is this a great big dollop of wishful thinking, or can I actually relate this to some conceivable real world?” All of those are useful things to do.

I think where those pitch decks hit reality is when people are doing that analysis, which any mature investment organization is likely to be doing. And, of course, the flip side of that is we can see, particularly, mergers and acquisitions going horribly wrong.

TD: Sorry, did you say murders or mergers?

NS: (Laughs) Yes.

CS: I was about to say, they go horribly wrong when somebody (normally a senior executive) pins their reputation on “We’re going to make this thing happen, and we’re going to make it happen by buying this company.” And that then often becomes a blinder to everything that could be wrong with the financials.

NS: Yeah.

Can you give us your lightweight decision record template? If you were talking to a friend who’s a CEO and they just want to start a company, or you’re talking to a friend who is a CEO and they already have a company, what are the things that they should do tomorrow to start this process? How do they do it?

TD: When you create a process for your company, any process, you document what your reasons are– in any way– free form text! If you want to have a template, you can have templates. The reality is any template will have to evolve over time anyhow. I think this may be the big benefit of the modern natural language processing tools: that you can deal with the fact that your data is going to be unstructured. Just write down what are your motivations, why you’re doing this.

When we did the optimized interview process, we essentially started from “The interview process should closely resemble the actual work. The actual work looks like this. So we structure the interview process like that.” And then you go with it.

Another useful thing to document is when you make your decision, you document the decision and you also document what new data would make you change your mind.

NS: Oh, nice.

TD: And once you get into the habit of it, it’s– if you’re mean you could say this is journaling for managers– but it is journaling for managers. But it’s also useful.

CS: I think Thomas just hit upon the key thing, which is start with “Why?” Why are we even thinking about this? And then you get into the what’s and where’s and who’s and alternatives that were considered, and all of that stuff, and that can all be fairly easily templated. Sometimes that consideration of the other things that we could have done is really important, because you’re at a fork in the road or potentially a complete branch structure. How do you pick one particular path?

And other times, it’s obvious that you think at that moment you should be heading in a particular direction, and it seems like the only path that makes sense, but that whole thing of explaining the context of the decision in order to then be able to go back and reevaluate. And this is also where it relates to the scar tissue. As you put things in place that deal with the scars, knowing whether the circumstances that led to the scar are going to still be true for future you, or future organization, down the line is really important.

NS: I want to thank you both, Thomas and Chris. I’m Nick Selby. This has been Velocity’s Edge podcast.