Shared constants across programming languages

Pondering boundaries and dependencies; bugs that exist and need not

Feb 13, 2023

A recurring bug

At work we recently had a little bug that caught my eye. It’s a bug that I’ve seen before, and I’ll see it again. It looked a little bit like this:1

bucket = get_experiment_assignment()
if bucket == "variant_1":
  return True
return False

Don’t spend much time trying to spot it. It slipped past this code’s writer and their reviewers. It survived some time before anyone traced back a problem to this part of the code.

The bug is that “variant_1” should be “variant”.

See, you couldn’t have known that. People here didn’t. The string isn’t even always “variant”. Sometimes it is indeed “variant_1”, or “variant1”, or “variant_one”.

Regardless, you might have picked up on the code smell of defining a string deep in code instead of a defined variable. Let’s update that:

variant_bucket = "variant_1"

# 300 lines of code in between

bucket = get_experiment_assignment()
if bucket == variant_bucket:
  return True
return False

Okay, that’s clearly not going to fix the problem.2 We need the variable to be defined elsewhere, somewhere that the other end of this system also sees it. That way we can’t get it wrong: if we refer to the wrong name then our code won’t even compile (if in a compiled language) or won’t run without triggering errors (in an interpreted language).3

import constants

bucket = get_experiment_assignment()
if bucket == constants.variant_bucket:
  return True
return False

At this point, we shouldn’t even send strings around instead of using a low-memory constant. The content of the string is irrelevant. constants.variant_bucket could be an enum in a language that supports them.

This is starting to look better. But does it always solve the problem?

Constants across languages

What happens if the two sides of this system, which I’ll call server and client, are in two very different pieces of code?4 Different programming languages, even. How do you share constants across repositories and across programming languages?

I searched for how other people answered this problem. If you squint you can look at this as a configuration problem, for which some people use git submodules or build configuration services. More directly, there’s at least one person who put the effort in to defining a cross-language constant library. Those options have their drawbacks, including new dependencies and complexity, and not all approaches will provide compile-time checks that can really minimize bugs.

Thankfully, most of us are already using something across our projects with the functionality for shared constants (and more): either Protobuf or Thrift. While Protobuf and Thrift have differences, in the scope of what we’re solving here they act similarly and you can treat them interchangeably, as the Google and Facebook versions of the same thing. Both are specification languages which come with tools to automatically generate code in any of a large number of programming languages. They are typically used for defining data structures and APIs. They efficiently serialize data for sharing across the network or through files, such that the reader using any language knows how to unpack the data regardless of the writer’s language. Vastly different pieces of code can communicate to each other and structure their data in compatible formats, across languages, using Protobuf or Thrift.

Since constants in Protobuf/Thrift will generate native variables for each programming language, it will be impossible to typo the variable name without fundamentally breaking the code and hitting compile-time or run-time errors.5 Using Protobuf/Thrift to define constants might be such a simplified use case that it might not be obvious. But it’s a powerful way to prevent some of these bugs, and it's an easy one if both ends of the system already use the same Protobuf/Thrift definitions.

But this still isn’t stopping our bugs.

Barriers to using Protobuf or Thrift

We use one repository for Thrift definitions. To define shared constants in Thrift we need to review and merge a change to that repository, then we need another change to regenerate the language-specific Thrift wrapper in the server’s code. Then we’ll write a change to actually set the appropriate constant in the server. The client code then will similarly have two changes, one to regenerate the Thrift wrapper and one to use the constant. We’re up to five code reviews. Without using Thrift as the intermediary, we had only two code reviews: One for server code that sets strings to communicate with its clients, and one in client code to check a value against various strings and act accordingly.

Using Thrift involved more work, needing more reviews and deployments. Sometimes that overhead has prevented adoption.

In our case, we had yet another obstacle: the server-side definition of this constant isn’t in a language supported by Thrift or Protobuf. It’s a language that doesn’t even natively support shared variables. It’s yaml. Which we could use as an alternative to Protobuf/Thrift for constants, but without functionality for code generation and compile-time checks. We would need more wrapper code around using yaml configurations, and still we could fall prey to our typos.

We need to design solutions that work with the easiest path, because the easiest path becomes the default. The riskiest moments are not when programmers would be rigorous in following optional and harder paths. We need a solution that works when people are at their most error-prone.

Automated testing is only a partial solution

Note that unit testing won’t solve the problem, because our unit tests also won’t have a way to ensure consistency across languages and repositories. The client’s unit test will mock the server input, and if the client’s programmer writes the string wrong in their code then they’ll write it wrong in the test too. If we’re lucky, sometimes the act of writing a test will trigger them to check the correct value.

We would need good integration testing. The client needs to call the server and then act on its data. We need to be able to coordinate both client and server in automatic integration testing, which is hard. The testing needs to be thorough: we need to test both paths of the if condition and verify that we have the right behavior in both. Measuring code coverage in integration tests can help. A downside of thoroughness is slowness, as each incremental integration test extends runtimes for testing and pre-deployment checks.

This entire bug class doesn’t have to exist

Before working at a typical microservices-oriented, multi-language, multi-repo startup, I worked at Google. Google famously has a monorepo, where all code lives in one giant repository.6 They aren’t the only large company with this approach, but they’re the one I’m deeply familiar with.

There’s a trope at Google that everything is a Protobuf, and it’s true. Nobody at Google would ever write the code snippets above. It would seem foreign to them. They would instinctively reach for a Protobuf. Google makes this easier for developers than the multi-repo approach I described earlier. At Google you would typically write this in three code changes: One to define the proto file, one to write server code using it, and one to write client code using it. The build tooling knows that if you’re referring to a proto definition then it needs to rebuild proto wrappers when the proto file changes. Googlers—that being the name for Google employees, of course—might even write this as one commit if the change is trivial enough, across Protobuf code and the two programming languages at the same time. Their IDEs can identify usage of protos easily, in ways that your IDE might not since your repositories aren’t connected together.

Why does Google use a monorepo? They wrote a paper on it, and the biggest reason is included right in the subtitle: “a common source of truth” [1].7 As they expand later, "if one team wants to depend on another team’s code, it can depend on it directly". Being able to directly refer to or use other code is incredibly powerful. Code at Google has positive scaling properties through easy reuse and references.

The monorepo versus multi-repo debate is fascinating. The magic of how Google can pull off a monorepo is in their build tooling. The code you change, and its affected code, will build and deploy. The rest won’t. You also never actually download the entire repository. Google engineered themselves out of every one of the monorepo problems described in this excellent short talk by Aimee Lucido at Uber. Google gets the best of both worlds, of easy uniformity and code reuse across disparate codebases, while having flexible integration and deployment. Their solution can be better on all dimensions not as a property of monorepos specifically, but because they have an incredible number of people building and maintaining their tooling.8 Multi-repo architectures could also add tighter coupling between various repos and act a bit like Google’s monorepo, and surely if Google had started with a multi-repo setup they would have evolved it to have many of the good properties of their monorepo.

Google can invest billions in their build tooling.9 The rest of us have to choose between harder trade-offs. Most of us have chosen code isolation and independence along repository boundaries, at the cost of harder coordination across those boundaries. In searching for how others have approached the problem, many accept that they’ll have duplicate constants in multiple parts of their code.

What I like about Google’s monorepo approach is that they execute it well enough that the inherent complexity is usually unobserved by the wider body of engineers, leaving them with a simple way to avoid this bug class by default. Google’s design scales well to a complex aggregate system of services and dependencies, and it also scales well to a complex aggregate system of tens of thousands of engineers. At Google’s size, it’s been a good investment.

Many of the rest of us exist in a middle space, where our companies are large enough to use a multitude of languages and repositories, with intricate dependencies, yet with tooling that is at least a step behind a growth.

That means I’m stuck with this bug class, where we use strings to define behavior and then inconsistently spell the strings. We get it right almost all the time, but we do it often enough that there is a slow trickle of bugs. Then I feel déjà vu all over again.

In my next post I’ll describe one attempt to limit the problem, and (spoiler alert) it’s not a migration to a monorepo.

[1] Levenberg, J. (2016). Why Google stores billions of lines of code in a single repository. Communications of the ACM, 59, 78 - 87.

I’m writing this in Python even though our relevant code is in different languages. Python may be a lingua franca for programming examples.

And it didn’t: this example is actually a little bit closer to our actual code, which did have many lines in between string definition and use. That’s not exactly better.

In an interpreted language we might not catch it before deployment, if we don’t have full coverage in testing. But even the most cursory of unit tests should protect us.

In my case, both “client” and “server” are typically services in a microservices architecture.

It is still possible to refer to a different variable, unfortunately. But at least it will have to match the type (in a typed and compiled language), and furthermore we won’t have errors from simple typing mistakes the way we can with raw strings.

With exceptions including Android and Chrome.

You can also see one of the authors, Rachel Potvin, explain this in a talk here.

If you worked there, think of Piper, CitC, Cider, CodeSearch, Critique, Blaze, TAP, and others.

Literally, I imagine, or at least on that order of magnitude. Consider the size of their tooling teams, their salary and stock compensation, and the age of Google. For good measure, throw in some other costs such as compute resources.

Simplicity is SOTA

Discussion about this post