When is it okay to use short variable names?

MP 74: Naming things is hard, but we can get better at it.

I’ve been doing a lot of refactoring work lately. I expect new people to start contributing to the project I’m working on as it approaches its first stable release; with that in mind, one focus for this refactoring work is making the code more readable to people who aren’t already familiar with the overall codebase.

Recently I was working on a section of code that uses a number of list comprehensions. I rarely use single-letter variable names, and was surprised to find a number of places where single-letter names made the code more readable.1

Allowed modifications

The code I’m working on modifies the user’s project so it’s ready for deployment to a remote host. Here’s the main form of the command that people run when using this project:

$ python manage.py simple_deploy --platform <platform-name>

This command makes configuration changes to the user’s project based on the platform name they specify. Users can then run their platform’s deploy or push command, and they should have a working deployment of their project.

Ideally, the user should have a clean git status before running this code; all the changes made to their project should be contained in a single commit. This makes it easy to see what configuration changes were required for the target platform. It also allows the user to easily roll their project back to a clean state if they decide they don’t like any of the changes that were made.

When users run the simple_deploy command, it runs git status in the background before making any changes. If the output indicates the presence of uncommitted changes, the project exits with a message asking the user to commit their existing changes and then run the command again.

Terminal window showing a run of `$ python manage.py simple_deploy`
Running simple_deploy with uncommitted changes generates an error message. You can override this behavior using the —-ignore-unclean-git flag.

However, there are some uncommitted changes that are acceptable, which shouldn’t block the code’s execution. For example, the user may have added django-simple-deploy to their project’s requirements. Or, they may have run the command once and fixed an issue that blocked configuration. Running the command creates a log directory, and that directory is added to .gitignore. We don’t want to block execution based on these kinds of changes.

Examining changed files

One of the simplest ways to check for uncommitted changes is to see which files have been changed. Here’s two lines of code from a function that checks whether it’s okay to proceed with modifying the user’s project:

if any([path.name not in allowed_modifications for path in modified_paths]):
    return False

This code looks for any file that’s been modified, that’s unrelated to a simple_deploy run. If any such files exist the function returns False, indicating it’s not okay to proceed.

Significant and insignificant names

Let’s look at just the list comprehension in this code:

[path.name not in allowed_modifications for path in modified_paths]

We’re thinking about how to name things, so let’s write down all the names used here:

path
allowed_modifications
path
modified_paths

Two of these names are defined outside the comprehension: allowed_modifications, and modified_paths. The other name, path, is only used inside the comprehension.2

Here’s a version of the comprehension that de-emphasizes the name path:

[p.name not in allowed_modifications for p in modified_paths]

We don’t often use single-letter variable names, because they lack context. But in a comprehension, all the context is contained in a single line. Using the name p here emphasizes a few things:

  • It’s the name attribute of the path that we’re focusing on;
  • We’re looking for names in allowed_modifications;
  • The paths we’re examining are coming from modified_paths.

These are exactly the things that I want to call the reader’s attention to, if they’re unfamiliar with this codebase.

This is especially noticeable if we make the opposite kind of change, to a more verbose set of names:

[modified_path.name not in allowed_modifications for modified_path in modified_paths]

This is a common way to name things if we’re accustomed to using plural names for lists, and then using the singular version of that name in the opening line of a for loop:

for modified_path in modified_paths:
    ...

In the context of a full loop, where the first line is less busy than a comprehension, this naming approach works. That’s especially true if the block that follows has any degree of complexity.

Coming back to any()

If it’s not clear what this code does, consider the emphasis shown in this version of the comprehension:

[p.name not in allowed_modifications for p in modified_paths]

The bold expression here will always evaluate to True or False. So we’ll end up with a list like this:

[False, False, True, False]

Wrapping any() around this list returns True if any of the values in the list are True, and False otherwise:

>>> any([False, False, True, False])
True

The original code is a little hard to reason about out of context. If any of the modified files are not in the list of allowed modifications, the function returns False, indicating it’s not okay to proceed in configuring the user’s project. If none of the modified files are in that list, it will return True and we can proceed.

Conclusions

Naming things really is hard. I think when people say that, we often think about times where it was hard to come up with a descriptive name for an abstract concept we were working with. But many times there are smaller naming decisions that affect how readable our code is, especially to people who aren’t very familiar with the overall codebase.

When you’re writing a comprehension, consider using short names for the variables that only exist inside the comprehension. They should make sense to people reading your code, and draw attention to the more significant names that exist outside the comprehension itself. If you recognize other situations where a variable is only used in a single line, or in an otherwise isolated context, consider using short names there as well.

Don’t go overboard. For example, single-letter variable names in a standard for loop will probably make your code less readable.


  1. The subtitle of this post is a play on an old programming joke that most readers have probably heard many times before. If you’re not familiar with this joke, here’s one variation:

    There are two hard things in programming: naming things, cache invalidation, and off-by-one errors.

    If you haven’t heard this before, I’m happy to be the first to share it with you. :)

  2. Note that name in path.name is also used in the comprehension, but it’s not a variable name that we have control over. There’s no meaningful way to change the name of name.