Using `git clean`
MP 131: It seems like Git is always more capable than I realized.
I've been using Git for almost 20 years now, and I still discover new ways to use it more effectively. I really appreciate a tool that has more to offer every time I use a new workflow.
In my current work, I found myself running this set of commands frequently:
$ git reset --hard <commit-hash> $ git status <multiple unstaged files and dirs> $ rm -rf new_dir $ rm new_file_1 new_file_2 $ git status nothing to commit, working tree clean
Running git reset
brings the project back to its original state, but doesn't get rid of newly-created files and directories. I've been using git status
to see these files and directories, and then removing them manually. After doing this enough times, it occurred to me there's probably a command that does this automatically. It turns out there is: git clean
, which removes untracked files from your project.
Developing a project that acts on another project
I'm currently focused on making the 1.0 release of django-simple-deploy, and I use Git differently in this project than most other projects I've worked on. In most projects, I only use Git to manage the codebase of the actual project. But django-simple-deploy is a project that acts on another user's project. That means during development work I'm managing two codebases; the core project itself, and the sample project that django-simple-deploy is acting on.
Here's a typical workflow:
- Open all the necessary files and resources for django-simple-deploy.
- Open a sample project that django-simple-deploy can act on.
- Run the
deploy
command in the sample project. - Make changes to django-simple-deploy.
- Re-run the
deploy
command in the sample project.
That last bullet point is key; every time I run deploy
, it modifies the sample project. It modifies existing files, and creates new files and directories. Before running deploy
a second time, I need to reset the sample project to its original state.
A real-world work session
Here's what it looks like in an actual terminal session. First, we check the status:
test-project$ git status nothing to commit, working tree clean
Okay, there's a clean status. We can run the deploy
command against a clean version of the project:
test-project$ python manage.py deploy --unit-testing Configuring project for deployment... ... Your project is now configured for deployment on Fly.io.
Great! The deploy
command ran without any obvious errors. Let's see how it modified the project:
test-project$ git status Changes not staged for commit: modified: .gitignore modified: blog/settings.py modified: requirements.txt Untracked files: .dockerignore Dockerfile fly.toml no changes added to commit
This is exactly what I hope to see after running deploy
in the default configuration-only mode. It's modified some existing files, and created a few new platform-specific files as well.
When a user runs deploy
they follow this up with their platform's deployment command, such as fly deploy
. But I'm doing development work, so I want to reset this project to its original state in order to run the deploy
command again.
Here's what I've always done to reset the project:
test-project$ git reset --hard HEAD HEAD is now at d474e3b Added simple_deploy to INSTALLED_APPS. test-project$ git status On branch main Untracked files: .dockerignore Dockerfile fly.toml simple_deploy_logs/ nothing added to commit but untracked files present test-project$ rm .dockerignore Dockerfile fly.toml test-project$ rm -rf simple_deploy_logs test-project$ git status nothing to commit, working tree clean
I run reset
to undo changes made to existing files. But this still leaves the untracked files and directories. One of the modified files was .gitignore, which was updated to ignore a new log directory. That directory now shows up in the list of untracked files as well.
To fully reset the project back to its original state, I run rm
to remove each of the new files, and then rm -rf
to get rid of the new log directory. More than just a bunch of typing, this is time-consuming because I'm reading the status in order to know what to remove. This list is different for each platform, and it can be different depending on what kinds of bugs are showing up in the deploy script.
git clean
to the rescue
That last step is much simpler using git clean
:
test-project$ git reset --hard HEAD HEAD is now at d474e3b Added simple_deploy to INSTALLED_APPS. test-project$ git clean -fd Removing .dockerignore Removing Dockerfile Removing fly.toml Removing simple_deploy_logs/ test-project$ git status nothing to commit, working tree clean
When git clean
runs, it identifies all the untracked files and directories, and removes them. This is a much simpler workflow. It's not just less typing, it's less thinking. It works the same for every run of the deploy
command. It doesn't matter which deployment platform is being targeted, and it doesn't matter if there was a bug in the deploy script that made a different set of files and changes.
This is especially helpful for integration testing. The integration tests for django-simple-deploy make a temporary copy of the sample project, and reset it after each test module runs. The resetting process was especially verbose, with a growing list of files to check for and delete if necessary. All that code was replaced with a single call to clean
in the test script.

git clean
made part of the integration testing flow much simpler as well. Commits that remove more code than they add are so nice to see!A closer look at git clean
Here's the short description of git clean
from the documentation:
git-clean
- Remove untracked files from the working tree
It can be helpful to see the output without any flags, in the context we've been using:
$ git clean fatal: clean.requireForce defaults to true and neither -i, -n, nor -f given; refusing to clean
This is great! clean
is a destructive command, and it refuses to act without more specificity. We have to choose a mode: -i
for interactive, -n
for a dry run, or -f
to force removal of untracked files.
Let's make a dry run:
$ git clean -n Would remove .dockerignore Would remove Dockerfile Would remove fly.toml
The phrase dry run is common in CLI work; it shows what changes would happen if you ran the actual command. Here we can see exactly which files would be removed if we run git clean -f
.
You might notice that the log directory is not shown here. The -d
flag tells clean
to remove untracked directories as well. Let's do a dry run, including this flag:
$ git clean -d -n Would remove .dockerignore Would remove Dockerfile Would remove fly.toml Would remove simple_deploy_logs/
Including the -d
flag adds the log directory in the list of resources to be removed.
Conclusions
Some people, through a variety of experiences, have learned most or all of what Git can do. But I think many of us have just learned the pieces that are most relevant to our own work. If you find yourself using multiple commands and a tedious workflow to manage the files and directories in your project, make some time to find out if Git can be used in a way that makes your workflow more efficient and less error prone. There's a fair chance it can.