Exploring recent Python repositories, part 2

MP 35: Looking at projects that haven't earned many stars yet.

In the last post we looked at some of the most popular Python projects on GitHub that were created in the last year. In this followup post we’ll look at some of the newest projects, that have gotten just a tiny bit of attention.

It’s always interesting to see what people are doing with Python. You might want to reach out and collaborate on one of these projects, or you might want to try building a similar project on your own. If nothing else, surveying other people’s work like this can give you a better idea of the range of things people are doing with Python.

The main idea of this post is to search for projects that were created in the last few weeks, and have gained a small number of stars. This filters out the huge number of projects that have been pushed that have no stars, which are too numerous to look through meaningfully. It highlights the kinds of projects that have been pushed, and immediately gotten some notice from other programmers.

Note: Most of the projects that meet the criteria considered in this post have not been vetted at all. Before downloading, cloning, or running any of the projects described below, make sure you make your own assessment of the project. Most are probably not malicious, but new projects can often have unintended effects when they start to be run on different systems, and used in ways the creator was not necessarily anticipating. Also, look for a license file before deciding whether to borrow from or contribute to one of these projects.

Simplifying queries

Most of the code for this post is the same as last week’s; you can see the full notebook for this week’s post here. I added a few terms to the function that prunes AI-focused projects from the query results.

To make it easier to run multiple queries with a variety of date and star ranges, I also added a function that makes it easier to run additional queries:

def get_repos(start_date, end_date, star_range):
    """Get repos matching given conditions."""
    url = "https://api.github.com/search/repositories"
    url += f"?q=language:python+stars:{star_range}"
    url += "+NOT+gpt+NOT+llama+NOT+chat+NOT+llm+NOT+diffusion"
    url += f"+created:{start_date}..{end_date}"
    url += "&sort=stars&order=desc"
    url += "&per_page=100&page=1"

    repo_dicts = run_query(url)
    pruned_repos = prune_repos(repo_dicts)
    summarize_repos(pruned_repos)

This function takes in a start and end date of the form '2023-06-15', and a star range of the form '3..10'. It then inserts these values into the query URL, and runs the query. If you want, you could add one more parameter to this function and make it easy to request additional pages of results.

We can now run queries with a single line of code:

get_repos('2023-05-15', '2023-06-15', '3..25')

This will show a summary of all the projects created in a recent one-month period, that have gained between 3 and 25 stars.

The volume of new projects pushed to GitHub

Before digging into the actual projects of interest, let’s run a query that doesn’t require any stars, just to see how many Python projects are being pushed to GitHub on an ongoing basis:

get_repos('2023-05-15', '2023-06-15', '0')

This query looks for all Python projects created in a one-month period, with 0 stars. There are a lot:

Total repositories: 296033

People push over 300,000 Python repositories to GitHub every month!1 That should give you some sense of how hard it is to get noticed, and how difficult the problem of building a responsive Search API is at GitHub’s scale.

Projects with just a few stars

I’m sure there are many good projects that don’t have any stars, but there are so many it’s almost impossible to look through them meaningfully. Let’s look at the projects that were created in the last month, and have between 3 and 10 stars. These are probably projects that a few people noticed initially, but which haven’t been able to attract sustained interest:

get_repos('2023-05-15', '2023-06-15', '3..10')

There are about 2,000 projects that meet this criteria:

Query URL: https://api.github.com/...
Status code: 200
Total repositories: 2225
Complete results: True
Repositories returned: 100
Keeping 54 of 100 repos.

Of the 100 projects returned, 54 are not focused on AI. Here are some of the projects that stood out to me from this set:

AbletonAutoColor

I’ve started to learn piano recently, and I’ve become interested in synthesizers as well. That has me looking at DAWs, so this project caught my eye. It looks like it gives a different color to every track in an Ableton project, based on the track’s name.

bluesky-feed-generator

I haven’t joined Bluesky, but I’ve certainly been watching what happens as aging social media giants self-implode and drive their users elsewhere. If you’re a Bluesky user, you might be interested in how people are building on that platform. It looks like this project chooses which posts to put into a user’s feed.

CTKThemeMaker

I’ve used Tkinter occasionally, and I have certainly found the default styles functional but plain. This project appears to help people build custom styles for Tkinter, and the one screenshot in the README does look better than Tkinter’s default theme. Interestingly, I also came across ctk_theme_builder in an earlier query when drafting this post. Maybe there’s a push to update Tkinter themes lately?

A narrower timeframe

If you look at all the projects over a month-long period with a range of 3 to 10 stars, all the projects that are returned in the first page of results have 9 or 10 stars. I was curious to see what kinds of projects would have fewer stars. One way to find these projects is to narrow the timeframe even further:

get_repos('2023-06-10', '2023-06-15', '3..10')

In this 5-day range, there were 301 projects returned, and 49 of the first 100 projects pass the AI pruning filter. The projects that make the cut have 5-10 stars. Projects with 5 stars seem to be the lower bound for finding interesting projects, because those projects include a bunch of repositories with no descriptions. Those projects might be interesting, but it’s hard to tell without visiting each project to look at its README.

py-factory

I found one interesting project in this query. py-factory is a Pygame project, inspired by the popular game Factorio. Pygame is a fantastic framework for learning about how games are developed. If you’re interested in Pygame or like Factorio, this might be an interesting project to take a look at. There’s a short demo on YouTube available if you’re interested as well.

Screenshot of py-factory game running.
py-factory runs on my system. I don’t know how to play Factorio though, so I had no idea what I was doing.

Other interesting projects

In the course of writing this post, I played around with a number of queries that aren’t worth describing specifically, but led to a few more interesting projects.

pydoclint

I never used to like code linting and formatting tools, for a variety of reasons. But consistency across codebases and across projects is really beneficial, so I’ve become a convert. This linter focuses on verifying that what’s listed in a docstring matches what’s in the actual function signature.

snapsht

This project attempts to automate taking screenshots of browser windows when the page contents extend beyond the window borders. I’ve had trouble with this recently, and I’d be curious to know next time I run into that situation if snapsht could help.

python-image-tools

This is a simple wrapper for Pillow that focuses on downloading and modifying images from URLs. If that’s something you do often, this may simplify your workflow somewhat.

Conclusions

Exploring the kinds of projects that people are working on is a good thing to do from time to time. I hope you’ve found something interesting in the projects and approaches discussed in these last two posts.

If you’ve found an interesting project in your own explorations, or you have a project of your own to share, please feel free to mention it in the comments.

Resources

You can find the code files from this post in the mostly_python GitHub repository.


  1. Remember, this only includes the repositories with exactly 0 stars. If you expand that range, you’ll see well over 300k repositories.