OOP in Python, part 19: When do you need OOP?

MP 66: And when do you not need to use OOP?

Note: This post is part of a series about OOP in Python. The previous post showed how to use composition and inheritance together in one project. The next post reflects on this series as a whole, pointing out the highlights and most significant takeaways.


Once people start to understand how to use OOP concepts, two important questions come up:

Now that I know how to use classes, how do I know when to use them?

When is it okay to write code that doesn’t involve OOP?

These are great questions. Sometimes people learn how to write classes, and then think everything should be written as a class. A lot of this comes down to experience, but there are some general principles that can help you know when to structure something as a class, and when to use simpler structures such as a collection of functions.

In this post I’ll share an example from my own work where I chose specifically not to use OOP, and one where I decided to reimplement non-OOP code using a class-based structure.

Simple procedural scripts don’t need OOP

A procedural script is one that proceeds from one instruction to the next, with little or no redirection to other parts of the script. Many small scripts that we write are procedural in nature.

Substack hosts weekly Office Hours for writers, and I like to participate sometimes. It often takes place in a single comment thread, and the comments stack up quickly. It’s one of those online events where showing up as soon as it starts is really helpful. The problem though, is that it doesn’t start at a consistent time.

To address this, I wrote a script that polls the Office Hours page once a minute for up to 90 minutes, and opens the page when a new session is posted. Here’s the entire script:

import os, sys, time
import requests

url = "https://on.substack.com/s/office-hours"

for attempt_num in range(90):
    r = requests.get(url)
    print(f"Attempt #{attempt_num}. Status: {r.status_code}")

    if " mins ago<" in r.text:
        cmd = f'open -a "Google Chrome" {url}'
        os.system(cmd)
        print("  Found a new post!")
        sys.exit()
    else:
        print("  No new post found. Waiting one minute...")
        time.sleep(60)

This script defines a URL and then starts a for loop, which runs 90 times. It fetches the page, and shows a message that an attempt to retrieve the page was made. Each Office Hours page has a timestamp, which reads something like “Oct 26” for an older post, or “23 mins ago” for a fresh post. When the phrase “ mins ago” appears on the requested page, this script opens that page in a new Chrome window. (A style rule formats the timestamp using uppercase, but the source text uses lowercase.)

html card titled "Writer Office Hours", with a description and a timestamp. The timestamp reads "23 MINS AGO".
One of the elements on the Substack Office Hours page. New sessions are identifiable by the phrase “_ MINS AGO” in the timestamp.

I usually start this script while I’m doing other work, and a new browser window appears whenever the Office Hours thread starts. If I’m not working at my computer, I just put my laptop somewhere I can see it and then start the script. When I see a browser window open, I check if that day’s conversation is relevant to my work.

No need for OOP

There’s really no need to do anything more with this script. I’m the only one who uses it, and it does exactly what I need it to. There’s a certain kind of Office Hours that the script doesn’t pick up, but that’s a themed event that I’m not interested in. If I become interested in that later, I’ll build this out a little further to handle that case as well. But that would most likely be an elif block, not any OOP-related code.

How would you use OOP here?

You could convert this short script to use a class-based approach. Here’s what that might look like:

class OOMonitor:

    def __init__(self, url, target_string, num_attempts=90):
        self.url = url
        self.target_string = target_string
        self.num_attempts = num_attempts

    def start_monitoring(self):
        for attempt_num in range(self.num_attempts):
            r = requests.get(self.url)
            ...

if __name__ == '__main__':
    url = "https://on.substack.com/s/office-hours"
    target_string = " min ago<"

    monitor = OOMonitor(url, target_string)
    monitor.start_monitoring()

The __init__() method accepts a URL and a string to look for on the page. You can optionally set the number of times to load the page. The start_monitoring() method runs the loop that loads the page and looks for the string.

Making this change would allow you to use the tool to watch any page, not just the Office Hours page. It also creates a nice interface for the monitoring tool, and it creates a better structure for extending the project. For example you could add a method to end the monitoring, you could set flexible delays between reloading the page, and you could define any actions you want to take place when the string is found.

Practicing OOP

Even if you’re not looking to extend a small script like this, you can convert smaller scripts to a class-based structure for the sake of practicing OOP. This is really good practice, because it gets you out of doing exercises that other people have designed, and pushes you to think about different ways of structuring your own code.

If you have a small procedural or function-based script that already works, try reimplementing it using a class-based approach. You’ll know your conversion was successful if the OOP version has the same functionality as the original script.1

Too many functions sharing data are a problem

There’s nothing wrong with avoiding OOP entirely and structuring your code as a set of functions that you call as needed. In many cases, this kind of approach is easier to reason about than a class-based approach. A function-based approach can break down, however, when your functions start to share too much data.

This issue often shows up as a growing list of arguments in your function definitions and calls. When multiple functions take a long list of arguments, it quickly becomes harder to reason about the program’s behavior.

A function-based game

When I first learned to use Pygame, all the resources I came across used a function-based approach to building games. So when I wrote up the game project in the first edition of Python Crash Course, I used that same approach. Alien Invasion is a simple 2D space-shooter game with a ship at the bottom of the screen, and a fleet of aliens filling most of the screen.

ship at bottom of screen firing at a screen full of alien ships, with a scoreboard at the top
The finished version of Alien Invasion, a classic 2D space shooter game.

This approach is perfectly reasonable in the early stages of a game. Leaving out some code, here’s the main game loop at a point where the ship can be moved back and forth across the screen, and it can fire bullets that move up the screen:

import game_functions as gf

def run_game():
    ...
    # Start the main game loop.
    while True:
        gf.check_events(ai_settings, screen, ship, bullets)
        ship.update()
        gf.update_bullets(bullets)
        gf.update_screen(ai_settings, screen, ship, bullets)

run_game()

Most of the game’s logic is implemented in a set of functions, stored in the module game_fuctions.py. There’s one function in the main file, called run_game(). To start a game, you call this function.

The run_game() function starts the main game loop. This loop checks for game events such as button presses and mouse movement. It updates game elements such as the ship and any bullets that have been fired. Finally, it updates the screen on every pass through the loop. This is a pretty typical game loop, and it’s reasonably straightforward to understand the implementation by examining each function that’s called from the main loop.

There are some signs here already that the functions are doing too much work. We haven’t even dealt with aliens yet, and two functions, check_events() and update_screen(), already need four arguments to do their work. Also, these two functions each use the same set of arguments. If they were part of the same class, they would have access to most (or all) of those elements through the self object.

A growing problem

By the time the game was complete, the list of arguments in the game loop had grown significantly:

def run_game():
    ...
    while True:
        gf.check_events(ai_settings, screen, stats, sb,
            play_button, ship, aliens, bullets)
        
        if stats.game_active:
            ship.update()
            gf.update_bullets(ai_settings, screen, stats,
                sb, ship, aliens, bullets)
            gf.update_aliens(ai_settings, screen, stats,
                sb, ship, aliens, bullets)
        
        gf.update_screen(ai_settings, screen, stats, sb,
            ship, aliens, bullets, play_button)

Most of the main game functions, such as check_events() and update_screen(), needed access to all of the significant resources in the game: the overall game settings, the main screen object, the game’s statistics, the scoreboard, the ship, the group of aliens, the group of bullets, and the main Play button.

Embarrassing to look back on, but it worked

I’m a little embarrassed to look back on this code. I kept it like this because I was running up against a publishing deadline, and I knew the game worked as it was written. I didn't have time to rewrite it using a different approach, and I was much more comfortable releasing a somewhat clunky implementation that I was confident in, than a more elegant implementation that I had rushed out the door.

Programming is often about finding a balance between the best implementation you can come up with, and putting your work in front of other people.

Looking back, that was the right decision at the time. I got some questions over the years when this version of the project was in circulation, but it worked for most people. Programming, and writing, is often about finding a balance between the best implementation you can come up with, and putting your work in front of other people.

Where it broke down

The questions I got from readers who worked through this version of the project clarify exactly how a function-based approach breaks down, and becomes less readable. By far, the most common question related to error messages like this:

$ python alien_invasion.py
Traceback (most recent call last):
  ...
  File "/.../game_functions.py", line 46, in check_play_button
    if button_clicked and not stats.game_active:
                              ^^^^^^^^^^^^^^^^^
AttributeError: 'Scoreboard' object has no attribute 'game_active'

This is a really hard issue for someone new to programming to sort out. There’s a problem with the Scoreboard object, and something about game_active. Python’s improved error messages don’t help much here either; this is pointing at a stats.game_active reference that would be fairly hard for many people to troubleshoot.

The root problem is this function call:

    while True:
        gf.check_events(ai_settings, screen, sb, stats,
            play_button, ship, aliens, bullets)

The two arguments sb and stats are swapped. The stats object is being sent to the parameter that expects a Scoreboard object, and the sb object is being matched with the parameter that expects a GameStats object.

This is especially problematic because it’s an easy mistake to make. In a long list of arguments, it’s easy to make an ordering mistake, or simply lose track of what each argument is used for.2

Classes to the rescue

When it was time to write the second edition of the book, I was quite happy to have the chance to revisit this project. The main focus of that revision was to refactor the project using a class-based approach. My goal was to eliminate the long list of arguments in most function calls.

Here’s the basic structure of the revised project, again at the point where you can move the ship from side to side and fire bullets:

class AlienInvasion:

    def __init__(self):
        pygame.init()
        self.settings = Settings()
        self.screen = pygame.display.set_mode(...)
        self.ship = Ship(self)

    def run_game(self):
        while True:
            self._check_events()
            self._update_screen()

    def _check_events(self):
        ....

    def _update_screen(self):
        ...

if __name__ == '__main__':
    ai = AlienInvasion()
    ai.run_game()

In this implementation, the overall game is represented as a class called AlienInvasion. There are other classes involved, such as Settings and Ship. But instances of those classes are attributes of the overall game, such as self.settings and self.ship. Those objects are available to every method in the class, so they no longer need to be passed around as arguments between functions.

I could tell the refactoring work was likely to be a significant improvement, because even in this early stage of the game’s development the only argument in any function call is self.

The final version is still clean

I was really happy to see the final version of the game loop, at the end of the project:

class AlienInvasion:
    ...

    def run_game(self):
        """Start the main loop for the game."""
        while True:
            self._check_events()

            if self.stats.game_active:
                self.ship.update()
                self._update_bullets()
                self._update_aliens()

            self._update_screen()

Even at its most complex point in development, the methods _check_events() and _update_screen() don’t need any arguments.

Fewer questions, but different questions

It’s interesting to look back at code from different versions of the book. Python Crash Course has been read enough times that if something in the book doesn’t work, I definitely hear about it. People have asked far fewer questions about the class-based version of the game than they asked about the function-based approach.

I do get some questions, however. The most common question now relates to the initial state of the project. The end state of the project is much simpler in the class-based version. But one of the tradeoffs is an increase in abstraction and complexity in the first listing that draws a stationary ship to the screen.

Here’s the structure of that listing:

class AlienInvasion:

    def __init__(self):
        ...
        self.ship = Ship(self)

There are three references to self here. The first is received by __init__(). This is the usage of self that most people see when they first learn about OOP in Python.3 The second self is used to create an attribute, self.ship. But the third self is being passed to Ship. This last self is an instance of AlienInvasion; we’re passing a reference to the overall game object when we make a new Ship instance.

Here’s the first part of Ship:

class Ship:
 
    def __init__(self, ai_game):
        self.screen = ai_game.screen
        ...

For people new to OOP, it gets even more confusing here because we’ve got another self in Ship’s __init__() method. This self is a reference to Ship. The second argument in Ship’s __init__() method is called ai_game, and it receives the reference to the overall game object that was passed from the __init__() method in AlienInvasion.

This is really confusing if you’re new to OOP, and you want to fully understand what’s happening. I ended up writing a more detailed explanation than what’s offered in the book for anyone who did want a deeper understanding of this structure.

The “Software Architect” role

Restructuring this project gave me a little insight into what it’s like to be in the Software Architect role. My understanding of that job is that your main role is to help shape the overall implementation of a project, so that people can work efficiently on different parts of the project. A good architecture allows all the separate parts of a project to be developed fairly independently, while still working together in a coherent and efficient way.

Developing the class-based approach to Alien Invasion required a fairly good understanding of a number of things:

  • The root issues of an overly complex function-based approach;

  • The benefits that OOP implementation offers, specifically as it relates to how information is shared between parts of a class;

  • The role of composition in OOP;4

  • The different ways that self can be used in a class.

The structure that emerged from thinking about issues with the function-based approach was really interesting. It was more complex than most people new to OOP could come up with, but once that structure was in place the overall game became much easier to work with. If you accept the initial complexity, it’s fairly straightforward to see how all the parts such as_ check_events(), _update_screen() work. This extends to later methods such as _check_bullet_alien_collisions().

The balance between simplicity and complexity in this example is part of why people have enjoyed Python Crash Course over the years. It gives them an efficient introduction to the basics of Python, and then shows them a variety of ways those basic concepts can be used to build out more complex projects. This is also why the code in the book isn’t boring to me, even after all these years. Even though the code in the book is at the beginner to early intermediate level, the ideas that come up in developing and maintaining good examples are quite interesting.

Unexpected benefits

There were a number of unexpected benefits beyond eliminating the need to pass so many arguments around. It’s really interesting to have the overall game represented as a single object. I haven’t done this, but you could create an “arcade” that presents a screen where people can choose from a number of different games. When they click one, you create an instance of that game and call that game’s run_game() method.

You can also write “AI” players for any game that has an overall class structure. I wrote up a relatively short tutorial about how to create automated players for the Alien Invasion game. This is an interesting extension of the project, because it really highlights some of the benefits of OOP, and the ways it can be used.

I recently discovered one more benefit of using a class-based approach in this particular project. I wanted to update the test suite that makes sure all the code in the book works with each new release of Python, and all the libraries used in the book. I was able to develop a test whose structure parallels the structure used when automating gameplay. Basically, the test suite creates an automated player that clears one screen, and then makes some assertions about the state of the game after finishing one level. It’s a pretty cool to test to run, because instead of just printing out a pass or fail, a game window appears and you watch the test suite play through an accelerated version of the first level of the game.

Conclusions

Object-oriented programming has been around for decades, and people are still finding strengths and weaknesses in the different ways we use it. It takes time and experience to know when to use it, and how to use it to best serve your needs.

If you recognize the need for a class-based approach when you start a new project, go ahead and start with that kind of implementation. But if it’s not clear whether you’ll need to use OOP, or you’re unclear about how to structure your project as a class, start with a set of functions that carry out the tasks you know you need to implement.

Once you’ve done so, arguments that are common to several functions are indicative of the attributes you’ll probably need. When you re-implement your project as a class, most of the functions you’ve written will become methods in your class. The refactoring needed for that conversion usually involves replacing arguments in the function with attributes from the class.

Don’t be shy at all about copying structures you’ve seen from similar projects. It often takes an experienced developer a nontrivial amount of work and time to develop good structures; you should borrow from what they’ve come up with. Also, these days I wouldn’t hesitate to ask an AI assistant for some ideas about how to structure a project. I wouldn’t expect it to provide a perfect implementation, but the tools are already good enough to help steer us toward working implementations that we can then refine.

Resources

You can find the code files from this post in the mostly_python GitHub repository.


  1. For even better practice, write a small test suite for your original approach. Then write your class-based version, and make sure it passes all the tests. This is a great way to emphasize the importance of starting with tests that verify your project’s functionality, rather than its implementation.

    For more on this topic, see MP #45, Don’t start with unit tests.

  2. This can be improved slightly by using keyword arguments:

    def run_game():
        ...
        while True:
            gf.check_events(
                ai_settings=ai_settings,
                screen=screen,
                stats=stats,
                sb=sb,
                play_button=play_button,
                ship=ship,
                aliens=aliens,
                bullets=bullets,
            )

    This addresses the ordering issue, because order doesn’t matter when using keyword arguments. But it makes for even longer function calls, and it doesn’t address the fact that the same arguments are being passed back and forth between a number of related functions.

  3. For more about self, see MP #37.

  4. If it’s not clear how composition is used in the listings shown here, consider this excerpt:

    class AlienInvasion:
    
        def __init__(self):
            ...
            self.settings = Settings()
            ...
            self.ship = Ship(self)
            ...

    There are three classes here. The main class is AlienInvasion. But AlienInvasion is composed of a number of different things, including instances of at least two other classes: Settings and Ship.

    When a class has attributes that are instances of another class, we call that composition. For more about composition, see MP #63.