Julian's Blog - Type hints and Modern Python

JulianHysi 20 August, 2023

Type hints and Modern Python

What I had in mind when I started writing this article was something a bit different. As typing is one of my most used Python modules now, I thought of compiling an overview of its most useful features and tricks. While that's still something I intend to do, and it will probably be the part 2 of the current article, something struck me while writing the initial draft for it. And that is, typing in Python has come a long way, and has inadvertently become an integral part of the daily workflow for many of us. It has drastically changed the development experience, whether in writing, reading or analyzing code. And that's not without controversy I would say, remarkably so as the Python community isn't particularly known for drama and strongly-held opinions. Before I explain how we got to that, and where I stand on this, let's go on a memory trip based on my personal history of writing Python code.

My history with typing

Type annotations were introduced to Python in the 3.5 release in late 2015. They're also often referred to as "type hints", and even though the two concepts are not identical (you can practice hinting even by placing comments or naming your object my_list), we'll use the two terms interchangeably for the rest of the article. But long story short, they are just hints you could use to indicate what a certain object is supposed/intended to be, without affecting code execution.

A few months later, in 2016, I'm getting my first hands on Python. I was at university at the time, and Python was merely a fun new hobby to me. I know what you're thinking; everything is a hobby at that age! But no, really, all our "academic coding" was done in Java (no regrets on that). I would only use Python to solve small coding problems and build simple CLI games. It goes without saying that I had no interest in type annotations back then; as a matter of fact, I didn't even know Python supported that. On the contrary, the freedom that Python provided (as opposed to the verbose and strict Java code I was used to) was probably the main attraction for me in the language at the time.

Fast forward to 2018. I'm using Python much more now, and it's probably my main back-end language at this point. I'm building bigger projects, and even using it for my diploma thesis. I know what type annotations are, but I'm still not using them much. To my defense, almost nobody around me was either. When I'd very rarely come accross a Python snippet with annotations in it, my eyes would simply ignore the annotations (perhaps even get mildly annoyed by the noise or "impurity" of the code at hand). Truth be told, this new feature of the language hadn't yet gained traction, perhaps also because it was still limited in terms of capabilities, and not yet enhanced to the degree it is today.

From mid 2018 and onwards, I was now coding "professionally". This doesn't mean the quality of my code suddenly elevated (spoiler alert: there's no shortcut to experience) but I did feel responsible to write more robust code now that I was getting paid to do it. In my first 2 years, Python was not my main back-end language (unfortunately so I must say) but when I did get to write Python code, I started adding annotations. I was still an "occasional annotater" in that I would only hint the most important function definitions or objects, but I was starting to see the value in it. I was...starting to believe. As the years went by, my adoption rate increased, but up to last year, I would still not annotate the majority of the code that I wrote.

Return to the present moment. Annotations are quite widespread. And if you're using the more "modern flavour" of Python (like I currently am), that is a stack of FastAPI, pydantic, mypy and so on, type hints are everywhere. Your Python codebase is covered in them to the point it feels like it's written in a completely different language (to the dislike of many). To hint or not to hint...I'm part of this world now, so I've had the chance to sleep on that dilemma for a while. I'll share my two cents later on throughout the article.

Why share this backstory, you might wonder? Other than to practice my storytelling skills of course, it's because I think many Python developers went through more or less the same stages, unless they were already a veteran when type annotations came out, and were equipped with a different kind of technical vision. And most importantly, the history paints a clear trend towards the future of Python. Whether you like it or not, type hints will be ubiquitous, if they aren't already. Even if you decide to use them moderately in your own code, you will need to deal with heavily-typed code in other projects written by other people. Part 2 of this article will try to cover what you need to know to be set in that regard. The rest of this article will focus on why this is a polarizing topic at the moment, and how I see things.

Cons of typing

Let's start with what to me are the disadvantages of adding type annotations to your code.

First of all, one of the purposes of type annotations, and a major one I believe, is to serve as a reading aid. The developer who writes the code, can leverage annotations to express intent. And another developer who reads the code, can better understand that a certain parameter of a function is supposed to be a list let's say. But sometimes, annotations, and especially an overuse of them, leads to the opposite: the code is harder to read! When a function has many parameters, and they're all complex/composed data structures as opposed to simple data types like str or int, that makes for a lot of visual noise. Sure, this can be mitigated by creating custom types and/or pydantic models, so that list[dict[str, int]] is replaced by a meaningful name. But not everyone is doing that, and while you can argue that's not a shortcoming of type annotations per se, the reality of it is that it's making some Python code out there harder to read. And, I think for simple enough functions (which is something all functions should strive towards), there's often less congitive load if I just mentally infer what the types are by looking at the function definition rather than the typed signature. Sometimes things are that obvious already, and annotations only add clutter.

Consider the following code:

In this example, even without type hints, most developers can infer that users is likely some kind of iterable (probably a list) of user objects and notification is an object or a string. Now let's look at the annotated version of it:

Here, even though the type hints provide more detailed information about the expected structure of the parameters, they introduce significant visual noise. This can make the function signature harder to read at a quick glance, especially if a developer is just trying to understand the functionality and not the exact type details. It's somewhat of an extreme example, because you would likely have User and Notification classes which would remove the need for the custom types. But I think the point still stands. You've probably noticed in the wild code that is overly annotated, where classes don't exist for the data types at hand, and instead of extracting out custom types, everything goes into the function signature. That's a pain to read!

A second major disadvantage that I would list, is that type hints are misleading. Something can be typed as str but actually be a list. Not only that, but the type can actually change throughout the execution lifecycle. Yes, Python supports that; my_var identifier can point to a str object in one line, and a tuple in another. But I'd argue the first example is bad typing, and the second one is bad coding. And type hints are, well, just hints after all. Nobody is trying to sell them as something more than that. So one can argue this is, again, not something inherently wrong about type annotations themselves. But given our cognitive biases, however, we often perceive a hint as a stronger assertion than it truly is. We tend to believe it, especially if we're coming from a strictly-typed language. Therefore, type hints give a false sense of security about the state and robustness of your code.

The third, and probably my biggest complain, and this is something I don't hear being talked about enough: type annotations, or a frequent usage of them, forces the developer to think in terms of types and not behavior. Duck Typing is a major theme in Python, heavily supported and encouraged, and type annotations subtly lead you to think in the other direction. I know what you're saying; nobody forces you to annotate everything as a list when there's Iterable, Sequence, Sized etc available. But when you're thinking of what to type something as, you're probably thinking of types, not behavior. And while this may say more about developers than the language, we know for a fact that in reality abstract generics are underused and concrete types are vastly overused. But there's a deeper caveat to this, beyond type hinting: if you're annotating something as list when it could be very well be Iterable, you're also writing the code in a way that it only works with lists, needlessly restricting the scope and utility of it.

Lastly, some developers complain that annotated code goes against the vibe of Python. As mentioned earlier, Python's simplicity, freedom and laconic way of writting was a major selling point for me. It would be dishonest of me if I would argue against this point. Some of the complainers are suggesting that if you want types in your code, just use a statically typed language (there's no shortage of that), where you also get the performance and safety benefits of it. I don't think things are ever that simple, and there's a lot of nuance to that discourse (one does not switch languages just like that), but it's nevertheless a good point against type hints.

Pros of typing

Firstly, and we already mentioned this earlier, annotations serve as a reading aid, and they do a good job at this. The false sense of security and developer mistakes/misusage aside, when I'm quickly glancing at a piece of code to understand the inputs and outputs, type annotations are quite helpful more often than not, there's no denying to that.

Secondly, type hints make working at a large codebase easier. I'm still reserved about using type hints for standalone scripts and one-off snippets, but in a large codebase, it does facilitate understanding how different entities are related to each other and how/where each of them comes into play.

Another big plus is that it helps your IDE help you more (funny how that works)! At least in Pycharm, I've noticed that auto-completion is much better when the code is annotated, and there's more helpful warnings as well.

Morever, you can actually get functionality out of type hints! If you're using FastAPI and pydantic, you get data parsing and validation for free just by typing things out. As an example, if you hint that the request body for your endpoint should comply with a certain pydantic model, then a bad payload would be invalidated implicitly without you having to do anything about it. That's a massive gain in terms of productivity and not only. And by the way, this is rare a case of type annotations actually altering the behavior of your code. Don't get confused by that, to the Python interpreter the annotations don't mean much still. But there's frameworks and libraries that have been written to make use of the annotations in a more impactful way.

Lastly, it's the benefit of having static code analyzers like mypy run against your code. I personally cannot overstate the benefit of this, and for that, mypy will have its own section down below.

A game of assumptions

Sometimes, I refrain from testing my code until I've finished writing it. Essentially, I never execute the code halfway through to iteratively refine it. I complete the solution end to end, and only when I'm fully convinced it does what I expect it to do, I execute it. This is a good exercise in facing the assumptions I hold to be true. When we code, we often assume that some stdlib or third-party code does what we think it does, or that the language itself behaves in a certain way. And sometimes we're wrong about it. That's okay, because writing code is usually an iterative process, so we just mindleslly go "Ah, yes, my bad", fix the small issue, and run the code again. We repeat this process until everything works (or at least we think it does). There's nothing wrong with this, but in my experience I've found it a very valuable practice to sometimes do the opposite: finish the work, then test it. Sometimes the code executes fine, and I must say: oh what a joy that is. But sometimes it doesn't, and here's where things get interesting. When you're purposefully addressing every detail and trying to write perfectly-functional code, and you don't succeed at that, it means you're making some wrong assumptions along the way. Wrong assumptions are rooted in a lack of understanding, and sometimes that lack of understanding is about some critical component or aspect of the tool you're using. So I don't think you should just quickly and subconsciously grab the first solution from StackOverflow, apply the fix, and move on with your day. I think moments like that call for a pause and thoughtful reflection, and perhaps a deeper dive into the underlying knowledge gaps. And you cannot do that unless you deliberately exercise this technique every now and then.

But why am I sharing my weird habits? Well, that's because there's this tool called mypy, that humbles me in this regard. Mypy analyzes your codebase, and based on the annotations you've put, along with its own inference magic, it catches type-related errors early in the development process. It has caught possible issues that even my slow, immersive coding experience could not. These are issues that would probably reach production code otherwise. "Isn't that just indicative of a weak automated test suite?" you may say. Maybe, but I'm the one writing the tests too. If I didn't think about some edge case while coding, it's likely that I would forget about it when covering the thing in tests too. And, assuming the changes (along with the bug) have to go through a Pull Request first, the reviewer is just another colleague of mine. They're probably just as likely to overlook the issue. There's simply no other way to put it; mypy has bailed me out on several occasions. It's not perfect, and sometimes even annoying (I run it on strict mode) but it's an impressive and game-changer tool.

Conclusion

Where do I stand in all of this discussion? As you might have picked up by now, I'm certainly pro using type annotations. Acknowledging the shortcomings, and developing with care and mindfulness, I think that overall the benefits significantly outweigh the cons. I am currently using annotations a lot, and I think this is only going to increase. I've witnessed the quality of my code go up by margins that cannot be ingored. In the next article, I will share with you what I would consider as the typing basics (and a bit beyond that) that every Python developer needs to know in 2023.

Comment