Duck Typing and Python
"If it walks like a duck and it quacks like a duck, then it is a duck".
If you've been in this business for some time, then you have definitely heard the phrase. It is used to describe the programming practice of Duck Typing.
But what is Duck Typing, how does it relate to Python, and why is it important ? This article will cover that.
What is Duck Typing ?
Every piece of code generally performs an operation. It does something. In programming, we have objects and behaviors. By objects, I mean also variables of a primitive data type, such as integers or booleans. Well, in Python everything is an object (that's a topic for another day), but I needed to clarify that, since most of what I'm about to say regarding Duck Typing applies to other programming languages too.
So, objects and behaviors. As you already know, not every behavior can be applied to every object. For example, it makes no sense to turn a boolean value into uppercase, to find the square root of a string, or to sort an integer. Different kind of objects have different sets of behavior that encompass them. Using the wrong behavior will most likely result in an error (TypeError in Python).
Most of the time, we already know what behaviors the objects at hand support. Which is why type errors don't occur that often. But when we're not sure, we might try to error-proof the code with the following paradigm: "If object is X then do Y on object". This is the opposite of duck typing. We're checking whether object is specifically an instance of X (or of a child, grandchild...of X). We are not checking whether it quacks, we're checking if it is a duck.
In essence, Duck Typing is the programming practice of checking for behavior, not for type. Translated into the pseudocode language of the example above, it goes like "If object can do Y, then let it do Y". This is important, because on many occasions we don't really care about the type. Suppose we have a series of values, and would like to sort them. We don't care if the object containing the series of values is a list, or a tuple. Our only concern is whether this object can be sorted or not. In the context of getting our series of values sorted, type doesn't matter as long as the sort operation can happen. We don't care what the object is, but what it can be done to it or what it can do.
But how do we know what an object can do ? That's simple; through the methods it implements. Remember that an object has attributes, which serve as properties, and methods, which define behavior.
What's so special about Python ?
Duck Typing is not exclusive to Python. You can practice it in many other languages as well. But it's extra important in Python, because it is ubiquitous in almost every part of the language. In Python, we have many behavior-oriented concepts such as
- Sequence
- Iterable
- Iterator
- Callable
- Decorator
- context manager
- file-like object etc
These concepts are a trademark of the set of methods they implement. In Python, behavior is usually defined through dunder methods (i.e. __next__()
). You probably have often had to define dunder methods during writing Python code. Python internals make more sense once you grasp Duck Typing.
Python examples
In Python, there's the concept of a Sequence. To be a Sequence, you must implement certain operations, such as having a length, being sliceable, indexable, reverse-indexable, comparable with other sequences, checkable for membership etc.
Lists, tuples and strings are all sequences, since they all support the above-mentioned behaviors. Therefore, when we want to perform a slicing, and we don't know the exact type of the object we're dealing with, checking if the type is string closes the door to other kind of sequences, where the slicing would have worked too. It narrows the scope of our solution.
Yes, we can performs 3 checks at once (whether it's a list, tuple or string), but wouldn't it be better to simply check for behavior instead ? There could be many other types in Python which are sequences. Checking for all the types isn't practical. After all, what we really need to know is whether the object is sliceable, and that behavior is provided by the __getitem__()
method, regardless of type.
Iterables are a much narrower concept: anything we can loop over! That means str, list, tuple, set, dict, files, generators, zip objects etc all are iterables. Even iterators are iterables themselves. To build your own custom iterable, you can simply implement __iter__()
and make it return an iterator, or implement __getitem__()
under certain conditions. If the object at hand doesn't implement either, you can rest assured it's not something you can iterate over.
Benefits of Duck Typing
A big selling point of Duck Typing, as mentioned earlier, is the ability to make your code more generic. You perform an operation, in whatever object that operation is 'legal', without restricting your solution to a specific type. This way, you open up your code to more use cases.
Another benefit is improved code readability. Duck Typing expresses intent. You check whether an object can be sliced because you intend to slice it. It's immediately obvious to someone else reading the code! Checking whether the object is a list instance doesn't give many clues, or in fact, gives too many.
Duck Typing isn't always possible
There are times when you you will want to perform a certain operation, whose behavior is defined by only 1 specific type. An example of that is checking if an object's value is an all-uppercase word. This operation expects an exact str
instance, not a string-like object. Duck Typing is not applicable here. We're not dealing with a generic behavior implemented by many types, but something that is only defined by the str
class with its isupper()
method.
Another, perhaps obvious example, is exception handling. Every exception has an exact type, and often when we're writing error handling code, we handle different types of exceptions differently. The type matters! As a matter of fact, it's a best practice to try to be as specific as possible here, so as to avoid Pokemon Exception Handling (don't catch them all!).
EAFP
EAFP is a principle which states that it's easier to ask for forgiveness than permission. While that may be bad life advice in general, in programming it's something you should consider.
EAFP is somewhat related to Duck Typing. I'd say it complements it. But what exactly is it in programming terms ?
If we're practicing Duck Typing, we're checking for behavior, not type. But if we add EAFP to the mix, we don't check for anything. We simply assume behavior, and handle the appropriate error in case our assumption doesn't hold.
In our slicing example from earlier, we would simply assume that the object can be sliced and proceed with the operation. If it can't be sliced, it will raise a TypeError, for which we should have written code that handles it. This way, we're not asking for permission whether it can be sliced, but only for 'forgiveness' in case it can't.
A great benefit of EAFP is that the handling block gets executed only when things go wrong, whereas a traditional if statement check gets executed every time, even though most of the time the check might pass. Other than the performance benefit, EAFP makes the code more explicit, especially in terms of how it could fail. Also, EAFP helps avoid race condition, because there is no gap between the time of check to time of use.
Keep in mind that EAFP is a broad concept; it can be applied outside a Duck Typing context, and outside of Python too. Also, there are cases when you don't need to practice EAFP because Python has you covered. Example: dict.get('key')
will return None if the key doesn't exist, instead of raising an exception. And there might be cases where you're better off with an if statement. EAFP isn't a one-size-fits-all.
If your code calls type()
, isinstance()
, or issubclass()
a lot, you're not practicing Duck Typing nor EAFP. While that's not inherently wrong, you don't really need type checking that often, so take a closer inspection.