Julian's Blog - Logical Operators in Python

JulianHysi 25 June, 2023

Logical Operators in Python

We all know what logical operators are and how to use them. After all, they're one of the fundamental building blocks of even the simplest pieces of software. You learn about them when you first learn how to code, and it sticks with you forever. They change very little from language to language, and there's no incentive to really dive deeper into them.
In this article, we will go over some lesser known details about logical operators in Python, as well as the caveats and implications that come with it.

Let's get some basic terminology out of the way first. In the following line of code:

>>> x and y

x and y are the operands, and is the operator, and the whole statement is the operation, or the boolean expression (I will use the two terms interchangeably from now on). A boolean context is a conditional, expression or call to bool(), that expects to be reduced into a boolean value.
Python has three logical operators: and, or and not. When used in a boolean expression, they evaluate in the same way as in other popular languages: and needs both operands to be truthy, or needs (at least) one of them to be truthy and not needs the operand to be falsy in order to evaluate as true (and vice versa). Nothing special here. We will also mention the logical xor operation at the end of the article, but for the moment let's just dive into these three operators, as they're what's being used the vast majority of the time.

Return values

Where Python is a bit different, is that when some of these operators short-circuit, they don't actually return the boolean evaluation of the expression, but the value of one of the operands! Note that short-circuiting, which in this context basically means stopping the evaluation of the expression halfway through if a decision about it can already be made, is an efficiency improvement technique adopted by many languages. Why evaluate the 2nd operand in an and expression, if the first one is falsy ? It's a pretty basic and rather obvious performance trick. In Python, however, the and and or operators have a return value that is not necessarily a boolean one. Although it does evaluate to a boolean value in a boolean context, it's not a boolean instance (i.e. True or False objects). Let's drive this home with a quick example:

That will not print "True" to the screen, but "Doe"! Of course, as mentioned above, "Doe" evaluates to True in a boolean context, but it's not a boolean value per se, and if the code that depends on it expects or assumes it to be, this can lead to subtle bugs and unexpected behavior. Following is the return value table, for all four combinations, for both and and or operators:

x	y	x and y	x or y
truthy	truthy	y	x
truthy	falsy	y	x
falsy	truthy	x	x
falsy	falsy	x	y

Alright, that's a lot...do we have to memorize that as well, just like the corresponding truth tables? Well, these operators are used very frequently, so in my opinion, it's best to memorize them, or have the table above available somewhere in your cheatsheets sections. A neat trick I use to memorize them is this:

- and returns the 2nd operand when it evaluates to true, the first falsy operand when it evaluates to false
- or returns the 1st operand when it evaluates to true, and the 2nd when it evaluates to false

Let's reiterate this one last time so that it clicks: when using logical operators and and or in Python, the returned value is actually one of the operands, not necessarily a boolean True or False value. If the operand happens to be a literal boolean instance though, it would be returned as such. Put simply, while the result of these operations can indeed be interpreted as either True or False in a boolean context, the actual value returned could be any Python object, depending on the values of the operands at hand.

The not operator is different, as it always returns a boolean value, either True or False. If x is truthy, not x is False, and if x is falsy, not x is True. In other words, the not operator returns the same value that the expression evaluates to.

Short circuiting

The short-circuiting mentioned above implies the following: in an and expression, the 2nd operand will not be evaluated at all if the 1st one is falsy. And in the or expression, the 2nd operand will not be evaluated at all if the 1st is truthy. This makes a lot of sense, as for both these operators, the 1st operand can in these cases definitively determine the result, so there's no need to keep going. This behavior of Python allows us to do something like this:

In the code above, assume that my_list is a list we received, that can potentially be empty. Calling max() on an empty sequence raises ValueError, but the code above will not, because in the case of an empty list, we would short-circuit. We could of course also do it like this and still be safe:

but the above version arguably looks cleaner and is more readable.

Note that you can also leverage short-circuiting for performance decisions, particularly when one side of the expression is more computationally expensive than the other. In an and expression, you should put as the 1st operand the one which is falsy more often and/or is less costly to compute:

>>> is_weekday and costly_database_query()

So if it's not a weekday, which is very cheap to compute, then we don't issue the heavy query to the database. With or, the operand that is more often truthy, and/or less costly to compute, should come first:

>>> is_cache_valid or costly_query()

In summary, and this applies to both and and or operators, if both operands are more or less equally expensive to compute, then the first one should be the operand that is more likely to have a value which determines the expression (and thus allows for short-circuiting). And if both operands are more or less as likely to determine to outcome, then the cheapest one should come first. Under most circumstances, this would be nothing more than a micro-optimization, but there are situations where it could come in handy and make a significant difference.

Operator precedence

Worth mentioning is also the precedence of logical operators, which goes as follows: not, and and then or, in that order. Take a look at the following example:

- not y is evaluated first, and it's True (since y is falsy)
- the and operation runs next, now reduced to x and True, and since x is truthy, it evaluates to True as well
- lastly, there's True or z, which is True despite z being falsy

If you want to impose a different order, you have to use parentheses. Keep in mind that if there are nested parentheses, the innermost parentheses are evaluated first. This is a common math convention, and it's embraced by many programming languages for all sorts of operations, not just logical ones. And it's an especially important one to remember when you are short-circuiting to avoid an exception or for performance benefits, as an unexpected order of execution could make things worse.

Logical xor

Finally, let's talk about xor, one of my personal favorites. It stands for exclusive or, and as the name suggests, xor needs exactly one of the two operands to be true. It could be either the 1st or the 2nd; it doesn't matter. But not both. As you may pick up, it's a less frequently needed operator, and because of that (and a few other interesting language design decisions), Python and many languages do not offer a native logical xor operator at all. In Python, there's only three "true" logical operators; the ones mentioned at the top of and throughout the article.

We can, of course, simulate the xor operation by combining the other three operators in parentheses as below:

>>> xor_result = (x and not y) or (not x and y)

or using the inequality operator (less code, but harder to read?):

>>> xor_result = bool(x) != bool(y)

Everything mentioned so far about logical operators, return values, short-circuiting and precedence still holds true; in fact, it's all what the interpreter sees when evaluating that expression. It has no idea we are trying to compose a logical xor there.

However, the above syntax isn't very readable, especially when the operands themselves are boolean expressions or function calls, so you'll often see Python programmers prefer the `^` (caret) syntax, which is actually the bitwise xor operator:

>>> xor_result = x ^ y

When used with boolean values, ^ practically becomes a logical xor. However, it's still a bitwise operator, not a logical one, so be careful! When used with non-boolean objects, it can lead to unexpected results:

When a is bitwise xor-ed with b, it yields 0011 0001, which is converted to the decimal value 49. The above code always outputs 49, regardless of the order of the operands. The order is not the issue here, but perhaps you can start to see what is. 49 is a meaningless return value, so when xor-ing this way, we must be sure to discard return values, i.e. not assign them to a variable or make decisions based on them, unless of course, we really want to do a bitwise operation here, in which case we are very much interested in the return value.

But there's a bigger, hidden trap here, often hard to notice. This statement:

>>> bool(a ^ b)

perhaps to your astonishment, evaluates to True! That's because the bitwise xor expression evaluates to 49 first, which is True in a boolean context. A logical xor should return false when both operands are truthy, so to make things work, you need to adjust to this:

>>> bool(a) ^ bool(b)

And, on most object types (almost everything except numerical types), the bitwise operation is not supported at all, raising a TypeError. Hence, the more verbose syntax above will often be your only option. But in any case, you have to pay close attention to the data types involved. An unexpected value that changes the execution flow of your application is a lot more dangerous than a TypeError exception that can be easily & quickly spotted/fixed without compromising the integrity of your processes.

Conclusion

Well, that's about it. Logical operators aren't the most exciting topic to talk about, but I think it's necessary to be aware of these seemingly subtle but crucial details that can make your Python application behave unpredictably or inefficiently, leaving you baffled at what's going on.

Comment