Except the Unexpected

No, that title isn’t a typo. Predictably, yesterday’s post drew out an opposing view, and I’m very glad for it. While I haven’t changed my mind on the subject, Cedric did raises some reasonable points that I neglected in my original post. Maybe it’s just that I’m growing tired of posting every day, but I didn’t adequately explain my views, and for that I apologize. Only for not explaining, though, not for the views themselves. Today, I’ll try to be more detailed in my thoughts on the subject, and offer more recommendations than just “don’t return None” and “embrace exceptions”.

Also, I will apologize for the length of this post. I hope it helps, though.

Defining “exception”

A quick dictionary search for the word “exception” provides numerous options for defining the word. Removing useless definitions (“the act of excepting or the fact of being excepted”) and more specialized uses (criticism and legal usage), we’re left with a few variations on a theme:

The recurring theme here is that exceptions generally require a rule from which to deviate. But we’re programmers here, so what kind of rules are we talking about? In a nutshell, the rule is whatever action a piece of code is expected to perform. This might be defined by a function’s name, documentation, usage examples, or other communication, but it will be specific to each piece of code.

Any time you write a function, you’re designating a purpose for it, a task it should perform. That task, however simple or complicated, however blunt or subtle, is its rule. Deviations from that rule are exceptions. (To throw in more tongue-twisters for fun, consider this: exceptions violate expectations; they are excepted from what’s expected.)

However, just defining the concept of an exception doesn’t really do much for anybody. It’s like one of those zen sayings that doesn’t make any sense unless you’re expressly looking for meaning in the universe. And even then, it doesn’t really provide an answer, it just makes you appreciate the question. So how should we apply the concept of exceptions to programming? Well, exactly how it works will depend on the rule, and thus on each function itself.

So, it seems to me that the best designs focus on defining the rule, rather than the exceptions. Any required exceptions will be defined naturally as a side-effect of having a well-defined rule in place.

Defining a rule

When you write a function, you’re defining a task, or set of tasks, that the function will perform (some systems may make this definition more formal by designing by contract, but that’s beyond the scope of this article). Exactly how it’s performed is irrelevant to this discussion, and since rules will be different for different functions, the real key here is just to make sure that you do at least consciously define a rule for what the function will do. A few examples:

First, consider the dictionary. By using Python’s standard dictionary syntax, x[i], you’re implicitly calling __getitem__. So when using this syntax, the rule specifies that the key supplied must match a value. If this isn’t the case, it’s an exception to the rule, and Python reacts accordingly by raising a KeyError. If you supply a value for the key that can’t be used as a key (such as a list; try x[[]]), it can’t even try to look it up as a key, so that’s another exception: TypeError. These cases aren’t covered by the rule, so they’re considered to be exceptions.

Dictionaries provide another option, however, which allows for keys to be missing. This is a different function, and thus a different rule. By including “if such a value is present” in its rule, the get method must handle the inverse of that condition. If an appropriate value isn’t present, it’s no longer an exception, but an anticipated aspect of the rule, and it handles this by returning None. *gasp* Yes, this violates my previous post, and that’s why I agreed with Cedric that I should clarify my point. This situation isn’t evil, because returning None is appropriate within the rule defined for the function.

It’s also important to note that you may choose to call get instead of __getitem__, it’s not automatic. The “standard” dictionary access technique uses __getitem__, with get being generally reserved for more specialized situations which need to follow a different rule.

In the case of the get method of Django’s model managers, you’ll notice the rule is very complex, and thus has many potential points of failure. Just going by the rule I laid out above, here are the ways it could go wrong:

Of course, there are more things that can go wrong, but those are mostly ipmlementation details or things that are out of Django’s control. Those listed above are based solely on the rule I provided, and of course, that rule is probably a bit oversimplified as well.

Some words of advice

So when should your rules include provisions for None? How inclusive should your rules be? How should you convey the nature of these rules to programmers? These are all good questions, and while I don’t pretend to have a perfect answer (in fact, I doubt there is one), I’ll offer some advice based on my own experiences.

Conclusion (finally!)

As with anything else, I won’t pretend to be perfect in this regard myself. I came from a PHP background, and I wrote some of my Python code before I fully understood some of these philosophical concepts. But once you know, always try to be a better programmer, and this is one why I feel we could all be better programmers.

Comments

  1. At 4:10 p.m. on Nov 21, 2007, Jesse Kuhnert said ...

    This one's considerably more subjective, but I think it's still good advice. Remember the example of the dictionary. The "standard" (most-used, most-documented) tactic is direct access, x[i], which will raise an exception if the key isn't present. The None-returning variation, x.get(i), is the alternative, available if necessary, but not used by default. This separation and priority helps make sure that programmers make a conscious decision to deal with a function that returns None, rather than it being an unexpected side-effect of a function that didn't stipulate that in its rules.

    I'm not sure where this quote came from in the context of your blog entry - but here it seems they still don't explain ~why~ - in the case of the dictionary - throwing an exception is a good idea. Maybe for some of your other cases sure - certainly for things that perform a search of some sort and return a list / array the return should be an empty structure.

    Did I miss the big eureka where this became clear ?

  2. At 5:23 p.m. on Nov 21, 2007, saluk said ...

    Here is an example where using the get method can lead to very bad problems:

    car = autos.get("Toyota") trafficlist.store(car) trafficlist.start_simulation()

    #Somewhere in the simulation carA.region.inside(carB.region)

    Traceback (most recent call last): AttributeError: 'NoneType' object has no attribute 'list'


    That traceback tells you that something is None which shouldn't be, and it might not be obvious where that None came from. If you remember, you will do this to the above code:

    car = autos.get("Toyota") if car is not None: trafficlist.store(car) #The important addition trafficlist.start_simulation()

    However, this is a very easy test to forget about. The thing about not returning None is that it will throw the error at the proper place in the code and remind you (and anyone working on the code after you) what is really going on.

    "Oh, I accidentily capitalized Toyota when it is stored as toyota..."

    I actually use the "if x is not None" a lot in my code though. I think it's a bit of a toss-up, but I generally agree with the OP that exceptions will help prevent a certain class of runtime error. Returning None is like disguising a trap - it looks pretty but eventually you'll forget where the trap was and get bit :) Better to spring the trap early so that the actual problem can be dealt with.

  3. At 10:03 p.m. on Nov 21, 2007, talios said ...

    I just posted my own followup on this to http://www.talios.com/a_little_head_trauma_returning_none_is_evil.htm about how Smalltalk deals with this using closures and the Nil object pattern.

  4. At 3:56 a.m. on Nov 22, 2007, Juri said ...

    One family of languages where I never run into this problem is ML and descendants such as Scala, Nemerle, etc. Applies to Haskell too, I guess, but I've never used it.

    It's one of those times I appreciate a strong type system: if you return an option[Type] (or whatever it's called in your language), the compiler won't let you access the value without pattern matching that checks whether the value is None or an instance of Type.

  5. At 8:58 a.m. on Nov 22, 2007, Michael B said ...

    If you remember, you will do this to the above code:

    car = autos.get("Toyota") if car is not None: trafficlist.store(car) #The important addition trafficlist.start_simulation()

    How does it here make sense to be able to store a null value? Sorry 'none'. If the result is one object, and just one returning none/null is fine. If the result can be one or more objects return an empty collection(/container).

    So check preconditions and again use the documentation to specify the contract(it can make sense to allow null values to be added to a collection in distict cases, but here it doesn't).

    If you tolded to archive some object but your getting 'none'. Aren't you gonna yell "hey what the hell am I suppose to archive here?" (smells like an exception)

  6. At 1:05 p.m. on Nov 22, 2007, nara said ...

    Juri's point above is a good one: the net effect is that the the function should be consistent about returning one type of value (except in throwing exceptions, of course). Thus, when returning a list object, it makes sense to return the empty list instead of None. Haskell, and of course Java et al make you explicitly define the input and output types of the function, and does static type checking. With Python, you can return a variety of types from one function, instead of just one, often leading to confusing or disastrous results.

    Another point: writing unittests could help better describe what the rule for a function is.

  7. At 9:13 p.m. on Nov 22, 2007, Leo Soto M. said ...

    Hey Marty, very good points.

    Regarding the pythonic-mindset vs java-mindset, I think that dynamic language programmers are most used to fail fast, because such languages don't make any attempt to forbid unexpected behaviour.

    On the other hand, when using static languages, people somewhat trust the compiler to find and disallow many unexpected conditions. So, I think that allowing nulls everywhere is a Java fault.

    It's not an excuse, as every Java programmer should know its shortcomings, and act accordingly. But I couldn't resist the change to bash Java ;)

  8. At 10:30 a.m. on Nov 24, 2007, Jason McBrayer said ...

    I think it's a cultural difference more than (strictly speaking) a language difference. I often hear from Java programmers that you should only raise exceptions for things that are unexpected, and (this is the cultural difference) failure is usually not unexpected. That is, you don't raise an exception if something is failing in a common way that might be expected, such as an ORM not finding a matching record in a database.

    That's completely contrary to the practice of Python programmers, who tend to follow the Samurai Principle. Checking for invalid return values (such as null) is very common in Java, very uncommon in Python, because returning invalid values is generally not culturally acceptable for Python programmers.

Speak up!


This particular article was posted on Wednesday, November 21, 2007, and has received 8 comments.

It was preceeded by Returning None is Evil and followed by Dynamic Functions.

It contains the following links:

Archive

Categories

Powered by Django.