Some Thoughts on Error Handling

Errors are huge in development. As is error handling. Sadly not a lot is written about best practices in error handling. Most languages favor one way to handle errors and apply that to all cases. Developers tend to ignore errors or take the mechanism that is in hand. This article goes a little deeper into ways to handle errors and I share some thoughts on what error cases might be best handled by what methods. In development it’s not clearly defined what an error is. For the scope of these thoughts I define it as something happening that was not intended from the perspective of a developer. Errors in this case are not bugs.

Kinds of Errors

There are very different kinds of errors which developers encounter. Often error handling is only discussed in general but not in context.

  • User behavior
    • Wide field, handled by UX and out of scope here
  • Business Errors, for example
    • x not found
    • x not allowed
    • x not valid -> Validation
    • can’t book
    • can’t pay with this CC
  • Wrong API Usage
    • Dividing by zero
    • parameter x is not an email
  • Infrastructure problems
    • DB could not be synced
    • File could not be written

Possibilities to Handle Errors

There are many ways to handle errors. The most common are:

Null

One often used error signaling variant in Java is to return Null. If a value can be found, the value is returned. Otherwise a method returns null. Often this is used when a value could not be retrieved from a data structure, database or file. The problem with returning null is that compilers do not force developers to write error code. This often leads to null pointer exceptions (NPE)

Error codes

In C code what one often sees is that a method without a return value will return an error code. One example is to return 0 to signal everything went as expected and -1 to signal an error. The same problems as with returning null apply.

Magic value

If a method needs to return a value and return an error code, there is a variation on the former method. The method will return -1 if there was an error, and the business value otherwise. One example is indexOf in Java that will return the index of a character in a String and -1 if the String does not contain the character.

Option Monad

In functional languages it’s common to express the ‘Not found/Not there/Optional’ case not with null but with a special type. In Haskell this type is Maybe and in Scala Option is used. A method returns Option[T] with the error case returning None and the success case returning Some[T] for example Some[Person]. The benefit is that error handling can be delayed and with map the error case can be ignored as long as possible. Also errors can be composed.

I’ve written more about Optional and nullable Types in the past.

The benefit is even clearer when we call a method that also returns Option.

Compare this with a cascade of null checks

As the composition of errors into one result occurs often, Scala has a special construct called for-comprehension.

Error handling is fail fast in this case, which means the first error determines the result. Errors are not accumulating.

Validation with Either or ‘Or’

Either in Scala is a Type that can represent two other types. It is used in different contexts, one usage often is error handling. In this case Either represents an error or a success.

One can then decide to handle the error in place or delay the error handling code. One way is to use match and case on Failure and Success, another way is to use fold:

Interesting how one Haskell user writes about Either[String,T]

I can’t really recommend using this method in any circumstance. If you don’t need to distinguish errors, you should have used Maybe.

Scalactic has a variation on Either that is specific to error or validation handling. It’s used with Scala infix notation expressed as String Or ErrorMessage which is the same as Or[String,ErrorMessage]. Difference from Either the API names are clearer targeted at error handling. Scalactic extends the validation concept with accumulation.

Validation of more than one input parameter can be be done with Scala for-comprehensions. This either returns a person or an ErrorMessage. The mode is fail fast, so the ErrorMessage is from the first validation that fails.

Often one wants to accumulate errors. In this case Scalactic knows One and Many. Each validation method returns One and those get accumulated into Every.

Checked exceptions

With checked exceptions a method declares with throws what error states it has and what developers should handle. The thrown exceptions are then handled on the call site.

Often exceptions are not catched but code hardened against them. For example to clean up resources:

In this case the finally block is executed even if an exception was thrown. The exceptions bubbles up but locally all resources are cleaned up.

Unchecked exceptions

Unchecked exceptions are like checked exceptions but are not declared as part of the method API.

Unchecked exceptions are also often used with finally.

Exception handling with Monads

As shown above, error handling with Either or Option has several benefits. Error handling can be deferred and errors can be accumulated or composed. Scala introduced a new type Try that wraps exceptions. Here both parseInt methods can throw exceptions which are wrapped into a Try monad.

With this mechanism we can defer error handling to a different place in our code.

Try also supports recover to handle exceptions.

With recover one can also act on the type of exception:

With Try exceptions can be transformed into monads with all the benefits of accumulation, aggregation and composition.

Evaluating and Comparing Error Handling

Magic values, null and error codes

Magic values, error codes and null are problematic to use. They have many downsides with only a little benefit in a little less code. The major downside for static reference typed languages is how those methods are not type checked and correct usage can not be enforced at compile time. Also these error states can’t be determined by the developer from looking at a method signature.

With more recent error handling methods like exceptions, Option and Either/Validation, it is clear that one should no longer use these methods for error handling.

So it’s best to avoid null, magic values and error codes when designing error signaling.

Throws with Try vs. Either

If exceptions can be transformed to monads with Try, let’s take a look at three methods:

Which one is looking cleaner? The ‘Or’ version looks nice, although it’s a little bit cheating as it’s not really declaring what errors it is throwing. It’s more comparable to throws Exception which noone would use.

Checked and unchecked exceptions

Checked exceptions were planned as an improvement in Java over error codes in C. With the goals that developers could not ignore error handling and APIs make it clear what error states they expose. Everything new and shiny this new concept was applied to everything so exceptions proliferated over the Java code base.

If you only give me a hammer, everything looks like a nail. Sometimes this can be a good thing, as Eric Kidd wrote some years ago about the myriad ways to signal errors in Haskell

But I’d be just as happy if we could standardize on two or three of the above whenever possible!

But with Java it’s the opposite. And not only have Java developers limited means to express error state, Java and Sun made no huge effort to explain when to use checked or unchecked exceptions or gave good examples on exception usage. To the contrary, Sun often used exceptions in the wrong way.

Bad Example Used in Sun Best Practices

Even when well meant there were problems. As a sidenote I’ve found a bad example on writing exeptions in a best practice guide from 2002

The advice to rewthrow programming or environment bugs (the config file is not there) as Runtime exceptions is good practice, the “missing file” is bad example. It’s much better to include the filename in the error “Missing file on startup: ” + configFileName, otherwise the error is hard to diagnose. One could dismiss this as ‘example code’, but over decades Java developers where tortured with NumberFormatExceptions that did not show their input or ClassCastExceptions which did not show what should have been casted to what.

But while giving partially good advice, Sun set bad examples. The Worst checked exception usage is probably in the interface API of Appendable.

Appendable is implemented both by FileWriter and by StringBuffer. And although IOException may make sense for FileWriter, it does not make any sense with StringBuffer. The problem here is not the usage of a checked exception, but the wrong abstraction. The attempt to abstract over IO operations (write to file) and memory operations (append to String) was misguided and failed. This often comes from the tendency of developers to over-abstract. In this case it would have been better to have two interfaces as appending to a file is not the same as appending to a String. Another problem is IOException. It’s not specific enough. I can see where this comes from, as specific abstractions are hard to version (see below) and hard to abstract over. In this case it was misguided, as neither one can react properly to the problem (was the IOException due to a temporary or permanent problem) nor can one easily see what errors are thrown. So the IOException only forces slightly better error handling than an unchecked exception would in this case.

The problem with unchecked exceptions is simple: They are not handled. If you look in code bases – try it – of languages that only support unchecked exceptions, you will barely find error handling code.

The pro and con of checked exceptions is touched in Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems

Overall, we found that the developers are good at an- ticipating possible errors. In all but one case, the errors were checked by the developers. The only case where developers did not check the error was an unchecked er- ror system call return in Redis. This is different from the characteristics observed in previous studies on file sys- tem bugs [24, 41, 55], where many errors weren’t even checked. This difference is likely because (i) the Java compiler forces developers to catch all the checked ex- ceptions; and (ii) a variety of errors are expected to occur in large distributed systems, and the developers program more defensively. However, we found they were often simply sloppy in handling these errors. [Emphasis by me]

The most influential view on checked exceptions is by Andres Hejlsberg lead C# architect. Although saying checked exceptions are “a wonderful feature” he has concerns about scalability and versioning. The problem with versioning checked exceptions is adding a checked exception to a method call breaks it

Let’s say I create a method foo that declares it throws exceptions A, B, and C. In version two of foo, I want to add a bunch of features, and now foo might throw exception D. It is a breaking change for me to add D to the throws clause of that method, because existing caller of that method will almost certainly not handle that exception.

Isn’t that also a problem with unchecked exceptions, as a new method throws new unchecked exceptions?

No, because in a lot of cases, people don’t care. They’re not going to handle any of these exceptions. There’s a bottom level exception handler around their message loop. That handler is just going to bring up a dialog that says what went wrong and continue. The programmers protect their code by writing try finally’s everywhere, so they’ll back out correctly if an exception occurs, but they’re not actually interested in handling the exceptions.

So for the case of an application with a main message loop like an GUI app or a request based web app, in his point of view it’s better to have unchecked exceptions that bubble up.

The second problem that Anders sees with checked exceptions is scalability.

In the small, checked exceptions are very enticing. With a little example, you can show that you’ve actually checked that you caught the FileNotFoundException, and isn’t that great? Well, that’s fine when you’re just calling one API. The trouble begins when you start building big systems where you’re talking to four or five different subsystems. Each subsystem throws four to ten exceptions. Now, each time you walk up the ladder of aggregation, you have this exponential hierarchy below you of exceptions you have to deal with. You end up having to declare 40 exceptions that you might throw. And once you aggregate that with another subsystem you’ve got 80 exceptions in your throws clause. It just balloons out of control.

This concern is also reflected by Eric Kidd

This approach will also fail if we start mixing libraries, because each library will define its own set of errors, and we’ll need to write code which converts them all to our preferred error type.

I can see where Anders comes from, but only if you let checked exceptions bubble up, which from my point of view is an anti-pattern (not for unchecked exceptions). Exceptions are part of the domain language. If you use checked exceptions, each layer, subsystem or domain needs its own exceptions. For example

So each level aggregates and abstracts exceptions.

What to use when

This part is very subjective and I’m sure you differ. Nevertheless I think it’s a good idea to suggest a mapping of error handling methods to error cases.

Unchecked exceptions are therefor best for
* things that should not happen b/c you should have checked (div by zero)
* things you can’t/should not handle
* things that can bubble up
* things that can blow and I then just use a default

Unchecked exceptions are also good for everything orthogonal to your application like authorization and authentication.

Checked exceptions should not be used on user level or domain code with a main message loop as in web applications.

For infrastructure code where exceptions need to be handled and can’t bubble up, checked exceptions should be used. Because of the problems with checked exceptions (versioning, to many overwhelm developers) they should only be used sparsingly. I’d say 10% of what the Java API uses them.

For validations and business errors it’s best to use Either, Scalaz validations or Scalactics Or.

For data that is not there or could not be retrieved, it’s best to use Option.

If you have ideas how to make a language safer, what bugs you’ve seen often or general feedback, reply to @codemonkeyism