Comparing Tagged Types and Design by Contract

I’m working on a new mainstream programming language that prevents more bugs than current mainstream languages. One idea it will incorporate are Tagged Types. I wrote about them recently in Real Life Bug Prevention in Scala. Tagged Types can be used to solve different problems. Often they are used as marker types. In the context of this blog posts they work as contracts on other types therefor I call them contract types (CT) from here on. An existing Scala library which implements those with lots of functionality is Refined – good talk about Refined.

Design by Contract for Preventing Bugs

Design by Contract (DbC) is a an idea to prevent bugs by specifying constraints on data and constraints on how data relates to other data. The idea of DbC has been floating around for decades, but hasn’t made it into mainstream. The main reason I haven’t used DbC is because validation happens at runtime instead of compile time.

The idea is to express pre-conditions, post-conditions and invariants for method calls and classes. Pre-conditions are checked before the execution of a method, the post-conditions are checked at the end of the method, invariants are checked before and after execution of the method.

The canonical DbC example is a Stack (Source)

Contract Type Example

Lets start with a simpler example. We have a method that takes Integers and returns Integers.

The method has the additional constraint that the argument needs to be a positive integer. In some languages there are types for positive integers (or more likely non-negative including zero) but in Scala we use Int.

With a hypothetical Design by Contract framework (like iContract in Java) we could express the constraint with

The DbC framework creates checks during compilation for our constraint which are executed at runtime. If the constraint is violated, an exception is thrown. This is equivalent to using require in Scala, but makes the contract explicit and part of the documented API. With require in Scala this would look like

Can this be expressed with contract types? Yes, we supply an Int @@ Positive to the method instead of Int. This way the method can be sure that the argument is a positive Integer.

How does a developer get the required type to call this method? One way is to have a factory method to create the required type from Int with a contract type Positive tagged on. Developers supply an Int to positive(). If the argument is positive, the method returns Some[Int @@ Positive] otherwise it returns None. The result can then be supplied to our method.

This way the caller of a method needs to prove that data satisfies a certain constraint, in this case Positive. If the caller can’t supply data that satisfies a constraint, he can’t call the method. The responsibility to provide correct data lies with the caller, not the implementor of the method. This is a huge shift in development responsibility from

to

although the change in code looks rather insignificant.

Needing a method for conversion looks cumbersome, but this conversion only happens at the interfacing between two modules or systems and so is rare.

Generation of contract typed values ties neatly into validation, for example with Scalactic

Positive Performance Side Effect

If the state of constraints is unclear, constraints are checked more than once. For example a system might take a String as input and then checks in many places if the String is empty. With

the constraint is checked once and the contract type removes all constraint checks in other parts of the system. Sometimes even duplicated work is removed. Often different parts of a system need data in some form, for example lowercase Strings.

With constract types the String is converted once and not several times in different parts of the system. This removes work and increases performance.

Examining Contract Types

We should look at more examples of contract types

Sometimes the difference between tagged types that add semantic meaning to data and contract types is blurry. I would consider

not a contract type though. It’s more difficult with

and even more difficult with

While the first looks to be a contract type, the second looks more like a tagged type.

Benefits and Drawbacks

Contract Types

  • Correctness is checked at compile time, not runtime
  • Very few additional code needed
  • The burden is on the caller to prove an argument is correct

Design by Contract

  • Can fail at runtime
  • Design by contract is more powerful

Unit Testing

  • Very powerful
  • Additional (and more) code needed
  • Code seperate from main code
  • Run at compile time
  • Different testing mind set

The main benefit of contract types compared to DbC are that they are ensured at the call site instead of in the method. Also they are visible to the caller during development. More powerful systems for specifications exist, like TLA+. These also have the benefit to prove correctness at comile time but have the drawback of a much higher complexity. This is fine if you write a distributed database but overkill for web frontends to databases. I consider contract types something between regular types, DbC and dependent types like in Agda or Idris where it is possible for types to depend on data.

Conclusion

Contract types solve a special problem. Often types in mainstream languages do not express enough about the data they type, for example that a String is lowercase or a User is logged in. Interfaces in mainstream languages therefor have implicit requirements about the data they receive which easily leads to bugs. Contract types like Int @@ Positive express the requirement as a type. Although this can be solved by inheritance, interfaces or type classes, all of these concepts are harder to understand and read than contract types. There is more to be said in another blog post.

Real Life Bug Prevention in Scala

Using Scala for the last 5 years to run eventsofa I’ve became a better programmer creating less bugs. Most of the code is Java++ Scala compared to more canonical functional Haskell-like Scala. So I’m interested how Scala used in this way prevents bugs that I would make in Java, Python or Ruby.

There have been two things that made a huge impact, Options and tagged types.

Lets explore both concepts with a small example:

Option vs. Null

How does one signal that there is no invoice or there is no user? In Java this is often done with Null or throwing some kind of InvoiceNotFoundException. With Nulls for error handling (Some Thoughts on Error Handling) code looks like this

Code with exceptions would look similar. This code is noisy and because developers are not forced to check for Null, sometimes developers do not test for error conditions signaled with Null. The result are hard to find NullPointerExceptions (I also blame you Java, for not printing more information in NPEs, take a look at Elm!)

In Scala one would use Option type to signal that there is no user, with returning Some(User) if a user was found and None if there wasn’t.

Code using Option does not need to check for the result type of the method call and split the code path depending on success and error as with Null. One code path is used for both the error and the success case. Error handling can be deferred to a later point and errors can be accumulated.

or

For cleaner syntax Scala provide for sugaring which expands to the code above, but is easier to read

In Scala it’s easy to compose Options compared to composing Nulls – which don’t compose. Also the code is easier to read with less noise, especially the for version is easy to understand even with more complex dependencies. Code that is easy to read and easy to understand results in less misunderstandings and less bugs. This way error handling is improved as developers do write more and better error handling code. From my experience with Scala the usage of Option instead of Null or Exceptions leads to a lot less lazyness bugs.

Tagged Types

The usage of String for accountNumber and userId is problematic.

For developers it’s hard to understand how these Strings look like or how to get or create a correct one. There might be different userIds in different String formats and it is easy to plug the wrong userId formatted String into a method. The same object might have different names in different method signatures like id, userId, or user. A little bit more nuanced the problems shines in

If you find this code, your first thought is “What is percentage?”. To represent 11.5% the Float value could be 11.5 or 0.115. This has biten me in the past when I’ve assumed 10% percent for a coupon was represented by 0.10. As a result millions of users got mails with coupons with a value 0.10% because the developer of that method represented 10% with 10.0.

Where this also rears its head is with money.

First Once and for all: Do not use double for money, second is this a Netto or Brutto amount? A bug I’ve seen several times is when variables representing netto amounts are plugged into methods expecting brutto amounts.

The classic approach would be to create abstract data types (ADTs), classes, case classes or data types classes (Value Objects).

The downside with this is (Never, never, never use String in Java (or at least less often :-)) and the reason developers don’t use this often enough: developers need to write more code plus Percentage is harder to understand, to reuse and to calculated with than Float. I can plug in Float into more methods than Percentage.

In Scala I’m using Tagged Types instead of case classes her

These have several benefits. It’s easy to see that Percentage is a Float, creation can be documented (10% represented by 0.10) and controlled plus the API is easier to read. All of this leads to less misunderstandings and from my experience therefor to less bugs.

Other examples are

Tagged types have made my code much better than before.

How about other Scala features to prevent bugs? As I am writing server side, request isolated code, immutable data structures didn’t help to prevent concurrency bugs in the last years. My only concurrency is fire and forget usage of actors for logging, database side effects, sending mails etc. Limited concurrency is used to get data from different data sources, neither of these use cases share data though.

Sealed traits have helped with checking if switch statements are exhaustive, especially in larger code bases this helps when adding new features. Case classes prevented hashcode bugs as they generate correct hashcode methods by themselves.

All in all my Scala code has a reduced level of bugs due to some Scala features compared to my code in Java, Ruby or Python. Any new language I am using would need to support both Option and tagged types.