Where to put input validation in Typescript?

You can leave Feedback and follow me on Twitter.

In a web application we wonder where to put validation logic. Often validation logic is not coordinated and put into several places. Javascript validation in the browser, input validation in the web or API controllers, input validation in the service layer and input validation at the database layer. This blog post proposes to use tag types to reduce input validation and clearly define where to put and handle it. Some of the benefits could be achieved by using custom types, but those increase lines of code and many developers don't like having hundreds of small classes. As an example language we use Typescript.

Our example application is for buying on the internet. We have a controller method with an additional parameter that specifies a rebate as a precentage. The percentage

  • should be an integer, we do not support 20.3%
  • should be between 0 and 100, instead of 0 and 1
  • should be below 80 in this case for a coupon, as we do not want to give 100% rebates

I've chosen this example because as a developer once I created a bug were due to bad documentation I assumed a percentage Double value was between 0 and 1 (as a mathematician would do) while the developer who wrote the code thought it to be between 0 and 100 (like a marketing person and perhaps some product managers would do).

Let's look at some code

We use a string parameter with the percentage of the coupon. First step is to validate it's a number. Then we call buy and decreaseInventory. The buy method could both return validation errors (not a percentage) or business errors like out of stock. We need to handle both error types here. If there are none, we can call decreaseInventory.

The buy method needs input validation to make sure the parameter given has the correct syntax, correct semantic and valid values. The same goes for the decreaseInventory method.

Exception handling in Javascript isn't very good. It adds boilerplate and has confusing execution flow. We replace exceptions with an error result in version 2 of our example. We also push input validation to the edges in the web framework by making the paramter a number instead of a string in the second example.

This has less boilerplate from the exceptions than the first example, but we have still validation logic inside our functions.

To correctly call the buy function we need to read the documentation about the format of percentage. After reading about percentage, as a good coder we do not push invalid data to a function, so we validate it first, adding more validation logic.

This leads to duplicated validation logic with all the problems of code duplication. More code leads to more bugs, changing code needs to be done in several places, forgetting one place leads to bugs. More code makes understanding and reading code more difficult.

Code duplication is due to the fact, that the validation we have done is forgotten with the number type. We validate it's constraints (0-100, smaller than 80, Integer) but then forget this validation so it needs to be done again in the service layer at buy and decreaseInventory. If we have a database layer, we need to validate there a third time.

With tag types we can 'store' these constraints. A tag type refines a type with syntactic or semantic constraints.

Here Positive is a constraint on the rparamater which is of type number. In Typescript tag type are based on Intersection types.

I've written a tag type library for Typescript called taghiro which makes creating tag types easy and already brings dozens of predefined and useful constraints. In Scala one can use refined. Our example rewritten with tag types from taghiro looks like this:

This code has many benefits. First we need to put validation logic only in one place. The service layer looks much cleaner without the validation logic. One can also directly see the parameter constraints from the method signature. percentage needs to be a number, follow the Percentage constraints and be less than 80.

Then validation error handling and business error handling are seperated in two places. Validation handling before we call a method, business error handling after we have called a method.

A developer is forced to do validation first, he can't push faulty data lazily into the service layer, preventing potential bugs. The compiler makes sure at compile time that the parameter fullfils the right constraints and validation logic is in place. The caller needs to ensure those constraints and the service layer is not required to validate the input from the caller.

A small side note: If you are familiar with Java, in the early days Java developers had to deal with three main errors. Null Pointer Exeptions (NPE), Class Cast Exceptions (CCE) and Illegal Argument Exceptions (IAE). NPEs have been mostly solved by Optional. CCEs have been solved by Generics. Tag types solve the third major error class, IAEs.

Using tag types reduces boilerplate, reduced code duplication, pushes validation to the edges and makes code more readable and easier to understand. taghiro is open source and MIT licensed.

Tag Types in Typescript with taghiro

You can leave Feedback and follow me on Twitter.

Typescript has brought types in Javascript to the mainstream. With TS it's easier to write correct code and code that's easier to understand.

One step further are tag types. With them you can tag other types and you express refined types with richer constraints. I've written about them before in Scala here and here. For example NotZero is a tag type preventing bto be 0 in this example.

I've written a tag type library in Typescript with a number of ready to use tag types to make your code richer and cleaner. It's called taghiro and can be found on Github. It's mission is to prevent bugs and make code more readable to developers.

Ready to use tag types

Some examples of the many ready to use tag types included in taghiro are

  • MinSize
  • NonEmpty
  • Sorted
  • Positive
  • UpperCase

Tag types can easily be used with Typescript Intersection types and type guards. Suppose we have a function that lists all product categories and takes a category as a parameter. In our system all categories have to be uppercase, e.g. ELECTRONICS.

To make sure the parameter is uppercase we declare it as string & UpperCase.

Developers easily can see the constraint of category: It must be uppercase. Compare this to a method signature where the constraint is in the documentation. Failure to adhere to the constraint is not detected at compile time but at runtime and depending on the error handling in listProductCategory the system might crash.

A method with tag types can be used with type guards. Type guards in Typescript ensure a variable has a certain type. After the check Typescript assumes the variable has the corresponding type without casting.

isUpperCase is defined by taghiro

Handling the case that category is not uppercase now lays in the responsibility of the caller, who is much better equipped to handle the case as there is more context. In most systems the transformation from string to string & UpperCase happens a the edges, while inside the system every method uses string & UpperCase and needs no more error handling code.

Now on to a second complex example. Suppose we want to write an API to send emails.

Here several things could go wrong. First the to array could be empty. Second html could be empty, not contain any HTML or contain unsafe HTML. With tag types we can make sure the paramaters are save.

Now the caller needs to ensure that the parameters satisfy the tag types.

Another way would be to use custom types for the parameters. This has the drawback in Typescript that it doesn't prevent using the wrong type.

Or we could use ReceiverList and Html classes, which is the usual way to use OO and then use the same type guards to check for NonEmpty.

The downside here is you need more classes and in a large project this leads to hundreds of smaller helper data classes. These need to be maintained and kept in your mind when you develop new parts of the system or change existing parts. One other drawback is that you can't put these values into methods that take string and Array<string> while you can do this with the tag types.

But the major downside is the OO argument of encapsulation. ReceiverList encapsulates Array<string>. This indirection is aimed to make it easier to understand systems while in reality this indirection adds another layer you need to be aware of. Array<string & Email> & NonEmpty can be understood by everyone new to the project without looking into more classes. Compare this to an ReceiverList constructor

were we still don't know without looking in the documentation the constraints on theEmails (non-empty and being emails).

Custom tags

With taghiro you can write your own Tags. By leveraging libraries for checking emails and HTML we can easily implement Email and SafeHtml to make the method even safer.

Custom tag types

Beside using the supplied tag types, it's adviced to use your own. Tag types can be used to define custom domain concepts. One example is id. Here is an example based on string Uuid ids.

This way it's impossible to put the wrong ids into a method.

will only take a CustomerId. Compare this to

where in complex projects it's easy to put the wrong id into the findCustomer method, producing a hard to find bug.

One can define a custom Tag type to define more than one id tag.

Tag types are an easy way to build on the already excellent type system of Typescript. taghiro supplies ready to use tag type so you will need to write less code. And it will help you write less bugs.

taghiro is Open Source and licensed under a MIT license.

Reordering Futures in Java Streams

Futures are an interesting concept in concurrent and parallel programming. I've written about them several time, the last time in Advanced Futures With Play Framework. Streams are a concept which came with Java 8 to operate on a stream of objects, transform and filter them. Combining Futures and Streams in Java sometimes isn't easy. Some time ago a problem arised in code where slow futures in a stream were blocking other objects in the stream to be processed. Usually you'd use parallel streams in this case, but for operations concerns parallel streams had the wrong thread characteristic. Therefor I've thought about ways to unblock streams.

Lets start with a simple example. We have a simple asynchronous API call and a Stream of items and want to call an external API potentially with IO which takes some time:

We iterate over the stream and make the call. The call returns CompletableFuture so our stream becomes a Stream of Futures. In the end we collect the items of the stream into a List with a stream collector, in this case Collectors.toList(). As a minor challenge the collect call returns List<CompletableFuture<T>>, but we would like to have CompletableFuture<List<Integer>> for easier consumption. Luckily the futures library from Spotify provides an allAsList call which turns a List of Futures into a Future of List.

When running the code it outputs

The stream processes all items without blocking (4x Stream 1), then all API calls finish (Done) and afterwards the results are printed as 301, 1001, 102, 21. As the stream operates on Futures which do not block, our code immediatly runs after creation of the stream without waiting for the stream processing to end. Only when we call join() in the last line, our code blocks and waits until all futures have finished.

Streams are not only mapping but also support filtering. We now want to filter on the values of futures. As our Stream needs to deal with the value of a future, we need to wait on the future to complete, pull the value from the Future context into the Stream context with f.join():

Running this code is not as nice as before. The stream processes one item, then blocks until the call finishes, then processes the next item.

As can be seen from the output, the longer running task (1000) blocks (join) processing of futher stream items. Most often this is solved by using parallel streams. A parallel version would look like this

As expected the output again processes all items, the 1000ms API call does no longer block processing of other items:

If we want to have more control over when and how many threads are used at what time, and we also want single threaded Stream, we can implement a custom stream collector. With a custom collector, we can reorder the futures of the stream and move those that have completed to the front. Our custom collector is called FutureReorderCollector. It gathers all Futures and creates a new stream with a new ordering. Finished futures are now moved forward and running futures are kept back.

Output now is as before, although we are not using parallel streams and have a long running (1000ms) API call.

The long running Thread is moved to the end of the stream. Stream execution is blocked only as much as needed.

Our collector needs an executor to process those Futures. An optimized version could work with the executor already used by the futures. But in this case we provide a custom ExecutorService which we need to shut down after we finished work.

Implementation of the custom collector is simple. It collects all futures into a custom FutureCompletionService and creates an new Stream in the finisher with Stream.generate(f::take).limit(f.size()).

The collector uses a custom completion service, were we add futures to a JDK CompletionService and block on polling finished futures in take. One implementation detail is the counting of submitted tasks with AtomicInteger. This is used to limit the Stream generated in FutureReorderCollector.

Streams and futures are powerful concepts in Java. Sometimes it's not easy to combine them. With this code you can implement your own collector to change how streams process their items.