the blog for developers

Better Configuration Files

Over the years I have seen many configuration files. Most of them were unusable. There are many reasons for unusable configuration files. What I’ve learned from looking at large configurations are those main points:

1. Values

Often configuration files use the wrong values. Developers tend to use true/false for switching options on and off.

track-users = true

The configuration is easier to understand when instead of true/false one uses on/off, it’s more relevant to the domain

tracking-users = on

Sometimes there are double negations like

no-track-users = true

which leads easily to wrong configurations and is harder to understand. This also leads to inconsistencies like

track-users = on
no-paypal = off
creditcard = on

Do not use double negation, best stick with positive switches where possible.

2. Lists and Types

I’m a friend of typing data. This also goes for configuration values. Your configuration system should support simple types like Strings, Numbers and Lists.

backends = ['backend1', 'backend2', 'backend3']
port = 8080
name = "TestDB"

Then a configuration checker can check the values for typos, like

port = 808x0
backends = ['backend1' 'backend2', 'backend3']
// backends = backend1 backend2, backend3

Lists make configuration files easier to read and shorter. If you have no List support, configurations tend to look like this:

backend.1 = backend1
backend.2 = backend2
backend.3 = backend3
...
currency = dollar
currency = euro

Compare this to the clean List code above.

3. Descriptive

Configuration options should be descriptive. Often they tend to use lots of true/false configuration switches.

database = true
redis = false
cassandra = false

If possible, its better to replace true/false switches with possible options:

// DATABASE, REDIS, CASSANDRA
database = REDIS

4. Hierarchical configurations

Flat configurations are hard to read. To group configuration keys with flat configurations you need to repeat yourself with namespace values. I know many do not like mixing property style and XML style, but I’ve learned that this works really well for configurations (YAML is another hierarchical option).

<database>
    server = 127.0.0.1
    port = 8080
    name = "TestDB"
</database>

is more readable and clear than

database.server = 127.0.0.1
database.port = 8080
datbase.name = "TestDB"

or even

ordermanagement.database.server = 127.0.0.1
ordermanagement.database.port = 8080
ordermanagement.database.name = "TestDB"

5. Application configuration, server instances and environments

The configuration for one of your application instances is a combination of the environment the instance runs, the application configuration and the server instance.

// application code
database-url = jdbc://$DATABASE.server//$DB
callback-url = http://$HOST/callback

Both database-url and callback-url are application configuration values, $DB is depending on the environment (development, production, test) and $HOST is depending on the server instance. Environment solution can easily be done in different configuration files, e.g. put them into production, development and test directories. This enables you to put all configurations into your source control system.

6. Inversion of control and DRY

Your configuration framework should support inversion of control or dependency injection, I’m not sure what to call it. Do not define the database server IP in every client, define it in the database server configuration and inject/reference it into every client configuration.

// database server config
host = $HOST

and the client

// client config
database.server = $DATABASE.host

instead of

// client config
database.server = 127.0.0.1

Client does not declare the database server, the client determines at startup the runtime config by getting the values from the database server. This can most easily solved with a configuration service, it’s harder to solve with configuration files.

7. Write a configuration checker

Of utmost importance is to write a configuration checker. Many production problems arise from wrong configurations. Apache supports a config test before you restart the application by running

apachectl configtest

The checker should check typos, not resolved dependecies, not resolved variables, type checks (String, Number, List, IP, ….) and mandatory values.

Conclusion

Writing configuration in the wrong way leads to maintenance and production problems. Do not neglect the configuration of your application, you need to put thought into it or it will turn to an unmanageable mess.

You can leave a Reply here. Of course, you should follow me on twitter here.

You can share this post!
Do you want to tell others about this article? Use the social bookmark icons to submit this artice to the service of your choice. Thanks.

About the author: Stephan Schmidt has more than 15 years of internet technology experience and 10 years experience in agile. He was head of development, consultant and CTO and is a speaker, author and blog writer. He specializes in organizing and optimizing software development helping companies by increasing productivity with lean software development and agile methodologies. Want to know more? All views are only his own.

26 Tweets

Leave a reply.

Comments

Great posting Stephan. I know more examples for bad configuration files than for good ones, especially when it comes to the naming of variables. I don’t agree with your proposed list syntax though, because this can easily lead to syntax errors. I personally prefer a more lightweight syntax without brackets and/or with whitespaces as a delimiter between list items.

@Stefan: Thanks. IF you have bad examples, post a comment :-) Yes naming is a problem too, often developers do only think 5sec for naming a configuration property, which is too short and leads to bad names.

@Stephan Really nice post. I have used Groovy scripts (in Java project) for complex configuration stuff, which worked out well for me. I also blogged about it here:

http://charsequence.blogspot.com/2010/01/replacing-application-properties-with.html

I agree with pretty much everything. This is why I usually find configuration files as Internal DSLs on a language with great support for dsls (Scala, Ruby, Clojure, Groovy, and possibly Haskell) can actually follow all these points, with a few other advantages: Importing, templating and Hierarchichal configurations (for instance: dev configuration is the same as productin configuration, but using a local database instead of a remote one).

The configuration checker is a great tool as well. But, for small to medium sized configuration files written as internal DSL, the approach of just showing the configuration or searching it, might suffice. Actually such tool is great for big configurations files as well.

Pavlo

Great read! Another suggestion from my experience: if you are in the luxury to configure for a scripting language, consider configuration in the same language first and avoid additional formats like YAML etc. You stay “native”, without any media disruptions. The so-called “environizing” – the process of injecting concrete values instead of environment dependent placeholders, could be a problem, though, if your placeholders can’t be easily replaced through a bunch of shell scripts the integration guys would use to modify the values for each stage. But that must be a real exotic scripting language :)

Extra format parsing isn’t necessary if you already have a system which parses scripts. I wouldn’t have done this with JSP since it’s on top of something much bigger / static, but I’ve seen good configurations that worked with PHP that way since there is not much underneath. This is my point.

Cedric

Great points all around, Stephan (and I would really like to see YAML used more for configuration files that need to be read and edited by humans).

@Stephan: Probably one of the best examples for human-unfriendly configuration files is the Sendmail MTA. As far as i remember, you have to learn the m4 macro language to be able to configure your MTA. Then there is the Asterisk PBX, which is quite modular but you need to read a good amount of documentation to be able to configure it properly. Especially if you have to debug a problem in your extensions.conf . The Nagios monitoring system is also a nice example: You have to learn some basic concepts first (every piece of configuration is an object, you have to setup your templates first, nrpe has nothing to do with the scheduler core, and so on) to configure your system properly.

To be fair: Those are heavy and complex systems. They offer a good amount of flexibility and the price is a steep learning curve. All these systems have less featureful, no-frills alternatives to choose from.

On the other hand i believe that simple, common things should be easy to configure and complex, more uncommon setups have to be possible.

When it is hard to get a small example configuration to work, i tend to mistrust the software in general.

I think you forgot about two important aspects that make configuration of systems harder: updates (new versions) and customer specific configurations (to product versions).

You then have to define configurations incrementally supporting renaming, deleting and values changes of values. (If you do not want to make the configuration redundant.)

I am totally against putting IP addresses in config files. I don’t even like cross-referential hostnames.

A technique I like is to create and use DNS alias-based on the functionality you are using (e.g., database, web services). It’s a good idea to qualify those aliases too (e.g., billing-db, crm-openid, forums-smtp).

[...] Better Configuration Files We should have a common convention here [...]

ron

mostly important IMHO is to split the configuration file to several configuration files, each responsible for a different concern and allow injecting values between them. this way it’s modular, easier to keep DRY and updates are made easier.

in my multi module project, each module has it’s configuration files, and on bootstrap runtime collects them and injects values between, creating a big environment object.
changing the hosts is easy and done in one place only. http://code.google.com/p/reflections/wiki/UseCases

This is a great post. Two comments:

(1) Sometimes there is a need for multi-layers of configuration in order to support sensible defaults. For instance, in a desktop application, the general defaults can be read from the installation directory, user defaults can be read from the user’s home directory, and the actual configuration can be read from the current directory.

The configuration module just needs to load these three files (in order) and overwrite old values with new ones.

(2) I am not sure about “Client does not declare the database server, the client determines at startup the runtime config by getting the values from the database server. ” (your 6th point). The problems I see is that of bootstrapping: In order to retrieve the configuration from the server the client needs to know the address of this server, so I think it will be really difficult to have the IP address of the server injected into the clients.

Det

Great post. +1 in all points.

In my company we use a proprietary build system, based on Ant. The “proprietary” is based on the fact that our projects don’t use the build.xml directly but call a script that call a very complex Ant-build construct.

The configuration is done in project specific property files using the typical Java group-by-dot-chains style you mention.

In the beginning it was ok, but since many years of growing development and changing requirements we have now a really weird and complex configuration language.

Would’ve been better they had started with your tips in mind back then.

BTW: I will add one point:

Conditionals!

It is a bad experience if you try to set different properties without effect until you learn that the whole group is ignored due to some other funny named (and not obviously related) property flag, which is checked inside the application black box.

Stefan,
lots of good points. May I add

* Log the entire configuration as read/values replaced at your app’s startup. that way you always know what the config was that applies to the behavior and where values are not properly replaced
* Make sure all config is read at the start up of all modules (no lazy config init/validation [file exists, can be opend, …)), otherwise you have it scattered over the total log and rarely exercised features fail late and require a re-config/restart.
* add into your names type indication (which you did naturally), such as
** thread-count instead of threads
** max-thread-count instead of max-threads
** server-url instead of server

K

Leave a Reply

What people wrote somewhere else:

Just blogged “Better Configuration Files” http://t.co/SdF9EXp via @codemonkeyism

This comment was originally posted on Twitter

RT @codemonkeyism: Just blogged “Better Configuration Files” http://t.co/SdF9EXp via @codemonkeyism

This comment was originally posted on Twitter

RT @codemonkeyism: Just blogged “Better Configuration Files” http://t.co/SdF9EXp via @codemonkeyism

This comment was originally posted on Twitter

RT @codemonkeyism: Just blogged “Better Configuration Files” http://t.co/SdF9EXp via @codemonkeyism

This comment was originally posted on Twitter

Most of the points are covered by Configgy RT @codemonkeyism Just blogged “Better Configuration Files” http://t.co/SdF9EXp

This comment was originally posted on Twitter

great! RT: @codemonkeyism: Just blogged “Better Configuration Files” http://t.co/SdF9EXp via @codemonkeyism

This comment was originally posted on Twitter

Stephan Schmidt: Better Configuration Files http://bit.ly/baGkgV

This comment was originally posted on Twitter

RT @codemonkeyism: Just blogged “Better Configuration Files” http://t.co/SdF9EXp via @codemonkeyism

This comment was originally posted on Twitter

http://bit.ly/caBcx1 ¦ Better Configuration Files

This comment was originally posted on Twitter

RT @codemonkeyism: Just blogged “Better Configuration Files” http://t.co/SdF9EXp via @codemonkeyism

This comment was originally posted on Twitter

RT @codemonkeyism: Just blogged “Better Configuration Files” http://t.co/SdF9EXp via @codemonkeyism

This comment was originally posted on Twitter

Configuration files best practices. Very interesting insights. http://bit.ly/a4C0Sr

This comment was originally posted on Twitter

Code Monkeyism: Better Configuration Files http://bit.ly/czhdhZ

This comment was originally posted on Twitter

RT @jneira: Code Monkeyism: Better Configuration Files http://bit.ly/czhdhZ

This comment was originally posted on Twitter

@codemonkeyism about configuration files and some fundamental problems leading to maintenance and production problems: http://bit.ly/bqyCqW

This comment was originally posted on Twitter

Code Monkeyism: Better Configuration Files: http://bit.ly/cvYhFO

This comment was originally posted on Twitter

Config files http://is.gd/eQVIi

This comment was originally posted on Twitter

RT @vasileboris: Config files http://is.gd/eQVIi

This comment was originally posted on Twitter

Enjoyed reading @codemonkeyism post about configuration files: http://codemonkeyism.com/configuration-files/

This comment was originally posted on Twitter

Better Configuration Files http://t.co/cuYT5F9 via @codemonkeyism
-
Mange som har mye å hente her

This comment was originally posted on Twitter

Code Monkeyism: Better Configuration Files http://bit.ly/cyjrUp

This comment was originally posted on Twitter

Better Configuration Files http://t.co/jZK8fB7 via @codemonkeyism

This comment was originally posted on Twitter

http://bit.ly/caBcx1 ◊ Better Configuration Files

This comment was originally posted on Twitter

http://bit.ly/caBcx1 ★ Better Configuration Files

This comment was originally posted on Twitter

Better Configuration Files http://t.co/LzfSJoJ via @codemonkeyism

This comment was originally posted on Twitter

http://su.pr/2IMTzA – Better Configuration Files

This comment was originally posted on Twitter

Additional comments powered by BackType

Guide to CodeMonkeyism

Over the last 4 years I wrote many articles on this blog. To make it easier for you to find the relevant ones, I've organized them into topics.

Top 10

6 reasons why my VC funded startup did fail

Go Ahead: Next Generation Java Programming Style

Java Interview questions: Write a String Reverser

The dark side of NoSQL

7 Bad Signs not to Work for a Software Company or Startup

Is Java dead?

Scala vs. Clojure

Never, never, never use String in Java

No future for functional programming in 2008 – Scala, F# and Nu

Clojure vs Scala, Part 2

Java Developer

Is Java Dead?

Go Ahead: Next Generation Java Programming Style

Be careful with magical code

All variables in Java must be final

Never, never, never use String in Java

Bending Java: More readable code with methods that do nothing?

NoSQL Guy

NoSQL: The Dawn of Polyglot Persistence

The dark side of NoSQL

Essential storage tradeoff: Simple Reads vs. Simple Writes

Sharding destroys the goals of your relational database

The unholy legacy of databases

Startup/CTO

Development Dream Teams

6 reasons why my VC funded startup did fail

American vs. European style of Software Development

12 Things to Reduce Your Lead Time and Time to Market

The high cost of overhead when working in parallel

Essential storage tradeoff: Simple Reads vs. Simple Writes

Job Seeker

Another Good (Java) Interview Question

7 Bad Signs not to Work for a Software Company or Startup

Java Interview questions: Write a String Reverser (and use Recursion!)

Java Interview questions: Multiple Inheritance

As a Manager: What I value in developers

Top 10 Tips (+1) to Get a Pay Raise

Agilist

What Developers Need to Know About Agile

5 Practices Better to Change in Your Scrum Implementation

Scrum is not about engineering practices

ScrumMaster and ZenMaster: The joke of certification

What is Trans-Scrum?