Readable configuration files

From time to time it may be necessary to develop a software where you have to configure each run of the software or use a predefined configuration to run it. Normally there should be a UI, which the user can use to create such a configuration and hand it over to the software. But in case the effort of creating a user friendly UI is too high or the user does not want to have one, it is essential to choose a configuration format that can be edited quite easy. Looking at my current work, I have to provide a configuration format which will setup the a travel demand simulation, including every needed input data.

Taking a look at wikipedia you can find several possible formats which can be used to serialise the data. As I have worked with XML in the past and I got messed up with all those tags, like Jeff Atwood, I decided to take a look at alternative formats, like YAML. It was designed to be easy readable and until now I can agree with that.

As stated above I need to define a configuration format for a travel demand simulation. Historically, our tool uses matrices for costs and travel time as input. Those matrices can differ over time and travel mode, e.g. you can have one which is valid between 0 AM and 1 AM and another one being valid between 1 AM and 3 AM for travelling by car. If you want to go by bus, there are also different matrices, e.g. one being valid between 0 AM and 2 AM and one which is valid between 2 AM and 3 AM. So you can have several matrices per travel mode. Using YAML one can specify the input files in a data driven approach using nested sets of key value pairs.

  - mode: car
      - from: 0
        to: 1
        path: path/to/0-to-1.file
      - from: 1
        to: 3
        path: path/to/1-to-3.file
  - mode: bus
      - from: 0
        to: 2
        path: path/to/0-to-2.file
      - from: 2
        to: 3
        path: path/to/2-to-3.file

This is a classical approach of specifying configuration data like it was done in XML, except that only one parameter or key-value-pair is stated per line. As stated above, YAML was design to provide a more human readable serialisation format. So we can change the serialisation of the resulting configuration a bit more.

      0 to 1: path/to/0-to-1.file
      1 to 3: path/to/1-to-3.file
      0 to 2: path/to/0-to-2.file
      2 to 3: path/to/2-to-3.file

In this case, the data structure is based on maps where car and bus are keys for travel modes and 0 to 1, 1 to 3, 0 to 2 and 2 to 3 are keys for the time spans and matrix files. The syntax in this case can be combined to answer the question: Where will the travel time using a bus at 1 AM will read from?

travelTimeUsing: car: between: 0 to 1: path/to/0-to-1.file

Ignoring punctuation this can be written like.

Travel time using car between 0 to 1 path/to/0-to-1.file.

To build a better sounding sentence one could also add another will be read from after 0 to 1 which results in:

Travel time using car between 0 to 1 will be read from path/to/0-to-1.file.

But I think this last addition inflates the configuration with too much boilerplate text.

In my case this serialisation format did not consume a lot of time to develop and my users where quite happy with it. They can use normal text editors to change their configurations. Some text editors additionally can support them with syntax highlighting or folding of YAML files.

Using text editors to change the configuration instead of a dedicated UI or XML has the drawback, that I could not find a comparable mechanism like XML-Schema for YAML, which is supported by editors. So you have to know the syntax of the configuration. In my case my users could write their configuration files after a look on a small example. But nonetheless, sooner or later we may need to develop a UI to configure simulations to improve the user experience. Until there we have a quite readable configuration format.