Introduction
In this context for “configuration” we consider the set of initial parameter settings that are read at run-time by computer programs.
Traditionally the configuration data has been provided mostly in plain-text format but in recent years structured standards have gained more traction. This mostly because in modern distributed environments the configurations are often provided via a Configuration Management System (CMS), transmitted via network and consumed by multiple applications, or multiple instances of the same application. Additionally, some of the configuration parameters are often automatically determined by the CMS instead of being manually entered by humans.
A Human and Machine-Readable Configuration Format
In the aforementioned scenario, with the exception of the program command-line properties, the JavaScript Object Notation (JSON) format seems the perfect candidate:
- It is a standard open lightweight data-interchange format;
- It can be read from a local file or from a remote configuration provider;
- It is often the first format choice of popular Configuration Management System;
- Along with XML is the main format for data interchange used on the modern web;
- It is easy for humans to read and write;
- It is easy for machines to parse and generate;
- Supports all the basic data types (numbers, strings, boolean, arrays and hashes);
- It is developer-friendly, as it can be generated and parsed from almost any programming languages;
- Popular databases (e.g. MongoDB, MySQL, PostgreSQL) can store the JSON format natively.
Configuration entry points
In a given software program the configuration parameters can be provided in different ways:
- Command line properties:
They are very useful during development, testing and debugging phases.
They can overwrite any initial parameter loaded using an alternative method (i.e. configuration file).
They should follow the Unix convention of the dash followed by one letter or two dashes followed by the full parameter name.
For example:
./myprog -l DEBUG
./myprog --logLevel=DEBUG
- Environmental variables:
They can be used to pass initial parameters to a program that runs inside a Docker container.
For example we can use them to tell the program where to retrieve the full configuration.
For example:
docker run --detach=true --net="host" --tty=true \
--env="MYPROG_REMOTECONFIGPROVIDER=consul" \
--env="MYPROG_REMOTECONFIGENDPOINT=127.0.0.1:8500" \
--env="MYPROG_REMOTECONFIGPATH=/config/myprog" \
--env="MYPROG_REMOTECONFIGSECRETKEYRING=" \
owner/myprog:latest
- Local Configuration files:
They can be used to provide either the final or default configuration parameters.
For example:
{
"remoteConfigProvider" : "",
"remoteConfigEndpoint" : "",
"remoteConfigPath" : "",
"remoteConfigSecretKeyring" : "",
"logLevel" : "info",
"serverAddress" : ":8081",
"totalRequests" : 1000,
"concurrentRequests" : 10,
"testTimeout" : 30
}
- External configuration services:
Services like Consul or Etcd can be used to provide configuration parameters in a distributed system.
Configuration Loading Strategy
Different entry points can be used during the development, debugging, testing or deployment. Even different deployment environment may use different configuration entry points case-by-case.
To get the maximum flexibility, the different configuration entry points can be coordinated in the following sequence (1 has the lowest priority and 5 the maximum):
- In the “myprog” program the configuration parameters are defined as a data structure that can be easily mapped to and from a JSON object, and they are initialized with constant default values;
- The program attempts to load the local “config.json”
configuration file and, as soon one is found, overwrites the values
previously set. The configuration file is searched in the following
ordered directories:
./
: the current program folder;config/
: a “config” sub-directory of the current directory;~/.myprog/
: a “.myprog” directory inside the user home folder;/etc/myprog/
: a “myprog” directory inside the “/etc” folder.
- The program attempts to load the enviromental variables that
define the remote configuration system and, if found, overwrites
the correspondent configuration parameters:
- MYPROG_REMOTECONFIGPROVIDER → remoteConfigProvider
- MYPROG_REMOTECONFIGENDPOINT → remoteConfigEndpoint
- MYPROG_REMOTECONFIGPATH → remoteConfigPath
- MYPROG_REMOTECONFIGSECRETKEYRING → remoteConfigSecretKeyring
- If the remoteConfigX parameters are not empty, the program attempts to load the configuration data from the remote configuration management (i.e. Consul or Etcd) overwriting any previous value. To react to any configuration change, this process can be repeated at run-time at regular intervals and/or when a signal is sent to the software.
- Any specified command line property overwrites the correspondent configuration parameter.
- The configuration parameters are checked and sanitized.
It should now be evident how this sequence offers the flexibility required to configure the same program in very different development, testing or deployment scenarios.
Common Configuration Parameters
In any configuration it is very useful to use consistent names for fields that refers to the same functionality. In this way it is easy to set a Configuration Management System for multiple programs at once.
For example:
- remoteConfigProvider : remote configuration source (i.e. “consul”, “etcd”);
- remoteConfigEndpoint : remote configuration URL (ip:port);
- remoteConfigPath : remote configuration path where to search fo the configuration file (e.g. “/config/myprog”);
- remoteConfigSecretKeyring : path to the openpgp secret keyring used to decrypt the remote configuration data (e.g. “/etc/myprog/configkey.gpg”); if empty a non secure connection will be used instead;
- logLevel : Log category classified using the syslog severity levels (EMERGENCY, ALERT, CRITICAL, ERROR, WARNING, NOTICE, INFO, DEBUG)
- serverAddress : HTTP API URL (ip:port) or just (:port)