Introduction
Logs are used to record events taking place in the execution of a software program in order to provide an audit trail that can be used to understand the activity and diagnose problems.
In modern distributed environments the logs produced by multiple applications, or multiple instances of the same application, are often collected and processed in a single place. A typical stack involves the use of Elasticsearch, Logstash and Kibana (ELK). In this environments the first level consumer of the log messages is not anymore a human but a machine. This mostly because humans cannot process information in the massive flows that are created by concurrent and distributed systems. This poses new challenges when using traditional plain-text log formats designed for human consumption:
- They require custom parsing scripts;
- Add a new field would break most of the custom parsing scripts;
- The information is often reduced to avoid overload, but this keeps out useful data;
- It is difficult to aggregate or query logs with different formats.
Having a machine-readable structured log format, instead of a plain-text one, allows to trace operations across multiple different machines and different software systems. At the same time it is important to retain the human readability feature of the format.
A Human and Machine-Readable Log Format
In the aforementioned scenario the JavaScript Object Notation (JSON) format seems the perfect candidate:
- It is an open lightweight data-interchange standard format;
- Along with XML is the main format for data interchange used on the modern web;
- It is easy for humans to read and write;
- It is easy for machines to parse and generate;
- Supports all the basic data types (i.e. numbers, strings, boolean, arrays and hashes);
- It is developer-friendly, as it can be generated and parsed from almost any programming languages;
- Popular databases (e.g. MongoDB, MySQL, PostgreSQL) can store the JSON format natively, so it is possible to aggregate the JSON data in a service that gives powerful reporting, searching, and insights.
In JSON any piece of information can by atomically stored as key-value pair, so we have the ability to log anything that can add value when aggregated, charted or further analysed.
Using JSON we can write logs for machines to process and use the tools around those logs to transform them into something that is consumable by a human.
Common Log Fields
In any log message it is very useful to have a set of common fields that are critical to understand what, when and where the logged event happened. These fields can be automatically populated by a common software library, method or function, so there is no need to manually enter them in the code at each log call.
Every log message should include:
level
: Log category classified using the syslog severity levels, also defined in RFC 3164:- EMERGENCY: system is unusable
- ALERT: action must be taken immediately
- CRITICAL: critical conditions
- ERROR: error conditions
- WARNING: warning conditions
- NOTICE: normal but significant condition
- INFO: informational messages
- DEBUG: debug-level messages
hostname
: Host name of the machine running the program;program
: Name of the program;version
: Semantic version of the program;release
: Program release number (build number);datetime
: Human-readable UTC date and time when the event occurred (RFC3339 format);timestamp
: Machine-readable UTC timestamp in nanoseconds since EPOCH;msg
: Actual log message in plain english.
To avoid conflict between the standard fields and extra fields, any extra field is prefixed with “ext_”.
Log Example:
{
"level":"INFO",
"hostname":"server0001",
"program":"myprog",
"version":"1.2.3",
"release":"17",
"datetime":"2016-10-06T14:56:48Z",
"timestamp":1475765808084372773,
"msg":"Example message",
"ext_custom": 123
}