- You have a data which you want to log
- You shove that data into a string which has time, severity and other information.
- Your log gets written to a file
- If you have centralised logging (as you should) your log shipper (like Logstash) reads the file and forwards it to the log server.
- In the log server you parse the log file to extract the separate fields for indexing (for searching purposes)
- Profit! (usually not)
Enter GELF
Graylog has introduced a wonderful new logging format called GELF (Graylog Extended Logging Format). It is a simple JSON type format with a few mandatory fields and a protocol to ship the logs around. The end result is that you start with a data structure and can log it as it is and it ends up indexed exactly like that. There is a built in support for GELF in Docker for example. You get container names, ids, image names etc. as separate fields.
We've also tried the GELF logging modules for Python and Apache HTTP Server. Module for Log4J2 is also available although we haven't tried it yet.