Creating Usable Log Messages: Getting Timestamps Right

While the timestamp is probably the most trivial thing a log message needs to have, it’s surprising how many different ways the use of a timestamp can go wrong in a log output. In the Creating usable logs series, we discuss the different aspects and essential elements of the log messages. This post covers timestamps - the first post of the series was about log levels.

Logging is a critical aspect of application development, providing invaluable insights into an application's behavior. Beyond aiding in debugging and troubleshooting, logs play a pivotal role in bolstering an application's security.

Essential elements of a log timestamp

Every timestamp in your logs should contain the following information:

Date
Time
Timezone offset (and not timezone name!)
A fraction of the second

Who uses the log timestamp?

Logs are both processed by computers and are read by humans as well. Logs are sometimes stored in fancy observability applications, in other cases they are just in plain text files.

In all cases, the timestamp is always used by both humans and computers. So the critical requirements of a log timestamp are as follows:

Contains all the essential information (date, time, timezone offset, fraction of a second)
Easily processed by computers (when we have logs processed by tools)
Readable by humans (on the UI we expect humans to use, which might be a text file)

We use different timestamps depending on the output media. We might want to use a UNIX timestamp when producing the log intended for log aggregation, and a more readable representation when using a text file used by humans.

If a log output serves both purposes, for example, a text file used by operators as well as collected by your logging agent, err on the side of humans and use a more readable timestamp.

How many timestamps does a log record have?

When collected, a log message may have multiple timestamps:

The time it was generated by the application (this is the most accurate at least as long as the clock source is good).
The time it was received by the log collection mechanism (there’s a slight delay here, but the log collection infrastructure may have a better clock source).
The time it was finally written to disk (this is useful to measure the delay between the creation of a log message and its commitment to disk).

Common errors in log timestamps

Sometimes crucial information is missing from the timestamps. For example:

BSD syslog timestamps don't include the year
Some messages use only two digits for years
Timestamps may lack timezone information

Locales are easy to do wrong:

Don’t ever localize timestamps in log messages
Don't use names only for things like months, even in textual timestamps. If you must absolutely must, use English names, but it's better to use numeric values.
No AM/PM!

Things to do

Timezone:

Use GMT everywhere to make things easier.
Use timezone offsets and not timezone names! (For example, -04:00 instead of EST.)
Even GMT needs to be explicit in the timestamp, for example, include the "Z" in ISO8601.

Granularity:

Use at least millisecond granularity. Microseconds are useful, nanoseconds are probably an overkill.
More granularity in timestamps allows more parallel processing of log messages, as the ordering can be restored simply by sorting on the timestamp value.

Precision:

Timestamps are only as good as the clock that generates them. Although a completely synchronous clock is not needed across your entire infrastructure, something like NTP is quite useful both for producing and aggregating logs.

Which timestamp format to use

The best way is to use the ISO8601 format: YYYY-MM-DDTHH:MM:SS.sss+ZZ:ZZ

The reasons:

Easy to read for both humans and computers
All timestamps are of the same length (so easy to cut them out)
Easy to search for ranges even with grep
Orders lexically
It’s a standard

Why avoid UNIX timestamps for log files?

UNIX timestamps are unreadable to humans
Timezone is not explicit, only implied (GMT), but sometimes incorrectly
Granular formats are not standard, although often represented as a decimal fraction (e.g. 1702335956.123542)
Text processing tools like grep cannot handle this format

There’s really no reason to use anything other than ISO8601, except if you are 100% certain that a specific output is never used by humans.

Conclusion

Timestamps play a crucial role in log messages, serving as a fundamental element for both human readability and computer processing. A well-crafted timestamp includes date, time, timezone offset, and a fraction of a second. Striking a balance between human readability and machine processing is essential, especially when logs serve dual purposes.

We recommend the ISO8601 (YYYY-MM-DDTHH:MM:SS.sss+ZZ:ZZ) timestamp format, as it balances readability, consistency, and searchability for both humans and computers. Avoid UNIX timestamps, which lack readability and explicit timezone information, unless you are certain that the output is exclusively for machine consumption.

If you want to learn more about how we manage logs, check out our product page.