Making sense out of logs is not an easy task. Log management solutions, such as Sematext Cloud, gather and accept data from multiple sources. Those sources can have different log events structures, providing a different granularity. They may not follow common, logging good practices and be hard to get some meaning from.
Because of that, it is important that the application we develop and follow best practices. One of those is keeping meaningful log levels. That allows a person who will read the logs and try to give them meaning to understand the importance of the message that he sees in the text files or one of those awesome observability tools out there.
What Is a Logging Level?
A log level or log severity is a piece of information telling how important a given log message is. It is a simple, yet very powerful way of distinguishing log events from each other. If the log levels are used properly in your application all you need is to look at the severity first. It will tell you if you can continue sleeping during the on-call night or you need to jump out of bed right away and hit another personal best in running between your bedroom and laptop in the living room.
You can think of the log levels as a way to filter the critical information about your system state and the one that is purely informative. The log levels can help in reducing the information noise and reduce alert fatigue.
The History of Log Levels
Before continuing with the description of the log levels themselves it would be good to know where the log levels come from. It all started with Syslog. In the 80s, the Sendmail a mailer daemon project developed by Eric Allman required a logging solution. This is how Syslog was born. It was rapidly adopted by other applications in the Unix-like ecosystem and became a standard. Btw – at Sematext we do support Syslog format in the Sematext Cloud Logs.
Syslog came with the idea of severity levels, which is now defined in the Syslog standard. Syslog comes with the following severity levels:
After the 80s programming languages were evolving and different logging frameworks were introduced. Nowadays each programming language has its own logging framework[b][c] allowing you to save data in various formats like JSON. In most cases you can ship data to different destinations like text file, syslog or Elasticsearch. But apart from the format and the possible destinations there is one thing that is common to the majority of them – the level of the log event.
Log Level Hierarchy: What Are the Most Common Logging Levels & How to Choose Them?
In most logging frameworks you will encounter all or some of the following log levels:
The names of some of those give you a hint on what they are about. However, let’s discuss each of them in greater detail.
TRACE – the most fine-grained information only used in rare cases where you need the full visibility of what is happening in your application and inside the third-party libraries that you use. You can expect the TRACE logging level to be very verbose. You can use it for example to annotate each step in the algorithm or each individual query with parameters in your code.
DEBUG – less granular compared to the TRACE level, but it is more than you will need in everyday use. The DEBUG log level should be used for information that may be needed for diagnosing issues and troubleshooting or when running application in the test environment for the purpose of making sure everything is running correctly
INFO – the standard log level indicating that something happened, application entered a certain state, etc. For example, a controller of your authorization API may include an INFO log level with information on which user requested authorization if the authorization was successful or not. The information logged using the INFO log level should be purely informative and not looking into them on a regular basis shouldn’t result in missing any important information.
WARN – the log level that indicates that something unexpected happened in the application, a problem, or a situation that might disturb one of the processes. But that doesn’t mean that the application failed. The WARN level should be used in situations that are unexpected, but the code can continue the work. For example, a parsing error occurred that resulted in a certain document not being processed.
ERROR – the log level that should be used when the application hits an issue preventing one or more functionalities from properly functioning. The ERROR log level can be used when one of the payment systems is not available, but there is still the option to check out the basket in the e-commerce application or when your social media logging option is not working for some reason.
FATAL – the log level that tells that the application encountered an event or entered a state in which one of the crucial business functionality is no longer working. A FATAL log level may be used when the application is not able to connect to a crucial data store like a database or all the payment systems are not available and users can’t checkout their baskets in your e-commerce.
To summarize what we know about each of the logging level:
|Fatal||One or more key business functionalities are not working and the whole system doesn’t fulfill the business functionalities.|
|Error||One or more functionalities are not working, preventing some functionalities from working correctly.|
|Warn||Unexpected behavior happened inside the application, but it is continuing its work and the key business features are operating as expected.|
|Info||An event happened, the event is purely informative and can be ignored during normal operations.|
|Debug||A log level used for events considered to be useful during software debugging when more granular information is needed.|
|Trace||A log level describing events showing step by step execution of your code that can be ignored during the standard operation, but may be useful during extended debugging sessions.|
Making Sense of Log Levels
The next question that you are probably already asking in your mind is how to make sense of log levels in your log events. There are two main things that you can think of, especially when using a log analysis solution like Sematext logs.
First of all – filtering. You can filter your logs to only show the ones having a given log level. For example, internally in Sematext, we use the severity name. Yes, we do like Syslog and in fact, we even support Syslog format when shipping logs to Sematext Cloud. So when I need to look for errors in our logs I can easily do that by using filters:
The second thing is the alerting functionality. Alerting is a very powerful tool, but can be overwhelming. Alerts should be actionable. That means that an alert should be an alert if someone needs to take an action on it by doing something. For example, remove files from a device to avoid hitting 100% of disk utilization.
If your application logs use the log levels well you can create alerts on all CRITICAL log level messages. As we already mentioned, the CRITICAL log level means that the crucial part of the application is not working and we are not delivering a business logic, so a total failure. You can potentially create alerts on ERROR log level messages as well, but that highly depends on the functionality. For example, in Sematext Cloud, we can easily create an alert for any log event that will have a CRITICAL log level:
What’s more, we can easily connect various notification hooks to such an alert to be notified using our preferred software:
But keep in mind – use the log levels wisely. Giving a CRITICAL log level to a WARN level log event will result in an alert and notifications to people who are on call. If the situation was not close to being critical soon your alerts will start being ignored, which is just a step away from ignoring real CRITICAL situations and having the system in a non-working state for hours.
How Do Log Levels Work?
You may ask yourself a question – how do the log levels work? Well, on their own they don’t work at all. They are labels that serve informational purpose. They are there, in the log message, to provide information. The log levels say how important a given event is. Should you jump out of your cozy bed right away to fight with a critical issue, or maybe you can just leave it till morning and the world will not end for a few hours once the patch is ready and deployed.
When combined with a logging framework you can easily limit the information that is written to the log destination of your choice. When setting the log level that you are interested in your logging framework you effectively limit all less important log levels to be ignored. So if you set your logging framework to have the root logging level to WARN you will only get log events with WARN, ERROR, and FATAL levels:
When combined with a log analysis solution your log level can play an information role, but also work as a filter effectively limiting the information that you are looking for:
Choosing the right log level when developing an application is crucial. It may not be visible while coding, but for sure it will be beneficial for everyone who uses logs when looking for issues, troubleshooting, creating alerts, or just looking into them regularly. The ease of filtering, alerting, and finding the right information easily is extremely important when dealing with a growing volume of log messages that nowadays systems produce. Although products like Sematext Cloud Logs help you deal with that, we also need to think about what we ship to them. As they say – garbage in, garbage out. Keep that in mind and make your logs useful. As we have just discussed, one of the key steps to that is using proper log levels – good luck!