Maintaining a smooth operation of your web application is crucial for the success of your business. When customers encounter performance issues while using your application, it will likely affect your business reliability and customer satisfaction. This can lead to churn rate increase which will cause a loss of revenue. As a Site Reliability Engineer (SRE) or DevOps professional, you would want to keep your product reliable for end users.
The key point of ensuring product reliability comes down to monitoring your systems in real-time, pinpointing the root cause of a problem when it happens, and notifying the right teams that are responsible for solving the issue.
Here are a few examples of why web applications may run slow:
- Scheduled maintenance jobs that are causing application slowness
- Network traffic spikes that are causing servers to overload
- Network latency and slow server performance
- Poorly written application code, outdated packages, database issues like slow running queries, wrong usage of indexes, and so on
- Updates in your systems
How to solve application slowness:
One approach is to detect exactly when the slowness started. With this information, you have a starting point. You can use this information to check if there were any updates in your systems just before you started having these issues and make a decision to roll back the update or make sure all the system components are compatible with the new updated version.
Another solution is scaling up your systems. But this is not always the right approach. The source of the problem might be something else and if you blindly scale up systems every time you start having performance problems you might be spending money unnecessarily.
That’s why it is essential to figure out the root cause first.
For any type of application slowness, you need monitoring tools to investigate the issue, find the source of the problem, and assign the right teams to solve it in real time. In this post, we will show you how to do that with Sematext.
Troubleshooting Slow Web Applications with Sematext
The first and most important thing is realizing you have a problem. To catch that, you want to monitor your client endpoints, check response times, and get alerted when they exceed a certain threshold value or deviate from the normal. You need to make sure each call returns a successful response code and if not, you need to get the returned error when a client calls your endpoint. With Sematext Synthetics you can actively monitor APIs, Web URLs, websites, and user journeys/click flows from multiple locations around the globe.
You can call each of your endpoints at a certain interval, monitor availability, average response time, extract metrics from responses, chart them, and define alert rules.
Create an endpoint monitor
You can monitor your endpoints with HTTP or Browser Monitors with Sematext Synthetics. While HTTP Monitors check the availability and performance metrics of a single URL with Browser Monitors you can write User Journey Scripts to simulate a user behavior.
To create a monitor, first, create a Synthetics App and choose the type of the monitor.
You will be asked to enter an endpoint URL or write a User Journey Script, and select the time interval and locations to run the monitor. HTTP Monitors have optional request settings such as Authentication, Header, Body, Query Params, and Cookies.
The next step is to configure alert rules to get notified from various notification hooks when something goes wrong. Alert conditions can be configured on the response fields and metrics. The conditions are evaluated for every run result. All the conditions should pass for a run to be declared as passing. If any condition fails, the run will fail and the monitor will be marked as failing. Check out our documentation to see supported condition types for HTTP and Browser Monitors as well as metrics that are available for both.
After you create your monitors, you will see the list of all your monitors along with their last 25 run results, and availability information.
By clicking on a single monitor, you can see the response times, and network timings based on location and see the result of each run based on the interval you set.
By clicking on a single run, you can see the details, understand how your website performed at a specific time, and check the cause of the failure. The run details differ depending on the monitor type.
Run details of HTTP Monitors
The details include which condition caused the failure, the response body and headers, and the request configuration parameters. You will see the DNS lookup time, Socket connect time, TLS handshake time, Time to first byte, and HTTP download time of the run result within the details.
Run details of Browser Monitors
The details include which condition caused the failure, logs produced when running the User Journey Script along with resources used, and the waterfall diagram. You will also see the response time, transfer size, time to first byte, and web vital metrics of the run result within details.
The resources tab contains charts broken down by each resource type hit within the user journey script. It allows you to monitor the performance of each resource used in your website and detect which resources are causing the slowness of your application.
The Browser monitor navigates to the specified URL, after which it loads the page and all its content, making additional requests as a standard browser would to fetch all the required resources. The waterfall diagram lists each of the resources fetched when loading the page specified in the User Journey script and shows the response time of each of these resources. Thus, you can see all the details and determine which part of the application is causing the slowness.
Dig deeper into troubleshooting with Sematext
Sematext Cloud is an all-in-one platform, meaning on top of monitoring your endpoints, and websites, you can track usage of referenced resources, metrics, logs of services, and hosts that are running these endpoints. Seeing these in a single solution helps you correlate, and detect interdependencies between components when you work with a distributed application.
Imagine you detect an issue in your endpoint; to drill down to the root cause you might want to check the metrics of a service that is running your application.
When you start receiving alerts from your endpoints based on conditions you’ve set, you can check the resource usage metrics of a server that is hosting your application. See if it is caused by an unexpected increase in network traffic. You can check the CPU and RAM usage within the same time period during the traffic growth, see if it is just a temporary spike or if it has been going on for a while, and decide to scale up your systems.
You can use the Split Screen correlation and data-pivoting feature to compare failed runs with application metrics to avoid switching contexts when you are troubleshooting.
Once you make sure the network traffic and resource usage of your systems are normal, you can refer to database logs, the performance of servers, containers, kubernetes pods, and other components that are running your application. See if there is a spike in error logs, check the message from the error log, and address the issue. You can also check the query rates for your database and detect if anything looks abnormal. Check your indexes, queries and optimize your CRUD operations.
Another key thing to monitor is deployment events. Problems with applications often occur after deployments. You can ship deployment events and find out the starting point of an issue with Sematext Events.
Once you figure out the root cause, you can assign the right team to address the issue and compare your performance metrics and logs afterward to make sure the problem is solved. Sematext Cloud also notifies you when metrics you’ve defined in alert rules are back to normal and everything is up and running again.
In conclusion, Sematext Cloud provides a powerful way of troubleshooting web application slowness in real-time from the starting point down to the root cause. With the ability to extract metrics from your endpoints, create charts, set up alerting rules, and combine these with metrics and logs of services that are hosting these applications, you gain full stack visibility into your whole infrastructure. Lets you take proactive measures to resolve any issues that arise.