When I’m using any application, I want it to be fast. Fast applications make me happy. And I’m sure it’s the same with everybody. If you own any application that works with the internet, it’s essential to focus on server performance. There are a lot of things to take care of to have a well-performing server. But to improve something, you need to know where it stands in the first place. Therefore, monitoring your server is the first step to increasing your server’s performance.
When talking about what to monitor to understand a server’s performance, there are many metrics that might come up. We’ll focus on just one in this post: server CPU usage. We’ll start by explaining what server CPU usage is, and then we’ll discuss why and how to monitor it.
What Does Server CPU Usage Mean?
Everything that happens on the server is a task to the system. This task breaks down to processes that are executed by the server. Processes can be of different complexities and differ in how fast they can be completed. And based on this, the CPU takes some time to execute the process. In other words, the CPU is being used to execute the process. CPU usage is the percentage of time that the CPU is being used to complete its tasks.
When your system is up and running, a CPU can be in three states: idle, busy, or waiting for I/O.
When the CPU is in the idle state, the CPU is doing nothing. It’s just waiting for a task to be assigned to it. It’s highly unlikely to see a CPU in idle state if you have a single processor system. Even if you aren’t running any application or doing anything, the CPU would be used to execute operating system tasks. But if your system has multiple processors, you could see one or some of them idle. This could be because only one or few processors are being used and the others have nothing to do.
The opposite of the idle state is the busy state, which is when the CPU is being used to execute a process. A CPU is responsible for different types of tasks. During the process of executing tasks, a CPU does four things:
- Fetch: The instructions to complete the task are stored in the memory. The CPU has to get these instructions from the memory to understand what to do.
- Decode: The instructions for the CPU can be in different formats, such as a program. For example, let’s say you wrote a program in C, Java, or Python. CPUs can’t directly understand these coding languages. It has to decode it into a form that it can understand.
- Execute: This is when the CPU is actually doing the task.
- Storage: Once the CPU completes the task, it has to give feedback about the task. The results after task execution are stored to the memory.
This is the state where the CPU is not completely idle but is not busy, either. During a process execution, a CPU might have to output some data to another component or process. Or the CPU might have to wait to get data from another process or component. In that case, it’s waiting for the I/O operation to be complete so that it can resume executing its task.
CPU usage is the measure of the CPU when it’s in a non-idle state. Although CPU usage is measured in percentage, you could see CPU usage values being greater than a 100%. This doesn’t mean that the CPU is overloaded. For example, you would see CPU usage greater than 100% in multiprocessor systems. The value of CPU usage for a system is the sum of percentage of CPU usage of individual CPUs. So, if you have two processors and one of them has a CPU usage of 60% and the other has a CPU usage of 50%, the total CPU usage value would be 110%.
What Causes High CPU Usage?
Finding the core of the problem helps you implement better solutions for high CPU usage. There can be various reasons for CPU usage based on use-cases. I’m listing the most common causes.
Processes With High CPU Requirement
Some programs require high CPU resources. If you have a low specification system and are trying to run a high-end video game, obviously the CPU usage will go through the roof. Similarly, there can be different processes that require high CPU resources in order to work. Such processes or a number of processes which as a subtotal utilize high CPU can be one of the causes. On servers, running multiple services to keep the server running, simulations, etc., can cause high CPU usage.
Processes can be primarily divided into 2 categories: system processes and application processes. System processes are the processes that are needed in order to keep your system running. Application processes are the processes that you’d use for a specific purpose. When these processes keep running in the background, they eat up CPU resources.
A lot of application processes keep running in the background even after you close the application window. This is less likely to happen on a server because a server is usually maintained and cleaned periodically to have only the processes it needs but it’s still possible.
Malware (Malicious Software) are programs used by malicious actors to attack your system or to perform illegitimate actions on your system. Initially, malware doesn’t utilize high CPU resources to stay hidden but when they start performing malicious actions, they cause high CPU usage. I witnessed this incident where malware had been induced in a server a week before and one day it started transferring all sensitive data from the server to cloud storage. So, this malware didn’t use much CPU resources while it was setting up things and finding critical data. But once it found everything it needed, there was a huge spike in CPU usage due to data transfer.
It’s common to have custom code running on servers for specific tasks. If such codes are not optimized, they might end up using a lot of CPU resources. Unoptimized loops and recursion are some of the most common causes for high CPU usage due to unoptimized code.
Common Fixes For High CPU Usage
Based on the previously addressed common causes, let’s look at some of the common fixes for high CPU Usage.
A system restart is a solution for most computer problems. How would that help in reducing CPU usage? A restart would close all the background processes that might have started due to some action but you no longer need it. It would also terminate zombie processes or other processes that are indefinitely running due to malfunction. So restarting your system would clean all (at least most) of the unnecessary processes and reduce CPU usage.
End Background Processes
Restarting the system might not be feasible especially for servers. So, you will have to handle unnecessary processes without restarting the system. To do this, you can list all the processes running, identify which processes are unnecessary, and end them. This is how it would look like in windows task manager:
Another fix is to deal with startup processes. There are some processes that are configured to start as soon as your system boots. Some of these are necessary and some are not. You can identify such processes and disabled them from autostarting. In windows, you can find them under the “Startup” tab in the task manager:
Malware is dangerous. Not just in terms of CPU usage but in general also. Some malware cause high CPU usage and using anti-malware could be your best defense against such malware. Modern anti-malware software are smart and advanced. They can not only identify malware in your system and delete or quarantine them, but they can also prevent malware from getting onto your system. So by using anti-malware, you will not only be fixing high CPU usage but also prevent high CPU usage due to malware.
If you identify that the reason for high CPU usage is due to unoptimized code, then you need to optimize it. You will have to analyze the code and find out what part of the code is consuming CPU resources and optimize it.
If you’ve fixed all the other problems and optimized everything and still see high CPU usage then it means that your CPU requirement is high. In such cases, there’s no other option than to upgrade your CPU resources.
Why Should You Monitor Server CPU Usage?
A server can have two types of jobs to complete on a high level: a job from the user and a job within the system. A job from the user is when a user has requested a service or data from the server. Let’s say you go to YouTube and search for something. YouTube’s server has to fetch all the videos relevant to your search and send it as a response to your request. This transaction of data uses CPU time.
A server has tasks other than just responding to users. It goes without saying that its CPU is being used to run the operating system and web services. But apart from this, servers can be used to run some scripts to process data. One common example is running Ansible playbooks. These playbooks can execute tasks even without the user having to intervene.
Managing Resources and Tasks
If your CPU usage is low, then you’re wasting your resources. And if your CPU usage is very high, the system slows down and might start lagging. You don’t want either of these to happen, which is why it’s important to monitor server CPU usage. Monitoring server CPU usage helps you understand how much your CPU is being used. This will help you in deciding whether you can make your CPU do more jobs or increase your resources to cater to the necessity.
Monitoring server CPU usage has been very advantageous to manage resources as well as to optimize tasks. There have been cases where programs take more time than they ideally should be taking. When you see that CPU usage is higher even while executing a short, simple program, you know that there’s something wrong with your program. Often, programs aren’t efficient due to poor programming practice. Monitoring CPU usage could help you find such instances and optimize code. Hence, monitoring CPU usage is like hitting two birds with a single stone.
Now that we’re done with the “what” and “why” of CPU usage and learned how beneficial monitoring server CPU usage can be, let’s get to the “how” part.
How to Monitor Server CPU Usage
CPU usage is not something like money that you can save up over a period of time and then use it when you want to. Unused CPU is a wasted resource. That’s why everybody wants the server CPU usage to be close to 100%. And to keep it at its best, you might have to do some tweaking and keep updating the system. To know what to do, you need to know the current status of server CPU usage.
You can check your server CPU usage with a single command. If you have a Windows operating system, open your command prompt and run this command:
wmic cpu get loadpercentage
You should see the CPU usage percentage as shown in this screenshot:
Based on your requirement, you can query for CPU Usage periodically and store the data somewhere, and then look at it later. Or you can stream CPU usage into a dashboard to monitor anytime.
Most operating systems come with inbuilt tools and utilities for CPU Usage monitoring. But these tools have limitations and they’re usually too basic. In the present age, we need advanced capabilities to deal with and work on monitoring data and that calls the need for advanced monitoring tools. Almost every service is online now. And servers play an important role in making these services available. You can not afford to wait for a couple of hours to know your server is down due to high CPU usage. You need to continuously monitor CPU usage and other server metrics.
Checking server CPU usage at one point in time is not of much use. You need a solution to monitor server CPU usage continuously. Almost every server operating system comes with a performance monitor where you can see the CPU usage at any given time. When monitoring server CPU usage, you should look for spikes, as sudden changes in CPU usage indicated that something out of the ordinary has happened.
Along with monitoring CPU usage in real time, you want to store this data so that you can visualize and analyze it at different intervals. To get the most out of CPU usage statistics, you should collect CPU usage data at regular intervals (preferably in real time) by generating logs, and you should visualize it in the form of graphs. We’ve just spoken about one metric, but don’t forget that server CPU usage is not the only thing to monitor. To get everything you need in one place, use Scalyr.
Scalyr is a fast and simple log management and analytics platform. It provides support for different sources, live tailing, log parsers, real-time alerts, and a lot more. To make monitoring server metrics simple, I suggest you try Scalyr and see for yourself how easy monitoring can be.