Here are 10 terminal commands you can use every time you have a performance issue on one of your servers. Use them to investigate what’s causing the high load; take a look at the system resources load, look for errors and saturation metrics.

Some of these commands need sysstat package so make sure you have it installed. For Ubuntu use

$ sudo apt-get install sysstat

1. uptime

[sourcecode]$ uptime
09:18:20 up 6 days, 8:48, 1 user, load average: 2,30, 1,04, 1,26[/sourcecode]

This command will quickly show you the system load. There are three load averages: 2,30 for 1 minute, 1,04 for 5 minutes and 1,26 for 15 minutes. This allows you to make an idea on how your system load has changed over time. In the example above, the system load has jumped from 1,04 to 2,30 in the last minute but still you have to use vmstat to find out exactly what’s using up the system resources.

2. dmesg | tail

[sourcecode]$ dmesg | tail
[ 34.445277] vboxpci: IOMMU not found (not registered)
[ 34.975134] init: plymouth-upstart-bridge main process ended, respawning
[ 34.980011] init: plymouth-upstart-bridge main process (1698) terminated with status 1
[ 34.980028] init: plymouth-upstart-bridge main process ended, respawning
[ 38.559482] EXT4-fs (sda5): recovery complete
[ 38.581487] EXT4-fs (sda5): mounted filesystem with ordered data mode. Opts: (null)
[ 38.601262] init: plymouth-stop pre-start process (1848) terminated with status 1
[107356.417426] perf samples too long (2506 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[115114.143288] warning: `VirtualBox’ uses 32-bit capabilities (legacy support in use)
[176310.990027] hub 2-1:1.0: port 5 disabled by hub (EMI?), re-enabling…[/sourcecode]

This command shows you the last 10 system messages and it’s always a good idea to check it for errors.

3. vmstat 1

[sourcecode]$ vmstat 1
procs ———–memory———- —swap– —–io—- -system– ——cpu—–
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 21764 268096 170760 881224 0 0 1 19 32 19 1 0 98 1 0
0 0 21764 268096 170760 881240 0 0 0 0 47 101 0 0 100 0 0
0 0 21764 268096 170760 881292 0 0 0 0 38 82 0 0 100 0 0[/sourcecode]

virtual memory stat prints summaries every second. Collumns:

  • r: Number of processes running on CPU and waiting for a turn. This provides a better signal than load averages for determining CPU saturation, as it does not include I/O. To interpret: an “r” value greater than the CPU count is saturation.
  • free: Free memory in kilobytes. If there are too many digits to count, you have enough free memory. The “free -m” command, included as command 7, better explains the state of free memory.
  • si, so: Swap-ins and swap-outs. If these are non-zero, you’re out of memory.
  • us, sy, id, wa, st: These are breakdowns of CPU time, on average across all CPUs. They are user time, system time (kernel), idle, wait I/O, and stolen time (by other guests, or with Xen, the guest’s own isolated driver domain).
4. mpstat -P ALL 1

[sourcecode]$ mpstat -P ALL 1
Linux 3.13.0-37-generic (linuxmint) 12/03/2015 _x86_64_ (4 CPU)

09:54:22 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
09:54:23 AM all 6,53 0,00 8,09 0,00 0,00 0,00 0,00 0,00 0,00 85,38
09:54:23 AM 0 4,12 0,00 11,34 0,00 0,00 0,00 0,00 0,00 0,00 84,54
09:54:23 AM 1 16,49 0,00 2,06 0,00 0,00 0,00 0,00 0,00 0,00 81,44
09:54:23 AM 2 2,11 0,00 4,21 0,00 0,00 0,00 0,00 0,00 0,00 93,68
09:54:23 AM 3 3,19 0,00 14,89 0,00 0,00 0,00 0,00 0,00 0,00 81,91[/sourcecode]

This command allows you to check for an imbalanced load on CPUs.

5. pidstat 1

[sourcecode]$ pidstat 1
Linux 3.13.0-68-generic (linuxlove) 12/03/2015 _x86_64_ (2 CPU)

10:11:46 AM UID PID %usr %system %guest %CPU CPU Command
10:11:47 AM 33 810 20.00 0.00 0.00 20.00 1 /usr/sbin/apach
10:11:47 AM 105 1078 0.00 1.00 0.00 1.00 1 mysqld
10:11:48 AM 0 7 0.00 1.00 0.00 1.00 0 rcu_sched
10:11:48 AM 0 41 0.00 1.00 0.00 1.00 1 khugepaged[/sourcecode]

The pidstat command allows you to identify processes that use up a lot of %CPU.

6. iostat -xz 1

[sourcecode]$ iostat -xz 1
Linux 3.13.0-68-generic (linuxlove) 12/04/2015 _x86_64_ (2 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
0.61 0.01 0.07 1.04 0.00 98.27

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.02 2.47 0.13 3.79 2.51 39.54 21.49 0.04 9.42 13.13 9.30 5.34 2.09
dm-0 0.00 0.00 0.12 6.13 2.41 39.40 13.38 0.09 13.63 14.97 13.61 3.34 2.09
dm-1 0.00 0.00 0.00 0.01 0.01 0.04 8.00 0.00 69.10 18.33 77.87 1.78 0.00[/sourcecode]

This is a great tool for understanding block devices (disks), both the workload applied and the resulting performance. Look for:

  • r/s, w/s, rkB/s, wkB/s: These are the delivered reads, writes, read Kbytes, and write Kbytes per second to the device. Use these for workload characterization. A performance problem may simply be due to an excessive load applied.
  • await: The average time for the I/O in milliseconds. This is the time that the application suffers, as it includes both time queued and time being serviced. Larger than expected average times can be an indicator of device saturation, or device problems.
  • avgqu-sz: The average number of requests issued to the device. Values greater than 1 can be evidence of saturation (although devices can typically operate on requests in parallel, especially virtual devices which front multiple back-end disks.)
  • %util: Device utilization. This is really a busy percent, showing the time each second that the device was doing work. Values greater than 60% typically lead to poor performance (which should be seen in await), although it depends on the device. Values close to 100% usually indicate saturation.
7. free -m

[sourcecode]$ free -m
total used free shared buffers cached
Mem: 7887 5953 1934 227 117 1777
-/+ buffers/cache: 4059 3828
Swap: 3996 30 3966[/sourcecode]

This command is pretty straight forward, except for the buffers/cache values. Linux is known to cache free memory for its applications so it can quickly reclaim it when those applications need it. Basically, it reserves more memory than it actually uses so the buffers/cache line shows you the real memory utilization.

8. sar -n DEV 1

[sourcecode]$ sar -n DEV 1
Linux 3.13.0-37-generic (linuxmint) 12/04/2015 _x86_64_ (4 CPU)

12:18:16 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
12:18:17 AM eth0 1551,00 1145,00 2108,68 76,19 0,00 0,00 0,00 1,73
12:18:17 AM lo 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00

12:18:17 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
12:18:18 AM eth0 1694,00 1306,00 2293,33 84,98 0,00 0,00 0,00 1,88
12:18:18 AM lo 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00[/sourcecode]

This command prints out the network throughput; use it to check the network workload.

9. sar -n TCP,ETCP 1

[sourcecode]$ sar -n TCP,ETCP 1
Linux 3.13.0-37-generic (linuxmint) 12/04/2015 _x86_64_ (4 CPU)

12:21:36 AM active/s passive/s iseg/s oseg/s
12:21:37 AM 3,00 0,00 237,00 152,00

12:21:36 AM atmptf/s estres/s retrans/s isegerr/s orsts/s
12:21:37 AM 0,00 1,00 9,00 0,00 0,00[/sourcecode]

Use this command to check some TCP metrics:

  • active/s: Number of locally-initiated TCP connections per second (e.g., via connect()).
  • passive/s: Number of remotely-initiated TCP connections per second (e.g., via accept()).
  • retrans/s: Number of TCP retransmits per second.
10. top

The top command gathers all the important info from most commands checked earlier but it has a downside: it’s screen keeps refreshing which makes it hard to track an issue, as opposed to vmstat and pidstat which provide a rolling output.

Monitor your Linux System Resources from the Command Line
Tagged on: