This is a quick email detailing what the output from the "top" tool means, which is a utility which shows a list of running processes on a UNIX system along with statistics on their memory and CPU consumption.
I've attached to this email an image of some example top output from NGI. I've made it an image just in case people's mail clients decide to render it in a variable-pitch font, destroying readability.
I'll dismantle the lines one by one, from the top. I'm sure a lot of this is already known by you all or be obvious, but it may be worth exploring it all just in case, as there's a lot of confusion about some of the data here. As such this email is a little longer than I had originally intended.
If anyone has any questions, feel free to contact me directly, or reply to this posting on the mailing list.
Line 1: time, uptime, logged in users, load averages.
- time: Current local time - uptime: time system the system booted - logged in users: number of interactive sessions (console, ssh, etc) - load averages: There are three load average numbers, and these are the average "load" of the system over 1 minute, 5 minutes, and 15 minutes. Load is calculated by adding 1 for any process or thread that is one of: - Runnable if CPU resources available (R-state) - Waiting on IO (D-state) In the attached image, the instantaneous load is well north of 20, but this number is not useful to know. Which is why the numbers are averaged.
Line 2: Task statistics - Running tasks (or more accurately, runnable; as you can only run as many tasks as you have hardware threads). - Sleeping tasks which are not currently requesting any CPU or IO resources at all (perhaps waiting on a socket, for example.) - Stopped tasks, which have been sent SIGSLEEP to temporary remove them from the scheduler's task list. - Zombie tasks. These have died (naturally or otherwise), but their parent process is yet to call wait() on them to collect their exist status.
Line 3: Total userland usage, kernel/system usage, "nice" process usage, idle time, IO wait, hardware interrupt time, software interrupt time, stolen time.
The important thing to remember is that this is all real time, not CPU time, and they don't always add up. They are all expressed as percentages of available CPU time used since top last refreshed. - Total userland usage: The amount of available CPU time used by userland processes. - Kernel/system usage: The amount of available CPU time used by Linux itself. - "Nice" process usage: The amount of available CPU time used by low-priority processes (manipulated by the 'nice' command). - Hardware interrupt time: The amount of available CPU time used servicing hardware interrupts. - Software interrupt time: The amount of available CPU time used servicing software interrupts. (System calls) - Stolen time: Estimate of time stolen by the Hypervisor's overhead when running the system in a VM. This should always be zero on NGI.
Line 4: Total hardware memory, total used by processes and kernel, unused memory, kernel buffers.
- Total hardware memory is what is wired to the CPU and not reserved for other parts of the system. In this case, 4GB minus the memory used by PCI and the GPU. - Total used by processes is what is consumed in userland, ie applications. - Unused memory is spare. - Buffers is essentially memory used by Linux itself.
Line 5: Total swap, used swap, free swap, cache memory
- Total/used/free swap is hopefully self-explanatory. - Cache memory is trickier. This is *physical* memory that has been used to cache the contents of block devices (eMMC, SSD, USB sticks, etc). It is not "used" in a traditional sense; the instant a userland process requires more RAM and there is no unused RAM, this RAM is instantly raided to satisfy the request.
Now comes a table of processes and statistics. The columns are as follows:
- PID: Process identifier. Typically a 16 bit number, 0 is forbidden, 1 is 'init', which is the parent of last resort. NGI uses systemd as init. - USER: The user the process is running as. - PR: The process's priority. This normally runs from -20 to 20, where the *lower* the number, the higher the priority. There is also an 'RT' real-time priority which trumps all. It is not "real time", however. - NI: Niceness. This is an offset to apply to the normal priority. It is normally zero. - VIRT: Virtual memory size. This is the size of the process's address space. This doesn't mean it is all memory used, however: some may be maps of files on disc, or may be yet unused and not had real RAM allocated to back it. - RES: Resident size. This is how much RAM the process actually has allocated to it. This may be backed by physical RAM or by swap. - SHR: Sharable memory. This is the amount of memory being used that could possibly be shared with other processes. This includes shared memory used for IPC, as well as memory maps of files on disc (such as the executable itself, shared libraries, etc.) - S: State. S = sleeping, R = runnable, D = blocked waiting for IO, Z = waiting for parent process to collect corpse/exit status. - CPU: Percentage of CPU used by this process since the last refresh. Note that 100% is 100% of one thread. In a quad-core system, this could reach the high 300s. - MEM: Percentage of available RAM (combined physical and swap) that this process is using, as a factor of its RES, described above. - TIME: CPU seconds used by this process since it started. What this means is that if it were to use a whole CPU thread for a second (or 50% of a thread for two seconds), 1 will be added. - COMMAND: Title of process.
Note: This document is licensed under CC-BY-SA and was originally created by Codethink Ltd.