Using top command, you can easily find out how much CPU used by each job, However, a process/task I/O waiting is also conclued in CPU usage. So, in some case, CPU actually is not busy but you still see high load on system, some processes are just blocked by I/O, therefore CPU cost is high.

How to identify these processes?


The way today I'm going to explain is to find out the processes that are in 'D' state.You may wonder, what's that?

The process state code

Before we get started, let's review the process state code

       Here are the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process.
       D    Uninterruptible sleep (usually IO)
       R    Running or runnable (on run queue)
       S    Interruptible sleep (waiting for an event to complete)
       T    Stopped, either by a job control signal or because it is being traced.
       W    paging (not valid since the 2.6.xx kernel)
       X    dead (should never be seen)
       Z    Defunct ("zombie") process, terminated but not reaped by its parent.

       For BSD formats and when the stat keyword is used, additional characters may be displayed:
       <    high-priority (not nice to other users)
       N    low-priority (nice to other users)
       L    has pages locked into memory (for real-time and custom IO)
       s    is a session leader
       l    is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
       +    is in the foreground process group

Identify process in 'D' state by ps command

The process in 'D' state are in uninterruptible sleep mode, so we can use Linux ps command shows the process state in 8th column, so here is one way to quickly find out the processes that in 'D' state

#ps aux | awk '$8 ~ /D/  { print $0 }' 
root      9324  0.0  0.0   8316   436 ?        D<   Apr22   0:00 /sbin/blkid -o udev -p /dev/dm-0
root     11917  0.0  0.0  15016   832 ?        D<   May06   0:00 /sbin/kpartx -a -p p /dev/dm-0
root     12864  0.0  0.0  15016   696 ?        D<   Apr22   0:00 /sbin/kpartx -a -p p /dev/dm-0
root     13210  0.0  0.0  15016   696 ?        D<   Apr22   0:00 /sbin/kpartx -a -p p /dev/dm-0
root     15799  0.0  0.0   8316   508 ?        D<   May06  14:29 /sbin/blkid -o udev -p /dev/dm-0
root     25825  0.0  0.0   8316   504 ?        D<   May06  14:16 /sbin/blkid -o udev -p /dev/dm-0

In above output, you can see not just one process that are blocked by device dm-0, so you can look into more about the device see what's going on there.

Want repeat the process see if the processes are constantly running on blocked io state

watch -d -n 1 "(ps aux | awk '\$8 ~ /D/ { print \$0 }')"


while true; do date; ps aux | awk '$8 ~ /D/  { print $0 }'; sleep 1; done


In the meantime, you should be able to see iostate command shows high iowait

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.05    0.00    0.05   74.91    0.00   24.99