		Using the Linux Kernel Latency Histograms


This document gives a short explanation how to enable, configure and use
latency histograms. Latency histograms are primarily relevant in the
context of real-time enabled kernels (CONFIG_PREEMPT/CONFIG_PREEMPT_RT)
and are used in the quality management of the Linux real-time
capabilities.


* Purpose of latency histograms

A latency histogram continuously accumulates the frequencies of latency
data. There are two types of histograms
- potential sources of latencies
- effective latencies


* Potential sources of latencies

Potential sources of latencies are code segments where interrupts,
preemption or both are disabled (aka critical sections). To create
histograms of potential sources of latency, the kernel stores the time
stamp at the start of a critical section, determines the time elapsed
when the end of the section is reached, and increments the frequency
counter of that latency value - irrespective of whether any concurrently
running process is affected by latency or not.
- Configuration items (in the Kernel hacking/Tracers submenu)
  CONFIG_INTERRUPT_OFF_LATENCY
  CONFIG_PREEMPT_OFF_LATENCY


* Effective latencies

Effective latencies are actually occuring during wakeup of a process. To
determine effective latencies, the kernel stores the time stamp when a
process is scheduled to be woken up, and determines the duration of the
wakeup time shortly before control is passed over to this process. Note
that the apparent latency in user space may be considerably longer,
since
i) interrupts may be disabled preventing the scheduler from initiating
the wakeup mechanism, and
ii) the process may be interrupted after control is passed over to it
but before user space execution takes place.
- Configuration item (in the Kernel hacking/Tracers submenu)
  CONFIG_WAKEUP_LATENCY


* Usage

The interface to the administration of the latency histograms is located
in the debugfs file system. To mount it, either enter

mount -t sysfs nodev /sys
mount -t debugfs nodev /sys/kernel/debug

from shell command line level, or add

nodev	/sys			sysfs	defaults	0 0
nodev	/sys/kernel/debug	debugfs	defaults	0 0

to the file /etc/fstab. All latency histogram related files are
available in the directory /sys/kernel/debug/tracing/latency_hist. A
particular histogram type is enabled by writing non-zero to the related
variable in the /sys/kernel/debug/tracing/latency_hist/enable directory.
Select "preemptirqsoff" for the histograms of potential sources of
latencies and "wakeup" for histograms of effective latencies. The
histogram data - one per CPU - are available in the files

/sys/kernel/debug/tracing/latency_hist/preemptoff/CPUx
/sys/kernel/debug/tracing/latency_hist/irqsoff/CPUx
/sys/kernel/debug/tracing/latency_hist/preemptirqsoff/CPUx
/sys/kernel/debug/tracing/latency_hist/wakeup/CPUx.

The histograms are reset by writing non-zero to the file "reset" in a
particular latency directory. To reset all latency data, use

#!/bin/sh

HISTDIR=/sys/kernel/debug/tracing/latency_hist

if test -d $HISTDIR
then
  cd $HISTDIR
  for i in */reset
  do
    echo 1 >$i
  done
fi


* Data format

Latency data are stored with a resolution of one microsecond. The
maximum latency is 10,240 microseconds. The data are only valid, if the
overflow register is empty. Every output line contains the latency in
microseconds in the first row and the number of samples in the second
row. To display only lines with a positive latency count, use, for
example,

grep -v " 0$" /sys/kernel/debug/tracing/latency_hist/preemptoff/CPU0

#Minimum latency: 0 microseconds.
#Average latency: 0 microseconds.
#Maximum latency: 25 microseconds.
#Total samples: 3104770694
#There are 0 samples greater or equal than 10240 microseconds
#usecs	         samples
    0	      2984486876
    1	        49843506
    2	        58219047
    3	         5348126
    4	         2187960
    5	         3388262
    6	          959289
    7	          208294
    8	           40420
    9	            4485
   10	           14918
   11	           18340
   12	           25052
   13	           19455
   14	            5602
   15	             969
   16	              47
   17	              18
   18	              14
   19	               1
   20	               3
   21	               2
   22	               5
   23	               2
   25	               1


* Wakeup latency of a selected process

To only collect wakeup latency data of a particular process, write the
PID of the requested process to

/sys/kernel/debug/tracing/latency_hist/wakeup/pid.

PIDs are not considered, if this variable is set to 0.


* Details of the process with the highest wakeup latency so far

Selected data of the process that suffered from the highest wakeup
latency that occurred in a particular CPU are available in the file

/sys/kernel/debug/tracing/latency_hist/wakeup/max_latency-CPUx.

The format of the data is
<PID> <Priority> <Latency> <Command>

These data are also reset when the wakeup histogram ist reset.
