Here are few tips show you how to check huge page

Check system-wide THP usage

Run the following command to check system-wide THP usage:

# grep AnonHugePages /proc/meminfo 
AnonHugePages:   2603008 kB<

Or

# grep HugePages /proc/meminfo
AnonHugePages:   2603008 kB
HugePages_Total:    4096
HugePages_Free:     4096
HugePages_Rsvd:        0
HugePages_Surp:        0

Note: Red Hat Enterprise Linux 6.2 or later publishes additional THP monitoring via /proc/vmstat:

# egrep 'trans|thp' /proc/vmstat
nr_anon_transparent_hugepages 2018
thp_fault_alloc 7302
thp_fault_fallback 0
thp_collapse_alloc 401
thp_collapse_alloc_failed 0
thp_split 21

Check THP usage per process

Run the following command to monitor which processes are using THP:

    # grep -e AnonHugePages  /proc/*/smaps | awk  '{ if($2>4) print $0} ' |  awk -F "/"  '{print $0; system("ps -fp " $3)} '
/proc/7519/smaps:AnonHugePages:    305152 kB
UID        PID  PPID  C STIME TTY          TIME CMD
qemu      7519     1  1 08:53 ?        00:00:48 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name rhel7 -S -machine pc-i440fx-1.6,accel=kvm,usb=of
/proc/7610/smaps:AnonHugePages:    491520 kB
UID        PID  PPID  C STIME TTY          TIME CMD
qemu      7610     1  2 08:53 ?        00:01:30 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name util6vm -S -machine pc-i440fx-1.6,accel=kvm,usb=
/proc/7788/smaps:AnonHugePages:    389120 kB
UID        PID  PPID  C STIME TTY          TIME CMD
qemu      7788     1  1 08:54 ?        00:00:55 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name rhel64eus -S -machine pc-i440fx-1.6,accel=kvm,us

How to tell if Explicit HugePages is enabled or disabled

There can be two types of HugePages in the system: Explicit Huge Pages which are allocated explicitly by vm.nr_hugepages sysctl parameter and Tranparent Huge Pages which are allocated automatically by the kernel. See below on how to tell if Explicit HugePages is enabled or disabled.

  • Explicit HugePages DISABLED:

    • If the value of HugePages_Total is "0" it means HugePages is disabled on the system.

      # grep -i HugePages_Total /proc/meminfo 
      HugePages_Total:       0
      
    • Similarly, if the value in /proc/sys/vm/nr_hugepages file or vm.nr_hugepages sysctl parameter is "0" it means HugePages is disabled on the system:

      # cat /proc/sys/vm/nr_hugepages 
      0
      # sysctl vm.nr_hugepages
      vm.nr_hugepages = 0
      
  • Explicit HugePages ENABLED:

    • If the value of HugePages_Total is greater than "0", it means HugePages is enabled on the system:

      # grep -i HugePages_Total /proc/meminfo 
      HugePages_Total:       1024
      
    • Similarly if the value in /proc/sys/vm/nr_hugepages file or vm.nr_hugepages sysctl parameter is greater than "0", it means HugePages is enabled on the system:

      # cat /proc/sys/vm/nr_hugepages 
      1024
      # sysctl vm.nr_hugepages
      vm.nr_hugepages = 1024
      

Notes

  • RHEL 6 disables THP on systems with < 1G of ram. Refer to Red Hat Bug 618444 - disable transparent hugepages by default on small systems for more information.
  • Disadvantages of using the explicit hugepages (libhugetlbfs): Using hugetlbfs requires significant work from both application developers and system administrators; explicit hugepages must be set aside at boot time, and applications must map them explicitly. The process is fiddly enough that use of hugetlbfs is restricted to those who really care and who have the time to mess with it. Hugetlbfs is often seen as a feature for large, proprietary database management systems and little else.

Determining whether page fault latency is due to huge pages use

Huge page use can reduce the number of TLB updates required to access large regions of memory and reducing the overall cost of TLB updates but increase costs and latency for other operations. When a user-space application is given a range of addresses for a memory allocation the assignment of a physical page is deferred until the first time the page is accessed. To prevent information leakage from the previous user of the page the kernel writes zeros in the entire page. For a 4096 byte page this is a relatively short operation and will only take a couple of microseconds. The x86 hugepages are 2MB in size, 512 times larger than the normal page. Thus, the operation may take hundreds of microseconds and impact the operation of latency sensitive code. Below is a simple SystemTap command line script to show which applications have huge pages zeroed out and how long those operations take. It will run until cntl-c is pressed.

stap  -e 'global huge_clear probe kernel.function("clear_huge_page").return {
  huge_clear [execname(), pid()] <<< (gettimeofday_us() - @entry(gettimeofday_us()))}'

Below is the a run of the above SystemTap clear huge page script. The script will output a list sorted from the executable name and process with the most huge page clears to the least. The @count is the number of times that process encountered a huge page clear operation. Following that information is time statistics displayed in microseconds of wall clock time. The @min and the @max are the minimum and the maximum time respectively to clear out a page. The @sum is the total wall clock time. In the example below the ld process 17050 took a total 1924 microseconds to clear out huge pages and on average those page clears took 128 microseconds.

#  stap  -e 'global huge_clear probe kernel.function("clear_huge_page").return {
  huge_clear [execname(), pid()] <<< (gettimeofday_us() - @entry(gettimeofday_us()))}'

The system may attempt to save memory by using the same physical page for multiple processes. When one of the processes attempts to modify the contents of the page a new copy needs to be made of the page. The Copy-On-Write (COW) operation for the huge page can be observed with a script very similar to the one watching for huge pages to be zeroed out. Below is the script to watch for Copy-On-Write for huge pages and it will output data in a similar format.

stap  -e 'global huge_cow probe kernel.function("copy_user_huge_page").return {
  huge_cow [execname(), pid()] <<< (gettimeofday_us() - @entry(gettimeofday_us()))}'

Determining whether huge page split and collapse operations are affecting performance

Because some portions of the kernel code only work with normal-sized pages the kernel may convert a huge page into a set of normal-sized pages using a split operation. One can identify if split operations are occurring with the following systemtap script:

stap -e 'probe kernel.function("split_huge_page") { printf("%s: %s(%d)\n", pp(), execname(), pid());}'

Below is an example run of the script showing which processes are performing split huge page operations. In this case the same virtualized guest machine (qemu-system-x86_64) has some huge pages splits.

# stap -e 'probe kernel.function("split_huge_page") { printf("%s: %s(%d)\n", pp(), execname(), pid());}'

The inverse of the huge page split operation is the huge page collapse operation that converts a set of normal-sized pages into a single huge page. It is desirable to have a range of addresses need fewer TLB entries, but the conversion process is expensive because the system needs to find a candidate set of pages to group together and then copy all the memory from the possibly scattered normal-sized pages into a single huge page. The khugepaged kernel thread searches for candidates pages to collapse into a single huge page. Even if khugepaged is not successful converting normal-sized pages into huge pages it may still be taking processor time to search for candidate pages. You can see if the khugepaged kernel thread is taking a significant amount of processor time with:

top -p `pidof khugepaged`

If you want to see when the huge page collapse operations occur, the following will note each time khugepaged is able to collapse normal-sized pages into huge pages:

stap -e 'probe kernel.function("collapse_huge_page") {  printf("%-25s: %s (%d) collapse_huge_page\n", tz_ctime(gettimeofday_s()), execname(), pid())}'

The above one line script will generate output like the following:

$ stap -e 'probe kernel.function("collapse_huge_page") {  printf("%-25s: %s (%d) collapse_huge_page\n", ctime(gettimeofday_s()), execname(), pid())}'