签到成功

知道了

CNDBA社区CNDBA社区

Linux Transparent Huge Pages 对 Oracle 的影响

2016-10-27 14:56 4830 0 原创 Oracle 故障处理
作者: dave


1 Transparent Huge Pages 说明

官网上有2篇文章对THP 做了说明:

https://access.redhat.com/solutions/46111

http://www.cndba.cn/dave/article/312

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-transhuge.html


Starting with RedHat6, RedHat7, OL6, OL7 SLES11 and UEK2 kernels, Transparent HugePages are implemented and enabled (default) in an attempt to improve the memory management.  Transparent HugePages are similar to the HugePages that have been available in previous Linux releases.  The main difference is that the Transparent HugePages are set up dynamically at run time by the khugepaged thread in kernel while the regular HugePages had to be preallocated at the boot up time.

从RedHat 6, OEL 6, SLES 11 and UEK2 kernels 开始,系统缺省会启用 Transparent HugePages,用来提高内存管理的性能透明大页(Transparent HugePages ), THP 与Hugepages 类似,主要的区别是:Transparent HugePages 可以实时配置,不需要重启才能生效配置。http://www.cndba.cn/dave/article/312http://www.cndba.cn/dave/article/312


关于HugePages 参考如下链接:

Linux HugePages 配置与 Oracle 性能关系说明

http://www.cndba.cn/dave/article/310


Transparent Huge Pages (THP) are enabled by default in RHEL 6 for all applications. The kernel attempts to allocate hugepages whenever possible and any Linux process will receive 2MB pages if the mmap region is 2MB naturally aligned. The main kernel address space itself is mapped with hugepages, reducing TLB pressure from kernel code. For general information on Hugepages.

The kernel will always attempt to satisfy a memory allocation using hugepages. If no hugepages are available (due to non availability of physically continuous memory for example) the kernel will fall back to the regular 4KB pages. THP are also swappable (unlike hugetlbfs). This is achieved by breaking the huge page to smaller 4KB pages, which are then swapped out normally.

But to use hugepages effectively, the kernel must find physically continuous areas of memory big enough to satisfy the request, and also properly aligned. For this, a khugepaged kernel thread has been added. This thread will occasionally attempt to substitute smaller pages being used currently with a hugepage allocation, thus maximizing THP usage.

In userland, no modifications to the applications are necessary (hence transparent). But there are ways to optimize its use. For applications that want to use hugepages, use of posix_memalign() can also help ensure that large allocations are aligned to huge page (2MB) boundaries.

Also, THP is only enabled for anonymous memory regions. There are plans to add support for tmpfs and page cache. THP tunables are found in the /sys tree under /sys/kernel/mm/redhat_transparent_hugepage.http://www.cndba.cn/dave/article/312

The values for /sys/kernel/mm/redhat_transparent_hugepage/enabled can be one of the following:

always   -  always use THP

never    -  disable THP

khugepaged will be automatically started when transparent_hugepage/enabled is set to "always" or "madvise, and it'll be automatically shutdown if it's set to "never". The redhat_transparent_hugepage/defrag parameter takes the same values and it controls whether the kernel should make aggressive use of memory compaction to make more hugepages available.http://www.cndba.cn/dave/article/312

2 查看与关闭THP 


当以下文件里的值为always表示已经启用THP:

http://www.cndba.cn/dave/article/312

http://www.cndba.cn/dave/article/312

[root@www.cndba.cn ~]# cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
[always] madvise never
[root@www.cndba.cn ~]#
[root@www.cndba.cn ~]# grep AnonHugePages /proc/meminfo 
AnonHugePages:    348160 kB
只要这里的值大于0,即表示启用了THP。

在linux 6.2 之后可以通过如下命令来监控THP:

[root@www.cndba.cn ~]# egrep 'trans|thp' /proc/vmstat
nr_anon_transparent_hugepages 170
thp_fault_alloc 18566
thp_fault_fallback 110
thp_collapse_alloc 185
thp_collapse_alloc_failed 32
thp_split 221
[root@www.cndba.cn ~]#

查看哪些进程在使用THP:http://www.cndba.cn/dave/article/312

[root@www.cndba.cn ~]# grep -e AnonHugePages  /proc/*/smaps | awk  '{ if($2>4) print $0} ' |  awk -F "/"  '{print $0; system("ps -fp " $3)} '
/proc/2645/smaps:AnonHugePages:      2048 kB
UID        PID  PPID  C STIME TTY          TIME CMD
grid      2645  2644  0 Oct25 ?        00:00:09 /u01/gridsoft/12.1.0/opmn/bin/ons -d
/proc/2780/smaps:AnonHugePages:     14336 kB
UID        PID  PPID  C STIME TTY          TIME CMD
root      2780     1  0 Oct25 ?        00:01:53 /usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin/java -Djava.util.logging.config.file=/www/t
/proc/2780/smaps:AnonHugePages:     14336 kB
UID        PID  PPID  C STIME TTY          TIME CMD
root      2780     1  0 Oct25 ?        00:01:53 /usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin/java -Djava.util.logging.config.file=/www/t
/proc/2780/smaps:AnonHugePages:     38912 kB
UID        PID  PPID  C STIME TTY          TIME CMD
root      2780     1  0 Oct25 ?        00:01:53 /usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin/java -Djava.util.logging.config.file=/www/t
/proc/2780/smaps:AnonHugePages:      6144 kB
UID        PID  PPID  C STIME TTY          TIME CMD

在OS启动时禁用THP:

在grub.conf 文件中添加:transparent_hugepage=never。 这种方法在修改后需要重启OS才能生效。

[root@www.cndba.cn ~]# cat  /etc/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,1)
#          kernel /vmlinuz-version ro root=/dev/sda4
#          initrd /initrd-[generic-]version.img
#boot=/dev/sda1
device (hd0) HD(1,800,3e8000,ad383463-7239-443a-83c6-7b8c6539a458)
default=0
timeout=5
splashimage=(hd0,1)/grub/splash.xpm.gz
hiddenmenu
title CentOS 6 (2.6.32-573.el6.x86_64)
root (hd0,1)
kernel /vmlinuz-2.6.32-573.el6.x86_64 ro root=UUID=65b6fe1a-6897-4a16-9cf6-e8dfcc89b7ce rd_NO_LUKS rd_NO_LVM.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet transparent_hugepage=never
initrd /initramfs-2.6.32-573.el6.x86_64.img

在运行时禁用:

直接执行如下命令禁用THP,不需要重启OS。

[root@www.cndba.cn ~]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
[root@www.cndba.cn ~]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
[root@www.cndba.cn ~]# cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
always madvise [never]

但这种方法在OS重启之后就会失效。


http://www.cndba.cn/dave/article/312

3 Oracle 与 THP 关系

根据MOS ID 1557478.1的说明。

Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC, Oracle strongly advises to disable the use of Transparent HugePages. In addition, Transparent Hugepages may cause problems even in a single-instance database environment with unexpected performance problems or delays. As such, Oracle recommends disabling Transparent HugePages on all Database servers running Oracle.http://www.cndba.cn/dave/article/312


The ocssd.log may show some of the threads are blocked (but this does not show all the time):

2013-05-01 14:30:45.255: [    CSSD][224204544]clssscMonitorThreads clssnmvKillBlockThread not scheduled for 7500 msecs

2013-05-01 14:30:46.945: [    CSSD][224204544]clssscMonitorThreads clssnmvWorkerThread not scheduled for 8030 msecs


因为THP 会导致节点重启,所以Oracle 强烈建议关闭THP。 具体关闭操作参考上节。



版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
dave

dave

关注

人的一生应该是这样度过的:当他回首往事的时候,他不会因为虚度年华而悔恨,也不会因为碌碌无为而羞耻;这样,在临死的时候,他就能够说:“我的整个生命和全部精力,都已经献给世界上最壮丽的事业....."

  • 2235
    原创
  • 2
    翻译
  • 535
    转载
  • 185
    评论
  • 访问:6446566次
  • 积分:4223
  • 等级:核心会员
  • 排名:第1名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ