Configuring HugePages for Oracle Database

Overview

As the amount of memory available on systems grows and the amount of memory needed by the database grows the traditional 4k page size used in most Linux systems is becoming a bit too small. As the total memory allocated increases the number of pages that must be managed also increases – meaning more work for the kernel. With HugePages you can increase the typical 4KB page size to something like 2MB. This means that for the same amount of RAM being used your OS will have a multiple of 512 less pages to manage. In addition, with HugePages the pages are pinned in memory and can’t be swapped to disk, thus avoiding possible disk writes. Another key benefit I’ve read is that HugePages are managed via a global PageTable rather than every process having its own PageTable – this also reduces the amount of memory needed.

The following are my notes and steps for setting up HugePages. Thank you to the many people I talked with, especially to the following people who provided key insight into the workings of HugePages and configuring it: Andrew Kerber, Mark Bobak, and Yong Huang.

Performance Improvement?

I didn’t have enough time to dig in and really look at differences in the database performance itself; however, I compared the average load on the servers before and after and found that following HugePages implementation the average load was about 87.5% of what it was previously. I’m sure results will vary!

REFERENCES

MOS 361323.1 – Hugepages on Linux
MOS 361468.1 HugePages on 64-bit Linux
MOS 401749.1 script to calculate number of huge pages
MOS 361468.1 Troubleshooting Huge Pages
http://dbakerber.wordpress.com/
http://yong321.freeshell.org/oranotes/HugePages.txt

Important Notes

HugePages and AMM (Automatic Memory Management) are not compatible (maybe in 12c?). Oracle is very clear on this matter.
As of 11.2.0.2 there is parameter USE_LARGE_PAGES that provides very useful information in the alert log (use this.)
Oracle makes clear mention that incorrect configuration of HugePages can lead to many problems (from Oracle):
- hugepages not used
- poor db performance
- running out of memory or excessive swapping
- db instance not start
- crucial system services not working (e.g. CRS)
In the first example, I go through the steps in order. As such, it has two reboots which is not necessary. I have a short version later that has one reboot.
If you’re running RAC you can make these changes on single node to ensure success without risking crashing your whole RAC. I haven’t tried, but I think you could actually run your RAC nodes with some in regular shared memory and some in HugePages – though I wouldn’t suggest it.
My example is on Oracle 11.2.0.3.0. Some things may vary on other versions (such as the existance of the USE_LARGE_PAGES parameter.)

Planning

Check how much RAM you have available on the system. HugePages run in physical RAM and are pinned in it, so you want to know how much RAM you have available and how much you’re willing, or need to, give to the databases. If you’re running AMM you will need to switch to ASMM, so take a look at what you think you will want for SGA_TARGET and PGA_AGGREGATE_TARGET (look at v$sga_target_advice and v$pga_targeta_advice.)
You will reboot your server, so plan on having maintenance time to do this.

Steps Involved

This is a quick list of what you will need to do (depending on your configuration):

If you’re using AMM (i.e. MEMORY_TARGET), first make sure your kernel is configured correctly to support changing your database to ASMM (i.e. SGA_TARGET).
If you’re using AMM, change to ASMM.
Configure HugePages
Set the Oracle USE_LARGE_PAGES setting (not required, but a good idea)

These can all be done in one step with a reboot. First I’ll go through each individual step.

My example

For my example I’m working on a RAC system. This gave me the opportunity to work one server at a time. I am working with 128GB RAM and based on system statistics opted to start with a 48GB SGA and 16GB PGA. I also only have one database on the servers. My work belows is based on those criteria.

Step 1: Configure shared memory

If you are using ASMM instead of AMM you don’t need to do this since you should already have your shared memory settings correct. If you’re using AMM then you are using shared memory from /dev/shm, and you will want to get these settings correct before switching to ASMM.

These are the recommendations I used to determine how to configure shared memory on the server:

/etc/sysctl.conf

kernel.shmmax – set to the largest SGA on your server plus 1G
kernel.shmall – set to sum of all SGAs on the server divided by page size – ‘getconf PAGESIZE’

/etc/security/limits.conf

oracle soft memlock – set to slightly less than total RAM on server (in KB)
oracle hard memlock – set to slightly less than total RAM on server (in KB)

So, for my system

RAM = 128GB = 132093152 kB
SGA = 48GB – however, to allow for possible growth and given I have 128GB total, I’m going to use 64G for my numbers
PGA = 16GB
shmmax = 64GB+1GB = 65GB= 69793218560
shmall = 1SGA @ 64GB = 64G/4096 = 16,777,216
oracle soft memlock = slightly less than 132093152 = 130000000
oracle hard memlock = oracle soft memlock = 130000000

Make these changes to /etc/security/limits.conf and /etc/sysctl.conf and reboot your server. I recommend doing this even if you think you can set them without a reboot. That way you don’t find months later after an outage that it wasn’t done correctly.

Once the server is up again you can verify the settings with

cat /proc/sys/kernel/shmall
cat /proc/sys/kernel/shmmax 
ulimit -a

Step 2: Change from AMM to ASMM

Once you have shared memory configured to support your SGAs, you need to switch from AMM to ASMM if you’re running AMM. Remember, Oracle will probably not support you if you run AMM and HugePages together.

Note: If you’re running RAC, I’d recommend using the sid='<node>’ clause and trying one node out first before mucking around with the whole thing.

For my example here is what I used:

alter system set sga_target=48G scope=spfile sid='*';
alter system set sga_max_size=48G scope=spfile sid='*';
alter system set pga_aggregate_target=16G scope=spfile sid='*';
alter system reset memory_max_target scope=spfile sid='*';
alter system reset memory_target scope=spfile sid='*';

Bounce just one instance to make sure it comes up successfully, check parameters

sqlplus / as sysdba
show parameter memory  -- should be unset (i.e. 0)
show parameter ga      -- should match your settings above

If you run into issues like:

ORA-00843: Parameter not taking MEMORY_MAX_TARGET into account
ORA-00849: SGA_TARGET 51539607552 cannot be set to more than MEMORY_MAX_TARGET 34359738368.

and you are sure you set things correctly,then check your local pfile. I often find that the pfile has parameters set directly in it in addition to the reference to the pfile. If so, remove everything but the pfile reference (or at least the parameters above).

Bounce the whole database if good

Step 3: Configure HugePages on servers

Once you’re running with ASMM you should go ahead and get HugePages configured. Oracle has a script (in Note 401749.10 that will determine what they recommend for your HugePages configuration. Run this script:

->./hugepage_settings.sh
...
Recommended setting: vm.nr_hugepages = 24580

If you want to figure this out yourself, please reference Andrew Kerber’s blog on HugePages.

Next add the following to /etc/sysctl.conf

vm.nr_hugepages=24580

and reboot your server.

Now verify that the hugepages kernel setting is correct

cat /proc/sys/vm/nr_hugepages

Hopefully your instance(s) will start successfully and you can now check to make sure HugePages are being used. To do this do the following:

->grep Huge /proc/meminfo
HugePages_Total: 24580
HugePages_Free:  16212
HugePages_Rsvd:  16209
Hugepagesize:    2048 kB

To make sure that the configuration is valid, “the HugePages_Free value should be smaller than HugePages_Total and there should be some HugePages_Rsvd. The sum of Hugepages_Free and HugePages_Rsvd may be smaller than your total combined SGA as instances allocate pages dynamically and proactively as needed.” I’ve also seen both of the following: HugePages_Rsvd is the number of pages in use and Hugepages_Total – HugePages_Free is how many are actually being used.

You can look at the “ipcs” command; however, I’m not convinced it is the best way to verify, but you will see the huge pages if you know what to look for.

-> ipcs
------ Shared Memory Segments --------
key        shmid      owner     perms      bytes       nattch     status
0x00000000 29097988   oracle    640        4096        0
0x00000000 29130757   oracle    640        4096        0
0x0e2f3c64 29163526   oracle    640        4096        0
0x00000000 29655047   oracle    640        268435456   59
0x00000000 29687816   oracle    640        51271172096 59
0x2894b058 29720585   oracle    640        2097152     59

I could be wrong, but 51271172096/1024/1024/1000=48 (since pages are in kbytes, this would seem to be my 48G SGA)

To actually see the HugePages being allocated and deallocated run the following useful shell script from Yong’s page. While running it shutdown and startup your database.

while :
 do
  for i in $(grep ^Huge /proc/meminfo | head -3 | awk '{print $2}');
  do
   echo -n "$i "
  done
  echo ""
  sleep 5
 done

Step 4: Configure USE_LARGE_PAGES

Once you have it running using hugepages and have verified HugePages are being used you should turn on the USE_LARGE_PAGES parameter in the database. This will give you information in the alert log on HugePage use to help confirm it is being used.

alter system set use_large_pages=only scope=spfile sid='*';

Restart your instance and if things are good you should see something like this:

Starting ORACLE instance (normal)
****************** Large Pages Information *****************
Parameter use_large_pages = ONLY
Total Shared Global Region in Large Pages = 48 GB (100%)
Large Pages used by this instance: 24577 (48 GB)
Large Pages unused system wide = 3 (6144 KB) (alloc incr 128 MB)
Large Pages configured system wide = 24580 (48 GB)
Large Page size = 2048 KB
***********************************************************

The following example shows that all is not well:

Starting ORACLE instance (normal)
****************** Large Pages Information *****************
Total Shared Global Region in Large Pages = 0 KB (0%) Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide = 0 (0 KB) (alloc incr 128 MB)
Large Pages configured system wide = 0 (0 KB)
Large Page size = 2048 KB
RECOMMENDATION:
Total Shared Global Region size is 48 GB. For optimal performance,
prior to the next instance restart increase the number
of unused Large Pages by atleast 24577 2048 KB Large Pages (48 GB)
system wide to get 100% of the Shared
Global Region allocated with Large pages
****************** Large Pages Information *****************

All in one reboot

Configuration

/etc/security/limits.conf
 oracle soft memlock 130000000
 oracle hard memlock 130000000
/etc/sysctl.conf
 shmall=16777216
 shmmax=69793218560
 vm.nr_hugepages=24580
sqlplus / as sysdba
 alter system set sga_target=48G scope=spfile sid='*';
 alter system set sga_max_size=48G scope=spfile sid='*';
 alter system set pga_aggregate_target=16G scope=spfile sid='*';
 alter system reset memory_max_target scope=spfile sid='*';
 alter system reset memory_target scope=spfile sid='*';
 alter system set use_large_pages=only scope=spfile sid='*';

reboot server

Use the same verification procedures as above.

5 responses to “Configuring HugePages for Oracle Database”

Pingback: Configuring HugePages for Oracle Database | Jed's
Deepa Telagam | 2012/11/09 at 3:35 pm | Reply

Excellent!
edo | 2013/05/02 at 2:25 am | Reply

little improvement for the shell script watching allocation/deallocation:
while :
do
awk ‘/^HugePages_/ {all=all $2″/t”}; END {print all}’ /proc/meminfo)
sleep 5
done
Rahul Dixit | 2013/08/06 at 7:50 pm | Reply

Hi, The total number of huge_pages should be equal or bit more than the SGA? right? Not the SGA+PGA? Since pga is not shared…
Regards,
RD.
- tinky2jed | 2013/08/07 at 4:17 pm | Reply
  
  It will need to be enough for all SGAs running on the server, preferably a little more to be safe, but as long as it is enough to contain it all.