Overview
As the amount of memory available on systems grows and the amount of memory needed by the database grows the traditional 4k page size used in most Linux systems is becoming a bit too small. As the total memory allocated increases the number of pages that must be managed also increases – meaning more work for the kernel. With HugePages you can increase the typical 4KB page size to something like 2MB. This means that for the same amount of RAM being used your OS will have a multiple of 512 less pages to manage. In addition, with HugePages the pages are pinned in memory and can’t be swapped to disk, thus avoiding possible disk writes. Another key benefit I’ve read is that HugePages are managed via a global PageTable rather than every process having its own PageTable – this also reduces the amount of memory needed.
The following are my notes and steps for setting up HugePages. Thank you to the many people I talked with, especially to the following people who provided key insight into the workings of HugePages and configuring it: Andrew Kerber, Mark Bobak, and Yong Huang.
Performance Improvement?
I didn’t have enough time to dig in and really look at differences in the database performance itself; however, I compared the average load on the servers before and after and found that following HugePages implementation the average load was about 87.5% of what it was previously. I’m sure results will vary!
REFERENCES
- MOS 361323.1 – Hugepages on Linux
- MOS 361468.1 HugePages on 64-bit Linux
- MOS 401749.1 script to calculate number of huge pages
- MOS 361468.1 Troubleshooting Huge Pages
- http://dbakerber.wordpress.com/
- http://yong321.freeshell.org/oranotes/HugePages.txt
Important Notes
- HugePages and AMM (Automatic Memory Management) are not compatible (maybe in 12c?). Oracle is very clear on this matter.
- As of 11.2.0.2 there is parameter USE_LARGE_PAGES that provides very useful information in the alert log (use this.)
- Oracle makes clear mention that incorrect configuration of HugePages can lead to many problems (from Oracle):
- hugepages not used
- poor db performance
- running out of memory or excessive swapping
- db instance not start
- crucial system services not working (e.g. CRS)
- In the first example, I go through the steps in order. As such, it has two reboots which is not necessary. I have a short version later that has one reboot.
- If you’re running RAC you can make these changes on single node to ensure success without risking crashing your whole RAC. I haven’t tried, but I think you could actually run your RAC nodes with some in regular shared memory and some in HugePages – though I wouldn’t suggest it.
- My example is on Oracle 11.2.0.3.0. Some things may vary on other versions (such as the existance of the USE_LARGE_PAGES parameter.)
Planning
- Check how much RAM you have available on the system. HugePages run in physical RAM and are pinned in it, so you want to know how much RAM you have available and how much you’re willing, or need to, give to the databases. If you’re running AMM you will need to switch to ASMM, so take a look at what you think you will want for SGA_TARGET and PGA_AGGREGATE_TARGET (look at v$sga_target_advice and v$pga_targeta_advice.)
- You will reboot your server, so plan on having maintenance time to do this.
Steps Involved
This is a quick list of what you will need to do (depending on your configuration):
- If you’re using AMM (i.e. MEMORY_TARGET), first make sure your kernel is configured correctly to support changing your database to ASMM (i.e. SGA_TARGET).
- If you’re using AMM, change to ASMM.
- Configure HugePages
- Set the Oracle USE_LARGE_PAGES setting (not required, but a good idea)
These can all be done in one step with a reboot. First I’ll go through each individual step.
My example
For my example I’m working on a RAC system. This gave me the opportunity to work one server at a time. I am working with 128GB RAM and based on system statistics opted to start with a 48GB SGA and 16GB PGA. I also only have one database on the servers. My work belows is based on those criteria.
Step 1: Configure shared memory
If you are using ASMM instead of AMM you don’t need to do this since you should already have your shared memory settings correct. If you’re using AMM then you are using shared memory from /dev/shm, and you will want to get these settings correct before switching to ASMM.
These are the recommendations I used to determine how to configure shared memory on the server:
/etc/sysctl.conf
- kernel.shmmax – set to the largest SGA on your server plus 1G
- kernel.shmall – set to sum of all SGAs on the server divided by page size – ‘getconf PAGESIZE’
/etc/security/limits.conf
- oracle soft memlock – set to slightly less than total RAM on server (in KB)
- oracle hard memlock – set to slightly less than total RAM on server (in KB)
So, for my system
- RAM = 128GB = 132093152 kB
- SGA = 48GB – however, to allow for possible growth and given I have 128GB total, I’m going to use 64G for my numbers
- PGA = 16GB
- shmmax = 64GB+1GB = 65GB= 69793218560
- shmall = 1SGA @ 64GB = 64G/4096 = 16,777,216
- oracle soft memlock = slightly less than 132093152 = 130000000
- oracle hard memlock = oracle soft memlock = 130000000
Make these changes to /etc/security/limits.conf and /etc/sysctl.conf and reboot your server. I recommend doing this even if you think you can set them without a reboot. That way you don’t find months later after an outage that it wasn’t done correctly.
Once the server is up again you can verify the settings with
cat /proc/sys/kernel/shmall cat /proc/sys/kernel/shmmax ulimit -a
Step 2: Change from AMM to ASMM
Once you have shared memory configured to support your SGAs, you need to switch from AMM to ASMM if you’re running AMM. Remember, Oracle will probably not support you if you run AMM and HugePages together.
Note: If you’re running RAC, I’d recommend using the sid='<node>’ clause and trying one node out first before mucking around with the whole thing.
For my example here is what I used:
alter system set sga_target=48G scope=spfile sid='*'; alter system set sga_max_size=48G scope=spfile sid='*'; alter system set pga_aggregate_target=16G scope=spfile sid='*'; alter system reset memory_max_target scope=spfile sid='*'; alter system reset memory_target scope=spfile sid='*';
Bounce just one instance to make sure it comes up successfully, check parameters
sqlplus / as sysdba show parameter memory -- should be unset (i.e. 0) show parameter ga -- should match your settings above
If you run into issues like:
ORA-00843: Parameter not taking MEMORY_MAX_TARGET into account ORA-00849: SGA_TARGET 51539607552 cannot be set to more than MEMORY_MAX_TARGET 34359738368.
and you are sure you set things correctly,then check your local pfile. I often find that the pfile has parameters set directly in it in addition to the reference to the pfile. If so, remove everything but the pfile reference (or at least the parameters above).
Bounce the whole database if good
Step 3: Configure HugePages on servers
Once you’re running with ASMM you should go ahead and get HugePages configured. Oracle has a script (in Note 401749.10 that will determine what they recommend for your HugePages configuration. Run this script:
->./hugepage_settings.sh ... Recommended setting: vm.nr_hugepages = 24580
If you want to figure this out yourself, please reference Andrew Kerber’s blog on HugePages.
Next add the following to /etc/sysctl.conf
vm.nr_hugepages=24580
and reboot your server.
Now verify that the hugepages kernel setting is correct
cat /proc/sys/vm/nr_hugepages
Hopefully your instance(s) will start successfully and you can now check to make sure HugePages are being used. To do this do the following:
->grep Huge /proc/meminfo HugePages_Total: 24580 HugePages_Free: 16212 HugePages_Rsvd: 16209 Hugepagesize: 2048 kB
To make sure that the configuration is valid, “the HugePages_Free value should be smaller than HugePages_Total and there should be some HugePages_Rsvd. The sum of Hugepages_Free and HugePages_Rsvd may be smaller than your total combined SGA as instances allocate pages dynamically and proactively as needed.” I’ve also seen both of the following: HugePages_Rsvd is the number of pages in use and Hugepages_Total – HugePages_Free is how many are actually being used.
You can look at the “ipcs” command; however, I’m not convinced it is the best way to verify, but you will see the huge pages if you know what to look for.
-> ipcs ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x00000000 29097988 oracle 640 4096 0 0x00000000 29130757 oracle 640 4096 0 0x0e2f3c64 29163526 oracle 640 4096 0 0x00000000 29655047 oracle 640 268435456 59 0x00000000 29687816 oracle 640 51271172096 59 0x2894b058 29720585 oracle 640 2097152 59
I could be wrong, but 51271172096/1024/1024/1000=48 (since pages are in kbytes, this would seem to be my 48G SGA)
To actually see the HugePages being allocated and deallocated run the following useful shell script from Yong’s page. While running it shutdown and startup your database.
while : do for i in $(grep ^Huge /proc/meminfo | head -3 | awk '{print $2}'); do echo -n "$i " done echo "" sleep 5 done
Step 4: Configure USE_LARGE_PAGES
Once you have it running using hugepages and have verified HugePages are being used you should turn on the USE_LARGE_PAGES parameter in the database. This will give you information in the alert log on HugePage use to help confirm it is being used.
alter system set use_large_pages=only scope=spfile sid='*';
Restart your instance and if things are good you should see something like this:
Starting ORACLE instance (normal) ****************** Large Pages Information ***************** Parameter use_large_pages = ONLY Total Shared Global Region in Large Pages = 48 GB (100%) Large Pages used by this instance: 24577 (48 GB) Large Pages unused system wide = 3 (6144 KB) (alloc incr 128 MB) Large Pages configured system wide = 24580 (48 GB) Large Page size = 2048 KB ***********************************************************
The following example shows that all is not well:
Starting ORACLE instance (normal) ****************** Large Pages Information ***************** Total Shared Global Region in Large Pages = 0 KB (0%) Large Pages used by this instance: 0 (0 KB) Large Pages unused system wide = 0 (0 KB) (alloc incr 128 MB) Large Pages configured system wide = 0 (0 KB) Large Page size = 2048 KB RECOMMENDATION: Total Shared Global Region size is 48 GB. For optimal performance, prior to the next instance restart increase the number of unused Large Pages by atleast 24577 2048 KB Large Pages (48 GB) system wide to get 100% of the Shared Global Region allocated with Large pages ****************** Large Pages Information *****************
All in one reboot
Configuration
/etc/security/limits.conf oracle soft memlock 130000000 oracle hard memlock 130000000 /etc/sysctl.conf shmall=16777216 shmmax=69793218560 vm.nr_hugepages=24580 sqlplus / as sysdba alter system set sga_target=48G scope=spfile sid='*'; alter system set sga_max_size=48G scope=spfile sid='*'; alter system set pga_aggregate_target=16G scope=spfile sid='*'; alter system reset memory_max_target scope=spfile sid='*'; alter system reset memory_target scope=spfile sid='*'; alter system set use_large_pages=only scope=spfile sid='*';
reboot server
Use the same verification procedures as above.
Pingback: Configuring HugePages for Oracle Database | Jed's
Excellent!
little improvement for the shell script watching allocation/deallocation:
while :
do
awk ‘/^HugePages_/ {all=all $2″/t”}; END {print all}’ /proc/meminfo)
sleep 5
done
Hi, The total number of huge_pages should be equal or bit more than the SGA? right? Not the SGA+PGA? Since pga is not shared…
Regards,
RD.
It will need to be enough for all SGAs running on the server, preferably a little more to be safe, but as long as it is enough to contain it all.