How to correct the error "Too many files open"

Environment

  • Red Hat Enterprise Linux (RHEL) 5,6,7

Issue

  • How to correct the error "Too many files open"
  • Error during login "Too many open files" and the session gets terminated automatically.

Resolution

  • This error is generated when the open file limit for a user or system exceeds the default setting.

SYSTEM Wide settings

  • To see the settings for maximum open files for the OS level, use following command:
    Raw
        # cat /proc/sys/fs/file-max
    
  • This value means that the maximum number of files all processes running on the system can open. By default this number will automatically vary according to the amount of RAM in the system. As a rough guideline it will be approximately 100000 files per GB of RAM, so something like 400000 files on a 4GB machine and 1600000 on a 16GB machine. To change the system wide maximum open files, as root edit the /etc/sysctl.conf and add the following to the end of the file:
    Raw
        fs.file-max = 495000
    
    Note: The above example will set the maximum number of files to 495,000 and will take effect when the system is rebooted.
  • Then issue the following command to activate this change to the live system:
    Raw
        # sysctl -p
    

Per USER Settings

  • To see the setting for maximum open files for a user, as root issue the following commands:
    Raw
        # su - <user>
        $ ulimit -n
    
  • The default setting for this is usually 1024. If more is needed for a specific user then as root modify it in the /etc/security/limits.conf file:
    Raw
        user - nofile 2048
    
    This will set the maximum open files for the specific "user" to 2048 files.
** WARNING **
The limits module that handles the setting of these values first reads /etc/security/limits.conf and then reads each file matching /etc/security/limits.d/*.conf This means that any changes you make in /etc/security/limits.conf may be overridden by a subsequent file. These files should be reviewed as a set to ensure they meet your requirements.
  • To do a system wide increase for all users then as root edit /etc/security/limits.conf file and add the following:
    Raw
        * - nofile 2048
    
  • This set the maximum open files for ALL users to 2048 files. These setting will require a reboot to become active.

NFS errors

  • On NFS mounted file systems, the error might appear on the client side, but need to be corrected on the NFS server side. An error like this may be seen on the client, but for an application accessing files via NFS:
Raw
java.io.FileNotFoundException: /apps/jenkins/jobs/TCE_FWS/builds/2013-03-21_04-01-34/archive/release/fws.war (Too many open files)
The solution is the increase ulimit on the NFS server; (here we doubled it in /etc/security/limits.conf from the default value)
Raw
        * - nofile 2048

Difference between hard limit and soft limit

  • The "nofile" item has two possible limit values: hard and soft. The hard limit represents the maximum value a soft limit may have and the soft limit represents the limit being actively enforced on the system at that time. Hard limits can be lowered by normal users, but not raised and soft limits cannot be set higher than hard limits. Only root may raise hard limits.
  • Both types of limits must be set before the change in the maximum number of open files will take effect. By using the "-" character, both hard and soft limits are set simultaneously.
  • For more details see the article All about resource limits: ulimit, pam_limits.so, /etc/limits.conf, and /etc/limits.d/

Diagnostic Steps

The following can be placed into /etc/cron.hourly/ or /etc/cron.daily/ to check for situation when you see too may open files from a service. It will automatically report the lsof output for a service user that you set.
  • Simply set the SERVICE_USER and LOG_FILE directives to configure this script.
Raw
#!/bin/bash
SERVICE_USER=activemq
LOG_FILE=/var/log/activemq/activemq.log

LIMITS_LOG=/var/log/${SERVICE_USER}_filelimits

echo "Log File: $LOG_FILE"
echo "Limit File: $LIMITS_LOG"
echo "Service User: $SERVICE_USER"

if [[ -e $LIMITS_LOG ]]; then
    logger "Too many open files cron job ran, however we already found file limits in the log."
    exit 0;
else
    logger "Too many open files cron job ran."
fi

echo "grepping $LOG_FILE for \"Too many open files\""
grep "Too many open files" $LOG_FILE &> /dev/null
if [[ $? == 0 ]]; then
    lsof | grep $SERVICE_USER > $LIMITS_LOG
    logger "Too many open files cron job ran, and found a file limit was logged."

No comments: