Eric Dumazet 7e360c38ab fs: allow for more than 2^31 files
Andrew,

Could you please review this patch, you probably are the right guy to
take it, because it crosses fs and net trees.

Note : /proc/sys/fs/file-nr is a read-only file, so this patch doesnt
depend on previous patch (sysctl: fix min/max handling in
__do_proc_doulongvec_minmax())

Thanks !

[PATCH V4] fs: allow for more than 2^31 files

Robin Holt tried to boot a 16TB system and found af_unix was overflowing
a 32bit value :

<quote>

We were seeing a failure which prevented boot.  The kernel was incapable
of creating either a named pipe or unix domain socket.  This comes down
to a common kernel function called unix_create1() which does:

        atomic_inc(&unix_nr_socks);
        if (atomic_read(&unix_nr_socks) > 2 * get_max_files())
                goto out;

The function get_max_files() is a simple return of files_stat.max_files.
files_stat.max_files is a signed integer and is computed in
fs/file_table.c's files_init().

        n = (mempages * (PAGE_SIZE / 1024)) / 10;
        files_stat.max_files = n;

In our case, mempages (total_ram_pages) is approx 3,758,096,384
(0xe0000000).  That leaves max_files at approximately 1,503,238,553.
This causes 2 * get_max_files() to integer overflow.

</quote>

Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long
integers, and change af_unix to use an atomic_long_t instead of
atomic_t.

get_max_files() is changed to return an unsigned long.
get_nr_files() is changed to return a long.

unix_nr_socks is changed from atomic_t to atomic_long_t, while not
strictly needed to address Robin problem.

Before patch (on a 64bit kernel) :
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
-18446744071562067968

After patch:
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
2147483648
# cat /proc/sys/fs/file-nr
704     0       2147483648

Reported-by: Robin Holt <holt@sgi.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David Miller <davem@davemloft.net>
Reviewed-by: Robin Holt <holt@sgi.com>
Tested-by: Robin Holt <holt@sgi.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-25 21:18:20 -04:00
..
2010-09-09 20:48:37 +02:00
2010-08-09 16:48:42 -04:00
2010-07-14 11:29:46 +02:00
2010-07-28 09:58:19 -04:00
2010-08-11 00:28:20 -04:00
2009-09-18 09:48:52 -07:00
2010-09-09 21:07:09 +02:00
2010-09-22 17:22:39 -07:00
2010-10-01 10:50:58 -07:00
2010-06-29 10:07:09 +02:00
2010-10-18 18:44:26 +02:00
2010-08-11 23:04:20 +09:30
2010-08-19 17:18:02 -07:00
2010-08-20 08:55:00 -07:00
2010-05-11 12:01:10 -07:00
2010-10-07 09:41:25 +02:00
2010-10-25 21:18:20 -04:00
2010-07-27 12:40:54 +02:00
2010-03-06 11:26:23 -08:00
2009-09-23 18:13:10 -07:00
2010-05-10 08:48:39 +02:00
2009-06-18 13:03:55 -07:00