Age | Commit message (Collapse) | Author |
|
Thanks to Matthew Dodd for this bug report:
A file label issue while running SELinux in MLS mode provoked the
following bug, which is a result of use before init on a 'struct list_head'.
In nfsd4_list_rec_dir() if the call to dentry_open() fails the 'goto
out' skips INIT_LIST_HEAD() which results in the normally improbable
case where list_entry() returns NULL.
Trace follows.
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
SELinux: Context unconfined_t:object_r:var_lib_nfs_t:s0 is not valid
(left unmapped).
type=1400 audit(1227298063.609:282): avc: denied { read } for
pid=1890 comm="rpc.nfsd" name="v4recovery" dev=dm-0 ino=148726
scontext=system_u:system_r:nfsd_t:s0-s15:c0.c1023
tcontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tclass=dir
BUG: unable to handle kernel NULL pointer dereference at 00000004
IP: [<c050894e>] list_del+0x6/0x60
*pde = 0d9ce067 *pte = 00000000
Oops: 0000 [#1] SMP
Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4
sunrpc ipv6 dm_multipath scsi_dh ppdev parport_pc sg parport floppy
ata_piix pata_acpi ata_generic libata pcnet32 i2c_piix4 mii pcspkr
i2c_core dm_snapshot dm_zero dm_mirror dm_log dm_mod BusLogic sd_mod
scsi_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last
unloaded: microcode]
Pid: 1890, comm: rpc.nfsd Not tainted (2.6.27.5-37.fc9.i686 #1)
EIP: 0060:[<c050894e>] EFLAGS: 00010217 CPU: 0
EIP is at list_del+0x6/0x60
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: cd99e480
ESI: cf9caed8 EDI: 00000000 EBP: cf9caebc ESP: cf9caeb8
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rpc.nfsd (pid: 1890, ti=cf9ca000 task=cf4de580 task.ti=cf9ca000)
Stack: 00000000 cf9caef0 d0a9f139 c0496d04 d0a9f217 fffffff3 00000000
00000000
00000000 00000000 cf32b220 00000000 00000008 00000801 cf9caefc
d0a9f193
00000000 cf9caf08 d0a9b6ea 00000000 cf9caf1c d0a874f2 cf9c3004
00000008
Call Trace:
[<d0a9f139>] ? nfsd4_list_rec_dir+0xf3/0x13a [nfsd]
[<c0496d04>] ? do_path_lookup+0x12d/0x175
[<d0a9f217>] ? load_recdir+0x0/0x26 [nfsd]
[<d0a9f193>] ? nfsd4_recdir_load+0x13/0x34 [nfsd]
[<d0a9b6ea>] ? nfs4_state_start+0x2a/0xc5 [nfsd]
[<d0a874f2>] ? nfsd_svc+0x51/0xff [nfsd]
[<d0a87f2d>] ? write_svc+0x0/0x1e [nfsd]
[<d0a87f48>] ? write_svc+0x1b/0x1e [nfsd]
[<d0a87854>] ? nfsctl_transaction_write+0x3a/0x61 [nfsd]
[<c04b6a4e>] ? sys_nfsservctl+0x116/0x154
[<c04975c1>] ? putname+0x24/0x2f
[<c04975c1>] ? putname+0x24/0x2f
[<c048d49f>] ? do_sys_open+0xad/0xb7
[<c048d337>] ? filp_close+0x50/0x5a
[<c048d4eb>] ? sys_open+0x1e/0x26
[<c0403cca>] ? syscall_call+0x7/0xb
[<c064007b>] ? init_cyrix+0x185/0x490
=======================
Code: 75 e1 8b 53 08 8d 4b 04 8d 46 04 e8 75 00 00 00 8b 53 10 8d 4b 0c
8d 46 0c e8 67 00 00 00 5b 5e 5f 5d c3 90 90 55 89 e5 53 89 c3 <8b> 40
04 8b 00 39 d8 74 16 50 53 68 3e d6 6f c0 6a 30 68 78 d6
EIP: [<c050894e>] list_del+0x6/0x60 SS:ESP 0068:cf9caeb8
---[ end trace a89c4ad091c4ad53 ]---
Cc: Matthew N. Dodd <Matthew.Dodd@spart.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This takes care of all of the direct callers of vfs_mknod().
Since a few of these cases also handle normal file creation
as well, this also covers some calls to vfs_create().
So that we don't have to make three mnt_want/drop_write()
calls inside of the switch statement, we move some of its
logic outside of the switch and into a helper function
suggested by Christoph.
This also encapsulates a fix for mknod(S_IFREG) that Miklos
found.
[AV: merged mkdir handling, added missing nfsd pieces]
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Elevate the write count during the vfs_rmdir() and vfs_unlink().
[AV: merged rmdir and unlink parts, added missing pieces in nfsd]
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* Add path_put() functions for releasing a reference to the dentry and
vfsmount of a struct path in the right order
* Switch from path_release(nd) to path_put(&nd->path)
* Rename dput_path() to path_put_conditional()
[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.
Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
<dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
struct path in every place where the stack can be traversed
- it reduces the overall code size:
without patch series:
text data bss dec hex filename
5321639 858418 715768 6895825 6938d1 vmlinux
with patch series:
text data bss dec hex filename
5320026 858418 715768 6894212 693284 vmlinux
This patch:
Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Not architecture specific code should not #include <asm/scatterlist.h>.
This patch therefore either replaces them with
#include <linux/scatterlist.h> or simply removes them if they were
unused.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.
This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly
Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).
Cross-compile tested on
all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.18-1.2724.lockdepPAE #1
> ---------------------------------------------
> nfsd/6884 is trying to acquire lock:
> (&inode->i_mutex){--..}, at: [<c04811e5>] vfs_rmdir+0x73/0xf4
>
> but task is already holding lock:
> (&inode->i_mutex){--..}, at: [<f8dfa621>]
> nfsd4_clear_clid_dir+0x1f/0x3d [nfsd]
>
> other info that might help us debug this:
> 3 locks held by nfsd/6884:
> #0: (hash_sem){----}, at: [<f8de05eb>] nfsd+0x181/0x2ea [nfsd]
> #1: (client_mutex){--..}, at: [<f8df6d19>]
> nfsd4_setclientid_confirm+0x3b/0x2cf [nfsd]
> #2: (&inode->i_mutex){--..}, at: [<f8dfa621>]
> nfsd4_clear_clid_dir+0x1f/0x3d [nfsd]
>
> stack backtrace:
> [<c040524d>] dump_trace+0x69/0x1af
> [<c04053ab>] show_trace_log_lvl+0x18/0x2c
> [<c040595f>] show_trace+0xf/0x11
> [<c0405a53>] dump_stack+0x15/0x17
> [<c043ca7a>] __lock_acquire+0x110/0x9b6
> [<c043d91e>] lock_acquire+0x5c/0x7a
> [<c061a41b>] __mutex_lock_slowpath+0xde/0x234
> [<c04811e5>] vfs_rmdir+0x73/0xf4
> [<f8dfa62b>] nfsd4_clear_clid_dir+0x29/0x3d [nfsd]
> [<f8dfa733>] nfsd4_remove_clid_dir+0xb8/0xf8 [nfsd]
> [<f8df6e90>] nfsd4_setclientid_confirm+0x1b2/0x2cf [nfsd]
> [<f8def19a>] nfsd4_proc_compound+0x137a/0x166c [nfsd]
> [<f8de00d5>] nfsd_dispatch+0xc5/0x180 [nfsd]
> [<f8d09d83>] svc_process+0x3bd/0x631 [sunrpc]
> [<f8de0604>] nfsd+0x19a/0x2ea [nfsd]
> [<c0404e27>] kernel_thread_helper+0x7/0x10
> DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
> Leftover inexact backtrace:
> =======================
Some nesting annotation to the nfsd4 recovery code.
The vfs operations called will take dentry->d_inode->i_mutex.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
When I was performing some operations on NFS, I got below error on server
side.
=============================================
[ INFO: possible recursive locking detected ]
2.6.19-prep #1
---------------------------------------------
nfsd4/3525 is trying to acquire lock:
(&inode->i_mutex){--..}, at: [<c0611e5a>] mutex_lock+0x21/0x24
but task is already holding lock:
(&inode->i_mutex){--..}, at: [<c0611e5a>] mutex_lock+0x21/0x24
other info that might help us debug this:
2 locks held by nfsd4/3525:
#0: (client_mutex){--..}, at: [<c0611e5a>] mutex_lock+0x21/0x24
#1: (&inode->i_mutex){--..}, at: [<c0611e5a>] mutex_lock+0x21/0x24
stack backtrace:
[<c04051ed>] show_trace_log_lvl+0x58/0x16a
[<c04057fa>] show_trace+0xd/0x10
[<c0405913>] dump_stack+0x19/0x1b
[<c043b6f1>] __lock_acquire+0x778/0x99c
[<c043be86>] lock_acquire+0x4b/0x6d
[<c0611ceb>] __mutex_lock_slowpath+0xbc/0x20a
[<c0611e5a>] mutex_lock+0x21/0x24
[<c047fd7e>] vfs_rmdir+0x76/0xf8
[<f94b7ce9>] nfsd4_clear_clid_dir+0x2c/0x41 [nfsd]
[<f94b7de9>] nfsd4_remove_clid_dir+0xb1/0xe8 [nfsd]
[<f94b307b>] laundromat_main+0x9b/0x1c3 [nfsd]
[<c04333d6>] run_workqueue+0x7a/0xbb
[<c0433d0b>] worker_thread+0xd2/0x107
[<c0436285>] kthread+0xc3/0xf2
[<c0402005>] kernel_thread_helper+0x5/0xb
===================================================================
Cause for this problem was,2 successive mutex_lock calls on 2 diffrent inodes ,as shown below
static int
nfsd4_clear_clid_dir(struct dentry *dir, struct dentry *dentry)
{
int status;
/* For now this directory should already be empty, but we empty it of
* any regular files anyway, just in case the directory was created by
* a kernel from the future.... */
nfsd4_list_rec_dir(dentry, nfsd4_remove_clid_file);
mutex_lock(&dir->d_inode->i_mutex);
status = vfs_rmdir(dir->d_inode, dentry);
...
int vfs_rmdir(struct inode *dir, struct dentry *dentry)
{
int error = may_delete(dir, dentry, 1);
if (error)
return error;
if (!dir->i_op || !dir->i_op->rmdir)
return -EPERM;
DQUOT_INIT(dir);
mutex_lock(&dentry->d_inode->i_mutex);
...
So I have developed the patch to overcome this problem.
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
These patches make the kernel pass 64-bit inode numbers internally when
communicating to userspace, even on a 32-bit system. They are required
because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
for example. The 64-bit inode numbers are then propagated to userspace
automatically where the arch supports it.
Problems have been seen with userspace (eg: ld.so) using the 64-bit inode
number returned by stat64() or getdents64() to differentiate files, and
failing because the 64-bit inode number space was compressed to 32-bits, and
so overlaps occur.
This patch:
Make filldir_t take a 64-bit inode number and struct kstat carry a 64-bit
inode number so that 64-bit inode numbers can be passed back to userspace.
The stat functions then returns the full 64-bit inode number where
available and where possible. If it is not possible to represent the inode
number supplied by the filesystem in the field provided by userspace, then
error EOVERFLOW will be issued.
Similarly, the getdents/readdir functions now pass the full 64-bit inode
number to userspace where possible, returning EOVERFLOW instead when a
directory entry is encountered that can't be properly represented.
Note that this means that some inodes will not be stat'able on a 32-bit
system with old libraries where they were before - but it does mean that
there will be no ambiguity over what a 32-bit inode number refers to.
Note similarly that directory scans may be cut short with an error on a
32-bit system with old libraries where the scan would work before for the
same reasons.
It is judged unlikely that this situation will occur because modern glibc
uses 64-bit capable versions of stat and getdents class functions
exclusively, and that older systems are unlikely to encounter
unrepresentable inode numbers anyway.
[akpm: alpha build fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch converts all remaining crypto_digest users to use the new
crypto_hash interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Make sure we get a directory when we look up the recovery directory.
Thanks to Christoph Hellwig for the bug report.
Based on feedback from Christoph and others, we may remove the need for this
lookup and just pass in a file descriptor from userspace instead, and/or
completely move the directory handling to userspace. For now we're just
fixing the obvious bugs.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
We should be opening this directory RDONLY, not RDWR.
Thanks to Christoph Hellwig for the bug report.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.
Modified-by: Ingo Molnar <mingo@elte.hu>
(finished the conversion)
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Since the patch to add a NULL short-circuit to crypto_free_tfm() went in,
there's no longer any need for callers of that function to check for NULL.
This patch removes the redundant NULL checks and also a few similar checks
for NULL before calls to kfree() that I ran into while doing the
crypto_free_tfm bits.
I've succesfuly compile tested this patch, and a kernel with the patch
applied boots and runs just fine.
When I posted the patch to LKML (and other lists/people on Cc) it drew the
following comments :
J. Bruce Fields commented
"I've no problem with the auth_gss or nfsv4 bits.--b."
Sridhar Samudrala said
"sctp change looks fine."
Herbert Xu signed off on the patch.
So, I guess this is ready to be dropped into -mm and eventually mainline.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch goes through the current users of the crypto layer and sets
CRYPTO_TFM_REQ_MAY_SLEEP at crypto_alloc_tfm() where all crypto operations
are performed in process context.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure we don't try to delete client recovery directories multiple times;
fixes some spurious error messages.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Oops, this lookup_one_len needs the i_sem.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
We need to fsync the recovery directory after writing to it, but we weren't
doing this correctly. (For example, we weren't taking the i_sem when calling
->fsync().)
Just reuse the existing nfsd fsync code instead.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Set the recovery directory via /proc/fs/nfsd/nfs4recoverydir.
It may be changed any time, but is used only on startup.
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch adds the code to create and remove client subdirectories from the
recovery directory, as described in the previous patch comment.
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
NFSv4 clients are required to know what state they have on the server so that
they can reclaim it on server reboot. However, it is possible for
pathalogical combinations of server reboots and network partitions to leave a
client in a state where it cannot know whether it has lost its state on the
server.
For this reason, rfc3530 requires that we store some information about clients
to stable storage.
So we maintain a directory /var/lib/nfs/v4recovery with a subdirectory for
each client with active state. We leave open the possibility of including
files underneath each such subdirectory with information about the client, but
for now the subdirectories are empty.
We create a client subdirectory whenever a client makes its first non-reclaim
open_confirm.
We remove a client subdirectory whenever either
a) its lease expires, or
b) the grace period ends without it reclaiming anything.
When handling reclaims, we allow the reclaim if and only if the client doing
the reclaim has a subdirectory.
This patch adds just the code to scan the recovery directory on nfsd startup.
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
For the purposes of reboot recovery we keep a directory with subdirectories
each having a name that is the ascii hex representation of the md5 sum of a
client identifier for an active client.
This adds the code to calculate that name. We also use it for the purposes of
comparing clients, so if someone ever manages to find two client names that
are md5 collisions, then we'll return clid_inuse to the second.
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|