summaryrefslogtreecommitdiff
path: root/fs/nfsd
AgeCommit message (Collapse)Author
2017-03-06NFSD: Fix a null reference case in find_or_create_lock_stateid()Kinglong Mee
[ Upstream commit d19fb70dd68c4e960e2ac09b0b9c79dfdeefa726 ] nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid(). If nfsd doesn't go through init_lock_stateid() and put stateid at end, there is a NULL reference to .sc_free when calling nfs4_put_stid(ns). This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid(). Cc: stable@vger.kernel.org Fixes: 356a95ece7aa "nfsd: clean up races in lock stateid searching..." Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
2016-12-23fs: Give dentry to inode_change_ok() instead of inodeJan Kara
[ Upstream commit 31051c85b5e2aaaf6315f74c72a732673632a905 ] inode_change_ok() will be resposible for clearing capabilities and IMA extended attributes and as such will need dentry. Give it as an argument to inode_change_ok() instead of an inode. Also rename inode_change_ok() to setattr_prepare() to better relect that it does also some modifications in addition to checks. References: CVE-2015-1350 Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Philipp Hahn <hahn@univention.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
2016-12-23nfsd: Disable NFSv2 timestamp workaround for NFSv3+Andreas Gruenbacher
NFSv2 can set the atime and/or mtime of a file to specific timestamps but not to the server's current time. To implement the equivalent of utimes("file", NULL), it uses a heuristic. NFSv3 and later do support setting the atime and/or mtime to the server's current time directly. The NFSv2 heuristic is still enabled, and causes timestamps to be set wrong sometimes. Fix this by moving the heuristic into the NFSv2 specific code. We can leave it out of the create code path: the owner can always set timestamps arbitrarily, and the workaround would never trigger. References: CVE-2015-1350 Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Philipp Hahn <hahn@univention.de> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
2016-08-19nfsd: check permissions when setting ACLsBen Hutchings
[ Upstream commit 999653786df6954a31044528ac3f7a5dadca08f4 ] Use set_posix_acl, which includes proper permission checks, instead of calling ->set_acl directly. Without this anyone may be able to grant themselves permissions to a file by setting the ACL. Lock the inode to make the new checks atomic with respect to set_acl. (Also, nfsd was the only caller of set_acl not locking the inode, so I suspect this may fix other races.) This also simplifies the code, and ensures our ACLs are checked by posix_acl_valid. The permission checks and the inode locking were lost with commit 4ac7249e, which changed nfsd to use the set_acl inode operation directly instead of going through xattr handlers. Reported-by: David Sinquin <david@sinquin.eu> [agreunba@redhat.com: use set_posix_acl] Fixes: 4ac7249e Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
2016-07-10nfsd4/rpc: move backchannel create logic into rpc codeJ. Bruce Fields
[ Upstream commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 ] Also simplify the logic a bit. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Trond Myklebust <trondmy@primarydata.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-04-18nfsd: fix deadlock secinfo+readdir compoundJ. Bruce Fields
[ Upstream commit 2f6fc056e899bd0144a08da5cacaecbe8997cd74 ] nfsd_lookup_dentry exits with the parent filehandle locked. fh_put also unlocks if necessary (nfsd filehandle locking is probably too lenient), so it gets unlocked eventually, but if the following op in the compound needs to lock it again, we can deadlock. A fuzzer ran into this; normal clients don't send a secinfo followed by a readdir in the same compound. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-04-18nfsd4: fix bad bounds checkingJ. Bruce Fields
[ Upstream commit 4aed9c46afb80164401143aa0fdcfe3798baa9d5 ] A number of spots in the xdr decoding follow a pattern like n = be32_to_cpup(p++); READ_BUF(n + 4); where n is a u32. The only bounds checking is done in READ_BUF itself, but since it's checking (n + 4), it won't catch cases where n is very large, (u32)(-4) or higher. I'm not sure exactly what the consequences are, but we've seen crashes soon after. Instead, just break these up into two READ_BUF()s. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-12-14nfsd: eliminate sending duplicate and repeated delegationsAndrew Elble
commit 34ed9872e745fa56f10e9bef2cf3d2336c6c8816 upstream. We've observed the nfsd server in a state where there are multiple delegations on the same nfs4_file for the same client. The nfs client does attempt to DELEGRETURN these when they are presented to it - but apparently under some (unknown) circumstances the client does not manage to return all of them. This leads to the eventual attempt to CB_RECALL more than one delegation with the same nfs filehandle to the same client. The first recall will succeed, but the next recall will fail with NFS4ERR_BADHANDLE. This leads to the server having delegations on cl_revoked that the client has no way to FREE or DELEGRETURN, with resulting inability to recover. The state manager on the server will continually assert SEQ4_STATUS_RECALLABLE_STATE_REVOKED, and the state manager on the client will be looping unable to satisfy the server. List discussion also reports a race between OPEN and DELEGRETURN that will be avoided by only sending the delegation once to the client. This is also logically in accordance with RFC5561 9.1.1 and 10.2. So, let's: 1.) Not hand out duplicate delegations. 2.) Only send them to the client once. RFC 5561: 9.1.1: "Delegations and layouts, on the other hand, are not associated with a specific owner but are associated with the client as a whole (identified by a client ID)." 10.2: "...the stateid for a delegation is associated with a client ID and may be used on behalf of all the open-owners for the given client. A delegation is made to the client as a whole and not to any specific process or thread of control within it." Reported-by: Eric Meddaugh <etmsys@rit.edu> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-12-14nfsd: serialize state seqid morphing operationsJeff Layton
commit 35a92fe8770ce54c5eb275cd76128645bea2d200 upstream. Andrew was seeing a race occur when an OPEN and OPEN_DOWNGRADE were running in parallel. The server would receive the OPEN_DOWNGRADE first and check its seqid, but then an OPEN would race in and bump it. The OPEN_DOWNGRADE would then complete and bump the seqid again. The result was that the OPEN_DOWNGRADE would be applied after the OPEN, even though it should have been rejected since the seqid changed. The only recourse we have here I think is to serialize operations that bump the seqid in a stateid, particularly when we're given a seqid in the call. To address this, we add a new rw_semaphore to the nfs4_ol_stateid struct. We do a down_write prior to checking the seqid after looking up the stateid to ensure that nothing else is going to bump it while we're operating on it. In the case of OPEN, we do a down_read, as the call doesn't contain a seqid. Those can run in parallel -- we just need to serialize them when there is a concurrent OPEN_DOWNGRADE or CLOSE. LOCK and LOCKU however always take the write lock as there is no opportunity for parallelizing those. Reported-and-Tested-by: Andrew W Elble <aweits@rit.edu> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-27nfsd/blocklayout: accept any minlengthChristoph Hellwig
commit 8c3ad9cb7343dc5f61b8cf3cdbe1016c5e7c2c8b upstream. Recent Linux clients have started to send GETLAYOUT requests with minlength less than blocksize. Servers aren't really allowed to impose this kind of restriction on layouts; see RFC 5661 section 18.43.3 for details. This has been observed to cause indefinite hangs on fsx runs on some clients. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29nfsd: ensure that delegation stateid hash references are only put onceJeff Layton
commit 3fcbbd244ed1d20dc0eb7d48d729503992fa9b7d upstream. It's possible that a DELEGRETURN could race with (e.g.) client expiry, in which case we could end up putting the delegation hash reference more than once. Have unhash_delegation_locked return a bool that indicates whether it was already unhashed. In the case of destroy_delegation we only conditionally put the hash reference if that returns true. The other callers of unhash_delegation_locked call it while walking list_heads that shouldn't yet be detached. If we find that it doesn't return true in those cases, then throw a WARN_ON as that indicates that we have a partially hashed delegation, and that something is likely very wrong. Tested-by: Andrew W Elble <aweits@rit.edu> Tested-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29nfsd: ensure that the ol stateid hash reference is only put onceJeff Layton
commit e85687393f3ee0a77ccca016f903d1558bb69258 upstream. When an open or lock stateid is hashed, we take an extra reference to it. When we unhash it, we drop that reference. The code however does not properly account for the case where we have two callers concurrently trying to unhash the stateid. This can lead to list corruption and the hash reference being put more than once. Fix this by having unhash_ol_stateid use list_del_init on the st_perfile list_head, and then testing to see if that list_head is empty before releasing the hash reference. This means that some of the unhashing wrappers now become bool return functions so we can test to see whether the stateid was unhashed before we put the reference. Reported-by: Andrew W Elble <aweits@rit.edu> Tested-by: Andrew W Elble <aweits@rit.edu> Reported-by: Anna Schumaker <Anna.Schumaker@netapp.com> Tested-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29nfsd: Fix an FS_LAYOUT_TYPES/LAYOUT_TYPES encode bugKinglong Mee
commit 6896f15aabde505b35888039af93d1d182a0108a upstream. Currently we'll respond correctly to a request for either FS_LAYOUT_TYPES or LAYOUT_TYPES, but not to a request for both attributes simultaneously. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16nfsd: do nfs4_check_fh in nfs4_check_file instead of nfs4_check_olstateidJeff Layton
commit 8fcd461db7c09337b6d2e22d25eb411123f379e3 upstream. Currently, preprocess_stateid_op calls nfs4_check_olstateid which verifies that the open stateid corresponds to the current filehandle in the call by calling nfs4_check_fh. If the stateid is a NFS4_DELEG_STID however, then no such check is done. This could cause incorrect enforcement of permissions, because the nfsd_permission() call in nfs4_check_file uses current the current filehandle, but any subsequent IO operation will use the file descriptor in the stateid. Move the call to nfs4_check_fh into nfs4_check_file instead so that it can be done for all stateid types. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> [bfields: moved fh check to avoid NULL deref in special stateid case] Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16nfsd: refactor nfs4_preprocess_stateid_opChristoph Hellwig
commit a0649b2d3fffb1cde8745568c767f3a55a3462bc upstream. Split out two self contained helpers to make the function more readable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Cc: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16nfsd: Drop BUG_ON and ignore SECLABEL on absent filesystemKinglong Mee
commit c2227a39a078473115910512aa0f8d53bd915e60 upstream. On an absent filesystem (one served by another server), we need to be able to handle requests for certain attributest (like fs_locations, so the client can find out which server does have the filesystem), but others we can't. We forgot to take that into account when adding another attribute bitmask work for the SECURITY_LABEL attribute. There an export entry with the "refer" option can result in: [ 88.414272] kernel BUG at fs/nfsd/nfs4xdr.c:2249! [ 88.414828] invalid opcode: 0000 [#1] SMP [ 88.415368] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iosf_mbi ppdev btrfs coretemp crct10dif_pclmul crc32_pclmul crc32c_intel xor ghash_clmulni_intel raid6_pq vmw_balloon parport_pc parport i2c_piix4 shpchp vmw_vmci acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi mptscsih serio_raw mptbase e1000 scsi_transport_spi ata_generic pata_acpi [last unloaded: nfsd] [ 88.417827] CPU: 0 PID: 2116 Comm: nfsd Not tainted 4.0.7-300.fc22.x86_64 #1 [ 88.418448] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [ 88.419093] task: ffff880079146d50 ti: ffff8800785d8000 task.ti: ffff8800785d8000 [ 88.419729] RIP: 0010:[<ffffffffa04b3c10>] [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd] [ 88.420376] RSP: 0000:ffff8800785db998 EFLAGS: 00010206 [ 88.421027] RAX: 0000000000000001 RBX: 000000000018091a RCX: ffff88006668b980 [ 88.421676] RDX: 00000000fffef7fc RSI: 0000000000000000 RDI: ffff880078d05000 [ 88.422315] RBP: ffff8800785dbb58 R08: ffff880078d043f8 R09: ffff880078d4a000 [ 88.422968] R10: 0000000000010000 R11: 0000000000000002 R12: 0000000000b0a23a [ 88.423612] R13: ffff880078d05000 R14: ffff880078683100 R15: ffff88006668b980 [ 88.424295] FS: 0000000000000000(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000 [ 88.424944] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.425597] CR2: 00007f40bc370f90 CR3: 0000000035af5000 CR4: 00000000001407f0 [ 88.426285] Stack: [ 88.426921] ffff8800785dbaa8 ffffffffa049e4af ffff8800785dba08 ffffffff813298f0 [ 88.427585] ffff880078683300 ffff8800769b0de8 0000089d00000001 0000000087f805e0 [ 88.428228] ffff880000000000 ffff880079434a00 0000000000000000 ffff88006668b980 [ 88.428877] Call Trace: [ 88.429527] [<ffffffffa049e4af>] ? exp_get_by_name+0x7f/0xb0 [nfsd] [ 88.430168] [<ffffffff813298f0>] ? inode_doinit_with_dentry+0x210/0x6a0 [ 88.430807] [<ffffffff8123833e>] ? d_lookup+0x2e/0x60 [ 88.431449] [<ffffffff81236133>] ? dput+0x33/0x230 [ 88.432097] [<ffffffff8123f214>] ? mntput+0x24/0x40 [ 88.432719] [<ffffffff812272b2>] ? path_put+0x22/0x30 [ 88.433340] [<ffffffffa049ac87>] ? nfsd_cross_mnt+0xb7/0x1c0 [nfsd] [ 88.433954] [<ffffffffa04b54e0>] nfsd4_encode_dirent+0x1b0/0x3d0 [nfsd] [ 88.434601] [<ffffffffa04b5330>] ? nfsd4_encode_getattr+0x40/0x40 [nfsd] [ 88.435172] [<ffffffffa049c991>] nfsd_readdir+0x1c1/0x2a0 [nfsd] [ 88.435710] [<ffffffffa049a530>] ? nfsd_direct_splice_actor+0x20/0x20 [nfsd] [ 88.436447] [<ffffffffa04abf30>] nfsd4_encode_readdir+0x120/0x220 [nfsd] [ 88.437011] [<ffffffffa04b58cd>] nfsd4_encode_operation+0x7d/0x190 [nfsd] [ 88.437566] [<ffffffffa04aa6dd>] nfsd4_proc_compound+0x24d/0x6f0 [nfsd] [ 88.438157] [<ffffffffa0496103>] nfsd_dispatch+0xc3/0x220 [nfsd] [ 88.438680] [<ffffffffa006f0cb>] svc_process_common+0x43b/0x690 [sunrpc] [ 88.439192] [<ffffffffa0070493>] svc_process+0x103/0x1b0 [sunrpc] [ 88.439694] [<ffffffffa0495a57>] nfsd+0x117/0x190 [nfsd] [ 88.440194] [<ffffffffa0495940>] ? nfsd_destroy+0x90/0x90 [nfsd] [ 88.440697] [<ffffffff810bb728>] kthread+0xd8/0xf0 [ 88.441260] [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180 [ 88.441762] [<ffffffff81789e58>] ret_from_fork+0x58/0x90 [ 88.442322] [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180 [ 88.442879] Code: 0f 84 93 05 00 00 83 f8 ea c7 85 a0 fe ff ff 00 00 27 30 0f 84 ba fe ff ff 85 c0 0f 85 a5 fe ff ff e9 e3 f9 ff ff 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 be 04 00 00 00 4c 89 ef 4c 89 8d 68 fe [ 88.444052] RIP [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd] [ 88.444658] RSP <ffff8800785db998> [ 88.445232] ---[ end trace 6cb9d0487d94a29f ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-04nfsd: skip CB_NULL probes for 4.1 or laterChristoph Hellwig
With sessions in v4.1 or later we don't need to manually probe the backchannel connection, so we can declare it up instantly after setting up the RPC client. Note that we really should split nfsd4_run_cb_work in the long run, this is just the least intrusive fix for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04nfsd: fix callback restartsChristoph Hellwig
Checking the rpc_client pointer is not a reliable way to detect backchannel changes: cl_cb_client is changed only after shutting down the rpc client, so the condition cl_cb_client = tk_client will always be true. Check the RPC_TASK_KILLED flag instead, and rewrite the code to avoid the buggy cl_callbacks list and fix the lifetime rules due to double calls of the ->prepare callback operations method for this retry case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04nfsd: split transport vs operation errors for callbacksChristoph Hellwig
We must only increment the sequence id if the client has seen and responded to a request. If we failed to deliver it to the client we must resend with the same sequence id. So just like the client track errors at the transport level differently from those returned in the XDR. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04nfsd: fix pNFS return on close semanticsSachin Bhamare
For the sake of forgetful clients, the server should return the layouts to the file system on 'last close' of a file (assuming that there are no delegations outstanding to that particular client) or on delegreturn (assuming that there are no opens on a file from that particular client). In theory the information is all there in current data structures, but it's not efficiently available; nfs4_file->fi_ref includes references on the file across all clients, but we need a per-(client, file) count. Walking through lots of stateid's to calculate this on each close or delegreturn would be painful. This patch introduces infrastructure to maintain per-client opens and delegation counters on a per-file basis. [hch: ported to the mainline pNFS support, merged various fixes from Jeff] Signed-off-by: Sachin Bhamare <sachin.bhamare@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04nfsd: fix the check for confirmed openowner in nfs4_preprocess_stateid_opChristoph Hellwig
If we find a non-confirmed openowner we jump to exit the function, but do not set an error value. Fix this by factoring out a helper to do the check and properly set the error from nfsd4_validate_stateid. Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04nfsd/blocklayout: pretend we can send deviceid notificationsChristoph Hellwig
Commit df52699e4fcef ("NFSv4.1: Don't cache deviceids that have no notifications") causes the Linux NFS client to stop caching deviceid's unless a server pretends to support deviceid notifications. While this behavior is stupid and the language around this area in rfc5661 is a mess carified by an errata that I submittted, Trond insists on this behavior. Not caching deviceids degrades block layout performance massively as a GETDEVICEINFO is fairly expensive. So add this hack to make the Linux client happy again. Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-24Merge branch 'for-4.1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "A quiet cycle this time; this is basically entirely bugfixes. The few that aren't cc'd to stable are cleanup or seemed unlikely to affect anyone much" * 'for-4.1' of git://linux-nfs.org/~bfields/linux: uapi: Remove kernel internal declaration nfsd: fix nsfd startup race triggering BUG_ON nfsd: eliminate NFSD_DEBUG nfsd4: fix READ permission checking nfsd4: disallow SEEK with special stateids nfsd4: disallow ALLOCATE with special stateids nfsd: add NFSEXP_PNFS to the exflags array nfsd: Remove duplicate macro define for max sec label length nfsd: allow setting acls with unenforceable DENYs nfsd: NFSD_FAULT_INJECTION depends on DEBUG_FS nfsd: remove unused status arg to nfsd4_cleanup_open_state nfsd: remove bogus setting of status in nfsd4_process_open2 NFSD: Use correct reply size calculating function NFSD: Using path_equal() for checking two paths
2015-04-21nfsd: fix nsfd startup race triggering BUG_ONGiuseppe Cantavenera
nfsd triggered a BUG_ON in net_generic(...) when rpc_pipefs_event(...) in fs/nfsd/nfs4recover.c was called before assigning ntfsd_net_id. The following was observed on a MIPS 32-core processor: kernel: Call Trace: kernel: [<ffffffffc00bc5e4>] rpc_pipefs_event+0x7c/0x158 [nfsd] kernel: [<ffffffff8017a2a0>] notifier_call_chain+0x70/0xb8 kernel: [<ffffffff8017a4e4>] __blocking_notifier_call_chain+0x4c/0x70 kernel: [<ffffffff8053aff8>] rpc_fill_super+0xf8/0x1a0 kernel: [<ffffffff8022204c>] mount_ns+0xb4/0xf0 kernel: [<ffffffff80222b48>] mount_fs+0x50/0x1f8 kernel: [<ffffffff8023dc00>] vfs_kern_mount+0x58/0xf0 kernel: [<ffffffff802404ac>] do_mount+0x27c/0xa28 kernel: [<ffffffff80240cf0>] SyS_mount+0x98/0xe8 kernel: [<ffffffff80135d24>] handle_sys64+0x44/0x68 kernel: kernel: Code: 0040f809 00000000 2e020001 <00020336> 3c12c00d 3c02801a de100000 6442eb98 0040f809 kernel: ---[ end trace 7471374335809536 ]--- Fixed this behaviour by calling register_pernet_subsys(&nfsd_net_ops) before registering rpc_pipefs_event(...) with the notifier chain. Signed-off-by: Giuseppe Cantavenera <giuseppe.cantavenera.ext@nokia.com> Signed-off-by: Lorenzo Restelli <lorenzo.restelli.ext@nokia.com> Reviewed-by: Kinlong Mee <kinglongmee@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21nfsd: eliminate NFSD_DEBUGMark Salter
Commit f895b252d4edf ("sunrpc: eliminate RPC_DEBUG") introduced use of IS_ENABLED() in a uapi header which leads to a build failure for userspace apps trying to use <linux/nfsd/debug.h>: linux/nfsd/debug.h:18:15: error: missing binary operator before token "(" #if IS_ENABLED(CONFIG_SUNRPC_DEBUG) ^ Since this was only used to define NFSD_DEBUG if CONFIG_SUNRPC_DEBUG is enabled, replace instances of NFSD_DEBUG with CONFIG_SUNRPC_DEBUG. Cc: stable@vger.kernel.org Fixes: f895b252d4edf "sunrpc: eliminate RPC_DEBUG" Signed-off-by: Mark Salter <msalter@redhat.com> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21nfsd4: fix READ permission checkingJ. Bruce Fields
In the case we already have a struct file (derived from a stateid), we still need to do permission-checking; otherwise an unauthorized user could gain access to a file by sniffing or guessing somebody else's stateid. Cc: stable@vger.kernel.org Fixes: dc97618ddda9 "nfsd4: separate splice and readv cases" Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21nfsd4: disallow SEEK with special stateidsJ. Bruce Fields
If the client uses a special stateid then we'll pass a NULL file to vfs_llseek. Fixes: 24bab491220f " NFSD: Implement SEEK" Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: stable@vger.kernel.org Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21nfsd4: disallow ALLOCATE with special stateidsJ. Bruce Fields
vfs_fallocate will hit a NULL dereference if the client tries an ALLOCATE or DEALLOCATE with a special stateid. Fix that. (We also depend on the open to have broken any conflicting leases or delegations for us.) (If it turns out we need to allow special stateid's then we could do a temporary open here in the special-stateid case, as we do for read and write. For now I'm assuming it's not necessary.) Fixes: 95d871f03cae "nfsd: Add ALLOCATE support" Cc: stable@vger.kernel.org Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge second patchbomb from Andrew Morton: - the rest of MM - various misc bits - add ability to run /sbin/reboot at reboot time - printk/vsprintf changes - fiddle with seq_printf() return value * akpm: (114 commits) parisc: remove use of seq_printf return value lru_cache: remove use of seq_printf return value tracing: remove use of seq_printf return value cgroup: remove use of seq_printf return value proc: remove use of seq_printf return value s390: remove use of seq_printf return value cris fasttimer: remove use of seq_printf return value cris: remove use of seq_printf return value openrisc: remove use of seq_printf return value ARM: plat-pxa: remove use of seq_printf return value nios2: cpuinfo: remove use of seq_printf return value microblaze: mb: remove use of seq_printf return value ipc: remove use of seq_printf return value rtc: remove use of seq_printf return value power: wakeup: remove use of seq_printf return value x86: mtrr: if: remove use of seq_printf return value linux/bitmap.h: improve BITMAP_{LAST,FIRST}_WORD_MASK MAINTAINERS: CREDITS: remove Stefano Brivio from B43 .mailmap: add Ricardo Ribalda CREDITS: add Ricardo Ribalda Delgado ...
2015-04-15kernel: conditionally support non-root users, groups and capabilitiesIulia Manda
There are a lot of embedded systems that run most or all of their functionality in init, running as root:root. For these systems, supporting multiple users is not necessary. This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for non-root users, non-root groups, and capabilities optional. It is enabled under CONFIG_EXPERT menu. When this symbol is not defined, UID and GID are zero in any possible case and processes always have all capabilities. The following syscalls are compiled out: setuid, setregid, setgid, setreuid, setresuid, getresuid, setresgid, getresgid, setgroups, getgroups, setfsuid, setfsgid, capget, capset. Also, groups.c is compiled out completely. In kernel/capability.c, capable function was moved in order to avoid adding two ifdef blocks. This change saves about 25 KB on a defconfig build. The most minimal kernels have total text sizes in the high hundreds of kB rather than low MB. (The 25k goes down a bit with allnoconfig, but not that much. The kernel was booted in Qemu. All the common functionalities work. Adding users/groups is not possible, failing with -ENOSYS. Bloat-o-meter output: add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650) [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Iulia Manda <iulia.manda21@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15VFS: normal filesystems (and lustre): d_inode() annotationsDavid Howells
that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-03nfsd: add NFSEXP_PNFS to the exflags arrayChristoph Hellwig
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-03locks: change lm_get_owner and lm_put_owner prototypesJeff Layton
The current prototypes for these operations are somewhat awkward as they deal with fl_owners but take struct file_lock arguments. In the future, we'll want to be able to take references without necessarily dealing with a struct file_lock. Change them to take fl_owner_t arguments instead and have the callers deal with assigning the values to the file_lock structs. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2015-03-31nfsd: Remove duplicate macro define for max sec label lengthKinglong Mee
NFS4_MAXLABELLEN has defined for sec label max length, use it directly. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31nfsd: allow setting acls with unenforceable DENYsJ. Bruce Fields
We've been refusing ACLs that DENY permissions that we can't effectively deny. (For example, we can't deny permission to read attributes.) Andreas points out that any DENY of Window's "read", "write", or "modify" permissions would trigger this. That would be annoying. So maybe we should be a little less paranoid, and ignore entirely the permissions that are meaningless to us. Reported-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31nfsd: NFSD_FAULT_INJECTION depends on DEBUG_FSChengyu Song
NFSD_FAULT_INJECTION depends on DEBUG_FS, otherwise the debugfs_create_* interface may return unexpected error -ENODEV, and cause system crash. Signed-off-by: Chengyu Song <csong84@gatech.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31nfsd: remove unused status arg to nfsd4_cleanup_open_stateJeff Layton
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31nfsd: remove bogus setting of status in nfsd4_process_open2Jeff Layton
status is always reset after this (and it doesn't make much sense there anyway). Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31NFSD: Use correct reply size calculating functionKinglong Mee
ALLOCATE/DEALLOCATE only reply one status value to client, so, using nfsd4_only_status_rsize for reply size calculating. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31NFSD: Using path_equal() for checking two pathsKinglong Mee
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-30nfsd: require an explicit option to enable pNFSChristoph Hellwig
Turns out sending out layouts to any client is a bad idea if they can't get at the storage device, so require explicit admin action to enable pNFS. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25NFSD: Fix bad update of layout in nfsd4_return_file_layoutKinglong Mee
With return layout as, (seg is return layout, lo is record layout) seg->offset <= lo->offset and layout_end(seg) < layout_end(lo), nfsd should update lo's offset to seg's end, and, seg->offset > lo->offset and layout_end(seg) >= layout_end(lo), nfsd should update lo's end to seg's offset. Fixes: 9cf514ccfa ("nfsd: implement pNFS operations") Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25NFSD: Take care the return value from nfsd4_encode_stateidKinglong Mee
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25NFSD: Printk blocklayout length and offset as format 0x%llxKinglong Mee
When testing pnfs with nfsd_debug on, nfsd print a negative number of layout length and foff in nfsd4_block_proc_layoutget as, "GET: -xxxx:-xxx 2" Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25nfsd: return correct lockowner when there is a race on hash insertJ. Bruce Fields
alloc_init_lock_stateowner can return an already freed entry if there is a race to put openowners in the hashtable. Noticed by inspection after Jeff Layton fixed the same bug for open owners. Depending on client behavior, this one may be trickier to trigger in practice. Fixes: c58c6610ec24 "nfsd: Protect adding/removing lock owners using client_lock" Cc: <stable@vger.kernel.org> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Acked-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25nfsd: return correct openowner when there is a race to put one in the hashJeff Layton
alloc_init_open_stateowner can return an already freed entry if there is a race to put openowners in the hashtable. In commit 7ffb588086e9, we changed it so that we allocate and initialize an openowner, and then check to see if a matching one got stuffed into the hashtable in the meantime. If it did, then we free the one we just allocated and take a reference on the one already there. There is a bug here though. The code will then return the pointer to the one that was allocated (and has now been freed). This wasn't evident before as this race almost never occurred. The Linux kernel client used to serialize requests for a single openowner. That has changed now with v4.0 kernels, and this race can now easily occur. Fixes: 7ffb588086e9 Cc: <stable@vger.kernel.org> # v3.17+ Cc: Trond Myklebust <trond.myklebust@primarydata.com> Reported-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-20NFSD: Put exports after nfsd4_layout_verify failKinglong Mee
Fix commit 9cf514ccfa (nfsd: implement pNFS operations). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-20NFSD: Error out when register_shrinker() failKinglong Mee
If register_shrinker() failed, nfsd will cause a NULL pointer access as, [ 9250.875465] nfsd: last server has exited, flushing export cache [ 9251.427270] BUG: unable to handle kernel NULL pointer dereference at (null) [ 9251.427393] IP: [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0 [ 9251.427579] PGD 13e4d067 PUD 13e4c067 PMD 0 [ 9251.427633] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 9251.427706] Modules linked in: ip6t_rpfilter ip6t_REJECT bnep bluetooth xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw btrfs xfs microcode ppdev serio_raw pcspkr xor libcrc32c raid6_pq e1000 parport_pc parport i2c_piix4 i2c_core nfsd(OE-) auth_rpcgss nfs_acl lockd sunrpc(E) ata_generic pata_acpi [ 9251.428240] CPU: 0 PID: 1557 Comm: rmmod Tainted: G OE 3.16.0-rc2+ #22 [ 9251.428366] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 9251.428496] task: ffff880000849540 ti: ffff8800136f4000 task.ti: ffff8800136f4000 [ 9251.428593] RIP: 0010:[<ffffffff8136fc29>] [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0 [ 9251.428696] RSP: 0018:ffff8800136f7ea0 EFLAGS: 00010207 [ 9251.428751] RAX: 0000000000000000 RBX: ffffffffa0116d48 RCX: dead000000200200 [ 9251.428814] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa0116d48 [ 9251.428876] RBP: ffff8800136f7ea0 R08: ffff8800136f4000 R09: 0000000000000001 [ 9251.428939] R10: 8080808080808080 R11: 0000000000000000 R12: ffffffffa011a5a0 [ 9251.429002] R13: 0000000000000800 R14: 0000000000000000 R15: 00000000018ac090 [ 9251.429064] FS: 00007fb9acef0740(0000) GS:ffff88003fa00000(0000) knlGS:0000000000000000 [ 9251.429164] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9251.429221] CR2: 0000000000000000 CR3: 0000000031a17000 CR4: 00000000001407f0 [ 9251.429306] Stack: [ 9251.429410] ffff8800136f7eb8 ffffffff8136fcdd ffffffffa0116d20 ffff8800136f7ed0 [ 9251.429511] ffffffff8118a0f2 0000000000000000 ffff8800136f7ee0 ffffffffa00eb765 [ 9251.429610] ffff8800136f7ef0 ffffffffa010e93c ffff8800136f7f78 ffffffff81104ac2 [ 9251.429709] Call Trace: [ 9251.429755] [<ffffffff8136fcdd>] list_del+0xd/0x30 [ 9251.429896] [<ffffffff8118a0f2>] unregister_shrinker+0x22/0x40 [ 9251.430037] [<ffffffffa00eb765>] nfsd_reply_cache_shutdown+0x15/0x90 [nfsd] [ 9251.430106] [<ffffffffa010e93c>] exit_nfsd+0x9/0x6cd [nfsd] [ 9251.430192] [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200 [ 9251.430280] [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90 [ 9251.430395] [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b [ 9251.430457] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08 [ 9251.430691] RIP [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0 [ 9251.430755] RSP <ffff8800136f7ea0> [ 9251.430805] CR2: 0000000000000000 [ 9251.431033] ---[ end trace 080f3050d082b4ea ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-20NFSD: Take care the return value from nfsd4_decode_stateidKinglong Mee
Return status after nfsd4_decode_stateid failed. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>