Commit Graph

2080 Commits

Author SHA1 Message Date
J. Bruce Fields
0979292bfa svcrpc: fix failures to handle -1 uid's
As of f025adf191 "sunrpc: Properly decode
kuids and kgids in RPC_AUTH_UNIX credentials" any rpc containing a -1
(0xffff) uid or gid would fail with a badcred error.

Commit afe3c3fd53 "svcrpc: fix failures to
handle -1 uid's and gid's" fixed part of the problem, but overlooked the
gid upcall--the kernel can request supplementary gid's for the -1 uid,
but mountd's attempt write a response will get -EINVAL.

Symptoms were nfsd failing to reply to the first attempt to use a newly
negotiated krb5 context.

Reported-by: Sven Geggus <lists@fuchsschwanzdomain.de>
Tested-by: Sven Geggus <lists@fuchsschwanzdomain.de>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-08 17:27:23 -04:00
Linus Torvalds
7f0ef0267e Merge branch 'akpm' (updates from Andrew Morton)
Merge first patch-bomb from Andrew Morton:
 - various misc bits
 - I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been
   distracted.  There has been quite a bit of activity.
 - About half the MM queue
 - Some backlight bits
 - Various lib/ updates
 - checkpatch updates
 - zillions more little rtc patches
 - ptrace
 - signals
 - exec
 - procfs
 - rapidio
 - nbd
 - aoe
 - pps
 - memstick
 - tools/testing/selftests updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (445 commits)
  tools/testing/selftests: don't assume the x bit is set on scripts
  selftests: add .gitignore for kcmp
  selftests: fix clean target in kcmp Makefile
  selftests: add .gitignore for vm
  selftests: add hugetlbfstest
  self-test: fix make clean
  selftests: exit 1 on failure
  kernel/resource.c: remove the unneeded assignment in function __find_resource
  aio: fix wrong comment in aio_complete()
  drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode
  drivers/memstick/host/r592.c: convert to module_pci_driver
  drivers/memstick/host/jmb38x_ms: convert to module_pci_driver
  pps-gpio: add device-tree binding and support
  drivers/pps/clients/pps-gpio.c: convert to module_platform_driver
  drivers/pps/clients/pps-gpio.c: convert to devm_* helpers
  drivers/parport/share.c: use kzalloc
  Documentation/accounting/getdelays.c: avoid strncpy in accounting tool
  aoe: update internal version number to v83
  aoe: update copyright date
  aoe: perform I/O completions in parallel
  ...
2013-07-03 17:12:13 -07:00
Kees Cook
f170168b9a drivers: avoid parsing names as kthread_run() format strings
Calling kthread_run with a single name parameter causes it to be handled
as a format string. Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:41 -07:00
Linus Torvalds
f991fae5c6 Power management and ACPI updates for 3.11-rc1
- Hotplug changes allowing device hot-removal operations to fail
   gracefully (instead of crashing the kernel) if they cannot be
   carried out completely.  From Rafael J Wysocki and Toshi Kani.
 
 - Freezer update from Colin Cross and Mandeep Singh Baines targeted
   at making the freezing of tasks a bit less heavy weight operation.
 
 - cpufreq resume fix from Srivatsa S Bhat for a regression introduced
   during the 3.10 cycle causing some cpufreq sysfs attributes to
   return wrong values to user space after resume.
 
 - New freqdomain_cpus sysfs attribute for the acpi-cpufreq driver to
   provide information previously available via related_cpus from
   Lan Tianyu.
 
 - cpufreq fixes and cleanups from Viresh Kumar, Jacob Shin,
   Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia, Arnd Bergmann, and
   Tang Yuantian.
 
 - Fix for an ACPICA regression causing suspend/resume issues to
   appear on some systems introduced during the 3.4 development cycle
   from Lv Zheng.
 
 - ACPICA fixes and cleanups from Bob Moore, Tomasz Nowicki, Lv Zheng,
   Chao Guan, and Zhang Rui.
 
 - New cupidle driver for Xilinx Zynq processors from Michal Simek.
 
 - cpuidle fixes and cleanups from Daniel Lezcano.
 
 - Changes to make suspend/resume work correctly in Xen guests from
   Konrad Rzeszutek Wilk.
 
 - ACPI device power management fixes and cleanups from Fengguang Wu
   and Rafael J Wysocki.
 
 - ACPI documentation updates from Lv Zheng, Aaron Lu and Hanjun Guo.
 
 - Fix for the IA-64 issue that was the reason for reverting commit
   9f29ab1 and updates of the ACPI scan code from Rafael J Wysocki.
 
 - Mechanism for adding CMOS RTC address space handlers from Lan Tianyu
   (to allow some EC-related breakage to be fixed on some systems).
 
 - Spec-compliant implementation of acpi_os_get_timer() from
   Mika Westerberg.
 
 - Modification of do_acpi_find_child() to execute _STA in order to
   to avoid situations in which a pointer to a disabled device object
   is returned instead of an enabled one with the same _ADR value.
   From Jeff Wu.
 
 - Intel BayTrail PCH (Platform Controller Hub) support for the ACPI
   Intel Low-Power Subsystems (LPSS) driver and modificaions of that
   driver to work around a couple of known BIOS issues from
   Mika Westerberg and Heikki Krogerus.
 
 - EC driver fix from Vasiliy Kulikov to make it use get_user() and
   put_user() instead of dereferencing user space pointers blindly.
 
 - Assorted ACPI code cleanups from Bjorn Helgaas, Nicholas Mazzuca and
   Toshi Kani.
 
 - Modification of the "runtime idle" helper routine to take the return
   values of the callbacks executed by it into account and to call
   rpm_suspend() if they return 0, which allows some code bloat
   reduction to be done, from Rafael J Wysocki and Alan Stern.
 
 - New trace points for PM QoS from Sahara <keun-o.park@windriver.com>.
 
 - PM QoS documentation update from Lan Tianyu.
 
 - Assorted core PM code cleanups and changes from Bernie Thompson,
   Bjorn Helgaas, Julius Werner, and Shuah Khan.
 
 - New devfreq driver for the Exynos5-bus device from Abhilash Kesavan.
 
 - Minor devfreq cleanups, fixes and MAINTAINERS update from
   MyungJoo Ham, Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and
   Wei Yongjun.
 
 - OMAP Adaptive Voltage Scaling (AVS) SmartReflex voltage control
   driver updates from Andrii Tseglytskyi and Nishanth Menon.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0ZNOAAoJEKhOf7ml8uNsDLYP/0EU4rmvw0TWTITfp6RS1KDE
 9GwBn96ZR4Q5bJd9gBCTPSqhHOYMqxWEUp99sn/M2wehG1pk/jw5LO56+2IhM3UZ
 g1HDcJ7te2nVT/iXsKiAGTVhU9Rk0aYwoVSknwk27qpIBGxW9w/s5tLX8pY3Q3Zq
 wL/7aTPjyL+PFFFEaxgH7qLqsl3DhbtYW5AriUBTkXout/tJ4eO1b7MNBncLDh8X
 VQ/0DNCKE95VEJfkO4rk9RKUyVp9GDn0i+HXCD/FS4IA5oYzePdVdNDmXf7g+swe
 CGlTZq8pB+oBpDiHl4lxzbNrKQjRNbGnDUkoRcWqn0nAw56xK+vmYnWJhW99gQ/I
 fKnvxeLca5po1aiqmC4VSJxZIatFZqLrZAI4dzoCLWY+bGeTnCKmj0/F8ytFnZA2
 8IuLLs7/dFOaHXV/pKmpg6FAlFa9CPxoqRFoyqb4M0GjEarADyalXUWsPtG+6xCp
 R/p0CISpwk+guKZR/qPhL7M654S7SHrPwd2DPF0KgGsvk+G2GhoB8EzvD8BVp98Z
 9siCGCdgKQfJQVI6R0k9aFmn/4gRQIAgyPhkhv9tqULUUkiaXki+/t8kPfnb8O/d
 zep+CA57E2G8MYLkDJfpFeKS7GpPD6TIdgFdGmOUC0Y6sl9iTdiw4yTx8O2JM37z
 rHBZfYGkJBrbGRu+Q1gs
 =VBBq
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI updates from Rafael Wysocki:
 "This time the total number of ACPI commits is slightly greater than
  the number of cpufreq commits, but Viresh Kumar (who works on cpufreq)
  remains the most active patch submitter.

  To me, the most significant change is the addition of offline/online
  device operations to the driver core (with the Greg's blessing) and
  the related modifications of the ACPI core hotplug code.  Next are the
  freezer updates from Colin Cross that should make the freezing of
  tasks a bit less heavy weight.

  We also have a couple of regression fixes, a number of fixes for
  issues that have not been identified as regressions, two new drivers
  and a bunch of cleanups all over.

  Highlights:

   - Hotplug changes to support graceful hot-removal failures.

     It sometimes is necessary to fail device hot-removal operations
     gracefully if they cannot be carried out completely.  For example,
     if memory from a memory module being hot-removed has been allocated
     for the kernel's own use and cannot be moved elsewhere, it's
     desirable to fail the hot-removal operation in a graceful way
     rather than to crash the kernel, but currenty a success or a kernel
     crash are the only possible outcomes of an attempted memory
     hot-removal.  Needless to say, that is not a very attractive
     alternative and it had to be addressed.

     However, in order to make it work for memory, I first had to make
     it work for CPUs and for this purpose I needed to modify the ACPI
     processor driver.  It's been split into two parts, a resident one
     handling the low-level initialization/cleanup and a modular one
     playing the actual driver's role (but it binds to the CPU system
     device objects rather than to the ACPI device objects representing
     processors).  That's been sort of like a live brain surgery on a
     patient who's riding a bike.

     So this is a little scary, but since we found and fixed a couple of
     regressions it caused to happen during the early linux-next testing
     (a month ago), nobody has complained.

     As a bonus we remove some duplicated ACPI hotplug code, because the
     ACPI-based CPU hotplug is now going to use the common ACPI hotplug
     code.

   - Lighter weight freezing of tasks.

     These changes from Colin Cross and Mandeep Singh Baines are
     targeted at making the freezing of tasks a bit less heavy weight
     operation.  They reduce the number of tasks woken up every time
     during the freezing, by using the observation that the freezer
     simply doesn't need to wake up some of them and wait for them all
     to call refrigerator().  The time needed for the freezer to decide
     to report a failure is reduced too.

     Also reintroduced is the check causing a lockdep warining to
     trigger when try_to_freeze() is called with locks held (which is
     generally unsafe and shouldn't happen).

   - cpufreq updates

     First off, a commit from Srivatsa S Bhat fixes a resume regression
     introduced during the 3.10 cycle causing some cpufreq sysfs
     attributes to return wrong values to user space after resume.  The
     fix is kind of fresh, but also it's pretty obvious once Srivatsa
     has identified the root cause.

     Second, we have a new freqdomain_cpus sysfs attribute for the
     acpi-cpufreq driver to provide information previously available via
     related_cpus.  From Lan Tianyu.

     Finally, we fix a number of issues, mostly related to the
     CPUFREQ_POSTCHANGE notifier and cpufreq Kconfig options and clean
     up some code.  The majority of changes from Viresh Kumar with bits
     from Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia,
     Arnd Bergmann, and Tang Yuantian.

   - ACPICA update

     A usual bunch of updates from the ACPICA upstream.

     During the 3.4 cycle we introduced support for ACPI 5 extended
     sleep registers, but they are only supposed to be used if the
     HW-reduced mode bit is set in the FADT flags and the code attempted
     to use them without checking that bit.  That caused suspend/resume
     regressions to happen on some systems.  Fix from Lv Zheng causes
     those registers to be used only if the HW-reduced mode bit is set.

     Apart from this some other ACPICA bugs are fixed and code cleanups
     are made by Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and
     Zhang Rui.

   - cpuidle updates

     New driver for Xilinx Zynq processors is added by Michal Simek.

     Multidriver support simplification, addition of some missing
     kerneldoc comments and Kconfig-related fixes come from Daniel
     Lezcano.

   - ACPI power management updates

     Changes to make suspend/resume work correctly in Xen guests from
     Konrad Rzeszutek Wilk, sparse warning fix from Fengguang Wu and
     cleanups and fixes of the ACPI device power state selection
     routine.

   - ACPI documentation updates

     Some previously missing pieces of ACPI documentation are added by
     Lv Zheng and Aaron Lu (hopefully, that will help people to
     uderstand how the ACPI subsystem works) and one outdated doc is
     updated by Hanjun Guo.

   - Assorted ACPI updates

     We finally nailed down the IA-64 issue that was the reason for
     reverting commit 9f29ab11dd ("ACPI / scan: do not match drivers
     against objects having scan handlers"), so we can fix it and move
     the ACPI scan handler check added to the ACPI video driver back to
     the core.

     A mechanism for adding CMOS RTC address space handlers is
     introduced by Lan Tianyu to allow some EC-related breakage to be
     fixed on some systems.

     A spec-compliant implementation of acpi_os_get_timer() is added by
     Mika Westerberg.

     The evaluation of _STA is added to do_acpi_find_child() to avoid
     situations in which a pointer to a disabled device object is
     returned instead of an enabled one with the same _ADR value.  From
     Jeff Wu.

     Intel BayTrail PCH (Platform Controller Hub) support is added to
     the ACPI driver for Intel Low-Power Subsystems (LPSS) and that
     driver is modified to work around a couple of known BIOS issues.
     Changes from Mika Westerberg and Heikki Krogerus.

     The EC driver is fixed by Vasiliy Kulikov to use get_user() and
     put_user() instead of dereferencing user space pointers blindly.

     Code cleanups are made by Bjorn Helgaas, Nicholas Mazzuca and Toshi
     Kani.

   - Assorted power management updates

     The "runtime idle" helper routine is changed to take the return
     values of the callbacks executed by it into account and to call
     rpm_suspend() if they return 0, which allows us to reduce the
     overall code bloat a bit (by dropping some code that's not
     necessary any more after that modification).

     The runtime PM documentation is updated by Alan Stern (to reflect
     the "runtime idle" behavior change).

     New trace points for PM QoS are added by Sahara
     (<keun-o.park@windriver.com>).

     PM QoS documentation is updated by Lan Tianyu.

     Code cleanups are made and minor issues are addressed by Bernie
     Thompson, Bjorn Helgaas, Julius Werner, and Shuah Khan.

   - devfreq updates

     New driver for the Exynos5-bus device from Abhilash Kesavan.

     Minor cleanups, fixes and MAINTAINERS update from MyungJoo Ham,
     Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and Wei Yongjun.

   - OMAP power management updates

     Adaptive Voltage Scaling (AVS) SmartReflex voltage control driver
     updates from Andrii Tseglytskyi and Nishanth Menon."

* tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
  cpufreq: Fix cpufreq regression after suspend/resume
  ACPI / PM: Fix possible NULL pointer deref in acpi_pm_device_sleep_state()
  PM / Sleep: Warn about system time after resume with pm_trace
  cpufreq: don't leave stale policy pointer in cdbs->cur_policy
  acpi-cpufreq: Add new sysfs attribute freqdomain_cpus
  cpufreq: make sure frequency transitions are serialized
  ACPI: implement acpi_os_get_timer() according the spec
  ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
  ACPI: Add CMOS RTC Operation Region handler support
  ACPI / processor: Drop unused variable from processor_perflib.c
  cpufreq: tegra: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: s3c64xx: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: omap: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: imx6q: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: exynos: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: dbx500: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: davinci: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: arm-big-little: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: powernow-k8: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: pcc: call CPUFREQ_POSTCHANGE notfier in error cases
  ...
2013-07-03 14:35:40 -07:00
NeilBrown
0bebc633f1 sunrpc: Don't schedule an upcall on a replaced cache entry.
When a cache entry is replaced, the "expiry_time" get set to
zero by a call to "cache_fresh_locked(..., 0)" at the end of
"sunrpc_cache_update".

This low expiry time makes cache_check() think that the 'refresh_age'
is negative, so the 'age' is comparatively large and a refresh is
triggered.
However refreshing a replaced entry it pointless, it cannot achieve
anything useful.

So teach cache_check to ignore a low refresh_age when expiry_time
is zero.

Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:53:28 -04:00
NeilBrown
7715cde868 net/sunrpc: xpt_auth_cache should be ignored when expired.
commit d202cce896
    sunrpc: never return expired entries in sunrpc_cache_lookup

moved the 'entry is expired' test from cache_check to
sunrpc_cache_lookup, so that it happened early and some races could
safely be ignored.

However the ip_map (in svcauth_unix.c) has a separate single-item
cache which allows quick lookup without locking.  An entry in this
case would not be subject to the expiry test and so could be used
well after it has expired.

This is not normally a big problem because the first time it is used
after it is expired an up-call will be scheduled to refresh the entry
(if it hasn't been scheduled already) and the old entry will then
be invalidated.  So on the second attempt to use it after it has
expired, ip_map_cached_get will discard it.

However that is subtle and not ideal, so replace the "!cache_valid"
test with "cache_is_expired".
In doing this we drop the test on the "CACHE_VALID" bit.  This is
unnecessary as the bit is never cleared, and an entry will only
be cached if the bit is set.

Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:53:28 -04:00
NeilBrown
013920eb5d sunrpc/cache: ensure items removed from cache do not have pending upcalls.
It is possible for a race to set CACHE_PENDING after cache_clean()
has removed a cache entry from the cache.
If CACHE_PENDING is still set when the entry is finally 'put',
the cache_dequeue() will never happen and we can leak memory.

So set a new flag 'CACHE_CLEANED' when we remove something from
the cache, and don't queue any upcall if it is set.

If CACHE_PENDING is set before CACHE_CLEANED, the call that
cache_clean() makes to cache_fresh_unlocked() will free memory
as needed.  If CACHE_PENDING is set after CACHE_CLEANED, the
test in sunrpc_cache_pipe_upcall will ensure that the memory
is not allocated.

Reported-by: <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:53:28 -04:00
NeilBrown
2a1c7f53fd sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
cache_fresh_unlocked() is called when a cache entry
has been updated and ensures that if there were any
pending upcalls, they are cleared.

So every time we update a cache entry, we should call this,
and this should be the only way that we try to clear
pending calls (that sort of uniformity makes code sooo much
easier to read).

try_to_negate_entry() will (possibly) mark an entry as
negative.  If it doesn't, it is because the entry already
is VALID.
So the entry will be valid on exit, so it is appropriate to
call cache_fresh_unlocked().
So tidy up try_to_negate_entry() to do that, and remove
partial open-coded cache_fresh_unlocked() from the one
call-site of try_to_negate_entry().

In the other branch of the 'switch(cache_make_upcall())',
we again have a partial open-coded version of cache_fresh_unlocked().
Replace that with a real call.

And again in cache_clean(), use a real call to cache_fresh_unlocked().

These call sites might previously have called
cache_revisit_request() if CACHE_PENDING wasn't set.
This is never necessary because cache_revisit_request() can
only do anything if the item is in the cache_defer_hash,
However any time that an item is added to the cache_defer_hash
(setup_deferral), the code immediately tests CACHE_PENDING,
and removes the entry again if it is clear.  So all other
places we only need to 'cache_revisit_request' if we've
just cleared CACHE_PENDING.

Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown  <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:53:28 -04:00
NeilBrown
f9e1aedc6c sunrpc/cache: remove races with queuing an upcall.
We currently queue an upcall after setting CACHE_PENDING,
and dequeue after clearing CACHE_PENDING.
So a request should only be present when CACHE_PENDING is set.

However we don't combine the test and the enqueue/dequeue in
a protected region, so it is possible (if unlikely) for a race
to result in a request being queued without CACHE_PENDING set,
or a request to be absent despite CACHE_PENDING.

So: include a test for CACHE_PENDING inside the regions of
enqueue and dequeue where queue_lock is held, and abort
the operation if the value is not as expected.

Also remove the early 'return' from cache_dequeue() to ensure that it
always removes all entries: As there is no locking between setting
CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not
inconceivable for some other thread to clear CACHE_PENDING and then
someone else to set it and call sunrpc_cache_pipe_upcall, both before
the original threads completed the call.

With this, it perfectly safe and correct to:
 - call cache_dequeue() if and only if we have just
   cleared CACHE_PENDING
 - call sunrpc_cache_pipe_upcall() (via cache_make_upcall)
   if and only if we have just set CACHE_PENDING.

Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:42:53 -04:00
J. Bruce Fields
1f691b07c5 svcrpc: don't error out on small tcp fragment
Though clients we care about mostly don't do this, it is possible for
rpc requests to be sent in multiple fragments.  Here we have a sanity
check to ensure that the final received rpc isn't too small--except that
the number we're actually checking is the length of just the final
fragment, not of the whole rpc.  So a perfectly legal rpc that's
unluckily fragmented could cause the server to close the connection
here.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:32:04 -04:00
J. Bruce Fields
cf3aa02cb4 svcrpc: fix handling of too-short rpc's
If we detect that an rpc is too short, we abort and close the
connection.  Except, there's a bug here: we're leaving sk_datalen
nonzero without leaving any pages in the sk_pages array.  The most
likely result of the inconsistency is a subsequent crash in
svc_tcp_clear_pages.

Also demote the BUG_ON in svc_tcp_clear_pages to a WARN.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:32:04 -04:00
J. Bruce Fields
0dc1531aca svcrpc: store gss mech in svc_cred
Store a pointer to the gss mechanism used in the rq_cred and cl_cred.
This will make it easier to enforce SP4_MACH_CRED, which needs to
compare the mechanism used on the exchange_id with that used on
protected operations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:06 -04:00
J. Bruce Fields
4423406391 svcrpc: introduce init_svc_cred
Common helper to zero out fields of the svc_cred.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:06 -04:00
J. Bruce Fields
0de934936b Merge branch 'for-3.10' into 'for-3.11'
Merge bugfixes into my for-3.11 branch.
2013-07-01 17:22:24 -04:00
Al Viro
e77e430033 more open-coded file_inode() calls
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:21 +04:00
Stanislav Kinsbursky
4f6bb246f6 SUNRPC: PipeFS MOUNT notification optimization for dying clients
Not need to create pipes for dying client. So just skip them.

Note: we can safely dereference the client structure, because notification
caller is holding sn->pipefs_sb_lock.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-28 15:42:26 -04:00
Stanislav Kinsbursky
e73f4cc051 SUNRPC: split client creation routine into setup and registration
This helper moves all "registration" code to the new rpc_client_register()
helper.
This helper will be used later in the series to synchronize against PipeFS
MOUNT/UMOUNT events.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-28 15:42:16 -04:00
Stanislav Kinsbursky
adb6fa7ffe SUNRPC: fix races on PipeFS UMOUNT notifications
CPU#0                                   CPU#1
-----------------------------           -----------------------------
rpc_kill_sb
sn->pipefs_sb = NULL                    rpc_release_client
(UMOUNT_EVENT)                          rpc_free_auth
rpc_pipefs_event
rpc_get_client_for_event
!atomic_inc_not_zero(cl_count)
<skip the client>
                                        atomic_inc(cl_count)
                                        rpc_free_client
                                        rpc_clnt_remove_pipedir
                                        <skip client dir removing>

To fix this, this patch does the following:

1) Calls RPC_PIPEFS_UMOUNT notification with sn->pipefs_sb_lock being held.
2) Removes SUNRPC client from the list AFTER pipes destroying.
3) Doesn't hold RPC client on notification: if client in the list, then it
can't be destroyed while sn->pipefs_sb_lock in hold by notification caller.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-28 15:42:02 -04:00
Stanislav Kinsbursky
384816051c SUNRPC: fix races on PipeFS MOUNT notifications
Below are races, when RPC client can be created without PiepFS dentries

CPU#0					CPU#1
-----------------------------		-----------------------------
rpc_new_client				rpc_fill_super
rpc_setup_pipedir
mutex_lock(&sn->pipefs_sb_lock)
rpc_get_sb_net == NULL
(no per-net PipeFS superblock)
					sn->pipefs_sb = sb;
					notifier_call_chain(MOUNT)
					(client is not in the list)
rpc_register_client
(client without pipes dentries)

To fix this patch:
1) makes PipeFS mount notification call with pipefs_sb_lock being held.
2) releases pipefs_sb_lock on new SUNRPC client creation only after
registration.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-28 15:41:18 -04:00
Rafael J. Wysocki
207bc1181b Merge branch 'freezer'
* freezer:
  af_unix: use freezable blocking calls in read
  sigtimedwait: use freezable blocking call
  nanosleep: use freezable blocking call
  futex: use freezable blocking call
  select: use freezable blocking call
  epoll: use freezable blocking call
  binder: use freezable blocking calls
  freezer: add new freezable helpers using freezer_do_not_count()
  freezer: convert freezable helpers to static inline where possible
  freezer: convert freezable helpers to freezer_do_not_count()
  freezer: skip waking up tasks with PF_FREEZER_SKIP set
  freezer: shorten freezer sleep time using exponential backoff
  lockdep: check that no locks held at freeze time
  lockdep: remove task argument from debug_check_no_locks_held
  freezer: add unsafe versions of freezable helpers for CIFS
  freezer: add unsafe versions of freezable helpers for NFS
2013-06-28 13:00:53 +02:00
Jeff Layton
e401452d92 rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
We had a report of a reproducible WARNING:

[ 1360.039358] ------------[ cut here ]------------
[ 1360.043978] WARNING: at fs/dcache.c:1355 d_set_d_op+0x8d/0xc0()
[ 1360.049880] Hardware name: HP Z200 Workstation
[ 1360.054308] Modules linked in: nfsv4 nfs dns_resolver fscache nfsd
auth_rpcgss nfs_acl lockd sunrpc sg acpi_cpufreq mperf coretemp kvm_intel kvm
snd_hda_codec_realtek snd_hda_intel snd_hda_codec hp_wmi crc32c_intel
snd_hwdep e1000e snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
sparse_keymap rfkill soundcore serio_raw ptp iTCO_wdt pps_core pcspkr
iTCO_vendor_support mei microcode lpc_ich mfd_core wmi xfs libcrc32c sr_mod
sd_mod cdrom crc_t10dif radeon i2c_algo_bit drm_kms_helper ttm ahci libahci
drm i2c_core libata dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
auth_rpcgss]
[ 1360.107406] Pid: 8814, comm: mount.nfs4 Tainted: G         I --------------   3.9.0-0.55.el7.x86_64 #1
[ 1360.116771] Call Trace:
[ 1360.119219]  [<ffffffff810610c0>] warn_slowpath_common+0x70/0xa0
[ 1360.125208]  [<ffffffff810611aa>] warn_slowpath_null+0x1a/0x20
[ 1360.131025]  [<ffffffff811af46d>] d_set_d_op+0x8d/0xc0
[ 1360.136159]  [<ffffffffa05a7d6f>] __rpc_lookup_create_exclusive+0x4f/0x80 [sunrpc]
[ 1360.143710]  [<ffffffffa05a8cc6>] rpc_mkpipe_dentry+0x86/0x170 [sunrpc]
[ 1360.150311]  [<ffffffffa062a7b6>] nfs_idmap_new+0x96/0x130 [nfsv4]
[ 1360.156475]  [<ffffffffa062e7cd>] nfs4_init_client+0xad/0x2d0 [nfsv4]
[ 1360.162902]  [<ffffffff812f02df>] ? idr_get_empty_slot+0x16f/0x3c0
[ 1360.169062]  [<ffffffff812f0582>] ? idr_mark_full+0x52/0x60
[ 1360.174615]  [<ffffffff812f0699>] ? idr_alloc+0x79/0xe0
[ 1360.179826]  [<ffffffffa0598081>] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc]
[ 1360.187635]  [<ffffffffa05980f3>] ? rpc_init_wait_queue+0x13/0x20 [sunrpc]
[ 1360.194493]  [<ffffffffa05d05da>] nfs_get_client+0x27a/0x350 [nfs]
[ 1360.200666]  [<ffffffffa062e438>] nfs4_set_client.isra.8+0x78/0x100 [nfsv4]
[ 1360.207624]  [<ffffffffa062f2f3>] nfs4_create_server+0xf3/0x3a0 [nfsv4]
[ 1360.214222]  [<ffffffffa06284be>] nfs4_remote_mount+0x2e/0x60 [nfsv4]
[ 1360.220644]  [<ffffffff8119ea79>] mount_fs+0x39/0x1b0
[ 1360.225691]  [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20
[ 1360.231348]  [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0
[ 1360.236822]  [<ffffffffa0628396>] nfs_do_root_mount+0x86/0xc0 [nfsv4]
[ 1360.243246]  [<ffffffffa06287b4>] nfs4_try_mount+0x44/0xc0 [nfsv4]
[ 1360.249410]  [<ffffffffa05d1457>] ? get_nfs_version+0x27/0x80 [nfs]
[ 1360.255659]  [<ffffffffa05db985>] nfs_fs_mount+0x5c5/0xd10 [nfs]
[ 1360.261650]  [<ffffffffa05dc550>] ? nfs_clone_super+0x140/0x140 [nfs]
[ 1360.268074]  [<ffffffffa05da8e0>] ? param_set_portnr+0x60/0x60 [nfs]
[ 1360.274406]  [<ffffffff8119ea79>] mount_fs+0x39/0x1b0
[ 1360.279443]  [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20
[ 1360.285088]  [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0
[ 1360.290556]  [<ffffffff811b9f5d>] do_mount+0x1fd/0xa00
[ 1360.295677]  [<ffffffff81137dee>] ? __get_free_pages+0xe/0x50
[ 1360.301405]  [<ffffffff811b9be6>] ? copy_mount_options+0x36/0x170
[ 1360.307479]  [<ffffffff811ba7e3>] sys_mount+0x83/0xc0
[ 1360.312515]  [<ffffffff8160ad59>] system_call_fastpath+0x16/0x1b
[ 1360.318503] ---[ end trace 8fa1f4cbc36094a7 ]---

The problem is that we're ending up in __rpc_lookup_create_exclusive
with a negative dentry that already has d_op set. A little debugging
has shown that when we hit this, the d_ops are already set to
simple_dentry_operations.

I believe that what's happening is that during a mount, idmapd is racing
in and doing a lookup of /var/lib/nfs/rpc_pipefs/nfs/clnt???/idmap.
Before that dentry reference is released, the kernel races in to create
that file and finds the new negative dentry, which already has the
d_op set.

This patch just avoids setting the d_op if it's already set.
simple_dentry_operations and rpc_dentry_operations are functionally
equivalent so it shouldn't matter which one it's set to.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-18 13:46:50 -04:00
Joe Perches
fe2c6338fd net: Convert uses of typedef ctl_table to struct ctl_table
Reduce the uses of this unnecessary typedef.

Done via perl script:

$ git grep --name-only -w ctl_table net | \
  xargs perl -p -i -e '\
	sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
        s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

Reflow the modified lines that now exceed 80 columns.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 02:36:09 -07:00
Trond Myklebust
9ec2ef53b9 SUNRPC: Remove redundant call to rpc_set_running() in __rpc_execute()
The RPC_TASK_RUNNING flag will always have been set in rpc_make_runnable()
once we get past the test for out_of_line_wait_on_bit() returning
ERESTARTSYS.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:40 -04:00
Trond Myklebust
0053a8e65c SUNRPC: Remove unused function rpc_queue_empty
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:39 -04:00
Trond Myklebust
a76580fbf0 SUNRPC: Fix a potential race in rpc_execute
If the rpc_task is asynchronous, it could theoretically finish executing
on the workqueue it was assigned by rpc_make_runnable() before we get
round to testing RPC_IS_ASYNC() in rpc_execute.

In practice, however, all the existing callers hold a reference to the
rpc_task, so this can't happen today...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:38 -04:00
Linus Torvalds
4203afc3fb Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
 "A couple minor fixes for the (new to 3.10) gss-proxy code.

  And one regression from user-namespace changes.  (XBMC clients were
  doing something admittedly weird--sending -1 gid's--but something that
  we used to allow.)"

* 'for-3.10' of git://linux-nfs.org/~bfields/linux:
  svcrpc: fix failures to handle -1 uid's and gid's
  svcrpc: implement O_NONBLOCK behavior for use-gss-proxy
  svcauth_gss: fix error code in use_gss_proxy()
2013-05-31 09:48:56 +09:00
J. Bruce Fields
afe3c3fd53 svcrpc: fix failures to handle -1 uid's and gid's
As of f025adf191 "sunrpc: Properly decode
kuids and kgids in RPC_AUTH_UNIX credentials" any rpc containing a -1
(0xffff) uid or gid would fail with a badcred error.

Reported symptoms were xmbc clients failing on upgrade of the NFS
server; examination of the network trace showed them sending -1 as the
gid.

Reported-by: Julian Sikorski <belegdol@gmail.com>
Tested-by: Julian Sikorski <belegdol@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-29 10:37:47 -04:00
J. Bruce Fields
b161c14440 svcrpc: implement O_NONBLOCK behavior for use-gss-proxy
Somebody noticed LTP was complaining about O_NONBLOCK opens of
/proc/net/rpc/use-gss-proxy succeeding and then a following read
hanging.

I'm not convinced LTP really has any business opening random proc files
and expecting them to behave a certain way.  Maybe this isn't really a
bug.

But in any case the O_NONBLOCK behavior could be useful for someone that
wants to test whether gss-proxy is up without waiting.

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-28 16:46:51 -04:00
Trond Myklebust
a3c3cac5d3 SUNRPC: Prevent an rpc_task wakeup race
The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to
be careful about ordering the calls to rpc_test_and_set_running(task) and
rpc_clear_queued(task). If we get the order wrong, then we may end up
testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped
and changed the state of the rpc_task.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-05-22 14:55:32 -04:00
chaoting fan
b6040f9706 sunrpc: the cache_detail in cache_is_valid is unused any more
The cache_detail(*detail) in function cache_is_valid is not used any
more.

Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-21 11:02:02 -04:00
Trond Myklebust
2aed8b476f SUNRPC: Convert auth_gss pipe detection to work in namespaces
This seems to have been overlooked when we did the namespace
conversion. If a container is running a legacy version of rpc.gssd
then it will be disrupted if the global 'pipe_version' is set by a
container running the new version of rpc.gssd.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-05-16 06:17:54 -07:00
Trond Myklebust
abfdbd53a4 SUNRPC: Faster detection if gssd is actually running
Recent changes to the NFS security flavour negotiation mean that
we have a stronger dependency on rpc.gssd. If the latter is not
running, because the user failed to start it, then we time out
and mark the container as not having an instance. We then
use that information to time out faster the next time.

If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe,
then we mark the container as having an rpc.gssd instance.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-05-16 06:15:41 -07:00
Trond Myklebust
d36ccb9cec SUNRPC: Fix a bug in gss_create_upcall
If wait_event_interruptible_timeout() is successful, it returns
the number of seconds remaining until the timeout. In that
case, we should be retrying the upcall.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-05-15 10:49:58 -07:00
J. Bruce Fields
2fccbd9cc0 sunrpc: server back channel needs no rpcbind method
XPRT_BOUND is set on server backchannel xprts by xs_setup_bc_tcp()
(using xprt_set_bound()), and is never cleared, so ->rpcbind() will
never need to be called.

Reported-by: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-15 09:27:02 -04:00
Dan Carpenter
625cdd78d1 svcauth_gss: fix error code in use_gss_proxy()
This should return zero on success and -EBUSY on error so the type
needs to be int instead of bool.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-12 14:56:30 -04:00
Colin Cross
416ad3c9c0 freezer: add unsafe versions of freezable helpers for NFS
NFS calls the freezable helpers with locks held, which is unsafe
and will cause lockdep warnings when 6aa9707 "lockdep: check
that no locks held at freeze time" is reapplied (it was reverted
in dbf520a).  NFS shouldn't be doing this, but it has
long-running syscalls that must hold a lock but also shouldn't
block suspend.  Until NFS freeze handling is rewritten to use a
signal to exit out of the critical section, add new *_unsafe
versions of the helpers that will not run the lockdep test when
6aa9707 is reapplied, and call them from NFS.

In practice the likley result of holding the lock while freezing
is that a second task blocked on the lock will never freeze,
aborting suspend, but it is possible to manufacture a case using
the cgroup freezer, the lock, and the suspend freezer to create
a deadlock.  Silencing the lockdep warning here will allow
problems to be found in other drivers that may have a more
serious deadlock risk, and prevent new problems from being added.

Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:16:21 +02:00
Linus Torvalds
2dbd3cac87 Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
 "Small fixes for two bugs and two warnings"

* 'for-3.10' of git://linux-nfs.org/~bfields/linux:
  nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error
  SUNRPC: fix decoding of optional gss-proxy xdr fields
  SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning
  nfsd4: don't allow owner override on 4.1 CLAIM_FH opens
2013-05-10 09:28:55 -07:00
Linus Torvalds
8cbc95ee74 More NFS client bugfixes for 3.10
- Ensure that we match the 'sec=' mount flavour against the server list
 - Fix the NFSv4 byte range locking in the presence of delegations
 - Ensure that we conform to the NFSv4.1 spec w.r.t. freeing lock stateids
 - Fix a pNFS data server connection race
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRit1yAAoJEGcL54qWCgDyD9EQAKgb37dXhGt7OXBRBP4EY/T8
 xJZ2tmdDZ6etLFJVftqCv05hBvyfilPLK0E9zg/zW/kvkKxYQ/fykvpzBR/+Q7KF
 quOmjDHLhDTXBnXzPg1HEoeTaXI2/a8CdjpxxEkthD4+FaKlyCXM+EFtA9orT9ZI
 oM+aNaqEzTjoQyryTFMcHxAvsrqjnZBa0MT6Fh45HaLaijV7CdDWoj6gjy6Lc3Al
 4wHeT8QrZTp/NfIN16uykFZjeWwul4N9upu+CI2V8ZDMEit6JDYX4sl5tB41PzYW
 audDBcu0waSqoVQ2mJ5OHoYGZf0wopMUFaAst+tn0pQvwWUfTjD8XtO8uOgeMNoz
 2S+XxUC2qhSMszwNBVSmwe2LtSAyHiw32Md4hqkLYDH2c7tk8bJPKDXZJACBzJS7
 O1aMmOgWar8+nmzvmXFeU804SxBykV1V8UgtXWp5IwC36V0HAYnM5xtHwXBR7HWe
 lnuVHVdux7ySeAyrs2aMdKk7SAw5OC//WW8qoEF5USDEIljeoBzA+IYu9n91Hg2b
 ufnsyxumGJ6dZ0iU2nJVoLagRaZcm6kOhnxcegMpb9IH2+RLCQNef09lj2iklm2j
 mJA4o2lkVEHOswg/NwKn/I4ho8tbNNb8v//S5KiqrYhiiqZhOzu3RRtFeZi91iac
 P/g+hPzfuGnmwcoCEUSa
 =5zpc
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull more NFS client bugfixes from Trond Myklebust:

 - Ensure that we match the 'sec=' mount flavour against the server list

 - Fix the NFSv4 byte range locking in the presence of delegations

 - Ensure that we conform to the NFSv4.1 spec w.r.t.  freeing lock
   stateids

 - Fix a pNFS data server connection race

* tag 'nfs-for-3.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS4.1 Fix data server connection race
  NFSv3: match sec= flavor against server list
  NFSv4.1: Ensure that we free the lock stateid on the server
  NFSv4: Convert nfs41_free_stateid to use an asynchronous RPC call
  SUNRPC: Don't spam syslog with "Pseudoflavor not found" messages
  NFSv4.x: Fix handling of partially delegated locks
2013-05-09 10:24:54 -07:00
J. Bruce Fields
fb43f11c66 SUNRPC: fix decoding of optional gss-proxy xdr fields
The current code works, but sort of by accident: it obviously didn't
intend the error return to be interpreted as "true".

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-07 17:45:20 -04:00
Geert Uytterhoeven
9fd40c5a66 SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning
net/sunrpc/auth_gss/gss_rpc_xdr.c: In function ‘gssx_dec_option_array’:
net/sunrpc/auth_gss/gss_rpc_xdr.c:258: warning: ‘creds’ may be used uninitialized in this function

Return early if count is zero, to make it clearer to the compiler (and the
casual reviewer) that no more processing is done.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-06 08:54:06 -04:00
Linus Torvalds
1db772216f Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from J Bruce Fields:
 "Highlights include:

   - Some more DRC cleanup and performance work from Jeff Layton

   - A gss-proxy upcall from Simo Sorce: currently krb5 mounts to the
     server using credentials from Active Directory often fail due to
     limitations of the svcgssd upcall interface.  This replacement
     lifts those limitations.  The existing upcall is still supported
     for backwards compatibility.

   - More NFSv4.1 support: at this point, if a user with a current
     client who upgrades from 4.0 to 4.1 should see no regressions.  In
     theory we do everything a 4.1 server is required to do.  Patches
     for a couple minor exceptions are ready for 3.11, and with those
     and some more testing I'd like to turn 4.1 on by default in 3.11."

Fix up semantic conflict as per Stephen Rothwell and linux-next:

Commit 030d794bf4 ("SUNRPC: Use gssproxy upcall for server RPCGSS
authentication") adds two new users of "PDE(inode)->data", but we're
supposed to use "PDE_DATA(inode)" instead since commit d9dda78bad
("procfs: new helper - PDE_DATA(inode)").

The old PDE() macro is no longer available since commit c30480b92c
("proc: Make the PROC_I() and PDE() macros internal to procfs")

* 'for-3.10' of git://linux-nfs.org/~bfields/linux: (60 commits)
  NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly
  NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo()
  nfsd: make symbol nfsd_reply_cache_shrinker static
  svcauth_gss: fix error return code in rsc_parse()
  nfsd4: don't remap EISDIR errors in rename
  svcrpc: fix gss-proxy to respect user namespaces
  SUNRPC: gssp_procedures[] can be static
  SUNRPC: define {create,destroy}_use_gss_proxy_proc_entry in !PROC case
  nfsd4: better error return to indicate SSV non-support
  nfsd: fix EXDEV checking in rename
  SUNRPC: Use gssproxy upcall for server RPCGSS authentication.
  SUNRPC: Add RPC based upcall mechanism for RPCGSS auth
  SUNRPC: conditionally return endtime from import_sec_context
  SUNRPC: allow disabling idle timeout
  SUNRPC: attempt AF_LOCAL connect on setup
  nfsd: Decode and send 64bit time values
  nfsd4: put_client_renew_locked can be static
  nfsd4: remove unused macro
  nfsd4: remove some useless code
  nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
  ...
2013-05-03 10:59:39 -07:00
Trond Myklebust
9b1d75b755 SUNRPC: Don't spam syslog with "Pseudoflavor not found" messages
Just convert those messages to dprintk()s so that they can be used
when debugging.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-05-03 12:19:33 -04:00
Linus Torvalds
20b4fb4852 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS updates from Al Viro,

Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).

7kloc removed.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
  don't bother with deferred freeing of fdtables
  proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
  proc: Make the PROC_I() and PDE() macros internal to procfs
  proc: Supply a function to remove a proc entry by PDE
  take cgroup_open() and cpuset_open() to fs/proc/base.c
  ppc: Clean up scanlog
  ppc: Clean up rtas_flash driver somewhat
  hostap: proc: Use remove_proc_subtree()
  drm: proc: Use remove_proc_subtree()
  drm: proc: Use minor->index to label things, not PDE->name
  drm: Constify drm_proc_list[]
  zoran: Don't print proc_dir_entry data in debug
  reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
  proc: Supply an accessor for getting the data from a PDE's parent
  airo: Use remove_proc_subtree()
  rtl8192u: Don't need to save device proc dir PDE
  rtl8187se: Use a dir under /proc/net/r8180/
  proc: Add proc_mkdir_data()
  proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
  proc: Move PDE_NET() to fs/proc/proc_net.c
  ...
2013-05-01 17:51:54 -07:00
Wei Yongjun
1eb6d6223a svcauth_gss: fix error return code in rsc_parse()
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-30 18:14:15 -04:00
Linus Torvalds
8728f986fe NFS client bugfixes and cleanups for 3.10
- NLM: stable fix for NFSv2/v3 blocking locks
 - NFSv4.x: stable fixes for the delegation recall error handling code
 - NFSv4.x: Security flavour negotiation fixes and cleanups by Chuck Lever
 - SUNRPC: A number of RPCSEC_GSS fixes and cleanups also from Chuck
 - NFSv4.x assorted state management and reboot recovery bugfixes
 - NFSv4.1: In cases where we have already looked up a file, and hold a
   valid filehandle, use the new open-by-filehandle operation instead of
   opening by name.
 - Allow the NFSv4.1 callback thread to freeze
 - NFSv4.x: ensure that file unlock waits for readahead to complete
 - NFSv4.1: ensure that the RPC layer doesn't override the NFS session
   table size negotiation by limiting the number of slots.
 - NFSv4.x: Fix SETATTR spec compatibility issues
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRfu+cAAoJEGcL54qWCgDylxkP/24cXOLHMKMYnIab0cQYIW2m
 SQgADGE+MqgTVlWjGVWublVMDY1R51iINsksAjxMtXYt50FdBJEqV2uxIGi4VnbR
 nR9eppqQ6vk5e6r5+aZyVmWKoLFnJ4MBF6OpPUZB5mf8iH/fiixmSYLvseopPdDj
 bjHwCxg+xEgew5EhQF/xqkEfkAp2NN84xUksTWb9uDIW2c3SJweY/ZVR2Zsqpugm
 oqYVtrSLvNKqINQG8OP10s+mRWULwoqapF+kEHlxNbRy26C0zlbXPaneSgYzqHsY
 OyRkAT7uJJqStYlqdW7k+DhyNMB+T33WAGJpWQlfJGYk5d/n0rtBJDVo0hfhCSQr
 VkOXiO9J08NMbelCu4+0CJii7h5GCaqpuJEEmNL6AlF/TJVkIQJuRaG2+WDmEtO2
 oYd4UfXlAbUuts1SW7u/yyN/yrjVTm1tZYRBqn2VJdqh1s8dMxEWPct2Yn314mpS
 ODAnbDkEhtWlc9cloSRnwKec5WcxMZb19IJeK9ZvHm7PfIu/QHtj6Ren8s1//bZI
 OMQxC/Vf/wcjMdNtr7QdMNxWG1aK8DL9mYP5XwCrkZ560LIrtxmhyqYeoGAfgO5u
 +K/gKmQwjsaPhEa8jbP2/wI0II9yKPWj/fVwqhbhqaBUx5GA2iAKcdpPP6JAMAti
 +PXkLTtkyrIgSNwzl63S
 =Hgot
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes and cleanups from Trond Myklebust:

 - NLM: stable fix for NFSv2/v3 blocking locks

 - NFSv4.x: stable fixes for the delegation recall error handling code

 - NFSv4.x: Security flavour negotiation fixes and cleanups by Chuck
   Lever

 - SUNRPC: A number of RPCSEC_GSS fixes and cleanups also from Chuck

 - NFSv4.x assorted state management and reboot recovery bugfixes

 - NFSv4.1: In cases where we have already looked up a file, and hold a
   valid filehandle, use the new open-by-filehandle operation instead of
   opening by name.

 - Allow the NFSv4.1 callback thread to freeze

 - NFSv4.x: ensure that file unlock waits for readahead to complete

 - NFSv4.1: ensure that the RPC layer doesn't override the NFS session
   table size negotiation by limiting the number of slots.

 - NFSv4.x: Fix SETATTR spec compatibility issues

* tag 'nfs-for-3.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (67 commits)
  NFSv4: Warn once about servers that incorrectly apply open mode to setattr
  NFSv4: Servers should only check SETATTR stateid open mode on size change
  NFSv4: Don't recheck permissions on open in case of recovery cached open
  NFSv4.1: Don't do a delegated open for NFS4_OPEN_CLAIM_DELEG_CUR_FH modes
  NFSv4.1: Use the more efficient open_noattr call for open-by-filehandle
  NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
  NFSv4: Ensure that we clear the NFS_OPEN_STATE flag when appropriate
  LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot
  NFSv4: Ensure the LOCK call cannot use the delegation stateid
  NFSv4: Use the open stateid if the delegation has the wrong mode
  nfs: Send atime and mtime as a 64bit value
  NFSv4: Record the OPEN create mode used in the nfs4_opendata structure
  NFSv4.1: Set the RPC_CLNT_CREATE_INFINITE_SLOTS flag for NFSv4.1 transports
  SUNRPC: Allow rpc_create() to request that TCP slots be unlimited
  SUNRPC: Fix a livelock problem in the xprt->backlog queue
  NFSv4: Fix handling of revoked delegations by setattr
  NFSv4 release the sequence id in the return on close case
  nfs: remove unnecessary check for NULL inode->i_flock from nfs_delegation_claim_locks
  NFS: Ensure that NFS file unlock waits for readahead to complete
  NFS: Add functionality to allow waiting on all outstanding reads to complete
  ...
2013-04-30 11:28:08 -07:00
Akinobu Mita
c86d2ddec7 net/sunrpc: rename random32() to prandom_u32()
Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 18:28:43 -07:00
Andy Shevchenko
2e0fb404c8 lib, net: make isodigit() public and use it
There are at least two users of isodigit().  Let's make it a public
function of ctype.h.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 18:28:19 -07:00
J. Bruce Fields
d28fcc830c svcrpc: fix gss-proxy to respect user namespaces
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-29 18:21:29 -04:00
Fengguang Wu
6278b62aa8 SUNRPC: gssp_procedures[] can be static
Cc: Simo Sorce <simo@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2013-04-29 17:19:48 -04:00
J. Bruce Fields
0ff3bab530 SUNRPC: define {create,destroy}_use_gss_proxy_proc_entry in !PROC case
Though I wonder whether we should really just depend on CONFIG_PROC_FS
at some point.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
2013-04-29 17:16:26 -04:00
J. Bruce Fields
b1df763723 Merge branch 'nfs-for-next' of git://linux-nfs.org/~trondmy/nfs-2.6 into for-3.10
Note conflict: Chuck's patches modified (and made static)
gss_mech_get_by_OID, which is still needed by gss-proxy patches.

The conflict resolution is a bit minimal; we may want some more cleanup.
2013-04-29 16:23:34 -04:00
Simo Sorce
030d794bf4 SUNRPC: Use gssproxy upcall for server RPCGSS authentication.
The main advantge of this new upcall mechanism is that it can handle
big tickets as seen in Kerberos implementations where tickets carry
authorization data like the MS-PAC buffer with AD or the Posix Authorization
Data being discussed in IETF on the krbwg working group.

The Gssproxy program is used to perform the accept_sec_context call on the
kernel's behalf. The code is changed to also pass the input buffer straight
to upcall mechanism to avoid allocating and copying many pages as tokens can
be as big (potentially more in future) as 64KiB.

Signed-off-by: Simo Sorce <simo@redhat.com>
[bfields: containerization, negotiation api]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:28 -04:00
Simo Sorce
1d658336b0 SUNRPC: Add RPC based upcall mechanism for RPCGSS auth
This patch implements a sunrpc client to use the services of the gssproxy
userspace daemon.

In particular it allows to perform calls in user space using an RPC
call instead of custom hand-coded upcall/downcall messages.

Currently only accept_sec_context is implemented as that is all is needed for
the server case.

File server modules like NFS and CIFS can use full gssapi services this way,
once init_sec_context is also implemented.

For the NFS server case this code allow to lift the limit of max 2k krb5
tickets. This limit is prevents legitimate kerberos deployments from using krb5
authentication with the Linux NFS server as they have normally ticket that are
many kilobytes large.

It will also allow to lift the limitation on the size of the credential set
(uid,gid,gids) passed down from user space for users that have very many groups
associated. Currently the downcall mechanism used by rpc.svcgssd is limited
to around 2k secondary groups of the 65k allowed by kernel structures.

Signed-off-by: Simo Sorce <simo@redhat.com>
[bfields: containerization, concurrent upcalls, misc. fixes and cleanup]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:27 -04:00
Simo Sorce
400f26b542 SUNRPC: conditionally return endtime from import_sec_context
We expose this parameter for a future caller.
It will be used to extract the endtime from the gss-proxy upcall mechanism,
in order to set the rsc cache expiration time.

Signed-off-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:27 -04:00
J. Bruce Fields
33d90ac058 SUNRPC: allow disabling idle timeout
In the gss-proxy case we don't want to have to reconnect at random--we
want to connect only on gss-proxy startup when we can steal gss-proxy's
context to do the connect in the right namespace.

So, provide a flag that allows the rpc_create caller to turn off the
idle timeout.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:26 -04:00
J. Bruce Fields
7073ea8799 SUNRPC: attempt AF_LOCAL connect on setup
In the gss-proxy case, setup time is when I know I'll have the right
namespace for the connect.

In other cases, it might be useful to get any connection errors
earlier--though actually in practice it doesn't make any difference for
rpcbind.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 11:41:25 -04:00
J. Bruce Fields
c85b03ab20 Merge Trond's nfs-for-next
Merging Trond's nfs-for-next branch, mainly to get
b7993cebb8 "SUNRPC: Allow rpc_create() to
request that TCP slots be unlimited", which a small piece of the
gss-proxy work depends on.
2013-04-26 11:37:43 -04:00
Trond Myklebust
b0212b84fb Merge branch 'bugfixes' into linux-next
Fix up a conflict between the linux-next branch and mainline.
Conflicts:
	fs/nfs/nfs4proc.c
2013-04-23 15:52:14 -04:00
Trond Myklebust
bd1d421abc Merge branch 'rpcsec_gss-from_cel' into linux-next
* rpcsec_gss-from_cel: (21 commits)
  NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
  NFSv4: Don't clear the machine cred when client establish returns EACCES
  NFSv4: Fix issues in nfs4_discover_server_trunking
  NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available
  NFS: Use server-recommended security flavor by default (NFSv3)
  SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
  NFS: Use "krb5i" to establish NFSv4 state whenever possible
  NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
  NFS: Use static list of security flavors during root FH lookup recovery
  NFS: Avoid PUTROOTFH when managing leases
  NFS: Clean up nfs4_proc_get_rootfh
  NFS: Handle missing rpc.gssd when looking up root FH
  SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
  SUNRPC: Make gss_mech_get() static
  SUNRPC: Refactor nfsd4_do_encode_secinfo()
  SUNRPC: Consider qop when looking up pseudoflavors
  SUNRPC: Load GSS kernel module by OID
  SUNRPC: Introduce rpcauth_get_pseudoflavor()
  SUNRPC: Define rpcsec_gss_info structure
  NFS: Remove unneeded forward declaration
  ...
2013-04-23 15:40:40 -04:00
Trond Myklebust
b7993cebb8 SUNRPC: Allow rpc_create() to request that TCP slots be unlimited
This is mainly for use by NFSv4.1, where the session negotiation
ultimately wants to decide how many RPC slots we can fill.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-14 12:26:03 -04:00
Trond Myklebust
ba60eb25ff SUNRPC: Fix a livelock problem in the xprt->backlog queue
This patch ensures that we throttle new RPC requests if there are
requests already waiting in the xprt->backlog queue. The reason for
doing this is to fix livelock issues that can occur when an existing
(high priority) task is waiting in the backlog queue, gets woken up
by xprt_free_slot(), but a new task then steals the slot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-14 12:26:02 -04:00
Linus Torvalds
f94eeb423b NFS client bugfixes for Linux 3.9
- Stable fix for memory corruption issues in nfs4[01]_walk_client_list
 - Stable fix for an Oopsable bug in rpc_clone_client
 - Another state manager deadlock in the NFSv4 open code
 - Memory leaks in nfs4_discover_server_trunking and rpc_new_client
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRZYu9AAoJEGcL54qWCgDySfwP/R2IdO2nfRzmDCPtvD6pPg8T
 l8Gf97Z/8A3g6WwfvmKNt48D1fKnhAcOaKTZQIZuZePAjI/Yy74DFMof6paiDmsO
 8hMcZgvunZotPwmBmhIwmLOxDYgbpdizDBlITsimnUQLrv78bMw2F/cNCcThYgTI
 Q4sNpZsl4kk1nmOYK/tGBCCkq6mIQhc95QeQPgnl2B/NozpZiIqgzrpWpSWMofn2
 cuSLiuEdmpCdJbgQaPEjSWf+doo/nBn720+Xj2RjmLhTTnWUtAsouElAdMs96Jjz
 cEhSll3nLIygr1xdFF7CD8qFjpbtg/YNhKw3HBCFAgHjrAjr+a3N+eHQOz9QQ6W4
 5OL3Mj0VEkvMrK1Sy76smynQJMJhrsn852Zo2wK2mCp+mHNZlBlML529Y4PJy2Ba
 Up4MteIaOTpKGSnBdzWmqPqro9glqlhrUk/o3XipCzIziWC8yDYjl2J9Ez8B7Ren
 uzvBeevYRX9AmQlmZUAPvx8+xVqA6cr0X2q8/6PqPnrNXP6Ff8+rm6gvH4VozyzJ
 qd/r7Bf1ozFXxoKQOztSiGjI5YiBp4DRXycR5td6eF3nZJipmbxY+WKllhaAakn6
 UY2NsGX2zfxkJMltqd2/xRmHtN+Eif1Uoo35pvzNxzBtPsRxBMIiPhGLglQu98Yj
 2NuwfT4//UNfS6JlBe6E
 =kBf2
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.9-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - fix for memory corruption issues in nfs4[01]_walk_client_list (stable)
 - fix for an Oopsable bug in rpc_clone_client (stable)
 - another state manager deadlock in the NFSv4 open code
 - memory leaks in nfs4_discover_server_trunking and rpc_new_client

* tag 'nfs-for-3.9-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Fix another potential state manager deadlock
  SUNRPC: Fix a potential memory leak in rpc_new_client
  NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list
  NFSv4: Fix a memory leak in nfs4_discover_server_trunking
  SUNRPC: Remove extra xprt_put()
2013-04-10 09:00:51 -07:00
Al Viro
d9dda78bad procfs: new helper - PDE_DATA(inode)
The only part of proc_dir_entry the code outside of fs/proc
really cares about is PDE(inode)->data.  Provide a helper
for that; static inline for now, eventually will be moved
to fs/proc, along with the knowledge of struct proc_dir_entry
layout.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09 14:13:32 -04:00
Paul Bolle
cb60718747 sunrpc: drop "select NETVM"
The Kconfig entry for SUNRPC_SWAP selects NETVM. That select statement
was added in commit a564b8f039 ("nfs:
enable swap on NFS"). But there's no Kconfig symbol NETVM. It apparently
was only in used in development versions of the swap over nfs
functionality but never entered mainline. Anyhow, it is a nop and can
safely be dropped.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:52 -04:00
Trond Myklebust
f05c124a70 SUNRPC: Fix a potential memory leak in rpc_new_client
If the call to rpciod_up() fails, we currently leak a reference to the
struct rpc_xprt.
As part of the fix, we also remove the redundant check for xprt!=NULL.
This is already taken care of by the callers.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:02:14 -04:00
Chuck Lever
a58e0be6f6 SUNRPC: Remove extra xprt_put()
While testing error cases where rpc_new_client() fails, I saw
some oopses.

If rpc_new_client() fails, it already invokes xprt_put().  Thus
__rpc_clone_client() does not need to invoke it again.

Introduced by commit 1b63a751 "SUNRPC: Refactor rpc_clone_client()"
Fri Sep 14, 2012.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org [>=3.7]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 16:58:14 -04:00
Chuck Lever
1c74a244fc SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
RPC_AUTH_MAXFLAVOR is an invalid flavor, on purpose.  Don't allow
any processing whatsoever if a caller passes it to rpcauth_create()
or rpcauth_get_gssinfo().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-04 17:01:00 -04:00
Alexey Khoroshilov
a7823c797b SUNRPC/cache: add module_put() on error path in cache_open()
If kmalloc() fails in cache_open(), module cd->owner left locked.
The patch adds module_put(cd->owner) on this path.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 15:32:32 -04:00
Chuck Lever
5007220b87 SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
Clean up: Reduce the symbol table footprint for auth_rpcgss.ko by
removing exported symbols for functions that are no longer used
outside of auth_rpcgss.ko.

The remaining two EXPORTs in gss_mech_switch.c get documenting
comments.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:43:41 -04:00
Chuck Lever
6599c0acae SUNRPC: Make gss_mech_get() static
gss_mech_get() is no longer used outside of gss_mech_switch.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:43:39 -04:00
Chuck Lever
a77c806fb9 SUNRPC: Refactor nfsd4_do_encode_secinfo()
Clean up.  This matches a similar API for the client side, and
keeps ULP fingers out the of the GSS mech switch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:43:33 -04:00
Chuck Lever
83523d083a SUNRPC: Consider qop when looking up pseudoflavors
The NFSv4 SECINFO operation returns a list of security flavors that
the server supports for a particular share.  An NFSv4 client is
supposed to pick a pseudoflavor it supports that corresponds to one
of the flavors returned by the server.

GSS flavors in this list have a GSS tuple that identify a specific
GSS pseudoflavor.

Currently our client ignores the GSS tuple's "qop" value.  A
matching pseudoflavor is chosen based only on the OID and service
value.

So far this omission has not had much effect on Linux.  The NFSv4
protocol currently supports only one qop value: GSS_C_QOP_DEFAULT,
also known as zero.

However, if an NFSv4 server happens to return something other than
zero in the qop field, our client won't notice.  This could cause
the client to behave in incorrect ways that could have security
implications.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:43:24 -04:00
Chuck Lever
f783288f0c SUNRPC: Load GSS kernel module by OID
The current GSS mech switch can find and load GSS pseudoflavor
modules by name ("krb5") or pseudoflavor number ("390003"), but
cannot find GSS modules by GSS tuple:

  [ "1.2.840.113554.1.2.2", GSS_C_QOP_DEFAULT, RPC_GSS_SVC_NONE ]

This is important when dealing with a SECINFO request.  A SECINFO
reply contains a list of flavors the server supports for the
requested export, but GSS flavors also have a GSS tuple that maps
to a pseudoflavor (like 390003 for krb5).

If the GSS module that supports the OID in the tuple is not loaded,
our client is not able to load that module dynamically to support
that pseudoflavor.

Add a way for the GSS mech switch to load GSS pseudoflavor support
by OID before searching for the pseudoflavor that matches the OID
and service.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:43:18 -04:00
Chuck Lever
9568c5e9a6 SUNRPC: Introduce rpcauth_get_pseudoflavor()
A SECINFO reply may contain flavors whose kernel module is not
yet loaded by the client's kernel.  A new RPC client API, called
rpcauth_get_pseudoflavor(), is introduced to do proper checking
for support of a security flavor.

When this API is invoked, the RPC client now tries to load the
module for each flavor first before performing the "is this
supported?" check.  This means if a module is available on the
client, but has not been loaded yet, it will be loaded and
registered automatically when the SECINFO reply is processed.

The new API can take a full GSS tuple (OID, QoP, and service).
Previously only the OID and service were considered.

nfs_find_best_sec() is updated to verify all flavors requested in a
SECINFO reply, including AUTH_NULL and AUTH_UNIX.  Previously these
two flavors were simply assumed to be supported without consulting
the RPC client.

Note that the replaced version of nfs_find_best_sec() can return
RPC_AUTH_MAXFLAVOR if the server returns a recognized OID but an
unsupported "service" value.  nfs_find_best_sec() now returns
RPC_AUTH_UNIX in this case.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:43:07 -04:00
Chuck Lever
fb15b26f8b SUNRPC: Define rpcsec_gss_info structure
The NFSv4 SECINFO procedure returns a list of security flavors.  Any
GSS flavor also has a GSS tuple containing an OID, a quality-of-
protection value, and a service value, which specifies a particular
GSS pseudoflavor.

For simplicity and efficiency, I'd like to return each GSS tuple
from the NFSv4 SECINFO XDR decoder and pass it straight into the RPC
client.

Define a data structure that is visible to both the NFS client and
the RPC client.  Take structure and field names from the relevant
standards to avoid confusion.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:42:56 -04:00
Chuck Lever
71afa85e79 SUNRPC: Missing module alias for auth_rpcgss.ko
Commit f344f6df "SUNRPC: Auto-load RPC authentication kernel
modules", Mon Mar 20 13:44:08 2006, adds a request_module() call
in rpcauth_create() to auto-load RPC security modules when a ULP
tries to create a credential of that flavor.

In rpcauth_create(), the name of the module to load is built like
this:

	request_module("rpc-auth-%u", flavor);

This means that for, say, RPC_AUTH_GSS, request_module() is looking
for a module or alias called "rpc-auth-6".

The GSS module is named "auth_rpcgss", and commit f344f6df does not
add any new module aliases.  There is also no such alias provided in
/etc/modprobe.d on my system (Fedora 16).  Without this alias, the
GSS module is not loaded on demand.

This is used by rpcauth_create().  The pseudoflavor_to_flavor() call
can return RPC_AUTH_GSS, which is passed to request_module().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:42:27 -04:00
Linus Torvalds
5d538483ea NFS client bugfixes for Linux 3.9
- Fix an NFSv4 idmapper regression
 - Fix an Oops in the pNFS blocks client
 - Fix up various issues with pNFS layoutcommit
 - Ensure correct read ordering of variables in rpc_wake_up_task_queue_locked
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRUedyAAoJEGcL54qWCgDyar0P/2pTT/yxX8ejTu5DmY7e4PYJ
 jhPG2AEqY/yMLn9GvB375VIs1L8tuY50+3NFhWZFjyNbEU3GV+5Y+kPpBtAgYiSI
 VyIXiJ/xMtXdYJMYuE/nh5jbcqJsHwGjpcIaSd5BuWzQUaoUYvLulxWd4QN8mmaT
 5SuzmgV+7WIqV6RjlaYF82srcOKAjwemcrfRkCNzzJr6aT39gH2YdYFbDaTr7qhU
 fw0x3QlI7887vSNQcfaGbC1+jr6oe8wRCneOR0tceU/8bcj6zlUDk5HxqSOc28mA
 jUQieoVRggcM4s5DFpNcuwW6qCPZOmzv/OFD6oqnhyyonPOrue+7zaoujZmGNmjx
 dT2V/jQehanYD25WpDO8OyFXUeYE4x9bgHKsszhBTwr4x5D8ceEJ1sugcOPiTTxu
 tflbbuWbt+BguvXp4p8QayUj0V2cplM/nOovWyUG+BH46sz3Dtv46NOgJeO2a29g
 T6jayxmKCxvtPKtG0j34BzLngiKabZTSEhFms6Qarp9lwWvHWrR9KWGuDBNvy1Ts
 GMBN8P6Ib40yVi6Pwlj5Jpy6yLKVklHtJQpactr63AZmYrF4bBBSom+MWAh3X1iO
 QtF0x9Z1bBkXY2Q/u+3vWMxQtEPeW+pSiloj8aiceFAt33zKM+1bLofDhEw0s2fI
 wJEHYsGyGtDQINgP0v1e
 =OPbZ
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - Fix an NFSv4 idmapper regression
 - Fix an Oops in the pNFS blocks client
 - Fix up various issues with pNFS layoutcommit
 - Ensure correct read ordering of variables in
   rpc_wake_up_task_queue_locked

* tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked
  NFSv4.1: Add a helper pnfs_commit_and_return_layout
  NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn
  NFSv4.1: Fix a race in pNFS layoutcommit
  pnfs-block: removing DM device maybe cause oops when call dev_remove
  NFSv4: Fix the string length returned by the idmapper
2013-03-26 14:23:45 -07:00
Trond Myklebust
3ed5e2a2c3 SUNRPC: Report network/connection errors correctly for SOFTCONN rpc tasks
In the case of a SOFTCONN rpc task, we really want to ensure that it
reports errors like ENETUNREACH back to the caller. Currently, only
some of these errors are being reported back (connect errors are not),
and they are being converted by the RPC layer into EIO.

Reported-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:10 -04:00
Trond Myklebust
1166fde6a9 SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked
We need to be careful when testing task->tk_waitqueue in
rpc_wake_up_task_queue_locked, because it can be changed while we
are holding the queue->lock.
By adding appropriate memory barriers, we can ensure that it is safe to
test task->tk_waitqueue for equality if the RPC_TASK_QUEUED bit is set.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-03-25 11:23:40 -04:00
Linus Torvalds
aea8b5d1e5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace bugfixes from Eric Biederman:
 "This tree includes a partial revert for "fs: Limit sys_mount to only
  request filesystem modules." When I added the new style module aliases
  to the filesystems I deleted the old ones.  A bad move.  It turns out
  that distributions like Arch linux use module aliases when
  constructing ramdisks.  Which meant ultimately that an ext3 filesystem
  mounted with ext4 would not result in the ext4 module being put into
  the ramdisk.

  The other change in this tree adds a handful of filesystem module
  alias I simply failed to add the first time.  Which inconvinienced a
  few folks using cifs.

  I don't want to inconvinience folks any longer than I have to so here
  are these trivial fixes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  fs: Readd the fs module aliases.
  fs: Limit sys_mount to only request filesystem modules. (Part 3)
2013-03-13 15:47:50 -07:00
Eric W. Biederman
fa7614ddd6 fs: Readd the fs module aliases.
I had assumed that the only use of module aliases for filesystems
prior to "fs: Limit sys_mount to only request filesystem modules."
was in request_module.  It turns out I was wrong.  At least mkinitcpio
in Arch linux uses these aliases.

So readd the preexising aliases, to keep from breaking userspace.

Userspace eventually will have to follow and use the same aliases the
kernel does.  So at some point we may be delete these aliases without
problems.  However that day is not today.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-12 18:55:21 -07:00
Linus Torvalds
5b22b1848b Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
 "Some minor fallout from the user-namespace work broke most krb5 mounts
  to nfsd, and I screwed up a change to the AF_LOCAL rpc code."

* 'for-3.9' of git://linux-nfs.org/~bfields/linux:
  sunrpc: don't attempt to cancel unitialized work
  nfsd: fix krb5 handling of anonymous principals
2013-03-12 09:20:58 -07:00
J. Bruce Fields
190b1ecf25 sunrpc: don't attempt to cancel unitialized work
As of dc107402ae "SUNRPC: make AF_LOCAL connect synchronous", we no longer initialize connect_worker in the
AF_LOCAL case, resulting in warnings like:

    WARNING: at lib/debugobjects.c:261 debug_print_object+0x8c/0xb0() Hardware name: Bochs
    ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x20
    Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd sunrpc
    Pid: 4816, comm: nfsd Tainted: G        W    3.8.0-rc2-00049-gdc10740 #801
    Call Trace:
     [<ffffffff8156ec00>] ? free_obj_work+0x60/0xa0
     [<ffffffff81046aaf>] warn_slowpath_common+0x7f/0xc0
     [<ffffffff81046ba6>] warn_slowpath_fmt+0x46/0x50
     [<ffffffff8156eccc>] debug_print_object+0x8c/0xb0
     [<ffffffff81055030>] ? timer_debug_hint+0x10/0x10
     [<ffffffff8156f7e3>] debug_object_assert_init+0xe3/0x120
     [<ffffffff81057ebb>] del_timer+0x2b/0x80
     [<ffffffff8109c4e6>] ? mark_held_locks+0x86/0x110
     [<ffffffff81065a29>] try_to_grab_pending+0xd9/0x150
     [<ffffffff81065b57>] __cancel_work_timer+0x27/0xc0
     [<ffffffff81065c03>] cancel_delayed_work_sync+0x13/0x20
     [<ffffffffa0007067>] xs_destroy+0x27/0x80 [sunrpc]
     [<ffffffffa00040d8>] xprt_destroy+0x78/0xa0 [sunrpc]
     [<ffffffffa0006241>] xprt_put+0x21/0x30 [sunrpc]
     [<ffffffffa00030cf>] rpc_free_client+0x10f/0x1a0 [sunrpc]
     [<ffffffffa0002ff3>] ? rpc_free_client+0x33/0x1a0 [sunrpc]
     [<ffffffffa0002f7e>] rpc_release_client+0x6e/0xb0 [sunrpc]
     [<ffffffffa000325d>] rpc_shutdown_client+0xfd/0x1b0 [sunrpc]
     [<ffffffffa0017196>] rpcb_put_local+0x106/0x130 [sunrpc]
    ...

Acked-by: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-03-09 12:43:42 -05:00
J. Bruce Fields
3c34ae11fa nfsd: fix krb5 handling of anonymous principals
krb5 mounts started failing as of
683428fae8 "sunrpc: Update svcgss xdr
handle to rpsec_contect cache".

The problem is that mounts are usually done with some host principal
which isn't normally mapped to any user, in which case svcgssd passes
down uid -1, which the kernel is then expected to map to the
export-specific anonymous uid or gid.

The new uid_valid/gid_valid checks were therefore causing that downcall
to fail.

(Note the regression may not have been seen with older userspace that
tended to map unknown principals to an anonymous id on their own rather
than leaving it to the kernel.)

Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-03-06 10:11:08 -05:00
Eric W. Biederman
7f78e03513 fs: Limit sys_mount to only request filesystem modules.
Modify the request_module to prefix the file system type with "fs-"
and add aliases to all of the filesystems that can be built as modules
to match.

A common practice is to build all of the kernel code and leave code
that is not commonly needed as modules, with the result that many
users are exposed to any bug anywhere in the kernel.

Looking for filesystems with a fs- prefix limits the pool of possible
modules that can be loaded by mount to just filesystems trivially
making things safer with no real cost.

Using aliases means user space can control the policy of which
filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
with blacklist and alias directives.  Allowing simple, safe,
well understood work-arounds to known problematic software.

This also addresses a rare but unfortunate problem where the filesystem
name is not the same as it's module name and module auto-loading
would not work.  While writing this patch I saw a handful of such
cases.  The most significant being autofs that lives in the module
autofs4.

This is relevant to user namespaces because we can reach the request
module in get_fs_type() without having any special permissions, and
people get uncomfortable when a user specified string (in this case
the filesystem type) goes all of the way to request_module.

After having looked at this issue I don't think there is any
particular reason to perform any filtering or permission checks beyond
making it clear in the module request that we want a filesystem
module.  The common pattern in the kernel is to call request_module()
without regards to the users permissions.  In general all a filesystem
module does once loaded is call register_filesystem() and go to sleep.
Which means there is not much attack surface exposed by loading a
filesytem module unless the filesystem is mounted.  In a user
namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
which most filesystems do not set today.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reported-by: Kees Cook <keescook@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-03 19:36:31 -08:00
Linus Torvalds
8d05b3771d NFS client bugfixes for Linux 3.9
- Don't allow NFS silly-renamed files to be deleted
 - Don't start the retransmission timer when out of socket space
 - Fix a couple of pnfs-related Oopses.
 - Fix one more NFSv4 state recovery deadlock
 - Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRMpMhAAoJEGcL54qWCgDy4BMP/0Zl7Ei7x9bJSb1C1lpPSo5p
 Lr9XoHLYqhPcAwKUXQfgM5IkC69bE62bD5esmdDqkgZYqnmGE0E4LG6MsbsMmvzk
 yug5WOxmjOFee7Bdpd8B86Z0qsa4l2TkQu2h9G3zE36P2rPKQaNzpteIjhis5UEQ
 EfNyLoBdFcuUSh4ztMVZOzbAyDcbNfsyl03XVmlv+Qn/o0l42Zjth0qwOP60bjuM
 zJF1CkHi5NLbXEhmOev9mA6UYz6zWRbiA/Yu92pomtXVDtOtzWpUniBIcf/S1ZH/
 V8Gj6bWdHHyFCa2PjhY1/QdLBOPRPdxpAAJk+q48AKmzyiOU6g3lIHBp5ai1WZNI
 1C+SYxABE/EJgq9SoQYGqq6SUiolrFulqnFHXF0jHF+ifdjoHjSRmpGQAoyoZ0k1
 aSl+Ojqx7QHibJd8GZBavWc3upRDzhHDRRB3tkQCENi+hryBZxEyeS2Z54NmBRUN
 tsOuyac6rtknZdD8Do4DMt9uc9u1DWicaiZbLfkP1VL1Angh6NKSA7qbmH6giLBS
 9Y+DPcIk5e34uKQ21WTxFydGD+SMg0EMnOmfr6EYXWEHBhKNYVR+cHyH0mAF6RzX
 enU2g0H2m+3vUQqajPUP0DV/eLGtdsvWvMjiskc3KX90CWfHmV2C8GFSxjV2OkT1
 vG1KFrICO6DR2943Udit
 =FMtb
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "We've just concluded another Connectathon interoperability testing
  week, and so here are the fixes for the bugs that were discovered:

   - Don't allow NFS silly-renamed files to be deleted
   - Don't start the retransmission timer when out of socket space
   - Fix a couple of pnfs-related Oopses.
   - Fix one more NFSv4 state recovery deadlock
   - Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER"

* tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: One line comment fix
  NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS
  SUNRPC: add call to get configured timeout
  PNFS: set the default DS timeout to 60 seconds
  NFSv4: Fix another open/open_recovery deadlock
  nfs: don't allow nfs_find_actor to match inodes of the wrong type
  NFSv4.1: Hold reference to layout hdr in layoutget
  pnfs: fix resend_to_mds for directio
  SUNRPC: Don't start the retransmission timer when out of socket space
  NFS: Don't allow NFS silly-renamed files to be deleted, no signal
2013-03-02 16:46:07 -08:00
Trond Myklebust
512e4b291c SUNRPC: One line comment fix
Reported-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-02 15:54:11 -08:00
Linus Torvalds
b6669737d3 Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from J Bruce Fields:
 "Miscellaneous bugfixes, plus:

   - An overhaul of the DRC cache by Jeff Layton.  The main effect is
     just to make it larger.  This decreases the chances of intermittent
     errors especially in the UDP case.  But we'll need to watch for any
     reports of performance regressions.

   - Containerized nfsd: with some limitations, we now support
     per-container nfs-service, thanks to extensive work from Stanislav
     Kinsbursky over the last year."

Some notes about conflicts, since there were *two* non-data semantic
conflicts here:

 - idr_remove_all() had been added by a memory leak fix, but has since
   become deprecated since idr_destroy() does it for us now.

 - xs_local_connect() had been added by this branch to make AF_LOCAL
   connections be synchronous, but in the meantime Trond had changed the
   calling convention in order to avoid a RCU dereference.

There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.

* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
  SUNRPC: make AF_LOCAL connect synchronous
  nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
  svcrpc: fix rpc server shutdown races
  svcrpc: make svc_age_temp_xprts enqueue under sv_lock
  lockd: nlmclnt_reclaim(): avoid stack overflow
  nfsd: enable NFSv4 state in containers
  nfsd: disable usermode helper client tracker in container
  nfsd: use proper net while reading "exports" file
  nfsd: containerize NFSd filesystem
  nfsd: fix comments on nfsd_cache_lookup
  SUNRPC: move cache_detail->cache_request callback call to cache_read()
  SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
  SUNRPC: rework cache upcall logic
  SUNRPC: introduce cache_detail->cache_request callback
  NFS: simplify and clean cache library
  NFS: use SUNRPC cache creation and destruction helper for DNS cache
  nfsd4: free_stid can be static
  nfsd: keep a checksum of the first 256 bytes of request
  sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
  sunrpc: fix comment in struct xdr_buf definition
  ...
2013-02-28 18:02:55 -08:00
Weston Andros Adamson
edddbb1eda SUNRPC: add call to get configured timeout
Returns the configured timeout for the xprt of the rpc client.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-28 17:35:20 -08:00
J. Bruce Fields
dc107402ae SUNRPC: make AF_LOCAL connect synchronous
It doesn't appear that anyone actually needs to connect asynchronously.

Also, using a workqueue for the connect means we lose the namespace
information from the original process.  This is a problem since there's
no way to explicitly pass in a filesystem namespace for resolution of an
AF_LOCAL address.

Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-28 09:47:17 -08:00
Sasha Levin
b67bfe0d42 hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:24 -08:00
Linus Torvalds
d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
Linus Torvalds
70a3a06d01 Main batch of InfiniBand/RDMA changes for 3.9:
- SRP error handling fixes from Bart Van Assche
  - Implementation of memory windows for mlx4 from Shani Michaeli
  - Lots of cxgb4 HW driver fixes from Vipul Pandya
  - Make iSER work for virtual functions, other fixes from Or Gerlitz
  - Fix for bug in qib HW driver from Mike Marciniszyn
  - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman
  - Various cleanups and warning fixes from Julia Lawall, Paul Bolle, Wei Yongjun
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJRLPNoAAoJEENa44ZhAt0h0bMP/1xqlgDR3DGdoUoV4yTd6O9G
 Uccdb7og5o5tedVo+8xZz01y4at99P8FWW4hJ4os6k8n6yKoBVwo7qjN+BOR+JG5
 q8+O+ynUSIg4tGrb5sXcMnKXAXbw/vkftMWYNA41cbrM24DTYzB/2SLpvhwbFoTT
 tdQc2tgz5QaDqzWbagyCR4+k/IgO+Llrz/RvIdtz4dsTnTDogN7QCoSffX8n/Lpb
 DxtyXK4sdl3DAtd3CsIdsB/TSMb3RkHLCoSvmrWlLnqMdsbRxVnCVfBm4BOghW3J
 Y2K3joRoCjjIZSRNs/i0FMFkT/jbCXg1oXg9ek/a6YFNcgyk7z8iGyXrRY7fOnno
 8U2SfxJ69YpVYeJr+DSjaeHcmjsaYU7NN7JPxzvPKcJOIsxQJ/euJDXAXau3lEQY
 o9/p4JsGty0WHi1NanyygvghvBAoP1C5/59Sl4bHH5gckPyJT1kinPSCTT76YXGS
 WkSHg2mDhiJHy7Pnuy85iZldPoy2/5z09/I4aGMeL+8kUZbD4iFqzXIJU0HTsAim
 EONoRXDhIcN5DNVSVH1ig6nJ2a7Vhov4Z0r/vB8P4KhslBcqFwf2leC0eCoe5mNt
 SzcKhqosZDXoL8AwzpntzGIOid8pWmHbUx/PgIcoVXPjtl0h2ULNIFoYYyMZ3cyU
 AyN2tSiUZVddTV1/aKGL
 =RAQw
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband update from Roland Dreier:
 "Main batch of InfiniBand/RDMA changes for 3.9:

   - SRP error handling fixes from Bart Van Assche

   - Implementation of memory windows for mlx4 from Shani Michaeli

   - Lots of cxgb4 HW driver fixes from Vipul Pandya

   - Make iSER work for virtual functions, other fixes from Or Gerlitz

   - Fix for bug in qib HW driver from Mike Marciniszyn

   - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman

   - Various cleanups and warning fixes from Julia Lawall, Paul Bolle,
     Wei Yongjun"

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (41 commits)
  IB/mlx4: Advertise MW support
  IB/mlx4: Support memory window binding
  mlx4: Implement memory windows allocation and deallocation
  mlx4_core: Enable memory windows in {INIT, QUERY}_HCA
  mlx4_core: Disable memory windows for virtual functions
  IPoIB: Free ipoib neigh on path record failure so path rec queries are retried
  IB/srp: Fail I/O requests if the transport is offline
  IB/srp: Avoid endless SCSI error handling loop
  IB/srp: Avoid sending a task management function needlessly
  IB/srp: Track connection state properly
  IB/mlx4: Remove redundant NULL check before kfree
  IB/mlx4: Fix compiler warning about uninitialized 'vlan' variable
  IB/mlx4: Convert is_xxx variables in build_mlx_header() to bool
  IB/iser: Enable iser when FMRs are not supported
  IB/iser: Avoid error prints on EAGAIN registration failures
  IB/iser: Use proper define for the commands per LUN value advertised to SCSI ML
  IB/uverbs: Implement memory windows support in uverbs
  IB/core: Add "type 2" memory windows support
  mlx4_core: Propagate MR deregistration failures to caller
  mlx4_core: Rename MPT-related functions to have mpt_ prefix
  ...
2013-02-26 11:41:08 -08:00
Linus Torvalds
94f2f14234 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace and namespace infrastructure changes from Eric W Biederman:
 "This set of changes starts with a few small enhnacements to the user
  namespace.  reboot support, allowing more arbitrary mappings, and
  support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the
  user namespace root.

  I do my best to document that if you care about limiting your
  unprivileged users that when you have the user namespace support
  enabled you will need to enable memory control groups.

  There is a minor bug fix to prevent overflowing the stack if someone
  creates way too many user namespaces.

  The bulk of the changes are a continuation of the kuid/kgid push down
  work through the filesystems.  These changes make using uids and gids
  typesafe which ensures that these filesystems are safe to use when
  multiple user namespaces are in use.  The filesystems converted for
  3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs.  The
  changes for these filesystems were a little more involved so I split
  the changes into smaller hopefully obviously correct changes.

  XFS is the only filesystem that remains.  I was hoping I could get
  that in this release so that user namespace support would be enabled
  with an allyesconfig or an allmodconfig but it looks like the xfs
  changes need another couple of days before it they are ready."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits)
  cifs: Enable building with user namespaces enabled.
  cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t
  cifs: Convert struct cifs_sb_info to use kuids and kgids
  cifs: Modify struct smb_vol to use kuids and kgids
  cifs: Convert struct cifsFileInfo to use a kuid
  cifs: Convert struct cifs_fattr to use kuid and kgids
  cifs: Convert struct tcon_link to use a kuid.
  cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t
  cifs: Convert from a kuid before printing current_fsuid
  cifs: Use kuids and kgids SID to uid/gid mapping
  cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc
  cifs: Use BUILD_BUG_ON to validate uids and gids are the same size
  cifs: Override unmappable incoming uids and gids
  nfsd: Enable building with user namespaces enabled.
  nfsd: Properly compare and initialize kuids and kgids
  nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids
  nfsd: Modify nfsd4_cb_sec to use kuids and kgids
  nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion
  nfsd: Convert nfsxdr to use kuids and kgids
  nfsd: Convert nfs3xdr to use kuids and kgids
  ...
2013-02-25 16:00:49 -08:00
Al Viro
496ad9aa8e new helper: file_inode(file)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-22 23:31:31 -05:00
Trond Myklebust
a9a6b52ee1 SUNRPC: Don't start the retransmission timer when out of socket space
If the socket is full, we're better off just waiting until it empties,
or until the connection is broken. The reason why we generally don't
want to time out is that the call to xprt->ops->release_xprt() will
trigger a connection reset, which isn't helpful...

Let's make an exception for soft RPC calls, since they have to provide
timeout guarantees.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-02-22 15:17:17 -05:00
Linus Torvalds
06991c28f3 Driver core patches for 3.9-rc1
Here is the big driver core merge for 3.9-rc1
 
 There are two major series here, both of which touch lots of drivers all
 over the kernel, and will cause you some merge conflicts:
   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.
   - remove CONFIG_EXPERIMENTAL
 
 If you need me to provide a merged tree to handle these resolutions,
 please let me know.
 
 Other than those patches, there's not much here, some minor fixes and
 updates.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlEmV0cACgkQMUfUDdst+yncCQCfbmnQZju7kzWXk6PjdFuKspT9
 weAAoMCzcAtEzzc4LXuUxxG/sXBVBCjW
 =yWAQ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core patches from Greg Kroah-Hartman:
 "Here is the big driver core merge for 3.9-rc1

  There are two major series here, both of which touch lots of drivers
  all over the kernel, and will cause you some merge conflicts:

   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.

   - remove CONFIG_EXPERIMENTAL

  Other than those patches, there's not much here, some minor fixes and
  updates"

Fix up trivial conflicts

* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
  base: memory: fix soft/hard_offline_page permissions
  drivercore: Fix ordering between deferred_probe and exiting initcalls
  backlight: fix class_find_device() arguments
  TTY: mark tty_get_device call with the proper const values
  driver-core: constify data for class_find_device()
  firmware: Ignore abort check when no user-helper is used
  firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
  firmware: Make user-mode helper optional
  firmware: Refactoring for splitting user-mode helper code
  Driver core: treat unregistered bus_types as having no devices
  watchdog: Convert to devm_ioremap_resource()
  thermal: Convert to devm_ioremap_resource()
  spi: Convert to devm_ioremap_resource()
  power: Convert to devm_ioremap_resource()
  mtd: Convert to devm_ioremap_resource()
  mmc: Convert to devm_ioremap_resource()
  mfd: Convert to devm_ioremap_resource()
  media: Convert to devm_ioremap_resource()
  iommu: Convert to devm_ioremap_resource()
  drm: Convert to devm_ioremap_resource()
  ...
2013-02-21 12:05:51 -08:00
Shani Michaeli
7083e42ee2 IB/core: Add "type 2" memory windows support
This patch enhances the IB core support for Memory Windows (MWs).

MWs allow an application to have better/flexible control over remote
access to memory.

Two types of MWs are supported, with the second type having two flavors:

    Type 1  - associated with PD only
    Type 2A - associated with QPN only
    Type 2B - associated with PD and QPN

Applications can allocate a MW once, and then repeatedly bind the MW
to different ranges in MRs that are associated to the same PD. Type 1
windows are bound through a verb, while type 2 windows are bound by
posting a work request.

The 32-bit memory key is composed of a 24-bit index and an 8-bit
key. The key is changed with each bind, thus allowing more control
over the peer's use of the memory key.

The changes introduced are the following:

* add memory window type enum and a corresponding parameter to ib_alloc_mw.
* type 2 memory window bind work request support.
* create a struct that contains the common part of the bind verb struct
  ibv_mw_bind and the bind work request into a single struct.
* add the ib_inc_rkey helper function to advance the tag part of an rkey.

Consumer interface details:

* new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and
  IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support
  for these features.

  Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or
  IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B
  memory windows. It can set neither to indicate it doesn't support
  type 2 windows at all.

* modify existing provides and consumers code to the new param of
  ib_alloc_mw and the ib_mw_bind_info structure

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-21 11:51:45 -08:00
Linus Torvalds
2171ee8f43 NFS client bugfixes for Linux 3.9
- Fix an Oops in the pNFS layoutget code
 - Fix a number of NFSv4 and v4.1 state recovery deadlocks and hangs
   due to the interaction of the session drain lock and state management
   locks.
 - Remove task->tk_xprt, which was hiding a lot of RCU dereferencing bugs
 - Fix a long standing NFSv3 posix lock recovery bug.
 - Revert commit 324d003b0c. It turned out
   that the root cause of the deadlock was due to interactions with the
   workqueues that have now been resolved.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRJUfVAAoJEGcL54qWCgDySIoP/2QC2qNKqfcoIwOqOAR0pu5m
 wHybR67WXt/R6EK17l273apHfnJQb/IFFKRBfAIAW0vy4bJLwHMtutIeUweFWht2
 hmluYqVWAa9lPwaY1YMolo8Mfizaj0cY3QHv2ogF4poY0SLO6L8O/xo8n8nrn/pk
 QJW9tcvz43A2tbW92t833sQMFQSSjs9L2tvxM/BEahLrSEE4dl6WMWscmHQ/IXt/
 Hs3ADc4//dDF36k5lu7tU0qaVNDacjMXCV6Bw0HUCWCwjWzqmPxSlTDKAcer2vpm
 UawKDFTF4X1QYwbtGlNju0xhsSGRg7ZNRVEeRxqSROT280nzpp7C3CjM9sECtM2l
 SR9RqoYfBsEAyWZtsALApuSBZZZHOvUfVJEL1Cm0ZSeEbWSOO/DT07KRVgfb97h4
 hmE7pn5htgXdiGN2YCphMjrqjaoP4aKmULenj1h1jHbDOgrlqp2QLWqg/5NxMsTw
 M3vmYqQWkW60JN6e1pWJbsoyUkSMTVycRNrZ6WgShtd1lrz3tg66rDXYWdIhU6pT
 6qAKrkXnM5JkLoMoImNQod00Vr1ZXpVc8DLoFtug8rGohCdFJhWkYHefqVJ9rFhv
 2S2NpKDsFiA0q3aK0/YAkUnykYDAnLn7BM5SvhJjq8mKru4bP70PrN9LO8aqhE4E
 fFgj2MGG5QwrL0/GMuls
 =dthb
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Fix an Oops in the pNFS layoutget code

 - Fix a number of NFSv4 and v4.1 state recovery deadlocks and hangs due
   to the interaction of the session drain lock and state management
   locks.

 - Remove task->tk_xprt, which was hiding a lot of RCU dereferencing
   bugs

 - Fix a long standing NFSv3 posix lock recovery bug.

 - Revert commit 324d003b0c ("NFS: add nfs_sb_deactive_async to avoid
   deadlock").  It turned out that the root cause of the deadlock was
   due to interactions with the workqueues that have now been resolved.

* tag 'nfs-for-3.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (22 commits)
  NLM: Ensure that we resend all pending blocking locks after a reclaim
  umount oops when remove blocklayoutdriver first
  sunrpc: silence build warning in gss_fill_context
  nfs: remove kfree() redundant null checks
  NFSv4.1: Don't decode skipped layoutgets
  NFSv4.1: Fix bulk recall and destroy of layouts
  NFSv4.1: Fix an ABBA locking issue with session and state serialisation
  NFSv4: Fix a reboot recovery race when opening a file
  NFSv4: Ensure delegation recall and byte range lock removal don't conflict
  NFSv4: Fix up the return values of nfs4_open_delegation_recall
  NFSv4.1: Don't lose locks when a server reboots during delegation return
  NFSv4.1: Prevent deadlocks between state recovery and file locking
  NFSv4: Allow the state manager to mark an open_owner as being recovered
  SUNRPC: Add missing static declaration to _gss_mech_get_by_name
  Revert "NFS: add nfs_sb_deactive_async to avoid deadlock"
  SUNRPC: Nuke the tk_xprt macro
  SUNRPC: Avoid RCU dereferences in the transport bind and connect code
  SUNRPC: Fix an RCU dereference in xprt_reserve
  SUNRPC: Pass pointers to struct rpc_xprt to the congestion window
  SUNRPC: Fix an RCU dereference in xs_local_rpcbind
  ...
2013-02-21 09:23:01 -08:00
Jeff Layton
173db30934 sunrpc: silence build warning in gss_fill_context
Since commit 620038f6d2, gcc is throwing the following warning:

  CC [M]  net/sunrpc/auth_gss/auth_gss.o
In file included from include/linux/sunrpc/types.h:14:0,
                 from include/linux/sunrpc/sched.h:14,
                 from include/linux/sunrpc/clnt.h:18,
                 from net/sunrpc/auth_gss/auth_gss.c:45:
net/sunrpc/auth_gss/auth_gss.c: In function ‘gss_pipe_downcall’:
include/linux/sunrpc/debug.h:45:10: warning: ‘timeout’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
    printk(KERN_DEFAULT args); \
          ^
net/sunrpc/auth_gss/auth_gss.c:194:15: note: ‘timeout’ was declared here
  unsigned int timeout;
               ^
If simple_get_bytes returns an error, then we'll end up calling printk
with an uninitialized timeout value. Reasonably harmless, but fairly
simple to fix by removing the printout of the uninitialised parameters.

Cc: Andy Adamson <andros@netapp.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
[Trond: just remove the parameters rather than initialising timeout]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-17 15:37:14 -05:00
J. Bruce Fields
cc630d9f47 svcrpc: fix rpc server shutdown races
Rewrite server shutdown to remove the assumption that there are no
longer any threads running (no longer true, for example, when shutting
down the service in one network namespace while it's still running in
others).

Do that by doing what we'd do in normal circumstances: just CLOSE each
socket, then enqueue it.

Since there may not be threads to handle the resulting queued xprts,
also run a simplified version of the svc_recv() loop run by a server to
clean up any closed xprts afterwards.

Cc: stable@kernel.org
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>
Tested-by: Paweł Sikora <pawel.sikora@agmk.net>
Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-17 10:53:51 -05:00
J. Bruce Fields
e75bafbff2 svcrpc: make svc_age_temp_xprts enqueue under sv_lock
svc_age_temp_xprts expires xprts in a two-step process: first it takes
the sv_lock and moves the xprts to expire off their server-wide list
(sv_tempsocks or sv_permsocks) to a local list.  Then it drops the
sv_lock and enqueues and puts each one.

I see no reason for this: svc_xprt_enqueue() will take sp_lock, but the
sv_lock and sp_lock are not otherwise nested anywhere (and documentation
at the top of this file claims it's correct to nest these with sp_lock
inside.)

Cc: stable@kernel.org
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>
Tested-by: Paweł Sikora <pawel.sikora@agmk.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-17 10:53:16 -05:00
Stanislav Kinsbursky
d94af6dea9 SUNRPC: move cache_detail->cache_request callback call to cache_read()
The reason to move cache_request() callback call from
sunrpc_cache_pipe_upcall() to cache_read() is that this garantees, that cache
access will be done userspace process context (only userspace process have
proper root context).
This is required for NFSd support in container: svc_export_request() (which is
cache_request callback) calls d_path(), which, in turn, traverse dentry up to
current->fs->root. Kernel threads always have global root, while container
have be in "root jail" - i.e. have it's own nested root.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 10:43:48 -05:00
Stanislav Kinsbursky
21cd1254d3 SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
Passing this pointer is redundant since it's stored on cache_detail structure,
which is also passed to sunrpc_cache_pipe_upcall () function.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 10:43:47 -05:00
Stanislav Kinsbursky
2d4383383b SUNRPC: rework cache upcall logic
For most of SUNRPC caches (except NFS DNS cache) cache_detail->cache_upcall is
redundant since all that it's implementations are doing is calling
sunrpc_cache_pipe_upcall() with proper function address argument.
Cache request function address is now stored on cache_detail structure and
thus all the code can be simplified.
Now, for those cache details, which doesn't have cache_upcall callback (the
only one, which still has is nfs_dns_resolve_template)
sunrpc_cache_pipe_upcall will be called instead.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 10:43:46 -05:00
Stanislav Kinsbursky
73fb847a44 SUNRPC: introduce cache_detail->cache_request callback
This callback will allow to simplify upcalls in further patches in this
series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 10:43:45 -05:00
Eric W. Biederman
f025adf191 sunrpc: Properly decode kuids and kgids in RPC_AUTH_UNIX credentials
When reading kuids from the wire map them into the initial user
namespace, and validate the mapping succeded.

When reading kgids from the wire map them into the initial user
namespace, and validate the mapping succeded.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:26 -08:00
Eric W. Biederman
25da926371 sunrpc: Properly encode kuids and kgids in auth.unix.gid rpc pipe upcalls.
When a new rpc connection is established with an in-kernel server, the
traffic passes through svc_process_common, and svc_set_client and down
into svcauth_unix_set_client if it is of type RPC_AUTH_NULL or
RPC_AUTH_UNIX.

svcauth_unix_set_client then looks at the uid of the credential we
have assigned to the incomming client and if we don't have the groups
already cached makes an upcall to get a list of groups that the client
can use.

The upcall encodes send a rpc message to user space encoding the uid
of the user whose groups we want to know.  Encode the kuid of the user
in the initial user namespace as nfs mounts can only happen today in
the initial user namespace.

When a reply to an upcall comes in convert interpret the uid and gid values
from the rpc pipe as uids and gids in the initial user namespace and convert
them into kuids and kgids before processing them further.

When reading proc files listing the uid to gid list cache convert the
kuids and kgids from into uids and gids the initial user namespace.  As we are
displaying server internal details it makes sense to display these values
from the servers perspective.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:25 -08:00
Eric W. Biederman
a570abbb96 sunrpc: Properly encode kuids and kgids in RPC_AUTH_UNIX credentials
When writing kuids onto the wire first map them into the initial user
namespace.

When writing kgids onto the wire first map them into the initial user
namespace.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:24 -08:00
Eric W. Biederman
9e469e30d7 sunrpc: Hash uids by first computing their value in the initial userns
In svcauth_unix introduce a helper unix_gid_hash as otherwise the
expresion to generate the hash value is just too long.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:23 -08:00
Eric W. Biederman
683428fae8 sunrpc: Update svcgss xdr handle to rpsec_contect cache
For each received uid call make_kuid and validate the result.
For each received gid call make_kgid and validate the result.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:22 -08:00
Eric W. Biederman
90602c7b19 sunrpc: Update gss uid to security context mapping.
- Use from_kuid when generating the on the wire uid values.
- Use make_kuid when reading on the wire values.

In gss_encode_v0_msg, since the uid in gss_upcall_msg is now a kuid_t
generate the necessary uid_t value on the stack copy it into
gss_msg->databuf where it can safely live until the message is no
longer needed.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:21 -08:00
Eric W. Biederman
e572fc7398 sunrpc: Use gid_valid to test for gid != INVALID_GID
In auth unix there are a couple of places INVALID_GID is used a
sentinel to mark the end of uc_gids array.  Use gid_valid
as a type safe way to verify we have not hit the end of
valid data in the array.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:20 -08:00
Eric W. Biederman
cdba321e29 sunrpc: Convert kuids and kgids to uids and gids for printing
When printing kuids and kgids for debugging purpropses convert them
to ordinary integers so their values can be fed to the oridnary
print functions.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:19 -08:00
Eric W. Biederman
9132adb021 sunrpc: Simplify auth_unix now that everything is a kgid_t
In unx_create_cred directly assign gids from acred->group_info
to cred->uc_gids.

In unx_match directly compare uc_gids with group_info.

Now that both group_info and unx_cred gids are stored as kgids
this is valid and the extra layer of translation can be removed.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:18 -08:00
Eric W. Biederman
0b4d51b02a sunrpc: Use uid_eq and gid_eq where appropriate
When comparing uids use uid_eq instead of ==.
When comparing gids use gid_eq instead of ==.

And unfortunate cost of type safety.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:17 -08:00
Eric W. Biederman
7eaf040b72 sunrpc: Use kuid_t and kgid_t where appropriate
Convert variables that store uids and gids to be of type
kuid_t and kgid_t instead of type uid_t and gid_t.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:16 -08:00
Eric W. Biederman
bf37f79437 sunrpc: Use userns friendly constants.
Instead of (uid_t)0 use GLOBAL_ROOT_UID.
Instead of (gid_t)0 use GLOBAL_ROOT_GID.
Instead of (uid_t)-1 use INVALID_UID
Instead of (gid_t)-1 use INVALID_GID.
Instead of NOGROUP use INVALID_GID.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:15:15 -08:00
Linus Torvalds
e06b84052a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Revert iwlwifi reclaimed packet tracking, it causes problems for a
    bunch of folks.  From Emmanuel Grumbach.

 2) Work limiting code in brcmsmac wifi driver can clear tx status
    without processing the event.  From Arend van Spriel.

 3) rtlwifi USB driver processes wrong SKB, fix from Larry Finger.

 4) l2tp tunnel delete can race with close, fix from Tom Parkin.

 5) pktgen_add_device() failures are not checked at all, fix from Cong
    Wang.

 6) Fix unintentional removal of carrier off from tun_detach(),
    otherwise we confuse userspace, from Michael S.  Tsirkin.

 7) Don't leak socket reference counts and ubufs in vhost-net driver,
    from Jason Wang.

 8) vmxnet3 driver gets it's initial carrier state wrong, fix from Neil
    Horman.

 9) Protect against USB networking devices which spam the host with 0
    length frames, from Bjørn Mork.

10) Prevent neighbour overflows in ipv6 for locally destined routes,
    from Marcelo Ricardo.  This is the best short-term fix for this, a
    longer term fix has been implemented in net-next.

11) L2TP uses ipv4 datagram routines in it's ipv6 code, whoops.  This
    mistake is largely because the ipv6 functions don't even have some
    kind of prefix in their names to suggest they are ipv6 specific.
    From Tom Parkin.

12) Check SYN packet drops properly in tcp_rcv_fastopen_synack(), from
    Yuchung Cheng.

13) Fix races and TX skb freeing bugs in via-rhine's NAPI support, from
    Francois Romieu and your's truly.

14) Fix infinite loops and divides by zero in TCP congestion window
    handling, from Eric Dumazet, Neal Cardwell, and Ilpo Järvinen.

15) AF_PACKET tx ring handling can leak kernel memory to userspace, fix
    from Phil Sutter.

16) Fix error handling in ipv6 GRE tunnel transmit, from Tommi Rantala.

17) Protect XEN netback driver against hostile frontend putting garbage
    into the rings, don't leak pages in TX GOP checking, and add proper
    resource releasing in error path of xen_netbk_get_requests().  From
    Ian Campbell.

18) SCTP authentication keys should be cleared out and released with
    kzfree(), from Daniel Borkmann.

19) L2TP is a bit too clever trying to maintain skb->truesize, and ends
    up corrupting socket memory accounting to the point where packet
    sending is halted indefinitely.  Just remove the adjustments
    entirely, they aren't really needed.  From Eric Dumazet.

20) ATM Iphase driver uses a data type with the same name as the S390
    headers, rename to fix the build.  From Heiko Carstens.

21) Fix a typo in copying the inner network header offset from one SKB
    to another, from Pravin B Shelar.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
  net: sctp: sctp_endpoint_free: zero out secret key data
  net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree
  atm/iphase: rename fregt_t -> ffreg_t
  net: usb: fix regression from FLAG_NOARP code
  l2tp: dont play with skb->truesize
  net: sctp: sctp_auth_key_put: use kzfree instead of kfree
  netback: correct netbk_tx_err to handle wrap around.
  xen/netback: free already allocated memory on failure in xen_netbk_get_requests
  xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.
  xen/netback: shutdown the ring if it contains garbage.
  net: qmi_wwan: add more Huawei devices, including E320
  net: cdc_ncm: add another Huawei vendor specific device
  ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()
  tcp: fix for zero packets_in_flight was too broad
  brcmsmac: rework of mac80211 .flush() callback operation
  ssb: unregister gpios before unloading ssb
  bcma: unregister gpios before unloading bcma
  rtlwifi: Fix scheduling while atomic bug
  net: usbnet: fix tx_dropped statistics
  tcp: ipv6: Update MIB counters for drops
  ...
2013-02-09 07:55:24 +11:00
Jeff Layton
4c190e2f91 sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
When GSSAPI integrity signatures are in use, or when we're using GSSAPI
privacy with the v2 token format, there is a trailing checksum on the
xdr_buf that is returned.

It's checked during the authentication stage, and afterward nothing
cares about it. Ordinarily, it's not a problem since the XDR code
generally ignores it, but it will be when we try to compute a checksum
over the buffer to help prevent XID collisions in the duplicate reply
cache.

Fix the code to trim off the checksums after verifying them. Note that
in unwrap_integ_data, we must avoid trying to reverify the checksum if
the request was deferred since it will no longer be present when it's
revisited.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2013-02-08 15:19:10 -05:00
Jeff Layton
5976687a2b sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to addr.h
These routines are used by server and client code, so having them in a
separate header would be best.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-05 09:41:14 -05:00
Trond Myklebust
c5f5e9c5d2 SUNRPC: Add missing static declaration to _gss_mech_get_by_name
Ditto for _gss_mech_get_by_pseudoflavor.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-01 10:13:48 -05:00
Trond Myklebust
ad2368d6f5 SUNRPC: Avoid RCU dereferences in the transport bind and connect code
Avoid an RCU dereference by removing task->tk_xprt

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-01 10:13:47 -05:00
Trond Myklebust
45bc0dce98 SUNRPC: Fix an RCU dereference in xprt_reserve
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-01 10:13:47 -05:00
Trond Myklebust
6a24dfb645 SUNRPC: Pass pointers to struct rpc_xprt to the congestion window
Avoid access to task->tk_xprt

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-01 10:13:47 -05:00
Trond Myklebust
3dc0da278e SUNRPC: Fix an RCU dereference in xs_local_rpcbind
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-01 10:13:47 -05:00
Trond Myklebust
1b092092bf SUNRPC: Pass a pointer to struct rpc_xprt to the connect callback
Avoid another RCU dereference by passing the pointer to struct rpc_xprt
from the caller.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-01 10:13:47 -05:00
Trond Myklebust
a4f0835c60 SUNRPC: Eliminate task->tk_xprt accesses that bypass rcu_dereference()
tk_xprt is just a shortcut for tk_client->cl_xprt, however cl_xprt is
defined as an __rcu variable. Replace dereferences of tk_xprt with
non-rcu dereferences where it is safe to do so.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-01 10:13:47 -05:00
Tom Parkin
73df66f8b1 ipv6: rename datagram_send_ctl and datagram_recv_ctl
The datagram_*_ctl functions in net/ipv6/datagram.c are IPv6-specific.  Since
datagram_send_ctl is publicly exported it should be appropriately named to
reflect the fact that it's for IPv6 only.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31 13:53:08 -05:00
Trond Myklebust
edd2e36fe8 SUNRPC: When changing the queue priority, ensure that we change the owner
This fixes a livelock in the xprt->sending queue where we end up never
making progress on lower priority tasks because sleep_on_priority()
keeps adding new tasks with the same owner to the head of the queue,
and priority bumps mean that we keep resetting the queue->owner to
whatever task is at the head of the queue.

Regression introduced by commit c05eecf636
(SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones).

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-01-30 17:45:14 -05:00
Andriy Skulysh
35525b7978 sunrpc: Fix lockd sleeping until timeout
There is a race in enqueueing thread to a pool and
waking up a thread.
lockd doesn't wake up on reception of lock granted callback
if svc_wake_up() is called before lockd's thread is added
to a pool.

Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23 18:17:39 -05:00
J. Bruce Fields
624ab46448 svcrpc: silence "unused variable" warning in !RPC_DEBUG case
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23 18:17:38 -05:00
Yanchuan Nian
ec16867617 nfsd: Remove write permission from file content
The write function doesn't be implemented in file content, and it's meaningless
to write data into this file directly. Remove write permission from it.

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23 18:17:38 -05:00
Greg Kroah-Hartman
ed408f7c0f Merge 3.9-rc4 into driver-core-next
This is to fix up a build problem with a wireless driver due to the
dynamic-debug patches in this branch.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17 19:48:18 -08:00
Linus Torvalds
93ccb3910a NFS client bugfixe for Linux 3.8
- Fix a socket lock leak in net/sunrpc/xprt.c
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ8FNyAAoJEGcL54qWCgDy1EAP/jetZgUmOLCV37TVAFDPkaDy
 ADjeIshsJt7T2/2zKWBoDQ4sKSNO3wRbuSQ9gaMPglfdf8j3PV38+2MOyL3L4yTp
 2L5RqVrbzs+xgIRN7uu6pajVNeZpZb4PqphO+2SnM8uSz6XMVpYRoDtVBiEhgF16
 F9csoBEX5HMC4AFhbkDoKOUoIb13cutYdd+0ijKnAwBrc31YUrcQDwUtZfcp8h2P
 xk4q/k5uj0ilHGafu0BkkMqyQLVocvp/FJXDQ5CjCI73J55hE7lcfM2LMavrJ0gA
 ACxE5+kr0vVOaasvpyu3nkntQ4Td6Z2PYbXCyIIlGvsyqCM8QgqUrfTU9zZauxRa
 mrRWgw0c/mqJ2o41Jl2GxWXCPIoDMX9izdZad3wZ9ct0OTTk6RumHTvnGo1XoZBI
 i5UTVgmnZoOFBQ+gWsxBay9rBjEoG2IBxsew7eEDPCXM0nIG0NztvGK7psFbjR1y
 +wPAgB9+NghOzTwH3GrC1zEK5tpGq1DAbyciT5HC7gk/1ZmfVcvT0iAqO6nkyeyX
 MArMSS6TAgR4IH+gr/qdybnwI6AezGVLiRwCScNPWyHq/gJ9tMCpZ+iodQKxMkoW
 PGHaldLdMWtL+PEEYAmqWclMTaEnnsgMbbqmU1PucWYZ9Ovq2Kktzucczd/2GwdO
 Gh2Utpg0vfAJSZkxy1yK
 =ukG7
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfix from Trond Myklebust:

- Fix a socket lock leak in net/sunrpc/xprt.c

* tag 'nfs-for-3.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Ensure we release the socket write lock if the rpc_task exits early
2013-01-11 12:09:04 -08:00
Kees Cook
3fb790ee3a net/sunrpc: remove depends on CONFIG_EXPERIMENTAL
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.

CC: Trond Myklebust <Trond.Myklebust@netapp.com>
CC: "J. Bruce Fields" <bfields@fieldses.org>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-01-11 11:40:02 -08:00
Randy Dunlap
7144bca681 nfs: fix sunrpc/clnt.c kernel-doc warnings
Fix new kernel-doc warnings in clnt.c:

  Warning(net/sunrpc/clnt.c:561): No description found for parameter 'flavor'
  Warning(net/sunrpc/clnt.c:561): Excess function parameter 'auth' description in 'rpc_clone_client_set_auth'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-10 14:35:23 -08:00
Trond Myklebust
87ed50036b SUNRPC: Ensure we release the socket write lock if the rpc_task exits early
If the rpc_task exits while holding the socket write lock before it has
allocated an rpc slot, then the usual mechanism for releasing the write
lock in xprt_release() is defeated.

The problem occurs if the call to xprt_lock_write() initially fails, so
that the rpc_task is put on the xprt->sending wait queue. If the task
exits after being assigned the lock by __xprt_lock_write_func, but
before it has retried the call to xprt_lock_and_alloc_slot(), then
it calls xprt_release() while holding the write lock, but will
immediately exit due to the test for task->tk_rqstp != NULL.

Reported-by: Chris Perl <chris.perl@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.1]
2013-01-08 14:30:43 -05:00
Trond Myklebust
360e1a5349 SUNRPC: Partial revert of commit 168e4b39d1
Partially revert commit (SUNRPC: add WARN_ON_ONCE for potential deadlock).
The looping behaviour has been tracked down to a knownn issue with
workqueues, and a workaround has now been implemented.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org [>= 3.7]
2013-01-04 12:59:30 -05:00
Trond Myklebust
c6567ed140 SUNRPC: Ensure that we free the rpc_task after cleanups are done
This patch ensures that we free the rpc_task after the cleanup callbacks
are done in order to avoid a deadlock problem that can be triggered if
the callback needs to wait for another workqueue item to complete.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org
2013-01-04 12:53:59 -05:00
Linus Torvalds
982197277c Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux
Pull nfsd update from Bruce Fields:
 "Included this time:

   - more nfsd containerization work from Stanislav Kinsbursky: we're
     not quite there yet, but should be by 3.9.

   - NFSv4.1 progress: implementation of basic backchannel security
     negotiation and the mandatory BACKCHANNEL_CTL operation.  See

       http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues

     for remaining TODO's

   - Fixes for some bugs that could be triggered by unusual compounds.
     Our xdr code wasn't designed with v4 compounds in mind, and it
     shows.  A more thorough rewrite is still a todo.

   - If you've ever seen "RPC: multiple fragments per record not
     supported" logged while using some sort of odd userland NFS client,
     that should now be fixed.

   - Further work from Jeff Layton on our mechanism for storing
     information about NFSv4 clients across reboots.

   - Further work from Bryan Schumaker on his fault-injection mechanism
     (which allows us to discard selective NFSv4 state, to excercise
     rarely-taken recovery code paths in the client.)

   - The usual mix of miscellaneous bugs and cleanup.

  Thanks to everyone who tested or contributed this cycle."

* 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits)
  nfsd4: don't leave freed stateid hashed
  nfsd4: free_stateid can use the current stateid
  nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
  nfsd: warn on odd reply state in nfsd_vfs_read
  nfsd4: fix oops on unusual readlike compound
  nfsd4: disable zero-copy on non-final read ops
  svcrpc: fix some printks
  NFSD: Correct the size calculation in fault_inject_write
  NFSD: Pass correct buffer size to rpc_ntop
  nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
  nfsd: simplify service shutdown
  nfsd: replace boolean nfsd_up flag by users counter
  nfsd: simplify NFSv4 state init and shutdown
  nfsd: introduce helpers for generic resources init and shutdown
  nfsd: make NFSd service structure allocated per net
  nfsd: make NFSd service boot time per-net
  nfsd: per-net NFSd up flag introduced
  nfsd: move per-net startup code to separated function
  nfsd: pass net to __write_ports() and down
  nfsd: pass net to nfsd_set_nrthreads()
  ...
2012-12-20 14:04:11 -08:00
J. Bruce Fields
afc59400d6 nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
It may be a matter of personal taste, but I find this makes the code
clearer.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17 22:00:16 -05:00
J. Bruce Fields
3a28e33111 svcrpc: fix some printks
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17 16:02:40 -05:00
Stanislav Kinsbursky
cd6c596858 SUNRPC: continue run over clients list on PipeFS event instead of break
There are SUNRPC clients, which program doesn't have pipe_dir_name. These
clients can be skipped on PipeFS events, because nothing have to be created or
destroyed. But instead of breaking in case of such a client was found, search
for suitable client over clients list have to be continued. Otherwise some
clients could not be covered by PipeFS event handler.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org [>= v3.4]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-17 12:19:16 -05:00
Trond Myklebust
1efc28780b SUNRPC: variable 'svsk' is unused in function bc_send_request
Silence a compile time warning.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15 17:05:57 -05:00
Trond Myklebust
4a20a988f7 SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket
Silence the unnecessary warning "unhandled error (111) connecting to..."
and convert it to a dprintk for debugging purposes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15 17:02:29 -05:00
Andy Adamson
eb96d5c97b SUNRPC handle EKEYEXPIRED in call_refreshresult
Currently, when an RPCSEC_GSS context has expired or is non-existent
and the users (Kerberos) credentials have also expired or are non-existent,
the client receives the -EKEYEXPIRED error and tries to refresh the context
forever.  If an application is performing I/O, or other work against the share,
the application hangs, and the user is not prompted to refresh/establish their
credentials. This can result in a denial of service for other users.

Users are expected to manage their Kerberos credential lifetimes to mitigate
this issue.

Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number
of times to refresh the gss_context, and then return -EACCES to the application.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-12 15:36:02 -05:00
Andy Adamson
620038f6d2 SUNRPC set gss gc_expiry to full lifetime
Only use the default GSSD_MIN_TIMEOUT if the gss downcall timeout is zero.
Store the full lifetime in gc_expiry (not 3/4 of the lifetime) as subsequent
patches will use the gc_expiry to determine buffered WRITE behavior in the
face of expired or soon to be expired gss credentials.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-12 15:35:59 -05:00
Trond Myklebust
7ce0171d4f Merge branch 'bugfixes' into nfs-for-next 2012-12-11 09:16:26 -05:00
Stanislav Kinsbursky
756933ee8a SUNRPC: remove redundant "linux/nsproxy.h" includes
This is a cleanup patch.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:30 -05:00