Commit Graph

661563 Commits

Author SHA1 Message Date
Linus Torvalds
8f03cf50bc NFS client updates for Linux 4.11
Stable bugfixes:
 - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state
 - xprtrdma: Fix Read chunk padding
 - xprtrdma: Per-connection pad optimization
 - xprtrdma: Disable pad optimization by default
 - xprtrdma: Reduce required number of send SGEs
 - nlm: Ensure callback code also checks that the files match
 - pNFS/flexfiles: If the layout is invalid, it must be updated before retrying
 - NFSv4: Fix reboot recovery in copy offload
 - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE"
 - NFSv4: fix getacl head length estimation
 - NFSv4: fix getacl ERANGE for sum ACL buffer sizes
 
 Features:
 - Add and use dprintk_cont macros
 - Various cleanups to NFS v4.x to reduce code duplication and complexity
 - Remove unused cr_magic related code
 - Improvements to sunrpc "read from buffer" code
 - Clean up sunrpc timeout code and allow changing TCP timeout parameters
 - Remove duplicate mw_list management code in xprtrdma
 - Add generic functions for encoding and decoding xdr streams
 
 Bugfixes:
 - Clean up nfs_show_mountd_netid
 - Make layoutreturn_ops static and use NULL instead of 0 to fix sparse warnings
 - Properly handle -ERESTARTSYS in nfs_rename()
 - Check if register_shrinker() failed during rpcauth_init()
 - Properly clean up procfs/pipefs entries
 - Various NFS over RDMA related fixes
 - Silence unititialized variable warning in sunrpc
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAli3F7YACgkQ18tUv7Cl
 QOvzrQ//dL+nnBaqsm9bA2wwuVJSQ2R1zdkwHOCWghEWROZrQHzpi0VHu0ZKBLzr
 YsYFhHvIPax9Q8USY4B/QFQ3eUuZILEVn+xDruRxZaJPnsA4Zmr16VJwGF2F68Lh
 CGekA5qybqy8lAG6v96Gyjbi+JqjHNCmelYWRv7SX9IZcDjNJpsEbrSI4LkabTWh
 70WtCl3LBzVMRYRxe8+f0mcx4g4XCQ8pDaQRgRnfKtNeQk/+PgWz66xSNinDakVb
 A8AkaiUadPRgUTpap6HfBSicpRvtLQeLhARC0E4YE5pXp2H/kUt2MFe5szblfSCv
 zf2nrPUbNEHjBypFhERzCZZk6EonY6FeOojyW0g2C+rmPdK7WLlKbwTQFxdRGvsx
 78fIiPRdlDHDp9CXzD8V4xxRBJX/KkicA1Vp8CoyQtmpzpu2fjwT0kr9HeD+aEe6
 293+72QUfk05re2HYWF9MCGGVVLdnLLjrKCgwwRQ0HX5WF6GNQxX/yVgBVlqFeV3
 xc8m7ltKco5N9JxIqwlIpySq2e114EQOqsmHYz3gxd7ID9J1NJz+9H2z2EvgAKZ7
 wIPSLoZrdBdnoXG8ZDDTAvPKeB8l6egi6wjrvGKxewVlMbjzogdARsMKWoifnCfG
 HMkH+IEvLGvFc1pPeLbscJGEdVWXVn0thO+8fkS9F9sE/zMX9PA=
 =01DU
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "Highlights include:

  Stable bugfixes:
   - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state
   - xprtrdma: Fix Read chunk padding
   - xprtrdma: Per-connection pad optimization
   - xprtrdma: Disable pad optimization by default
   - xprtrdma: Reduce required number of send SGEs
   - nlm: Ensure callback code also checks that the files match
   - pNFS/flexfiles: If the layout is invalid, it must be updated before
     retrying
   - NFSv4: Fix reboot recovery in copy offload
   - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION
     replies to OP_SEQUENCE"
   - NFSv4: fix getacl head length estimation
   - NFSv4: fix getacl ERANGE for sum ACL buffer sizes

  Features:
   - Add and use dprintk_cont macros
   - Various cleanups to NFS v4.x to reduce code duplication and
     complexity
   - Remove unused cr_magic related code
   - Improvements to sunrpc "read from buffer" code
   - Clean up sunrpc timeout code and allow changing TCP timeout
     parameters
   - Remove duplicate mw_list management code in xprtrdma
   - Add generic functions for encoding and decoding xdr streams

  Bugfixes:
   - Clean up nfs_show_mountd_netid
   - Make layoutreturn_ops static and use NULL instead of 0 to fix
     sparse warnings
   - Properly handle -ERESTARTSYS in nfs_rename()
   - Check if register_shrinker() failed during rpcauth_init()
   - Properly clean up procfs/pipefs entries
   - Various NFS over RDMA related fixes
   - Silence unititialized variable warning in sunrpc"

* tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (64 commits)
  NFSv4: fix getacl ERANGE for some ACL buffer sizes
  NFSv4: fix getacl head length estimation
  Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE"
  NFSv4: Fix reboot recovery in copy offload
  pNFS/flexfiles: If the layout is invalid, it must be updated before retrying
  NFSv4: Clean up owner/group attribute decode
  SUNRPC: Add a helper function xdr_stream_decode_string_dup()
  NFSv4: Remove bogus "struct nfs_client" argument from decode_ace()
  NFSv4: Fix the underestimation of delegation XDR space reservation
  NFSv4: Replace callback string decode function with a generic
  NFSv4: Replace the open coded decode_opaque_inline() with the new generic
  NFSv4: Replace ad-hoc xdr encode/decode helpers with xdr_stream_* generics
  SUNRPC: Add generic helpers for xdr_stream encode/decode
  sunrpc: silence uninitialized variable warning
  nlm: Ensure callback code also checks that the files match
  sunrpc: Allow xprt->ops->timer method to sleep
  xprtrdma: Refactor management of mw_list field
  xprtrdma: Handle stale connection rejection
  xprtrdma: Properly recover FRWRs with in-flight FASTREG WRs
  xprtrdma: Shrink send SGEs array
  ...
2017-03-01 16:10:30 -08:00
Linus Torvalds
25c4e6c3f0 for-f2fs-4.11
This round introduces several interesting features such as on-disk NAT bitmaps,
 IO alignment, and a discard thread. And it includes a couple of major bug fixes
 as below.
 
 == Enhancement ==
 - introduce on-disk bitmaps to avoid scanning NAT blocks when getting free nids
 - support IO alignment to prepare open-channel SSD integration in future
 - introduce a discard thread to avoid long latency during checkpoint and fstrim
 - use SSR for warm node and enable inline_xattr by default
 - introduce in-memory bitmaps to check FS consistency for debugging
 - improve write_begin by avoiding needless read IO
 
 == Bug fix ==
 - fix broken zone_reset behavior for SMR drive
 - fix wrong victim selection policy during GC
 - fix missing behavior when preparing discard commands
 - fix bugs in atomic write support and fiemap
 - workaround to handle multiple f2fs_add_link calls having same name
 
 And it includes a bunch of clean-up patches as well.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYtmdOAAoJEEAUqH6CSFDSs0UP/AzngT37xVIhVBD13J9oHIuv
 rFA/eHVGRJmU1xc4SG1bghKm45xq8rwUX7irarfvLLc5aL+6VPGSdaRBykUr4A5N
 MN/bgK//EPp7If8EF+8PpY+9x7g67i0mtz5iD8dDrK+bUKV/IDKV1LWw5pR3g/g6
 RwMH0dUVOiD/HJ5iFp1ykTdVPe4vFY013uVmyPxUq+nCBlqlQm1nOvrGjF/HeYyX
 kqcD2LEc79GPfS5ebQIKfCfLE0rsWVnnS6YaqlDNCD5/oRim71CUtA4MPTYv29vp
 R/SebWlayEm+u68+uQUu6AyIk/1IdP0+AtRuQd/VxuteoyXmkTMHER662DqN4F8J
 npPdNrbNdlzwuAP77avy+hplqbD19yUa7o7Fl1No5rfheT3CiNTSj2uoriyEAffH
 1AM6tES7S7n5ttrXOr9iOxrK0u/vuaf7fbKVtK+RI09hwzdvyGB5HUdQB0iP/XR+
 obw8dru79ISMVZ9YuDhSfjI5ohAcfthfuqgjUt2RAfDv19IRsg5eayAp3T6nUfEX
 AGQbV/52dkO9svZztMbcBW95zmqkE0cMeX66KIMCPXNuDiE474t8k115K6kHpFwP
 e4Kx+mTSNhR1LEAaVdmCjbLb0gVrumVHTdjaZopnxTFmE70u/M6h1vY90m1LkReF
 ZDK5mhfMmGzU4wkvbgP8
 =tw8c
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "This round introduces several interesting features such as on-disk NAT
  bitmaps, IO alignment, and a discard thread. And it includes a couple
  of major bug fixes as below.

  Enhancements:

   - introduce on-disk bitmaps to avoid scanning NAT blocks when getting
     free nids

   - support IO alignment to prepare open-channel SSD integration in
     future

   - introduce a discard thread to avoid long latency during checkpoint
     and fstrim

   - use SSR for warm node and enable inline_xattr by default

   - introduce in-memory bitmaps to check FS consistency for debugging

   - improve write_begin by avoiding needless read IO

  Bug fixes:

   - fix broken zone_reset behavior for SMR drive

   - fix wrong victim selection policy during GC

   - fix missing behavior when preparing discard commands

   - fix bugs in atomic write support and fiemap

   - workaround to handle multiple f2fs_add_link calls having same name

  ... and it includes a bunch of clean-up patches as well"

* tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (97 commits)
  f2fs: avoid to flush nat journal entries
  f2fs: avoid to issue redundant discard commands
  f2fs: fix a plint compile warning
  f2fs: add f2fs_drop_inode tracepoint
  f2fs: Fix zoned block device support
  f2fs: remove redundant set_page_dirty()
  f2fs: fix to enlarge size of write_io_dummy mempool
  f2fs: fix memory leak of write_io_dummy mempool during umount
  f2fs: fix to update F2FS_{CP_}WB_DATA count correctly
  f2fs: use MAX_FREE_NIDS for the free nids target
  f2fs: introduce free nid bitmap
  f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint
  f2fs: update the comment of default nr_pages to skipping
  f2fs: drop the duplicate pval in f2fs_getxattr
  f2fs: Don't update the xattr data that same as the exist
  f2fs: kill __is_extent_same
  f2fs: avoid bggc->fggc when enough free segments are avaliable after cp
  f2fs: select target segment with closer temperature in SSR mode
  f2fs: show simple call stack in fault injection message
  f2fs: no need lock_op in f2fs_write_inline_data
  ...
2017-03-01 15:55:04 -08:00
Omar Sandoval
c4baad5029 virtio-console: avoid DMA from stack
put_chars() stuffs the buffer it gets into an sg, but that buffer may be
on the stack. This breaks with CONFIG_VMAP_STACK=y (for me, it
manifested as printks getting turned into NUL bytes).

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
2017-03-02 01:35:06 +02:00
Jason Wang
f889491380 vhost: introduce O(1) vq metadata cache
When device IOTLB is enabled, all address translations were stored in
interval tree. O(lgN) searching time could be slow for virtqueue
metadata (avail, used and descriptors) since they were accessed much
often than other addresses. So this patch introduces an O(1) array
which points to the interval tree nodes that store the translations of
vq metadata. Those array were update during vq IOTLB prefetching and
were reset during each invalidation and tlb update. Each time we want
to access vq metadata, this small array were queried before interval
tree. This would be sufficient for static mappings but not dynamic
mappings, we could do optimizations on top.

Test were done with l2fwd in guest (2M hugepage):

   noiommu  | before        | after
tx 1.32Mpps | 1.06Mpps(82%) | 1.30Mpps(98%)
rx 2.33Mpps | 1.46Mpps(63%) | 2.29Mpps(98%)

We can almost reach the same performance as noiommu mode.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 01:35:06 +02:00
Stephen Smalley
2651225b5e selinux: wrap cgroup seclabel support with its own policy capability
commit 1ea0ce4069 ("selinux: allow
changing labels for cgroupfs") broke the Android init program,
which looks up security contexts whenever creating directories
and attempts to assign them via setfscreatecon().
When creating subdirectories in cgroup mounts, this would previously
be ignored since cgroup did not support userspace setting of security
contexts.  However, after the commit, SELinux would attempt to honor
the requested context on cgroup directories and fail due to permission
denial.  Avoid breaking existing userspace/policy by wrapping this change
with a conditional on a new cgroup_seclabel policy capability.  This
preserves existing behavior until/unless a new policy explicitly enables
this capability.

Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-02 10:27:40 +11:00
David Howells
0837e49ab3 KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()
rcu_dereference_key() and user_key_payload() are currently being used in
two different, incompatible ways:

 (1) As a wrapper to rcu_dereference() - when only the RCU read lock used
     to protect the key.

 (2) As a wrapper to rcu_dereference_protected() - when the key semaphor is
     used to protect the key and the may be being modified.

Fix this by splitting both of the key wrappers to produce:

 (1) RCU accessors for keys when caller has the key semaphore locked:

	dereference_key_locked()
	user_key_payload_locked()

 (2) RCU accessors for keys when caller holds the RCU read lock:

	dereference_key_rcu()
	user_key_payload_rcu()

This should fix following warning in the NFS idmapper

  ===============================
  [ INFO: suspicious RCU usage. ]
  4.10.0 #1 Tainted: G        W
  -------------------------------
  ./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage!
  other info that might help us debug this:
  rcu_scheduler_active = 2, debug_locks = 0
  1 lock held by mount.nfs/5987:
    #0:  (rcu_read_lock){......}, at: [<d000000002527abc>] nfs_idmap_get_key+0x15c/0x420 [nfsv4]
  stack backtrace:
  CPU: 1 PID: 5987 Comm: mount.nfs Tainted: G        W       4.10.0 #1
  Call Trace:
    dump_stack+0xe8/0x154 (unreliable)
    lockdep_rcu_suspicious+0x140/0x190
    nfs_idmap_get_key+0x380/0x420 [nfsv4]
    nfs_map_name_to_uid+0x2a0/0x3b0 [nfsv4]
    decode_getfattr_attrs+0xfac/0x16b0 [nfsv4]
    decode_getfattr_generic.constprop.106+0xbc/0x150 [nfsv4]
    nfs4_xdr_dec_lookup_root+0xac/0xb0 [nfsv4]
    rpcauth_unwrap_resp+0xe8/0x140 [sunrpc]
    call_decode+0x29c/0x910 [sunrpc]
    __rpc_execute+0x140/0x8f0 [sunrpc]
    rpc_run_task+0x170/0x200 [sunrpc]
    nfs4_call_sync_sequence+0x68/0xa0 [nfsv4]
    _nfs4_lookup_root.isra.44+0xd0/0xf0 [nfsv4]
    nfs4_lookup_root+0xe0/0x350 [nfsv4]
    nfs4_lookup_root_sec+0x70/0xa0 [nfsv4]
    nfs4_find_root_sec+0xc4/0x100 [nfsv4]
    nfs4_proc_get_rootfh+0x5c/0xf0 [nfsv4]
    nfs4_get_rootfh+0x6c/0x190 [nfsv4]
    nfs4_server_common_setup+0xc4/0x260 [nfsv4]
    nfs4_create_server+0x278/0x3c0 [nfsv4]
    nfs4_remote_mount+0x50/0xb0 [nfsv4]
    mount_fs+0x74/0x210
    vfs_kern_mount+0x78/0x220
    nfs_do_root_mount+0xb0/0x140 [nfsv4]
    nfs4_try_mount+0x60/0x100 [nfsv4]
    nfs_fs_mount+0x5ec/0xda0 [nfs]
    mount_fs+0x74/0x210
    vfs_kern_mount+0x78/0x220
    do_mount+0x254/0xf70
    SyS_mount+0x94/0x100
    system_call+0x38/0xe0

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-03-02 10:09:00 +11:00
David S. Miller
16c54ac903 First round of fixes - details in the commits:
* use a valid hrtimer clock ID in mac80211_hwsim
  * don't reorder frames prior to BA session
  * flush a delayed work at suspend so the state is all valid before
    suspend/resume
  * fix packet statistics in fast-RX, the RX packets
    counter increment was simply missing
  * don't try to re-transmit filtered frames in an aggregation session
  * shorten (for tracing) a debug message
  * typo fix in another debug message
  * fix nul-termination with HWSIM_ATTR_RADIO_NAME in hwsim
  * fix mgmt RX processing when station is looked up by driver/device
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAli1HCYACgkQa3t4Rpy0
 AB1Sfw/+MeYZtAqeem6JzNW7OPPHgJIXHAnDyby79jzDiYZ1GF2c2asWMwBMshuH
 dTYFs15gdkfItrUF9SsvGFLRaRlPJO899lgL54EwcALxqP82Byylm36X29HTUyzB
 MS3lzrr84Ad5F/BkHa949ogcoJwmkmhSYpG5L5EclJs98Vu8pyd/AlBJ+fbNFmHz
 Pe2lDiIjt5H5MvGCZfl5PK1l9GwDLIMhLekn03HEfCN5lNz2c8aru1dhcCXxWSPG
 Oxj2J4n0WknwntKLx5F/Wceee8YDRyUBw0peKYxp8VQPPLVzpoc9GKlaCDBemwVQ
 sfVfgkkZchKtAt1SPJS5yKMfUlEQdqE3w3XIzzRS9bN3AsVK5EKL63zDO7HJ+5nZ
 2hB8kY4fjGUx0RrZKKL0jruSl+AUP5bWJUIt2VM55FM195rXUkDSjsn3M1ZIwiHp
 8fvwLWibKoQg7Y0UIdVQSl03Gfd2dEs0XrvD+fNMI9aNy/EriJZ15pxbNAPSi10Z
 pN02zFCmqxju6w1tppxW9eBoxdbt45lr1cyXY+GN6RTgm2ILM9i+tQHoNf4KuWYl
 5dydXhDkRGpUghevR4K29vBYtoaNK0CteY2xECdRGUVFnMl4QUgHInOeKcSneHCi
 Cvp31WYTAP+U/8ni0VYENqEC0iRbUNrMu2mkECxriHt5go+n5jA=
 =wths
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2017-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
First round of fixes - details in the commits:
 * use a valid hrtimer clock ID in mac80211_hwsim
 * don't reorder frames prior to BA session
 * flush a delayed work at suspend so the state is all valid before
   suspend/resume
 * fix packet statistics in fast-RX, the RX packets
   counter increment was simply missing
 * don't try to re-transmit filtered frames in an aggregation session
 * shorten (for tracing) a debug message
 * typo fix in another debug message
 * fix nul-termination with HWSIM_ATTR_RADIO_NAME in hwsim
 * fix mgmt RX processing when station is looked up by driver/device
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 15:08:34 -08:00
Eric Dumazet
449809a66c tcp/dccp: block BH for SYN processing
SYN processing really was meant to be handled from BH.

When I got rid of BH blocking while processing socket backlog
in commit 5413d1babe ("net: do not block BH while processing socket
backlog"), I forgot that a malicious user could transition to TCP_LISTEN
from a state that allowed (SYN) packets to be parked in the socket
backlog while socket is owned by the thread doing the listen() call.

Sure enough syzkaller found this and reported the bug ;)

=================================
[ INFO: inconsistent lock state ]
4.10.0+ #60 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[<ffffffff83a6a370>] spin_lock include/linux/spinlock.h:299 [inline]
 (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[<ffffffff83a6a370>] inet_ehash_insert+0x240/0xad0
net/ipv4/inet_hashtables.c:407
{IN-SOFTIRQ-W} state was registered at:
  mark_irqflags kernel/locking/lockdep.c:2923 [inline]
  __lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
  lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:299 [inline]
  inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
  reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
  inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
  tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
  tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
  tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
  tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
  tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
  ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
  NF_HOOK include/linux/netfilter.h:257 [inline]
  ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
  dst_input include/net/dst.h:492 [inline]
  ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
  NF_HOOK include/linux/netfilter.h:257 [inline]
  ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
  __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
  __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
  netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
  napi_skb_finish net/core/dev.c:4602 [inline]
  napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
  e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
  e1000_clean_rx_irq+0x5e0/0x1490
drivers/net/ethernet/intel/e1000/e1000_main.c:4489
  e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
  napi_poll net/core/dev.c:5171 [inline]
  net_rx_action+0xe70/0x1900 net/core/dev.c:5236
  __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
  invoke_softirq kernel/softirq.c:364 [inline]
  irq_exit+0x19e/0x1d0 kernel/softirq.c:405
  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
  do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
  ret_from_intr+0x0/0x20
  native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
  default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
  default_idle_call+0x36/0x60 kernel/sched/idle.c:96
  cpuidle_idle_call kernel/sched/idle.c:154 [inline]
  do_idle+0x348/0x440 kernel/sched/idle.c:243
  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
  start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
  verify_cpu+0x0/0xfc
irq event stamp: 1741
hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
[inline]
hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
_raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
hardirqs last disabled at (1740): [<ffffffff84d4a732>]
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (1740): [<ffffffff84d4a732>]
_raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
softirqs last  enabled at (1738): [<ffffffff84d4deff>]
__do_softirq+0x7cf/0xb7d kernel/softirq.c:310
softirqs last disabled at (1571): [<ffffffff84d4b92c>]
do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&hashinfo->ehash_locks[i])->rlock);
  <Interrupt>
    lock(&(&hashinfo->ehash_locks[i])->rlock);

 *** DEADLOCK ***

1 lock held by syz-executor0/5090:
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>] lock_sock
include/net/sock.h:1460 [inline]
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>]
sock_setsockopt+0x233/0x1e40 net/core/sock.c:683

stack backtrace:
CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
 valid_state kernel/locking/lockdep.c:2400 [inline]
 mark_lock_irq kernel/locking/lockdep.c:2602 [inline]
 mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3065
 mark_irqflags kernel/locking/lockdep.c:2941 [inline]
 __lock_acquire+0x6dc/0x3270 kernel/locking/lockdep.c:3295
 lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:299 [inline]
 inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
 reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
 inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
 dccp_v6_conn_request+0xada/0x11b0 net/dccp/ipv6.c:380
 dccp_rcv_state_process+0x51e/0x1660 net/dccp/input.c:606
 dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
 sk_backlog_rcv include/net/sock.h:896 [inline]
 __release_sock+0x127/0x3a0 net/core/sock.c:2052
 release_sock+0xa5/0x2b0 net/core/sock.c:2539
 sock_setsockopt+0x60f/0x1e40 net/core/sock.c:1016
 SYSC_setsockopt net/socket.c:1782 [inline]
 SyS_setsockopt+0x2fb/0x3a0 net/socket.c:1765
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458b9
RSP: 002b:00007fe8b26c2b58 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00000000004458b9
RDX: 000000000000001a RSI: 0000000000000001 RDI: 0000000000000006
RBP: 00000000006e2110 R08: 0000000000000010 R09: 0000000000000000
R10: 00000000208c3000 R11: 0000000000000292 R12: 0000000000708000
R13: 0000000020000000 R14: 0000000000001000 R15: 0000000000000000

Fixes: 5413d1babe ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 15:03:31 -08:00
Gary Lin
eba38a9682 bpf: update the comment about the length of analysis
Commit 07016151a4 ("bpf, verifier: further improve search
pruning") increased the limit of processed instructions from
32k to 64k, but the comment still mentioned the 32k limit.
This commit updates the comment to reflect the change.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Gary Lin <glin@suse.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 14:56:50 -08:00
Yotam Gigi
df2c43343b bridge: Fix error path in nbp_vlan_init
Fix error path order in nbp_vlan_init, so if switchdev_port_attr_set
call failes, the vlan_hash wouldn't be destroyed before inited.

Fixes: efa5356b0d ("bridge: per vlan dst_metadata netlink support")
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 14:55:28 -08:00
Pavel Shilovsky
61cfac6f26 CIFS: Fix possible use after free in demultiplex thread
The recent changes that added SMB3 encryption support introduced
a possible use after free in the demultiplex thread. When we
process an encrypted packed we obtain a pointer to SMB session
but do not obtain a reference. This can possibly lead to a situation
when this session was freed before we copy a decryption key from
there. Fix this by obtaining a copy of the key rather than a pointer
to the session under a spinlock.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-03-01 16:42:40 -06:00
Rafael J. Wysocki
6bff9c609f Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull changes related to turbostat for v4.11 from Len Brown.

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (44 commits)
  tools/power turbostat: version 17.02.24
  tools/power turbostat: bugfix: --add u32 was printed as u64
  tools/power turbostat: show error on exec
  tools/power turbostat: dump p-state software config
  tools/power turbostat: show package number, even without --debug
  tools/power turbostat: support "--hide C1" etc.
  tools/power turbostat: move --Package and --processor into the --cpu option
  tools/power turbostat: turbostat.8 update
  tools/power turbostat: update --list feature
  tools/power turbostat: use wide columns to display large numbers
  tools/power turbostat: Add --list option to show available header names
  tools/power turbostat: fix zero IRQ count shown in one-shot command mode
  tools/power turbostat: add --cpu parameter
  tools/power turbostat: print sysfs C-state stats
  tools/power turbostat: extend --add option to accept /sys path
  tools/power turbostat: skip unused counters on BDX
  tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits
  tools/power turbostat: skip unused counters on SKX
  tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7
  tools/power turbostat: initial Gemini Lake SOC support
  ...
2017-03-01 23:34:38 +01:00
Viresh Kumar
28db0c7b1c PM / OPP: Documentation: Fix opp-microvolt in examples
The triplet present in "opp-microvolt" property should be in the order
<target min max>, while all the examples have it in the order
<min target max>.

Fix it.

Luckily all of the users of "opp-microvolt" property have applied brain
instead of copying the examples from documentation and none of the
actual dts files have it wrong.

Reported-by: Rob Herring <robh@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-01 22:47:30 +01:00
Max Filippov
b46dcfa378 xtensa: allow merging vectors into .text section
Currently code for exception/IRQ vectors is stored in kernel image as
initialization data and is copied to its working addresses during
startup. It doesn't always make sense. In many cases vectors location
can be automatically decided at kernel link time and code can be placed
right there. This is especially useful for XIP kernel.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-01 12:32:50 -08:00
Max Filippov
9a736fcb09 xtensa: clean up bootable image build targets
Currently xtensa uses 'zImage' as a synonym of 'all', but in fact xtensa
supports three targets: 'Image' (ELF image with reset vector), 'zImage'
(compressed redboot image) and 'uImage' (U-Boot image).
Provide separate 'Image', 'zImage' and 'uImage' make targets that only
build corresponding image type. Make 'all' build all images appropriate
for a platform.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-01 12:32:50 -08:00
Max Filippov
4ab18701c6 xtensa: move parse_tag_fdt out of #ifdef CONFIG_BLK_DEV_INITRD
FDT tag parsing is not related to whether BLK_DEV_INITRD is configured
or not, move it out of the corresponding #ifdef/#endif block.
This fixes passing external FDT to the kernel configured w/o
BLK_DEV_INITRD support.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-01 12:31:52 -08:00
Josh Poimboeuf
e390f9a968 objtool, modules: Discard objtool annotation sections for modules
The '__unreachable' and '__func_stack_frame_non_standard' sections are
only used at compile time.  They're discarded for vmlinux but they
should also be discarded for modules.

Since this is a recurring pattern, prefix the section names with
".discard.".  It's a nice convention and vmlinux.lds.h already discards
such sections.

Also remove the 'a' (allocatable) flag from the __unreachable section
since it doesn't make sense for a discarded section.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: d1091c7fa3 ("objtool: Improve detection of BUG() and other dead ends")
Link: http://lkml.kernel.org/r/20170301180444.lhd53c5tibc4ns77@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-01 20:32:25 +01:00
Linus Torvalds
6053dc9814 - Fix kernel panic on specific Qualcomm platform due to broken erratum
workaround
 - Revert contiguous bit support due to TLB conflict aborts in simulation
 - Don't treat all CPU ID register fields as 4-bit quantities
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJYsHbpAAoJELescNyEwWM03xgH/jPZwxdS0UL1WftjqE7VXBI2
 5STwLBXB8cBr527QHBY7VCbtEQZAFnij7vJ8Eqe6nA8SMDbC1RTJ2ZkiPq0rKjVg
 pVJEdd3jEcRb3HNam90tOBTjlsyuR5Eagj0RI07j+YgsKhCxVTf6wu7z2StKhbuk
 o8P3fbV9JA9c68JR2MR7Z2GFe2pZW0vbRrKoSx6CdoCU8Wod46BUq+P+BoeVR+vZ
 593ERXnDi9tzoWFLyJYobO0PVuoipjX4e6+NxX+hPRKck6gJByainhnl63gGil9L
 vavJ2r3GK4pO5qAsusx6eXQksyR9lX5tGPk7XgJi+M+WMd4Moe+/zNXd4NADGww=
 =aG7Y
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "The main fix here addresses a kernel panic triggered on Qualcomm
  QDF2400 due to incorrect register usage in an erratum workaround
  introduced during the merge window.

  Summary:

   - Fix kernel panic on specific Qualcomm platform due to broken
     erratum workaround

   - Revert contiguous bit support due to TLB conflict aborts in
     simulation

   - Don't treat all CPU ID register fields as 4-bit quantities"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/cpufeature: check correct field width when updating sys_val
  Revert "arm64: mm: set the contiguous bit for kernel mappings where appropriate"
  arm64: Avoid clobbering mm in erratum workaround on QDF2400
2017-03-01 10:32:30 -08:00
Liping Zhang
3b45a4106f net: route: add missing nla_policy entry for RTA_MARK attribute
This will add stricter validating for RTA_MARK attribute.

Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 10:25:56 -08:00
Felix Jia
8c171d6ca5 net/ipv6: avoid possible dead locking on addr_gen_mode sysctl
The addr_gen_mode variable can be accessed by both sysctl and netlink.
Repleacd rtnl_lock() with rtnl_trylock() protect the sysctl operation to
avoid the possbile dead lock.`

Signed-off-by: Felix Jia <felix.jia@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 10:22:48 -08:00
Linus Torvalds
b286cedd47 powerpc updates for 4.11 part 2
Highlights include:
 
  - An update of the disassembly code used by xmon to the latest versions in
    binutils. We've received permission from all the authors of the relevant
    binutils changes to relicense their changes to the relevant files from GPLv3
    to GPLv2, for inclusion in Linux. Thanks to Peter Bergner for doing the leg
    work to get permission from everyone.
 
  - Addition of the "architected" Power9 CPU table entry, allowing us to boot
    in Power9 architected mode under a hypervisor.
 
  - Updates to the Power9 PMU code.
 
  - Implementation of clear_bit_unlock_is_negative_byte() to optimise
    unlock_page().
 
  - Freescale updates from Scott: "Highlights include 8xx breakpoints and perf,
    t1042rdb display support, and board updates."
 
 Thanks to:
   Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas Miller,
   Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan, Michael Roth, Nathan
   Fontenot, Naveen N. Rao, Nicholas Piggin, Peter Bergner, Paul E. McKenney,
   Rashmica Gupta, Russell Currey, Sahil Mehta, Stewart Smith.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYthsKAAoJEFHr6jzI4aWAaWMQAJ7mAwX98ncoYschPgRmmIun
 f6DtE4IonrxiZ22gp1ct4+c9OFtA+B5FXMcEhOKpfh93lg38PTDjHs9e5kfauD7+
 oTQ2Bg1eXaL48FKdmC5Vs4Kt+/J8e9guGafUC1OVIpTyyRPoZeUDH0lx+kSPV5bd
 PkL+wY/k3W0Njo8WgD1P9u3W15+BxISo/k8c7ajzKTHGBZlAvj5h2gO6XUBNMLyy
 YClB/qIymjZriSB+AeWYD79k8gPbBZPsmZG0ZF1hY060894LgqLB9mPOJdffx/DY
 H7/uP6jcsRDOXTOmyueW1SEmPoQbtysiMd1lNrCXKtC/Okr5uhn2cUhi88AsgWvd
 1QFly2lobcDAKPah/yB7YQGMAcmYvGGNuqrWaosaV2T7r0KprzUYYgCOqzvC3WSJ
 QtVatBzMIqRTMYq+3U4G1aHeCXlRazVQHDuvPby8RdR5b2gIexiqMab2eS7tSMIH
 mCOIunRIvT14g/7wxUV7tahN+ifncNxzAk4DvPO+Wc4FQ4sy7wArv2YipSaWRWtE
 u7tNdBkEwlDkKhJgRU5T0Op2PyMbHwCP8pWuz7PQIhKIcgwmP9wb07BIWG/GGIqn
 07TxJYX2ItabyEMZMsYhzILZqjLyiAaCARANB7ScbQbdP8wdcGZcwismhwnfROIU
 NuxsZg63BUDMoxk7Sauu
 =rspd
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc updates from Michael Ellerman:
 "Highlights include:

   - an update of the disassembly code used by xmon to the latest
     versions in binutils. We've received permission from all the
     authors of the relevant binutils changes to relicense their changes
     to the relevant files from GPLv3 to GPLv2, for inclusion in Linux.
     Thanks to Peter Bergner for doing the leg work to get permission
     from everyone.

   - addition of the "architected" Power9 CPU table entry, allowing us
     to boot in Power9 architected mode under a hypervisor.

   - updates to the Power9 PMU code.

   - implementation of clear_bit_unlock_is_negative_byte() to optimise
     unlock_page().

   - Freescale updates from Scott: "Highlights include 8xx breakpoints
     and perf, t1042rdb display support, and board updates."

  Thanks to:
    Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas
    Miller, Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan,
    Michael Roth, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Peter
    Bergner, Paul E. McKenney, Rashmica Gupta, Russell Currey, Sahil
    Mehta, Stewart Smith"

* tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
  powerpc: Remove leftover cputime_to_nsecs call causing build error
  powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU
  powerpc/optprobes: Fix TOC handling in optprobes trampoline
  powerpc/pseries: Advertise Hot Plug Event support to firmware
  cxl: fix nested locking hang during EEH hotplug
  powerpc/xmon: Dump memory in CPU endian format
  powerpc/pseries: Revert 'Auto-online hotplugged memory'
  powerpc/powernv: Make PCI non-optional
  powerpc/64: Implement clear_bit_unlock_is_negative_byte()
  powerpc/powernv: Remove unused variable in pnv_pci_sriov_disable()
  powerpc/kernel: Remove error message in pcibios_setup_phb_resources()
  powerpc/mm: Fix typo in set_pte_at()
  pci/hotplug/pnv-php: Disable MSI and PCI device properly
  pci/hotplug/pnv-php: Disable surprise hotplug capability on conflicts
  pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot()
  powerpc: Add POWER9 architected mode to cputable
  powerpc/perf: use is_kernel_addr macro in perf_get_misc_flags()
  powerpc/perf: Avoid FAB_*_MATCH checks for power9
  powerpc/perf: Add restrictions to PMC5 in power9 DD1
  powerpc/perf: Use Instruction Counter value
  ...
2017-03-01 10:10:16 -08:00
Benjamin Tissoires
522214d9be Input: rmi4 - f30: detect INPUT_PROP_BUTTONPAD from the button count
INPUT_PROP_BUTTONPAD is currently only set through the platform data.
The RMI4 header doc says that this property is there to force the
buttonpad property, so we also need to detect it by looking at
the exported buttons count.

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-01 10:01:56 -08:00
Linus Torvalds
044d5dfd62 sound fixes for 4.11-rc1
A few last-minute fixes for rc1:
 
 - ALSA core timer and sequencer fixes for bugs spotted by syzkaller
 - A couple of trivial HD-audio fixups
 - Additional PCI / codec IDs for Intel Geminilake
 - Fixes for CT-XFi DMA mask bugs
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEECxfAB4MH3rD5mfB6bDGAVD0pKaQFAli2mQcOHHRpd2FpQHN1
 c2UuZGUACgkQbDGAVD0pKaTboA//fvaaaGlU+IUBvEHV8LlqyfHOUBIIyowYIeJE
 mQHV4lrdMa9vEfXpIB5bztrqRcB6O9Trw0l4gEZfkC+YgLdZKZXi/haSdVnYsHRi
 koD8ZM1EtNu8o9FzGhOg8Cefu+bvQCg/dAglt6oitf3av/j6NmpFwC3EZzcAosH/
 VE3VRBk8AyQj1DYmlfoc+i27ksO9OceQI9xvJvGwdbbwzBTH/dL+PGXCwfF88T2p
 CyUvxtCk2HohHMloV6PtbpdD+ldouZxRvQsiV6MRy0Wg+ARAILvjeS9gdn3UT2LE
 E5J2JLM1X0x5J82Hki3z9vctwvmZbifCj/ewlql+3gFgAAvt0/PiRYZ0W1jcUpK6
 5THLRwU8zCOuAhQxsEzhDZh6mQq/gV69mWdVCgp3Er5faZQm6LqPUsHp2+yB/0aK
 0mXFRCAIjJa62ddtl40LPkPtJoEX9M00+ILNeASjMhpZSM8KyuBUnqygCoB1Kxhv
 vOiNhfIzXl8wQl406o/nIiDzbOdK7Ze7GyT6DHNQtNaS/aA9lS5RxYfcxDGce6c4
 nm9bnfkvRypNeY+dQwX5KefOd+ilYLHcOevUv1rC395pby7rEuVuMl/j83Qr8Pof
 cIjdSFefCCCQafH37UZhQ31noBIrxNFwlDBJh2YLWVDj6tU5ikY4GnjZQnMX/uv9
 /6XhpiM=
 =hOYM
 -----END PGP SIGNATURE-----

Merge tag 'sound-fix-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A few last-minute fixes for rc1:

   - ALSA core timer and sequencer fixes for bugs spotted by syzkaller

   - a couple of trivial HD-audio fixups

   - additional PCI / codec IDs for Intel Geminilake

   - fixes for CT-XFi DMA mask bugs"

* tag 'sound-fix-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: seq: Fix link corruption by event error handling
  ALSA: hda - Add subwoofer support for Dell Inspiron 17 7000 Gaming
  ALSA: ctxfi: Fallback DMA mask to 32bit
  ALSA: timer: Reject user params with too small ticks
  ALSA: hda: Add Geminilake HDMI codec ID
  ALSA: hda - Fix micmute hotkey problem for a lenovo AIO machine
  ALSA: hda - Add Geminilake PCI ID
2017-03-01 09:59:21 -08:00
David S. Miller
a3695e9751 Merge branch 'vxlan-geneve-rcu-fixes'
Jakub Kicinski says:

====================
VXLAN/geneve RCU fixes

VXLAN and GENEVE need to take RCU lock explicitly because TX path
only has the _bh() flavour of RCU locked.  Making the reconfiguration
path wait for both normal and _bh() RCU would be bigger hassle so
just acquire the lock, as suggested by Pravin:

https://www.mail-archive.com/netdev@vger.kernel.org/msg155583.html
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:58:46 -08:00
Jakub Kicinski
a717e3f740 geneve: lock RCU on TX path
There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: fceb9c3e38 ("geneve: avoid using stale geneve socket.")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:58:31 -08:00
Jakub Kicinski
56de859e99 vxlan: lock RCU on TX path
There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: c6fcc4fc5f ("vxlan: avoid using stale vxlan socket.")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:58:31 -08:00
Linus Torvalds
544a068f1f Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:

 - add thermal driver for R-Car Gen3 thermal sensors.

 - add thermal driver for ZTE' zx2967 family thermal sensors.

 - convert thermal ID allocation from IDR to IDA.

 - fix a possible NULL dereference in imx thermal driver.

 - fix a ti-soc-thermal driver dependency issue so that critical thermal
   control is still available when CPU_THERMAL is not defined.

 - update binding information for QorIQ thermal driver.

 - a couple of cleanups in thermal core, intel_powerclamp, exynos,
   dra752-thermal, mtk-thermal driver.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  powerpc/mpc85xx: Update TMU device tree node for T1023/T1024
  powerpc/mpc85xx: Update TMU device tree node for T1040/T1042
  dt-bindings: Update QorIQ TMU thermal bindings
  thermal: mtk_thermal: Staticise a number of data variables
  thermal: arm: dra752: Remove all TSHUT related definitions
  thermal: arm: dra752: Remove TSHUT configuration
  thermal: ti-soc-thermal: Remove CPU_THERMAL Dependency from TI_THERMAL
  thermal: imx: Fix possible NULL dereference.
  thermal: exynos: Remove parsing unused samsung,tmu_cal_mode property
  thermal: zx2967: add thermal driver for ZTE's zx2967 family
  thermal: use cpumask_var_t for on-stack cpu masks
  dt: bindings: add documentation for zx2967 family thermal sensor
  thermal/intel_powerclamp: Remove set-but-not-used variables
  thermal: rcar_gen3_thermal: Add R-Car Gen3 thermal driver
  thermal: rcar_gen3_thermal: Document the R-Car Gen3
  thermal: convert devfreq_cooling to use an IDA
  thermal: convert cpu_cooling to use an IDA
  thermal: convert clock cooling to use an IDA
  thermal core: convert ID allocation to IDA
2017-03-01 09:54:32 -08:00
Eric Dumazet
39e6c8208d net: solve a NAPI race
While playing with mlx4 hardware timestamping of RX packets, I found
that some packets were received by TCP stack with a ~200 ms delay...

Since the timestamp was provided by the NIC, and my probe was added
in tcp_v4_rcv() while in BH handler, I was confident it was not
a sender issue, or a drop in the network.

This would happen with a very low probability, but hurting RPC
workloads.

A NAPI driver normally arms the IRQ after the napi_complete_done(),
after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab
it.

Problem is that if another point in the stack grabs NAPI_STATE_SCHED bit
while IRQ are not disabled, we might have later an IRQ firing and
finding this bit set, right before napi_complete_done() clears it.

This can happen with busy polling users, or if gro_flush_timeout is
used. But some other uses of napi_schedule() in drivers can cause this
as well.

thread 1                                 thread 2 (could be on same cpu, or not)

// busy polling or napi_watchdog()
napi_schedule();
...
napi->poll()

device polling:
read 2 packets from ring buffer
                                          Additional 3rd packet is
available.
                                          device hard irq

                                          // does nothing because
NAPI_STATE_SCHED bit is owned by thread 1
                                          napi_schedule();

napi_complete_done(napi, 2);
rearm_irq();

Note that rearm_irq() will not force the device to send an additional
IRQ for the packet it already signaled (3rd packet in my example)

This patch adds a new NAPI_STATE_MISSED bit, that napi_schedule_prep()
can set if it could not grab NAPI_STATE_SCHED

Then napi_complete_done() properly reschedules the napi to make sure
we do not miss something.

Since we manipulate multiple bits at once, use cmpxchg() like in
sk_busy_loop() to provide proper transactions.

In v2, I changed napi_watchdog() to use a relaxed variant of
napi_schedule_prep() : No need to set NAPI_STATE_MISSED from this point.

In v3, I added more details in the changelog and clears
NAPI_STATE_MISSED in busy_poll_stop()

In v4, I added the ideas given by Alexander Duyck in v3 review

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:50:58 -08:00
Dan Carpenter
b2d0fe3547 net/mlx4: && vs & typo
Bitwise & was obviously intended here.

Fixes: 745d8ae462 ("net/mlx4: Spoofcheck and zero MAC can't coexist")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:50:58 -08:00
Colin Ian King
4f3de46f7a net: usb: asix_devices: fix missing return code check on call to asix_write_medium_mode
The call to asix_write_medium_mode is not updating the return code ret
and yet ret is being checked for an error. Fix this by assigning ret to
the return code from the call asix_write_medium_mode.

Detected by CoverityScan, CID#1357148 ("Logically Dead Code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:50:58 -08:00
Baruch Siach
0bf09c397e MAINTAINERS: Orphan usb/net/hso driver
The email address of Jan Dumon bounces, and there is not relevant information
in the linked website.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:50:58 -08:00
Ido Schimmel
f7df4923fa mlxsw: spectrum_router: Avoid potential packets loss
When the structure of the LPM tree changes (f.e., due to the addition of
a new prefix), we unbind the old tree and then bind the new one. This
may result in temporary packet loss.

Instead, overwrite the old binding with the new one.

Fixes: 6b75c4807d ("mlxsw: spectrum_router: Add virtual router management")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:50:58 -08:00
Zhu Yanjun
4f7bfb3982 rds: ib: add the static type to the variables
The variables rds_ib_mr_1m_pool_size and rds_ib_mr_8k_pool_size
are used only in the ib.c file. As such, the static type is
added to limit them in this file.

Cc: Joe Jin <joe.jin@oracle.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:50:58 -08:00
Xin Long
5179b26694 sctp: call rcu_read_lock before checking for duplicate transport nodes
Commit cd2b708750 ("sctp: check duplicate node before inserting a
new transport") called rhltable_lookup() to check for the duplicate
transport node in transport rhashtable.

But rhltable_lookup() doesn't call rcu_read_lock inside, it could cause
a use-after-free issue if it tries to dereference the node that another
cpu has freed it. Note that sock lock can not avoid this as it is per
sock.

This patch is to fix it by calling rcu_read_lock before checking for
duplicate transport nodes.

Fixes: cd2b708750 ("sctp: check duplicate node before inserting a new transport")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:50:58 -08:00
David Howells
540b1c48c3 rxrpc: Fix deadlock between call creation and sendmsg/recvmsg
All the routines by which rxrpc is accessed from the outside are serialised
by means of the socket lock (sendmsg, recvmsg, bind,
rxrpc_kernel_begin_call(), ...) and this presents a problem:

 (1) If a number of calls on the same socket are in the process of
     connection to the same peer, a maximum of four concurrent live calls
     are permitted before further calls need to wait for a slot.

 (2) If a call is waiting for a slot, it is deep inside sendmsg() or
     rxrpc_kernel_begin_call() and the entry function is holding the socket
     lock.

 (3) sendmsg() and recvmsg() or the in-kernel equivalents are prevented
     from servicing the other calls as they need to take the socket lock to
     do so.

 (4) The socket is stuck until a call is aborted and makes its slot
     available to the waiter.

Fix this by:

 (1) Provide each call with a mutex ('user_mutex') that arbitrates access
     by the users of rxrpc separately for each specific call.

 (2) Make rxrpc_sendmsg() and rxrpc_recvmsg() unlock the socket as soon as
     they've got a call and taken its mutex.

     Note that I'm returning EWOULDBLOCK from recvmsg() if MSG_DONTWAIT is
     set but someone else has the lock.  Should I instead only return
     EWOULDBLOCK if there's nothing currently to be done on a socket, and
     sleep in this particular instance because there is something to be
     done, but we appear to be blocked by the interrupt handler doing its
     ping?

 (3) Make rxrpc_new_client_call() unlock the socket after allocating a new
     call, locking its user mutex and adding it to the socket's call tree.
     The call is returned locked so that sendmsg() can add data to it
     immediately.

     From the moment the call is in the socket tree, it is subject to
     access by sendmsg() and recvmsg() - even if it isn't connected yet.

 (4) Lock new service calls in the UDP data_ready handler (in
     rxrpc_new_incoming_call()) because they may already be in the socket's
     tree and the data_ready handler makes them live immediately if a user
     ID has already been preassigned.

     Note that the new call is locked before any notifications are sent
     that it is live, so doing mutex_trylock() *ought* to always succeed.
     Userspace is prevented from doing sendmsg() on calls that are in a
     too-early state in rxrpc_do_sendmsg().

 (5) Make rxrpc_new_incoming_call() return the call with the user mutex
     held so that a ping can be scheduled immediately under it.

     Note that it might be worth moving the ping call into
     rxrpc_new_incoming_call() and then we can drop the mutex there.

 (6) Make rxrpc_accept_call() take the lock on the call it is accepting and
     release the socket after adding the call to the socket's tree.  This
     is slightly tricky as we've dequeued the call by that point and have
     to requeue it.

     Note that requeuing emits a trace event.

 (7) Make rxrpc_kernel_send_data() and rxrpc_kernel_recv_data() take the
     new mutex immediately and don't bother with the socket mutex at all.

This patch has the nice bonus that calls on the same socket are now to some
extent parallelisable.

Note that we might want to move rxrpc_service_prealloc() calls out from the
socket lock and give it its own lock, so that we don't hang progress in
other calls because we're waiting for the allocator.

We probably also want to avoid calling rxrpc_notify_socket() from within
the socket lock (rxrpc_accept_call()).

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.c.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-01 09:50:58 -08:00
Linus Torvalds
545b2820c4 pwm: Changes for v4.11-rc1
This set contains mostly fixes to existing drivers as well as cleanup of
 code that's not been in active use for a while.
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEEiOrDCAFJzPfAjcif3SOs138+s6EFAli1+XgZHHRoaWVycnku
 cmVkaW5nQGdtYWlsLmNvbQAKCRDdI6zXfz6zoX8LD/4l8MeOgKo+uWsfX7tY1kQ/
 gWitT/g+jCl3kXYtmcDTaLbEnNr0b5tGahmYC7rnCLhoq6Sp1VXjm6LChuoIUrMF
 r53IlUEEgHhClfCK/vqrp/kZoFndmtLZN2V6K8gpM730nNLYAwH6FhGmUFYp47Sm
 Xfdl5PwuSVDNnrBehjXPuAqUpFTp8lzrZsRkFrbX5a6TJ7hjn0TwNdovdGa6pV7k
 0LXKBfSGTRS2BxDmAFqXItC2USG7v3WRCDc4jCtykNLyQKU1yCA+AYwgR9+1jj8o
 1I07gjbZ+Fp8hzcQw/88MdodtS1xGV75WquhYLYWTgbneEWAcRVM47wklvL5gm7O
 70lIAvOveSFGjFoWnB9YACkOoZIxDINLhWwBD0Ytfpu/gaj3VsP/WrdMu6dk5kWq
 va4XHCS+dObT3o4fRkxQsQJVhAJmg4000Mupg4DOgaigd1PmFLFjBsmHua4+Ssf5
 c+y+SVAg8+c3jqA9mHxPJAJesJ41fXBQCB1Xy8pDOe8iK9fBiZjkq2ssiDerN0kI
 Sc4+hZHaGRhJqZUItvKJvNhdvrPkzQsRjrZgVLW/bDEXek3SXEmBAFsuZnr6w4BZ
 TVVxTNzg+h4obJZEay/P6oWO6KDfF/RyTK0CUHj/ORHxbqUViiwlSJkTVgp7Etd6
 mJj0xObBIwUGnSYvqGcShA==
 =J9q4
 -----END PGP SIGNATURE-----

Merge tag 'pwm/for-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
 "This set contains mostly fixes to existing drivers as well as cleanup
  of code that's not been in active use for a while"

* tag 'pwm/for-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (27 commits)
  acpi: lpss: call pwm_add_table() for BSW PWM device
  pwm: Try to load modules during pwm_get()
  pwm: Don't hold pwm_lookup_lock longer than necessary
  pwm: Make the PWM_POLARITY flag in DTB optional
  pwm: Print error messages with pr_err() instead of pr_debug()
  pwm: imx: Add polarity inversion support to i.MX's PWMv2
  pwm: imx: doc: Update imx-pwm.txt documentation entry
  pwm: imx: Remove redundant i.MX PWMv2 code
  pwm: imx: Provide atomic PWM support for i.MX PWMv2
  pwm: imx: Move PWMv2 wait for fifo slot code to a separate function
  pwm: imx: Move PWMv2 software reset code to a separate function
  pwm: imx: Rewrite v1 code to facilitate switch to atomic PWM
  pwm: imx: Add separate set of PWM ops for v1 and v2
  pwm: imx: Remove ipg clock and enable per clock when required
  pwm: lpss: Add Intel Gemini Lake PCI ID
  pwm: lpss: Do not export board infos for different PWM types
  pwm: lpss: Avoid reconfiguring while UPDATE bit is still enabled
  pwm: lpss: Switch to new atomic API
  pwm: lpss: Allow duty cycle to be 0
  pwm: lpss: Avoid potential overflow of base_unit
  ...
2017-03-01 09:46:02 -08:00
Linus Torvalds
3437f9f0a6 AST 2500 support for v4.11 - new hardware
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYtO7nAAoJEAx081l5xIa+48wP/3hd6e10Eih5chxsCAD+hNkv
 ljJ4Fn8mD7KbAxZabeSAMkDqwUUJ3h6fHtVb57GH+dklioLQ8PRno0m2dcVpM3US
 3gw5diGfwvADISgYGYsvKLyIBUJx6Ze0fEmdFb6vKZb/S3W85Yiz/rMIlo6Dlxj1
 OD0PdqDnfTwWFDzMfntJd1t3kViShONDmyABLtXyEXDm8efaqWCCAl+kD1wOho6n
 UmWmG15Dt+Wb1fFgi1u96X8eQn0376i0dfl8ose+/IQMbbdC4QaqDu8gs49UoSks
 l3+V7yBteN7C99jvhnuapk+z7H1KvSk49fqo6lPCUmGBy1S5BQSyGeXnQNs9BS36
 MRfeU9L6PCEbI1/jqcvCxJXcEbqBIMMl8525O44a45wynrG5lSZCliSF/CXNcc8r
 nJRJPNtvmqCw+3OCZiS4Au0rTTJbMM6Tni4ex9XI3cUZ/cazdeTy497eFZI/sX/F
 ENdaKNI22Pdb7dBActseOXPzsDDv+ctFZj02T9ll58wSKy7GZCXZrsul57rKKCaA
 nyBkXsWtwDbo6rIoHcaOLYsdhavMU139DXJXca64qhM99ymmMFQsyU45ctplBlp8
 d0a9BLyqzLreRpXdwOzvi1WiEe5vUGhadlAsmscHyRF3ZfHGa7Tet6HQfYN5eCZQ
 8jH7UB45KJcYGjKDPldr
 =Jt8n
 -----END PGP SIGNATURE-----

Merge tag 'drm-ast-2500-for-v4.11' of git://people.freedesktop.org/~airlied/linux

Pull drm AST2500 support from Dave Airlie:
 "This is a set of changes to enable the AST2500 BMC hardware, and also
  fix some bugs interacting with the older AST hardware.

  Some of the bug fixes are cc'ed to stable"

* tag 'drm-ast-2500-for-v4.11' of git://people.freedesktop.org/~airlied/linux:
  drm/ast: Call open_key before enable_mmio in POST code
  drm/ast: Fix test for VGA enabled
  drm/ast: POST code for the new AST2500
  drm/ast: Rename ast_init_dram_2300 to ast_post_chip_2300
  drm/ast: Factor mmc_test code in POST code
  drm/ast: Fixed vram size incorrect issue on POWER
  drm/ast: Base support for AST2500
  drm/ast: Fix calculation of MCLK
  drm/ast: Remove spurious include
  drm/ast: const'ify mode setting tables
  drm/ast: Handle configuration without P2A bridge
  drm/ast: Fix AST2400 POST failure without BMC FW or VBIOS
2017-03-01 09:42:42 -08:00
Linus Torvalds
f3ecc84b09 misc fixes for v4.11-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYtO93AAoJEAx081l5xIa+1soP/0qfpRzPrDU2OGPLK6bZHHaw
 zVaJxPBAMzrnnp4mhp7WW/+e9etYAD5DU3Q3TgY0PgS4np6o2h+zV0t6/su5PAfi
 0Zzdc41gw5Q3503OV+zmqyfX1zlp5oD6PtiQ3iVrfzKZeMBJPyJ741n21GXDZwmV
 2Z+Wk9uXm8cK0Sc/VFWFtpHVlD3Hrai5Jc9DhaMooFxIKw3FEvlmNBeS/aIyhm0z
 sc+f+GM86LOx5jr9IxCTdVa9Fm1O50CzQW949yt3RrV1eFQ1SzvBJomBIA6oZgdw
 n7V2eyJDhXshTL2GhAQ4F5/DDtpHWTb4yjWp3MV3hIMhZBl7BtrWclWMgbqJqOWM
 WI6rSIJUyx/zpqVtK7ItaIaoFadYoTPtujq7x7weZu9CwrejxgIztlvIosYYng7Q
 spt5T3c+ikeBEalDOADFclsD8TQz2m1yHw9dM0dZ2Z72Ey+yGTisybgk9su3UUBx
 jqb4eLUOqJK9jZtlHzru4lKwBd4FCkg31BUx6+iOO/UddyR6aojO7XzydcVEIiUq
 8SZiINc/v6eLij26x1N5lR2DTa2lvIhjhV2C8dHRnWXTyIdhIBvLHZUVZLb9hX7P
 dRm8mRJ6hRiIhQPluNKnJNlLdcqlbytawPSkFknUGi/bVKh+mIP93FPCU1KjaPJP
 n8z2KBIDZpgOAru36V/5
 =Wm0D
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-for-v4.11-rc1' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Misc fixes for v4.11-rc1.

  This is a selection of fixes for recent bugs, the vmwgfx one is
  important to avoid a regression, and compat ioctl one is pretty urgent
  for stable. Otherwise nothing too much.

  I've got a separate pull req for some AST hw IBM need to enable"

* tag 'drm-fixes-for-v4.11-rc1' of git://people.freedesktop.org/~airlied/linux:
  dma-buf: add support for compat ioctl
  drm/vmwgfx: Work around drm removal of control nodes
  drm/rockchip: cdn-dp: Fix error handling
  drm/rockchip: add extcon dependency for DP
  drm: zte: fix static checker warning on variable 'fmt'
2017-03-01 08:50:33 -08:00
Wanpeng Li
acc9ab6013 KVM: nVMX: Fix pending events injection
L2 fails to boot on a non-APICv box dues to 'commit 0ad3bed6c5
("kvm: nVMX: move nested events check to kvm_vcpu_running")'

KVM internal error. Suberror: 3
extra data[0]: 800000ef
extra data[1]: 1
RAX=0000000000000000 RBX=ffffffff81f36140 RCX=0000000000000000 RDX=0000000000000000
RSI=0000000000000000 RDI=0000000000000000 RBP=ffff88007c92fe90 RSP=ffff88007c92fe90
R8 =ffff88007fccdca0 R9 =0000000000000000 R10=00000000fffedb3d R11=0000000000000000
R12=0000000000000003 R13=0000000000000000 R14=0000000000000000 R15=ffff88007c92c000
RIP=ffffffff810645e6 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0000 0000000000000000 ffffffff 00c00000
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff88007fcc0000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffff88007fcd4200 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff88007fcc9000 0000007f
IDT=     ffffffffff578000 00000fff
CR0=80050033 CR2=00000000ffffffff CR3=0000000001e0a000 CR4=003406e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01

We should try to reinject previous events if any before trying to inject
new event if pending. If vmexit is triggered by L2 guest and L0 interested
in, we should reinject IDT-vectoring info to L2 through vmcs02 if any,
otherwise, we can consider new IRQs/NMIs which can be injected and call
nested events callback to switch from L2 to L1 if needed and inject the
proper vmexit events. However, 'commit 0ad3bed6c5 ("kvm: nVMX: move
nested events check to kvm_vcpu_running")' results in the handle events
order reversely on non-APICv box. This patch fixes it by bailing out for
pending events and not consider new events in this scenario.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Fixes: 0ad3bed6c5 ("kvm: nVMX: move nested events check to kvm_vcpu_running")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-03-01 17:03:24 +01:00
Jérémy Lefaure
0fce546f9f x86/kvm/vmx: remove unused variable in segment_base()
The pointer 'struct desc_struct *d' is unused since commit 8c2e41f7ae
("x86/kvm/vmx: Simplify segment_base()") so let's remove it.

Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-03-01 17:03:24 +01:00
Andy Lutomirski
0eb1d0fa6a selftests/x86: Add a basic selftest for ioperm
This doesn't fully exercise the interaction between KVM and ioperm(),
but it does test basic functionality.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-03-01 17:03:23 +01:00
Andy Lutomirski
b7ceaec112 x86/asm: Tidy up TSS limit code
In an earlier version of the patch ("x86/kvm/vmx: Defer TR reload
after VM exit") that introduced TSS limit validity tracking, I
confused which helper was which.  On reflection, the names I chose
sucked.  Rename the helpers to make it more obvious what's going on
and add some comments.

While I'm at it, clear __tss_limit_invalid when force-reloading as
well as when contitionally reloading, since any TR reload fixes the
limit.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-03-01 17:03:22 +01:00
Elena Reshetova
e3736c3eb3 kvm: convert kvm.users_count from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-03-01 17:03:21 +01:00
Arnd Bergmann
9ad82f117f watchdog: retu: restore MFD dependency
The retu watchdog calls into the respective mfd driver, but fails to
link if that is diabled:

drivers/watchdog/built-in.o: In function `retu_wdt_set_timeout':
ziirave_wdt.c:(.text+0x8c88): undefined reference to `retu_write'
ziirave_wdt.c:(.text+0x8c88): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `retu_write'
drivers/watchdog/built-in.o: In function `retu_wdt_start':
ziirave_wdt.c:(.text+0x8cc8): undefined reference to `retu_write'
ziirave_wdt.c:(.text+0x8cc8): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `retu_write'

This restores the dependency as it was before

Fixes: da2a68b3eb ("watchdog: Enable COMPILE_TEST where possible")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01 06:15:10 -08:00
Arnd Bergmann
9297b652bd watchdog: db8500: add back prmcu dependency
When the db8500 watchdog is enabled without the PRCMU, we get a lot of
warnings about duplicate or missing helper functions:

In file included from drivers/watchdog/ux500_wdt.c:21:0:
include/linux/mfd/dbx500-prcmu.h:422:19: error: redefinition of 'prcmu_abb_read'
 static inline int prcmu_abb_read(u8 slave, u8 reg, u8 *value, u8 size)

This restores the dependency as it was.

Fixes: da2a68b3eb ("watchdog: Enable COMPILE_TEST where possible")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01 06:15:10 -08:00
Arnd Bergmann
3736d4eb6a watchdog: kempld: fix gcc-4.3 build
gcc-4.3 can't decide whether the constant value in
kempld_prescaler[PRESCALER_21] is built-time constant or
not, and gets confused by the logic in do_div():

drivers/watchdog/kempld_wdt.o: In function `kempld_wdt_set_stage_timeout':
kempld_wdt.c:(.text.kempld_wdt_set_stage_timeout+0x130): undefined reference to `__aeabi_uldivmod'

This adds a call to ACCESS_ONCE() to force it to not consider
it to be constant, and leaves the more efficient normal case
in place for modern compilers, using an #ifdef to annotate
why we do this hack.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01 06:15:10 -08:00
Niklas Cassel
8d5755b3f7 watchdog: softdog: fire watchdog even if softirqs do not get to run
Checking for timer expiration is done from the softirq TIMER_SOFTIRQ.

Since commit 4cd13c21b2 ("softirq: Let ksoftirqd do its job"),
pending softirqs are no longer always handled immediately, instead,
if there are pending softirqs, and ksoftirqd is in state TASK_RUNNING,
the handling of the softirqs are deferred, and are instead supposed
to be handled by ksoftirqd, when ksoftirqd gets scheduled.

If a user space process with a real-time policy starts to misbehave
by never relinquishing the CPU while ksoftirqd is in state TASK_RUNNING,
what will happen is that all softirqs will get deferred, while ksoftirqd,
which is supposed to handle the deferred softirqs, will never get to run.

To make sure that the watchdog is able to fire even when we do not get
to run softirqs, replace the timers with hrtimers.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01 06:15:10 -08:00
Arnd Bergmann
ed4a9eca66 watchdog: kempld: revert to full dependency
The kempld watchdog driver requires the respective MFD driver:

drivers/watchdog/built-in.o: In function `kempld_wdt_probe':
kempld_wdt.c:(.text+0x5c78): undefined reference to `kempld_get_mutex'
kempld_wdt.c:(.text+0x5c84): undefined reference to `kempld_read8'
kempld_wdt.c:(.text+0x5c8e): undefined reference to `kempld_release_mutex'
kempld_wdt.c:(.text+0x5d1c): undefined reference to `kempld_read8'
kempld_wdt.c:(.text+0x5d2c): undefined reference to `kempld_write8'

This adds the Kconfig dependency back.

Fixes: da2a68b3eb ("watchdog: Enable COMPILE_TEST where possible")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01 06:15:10 -08:00
Arnd Bergmann
2672b7e01a watchdog: bcm2835: add CONFIG_OF dependency
Without CONFIG_OF, the driver fails to link:

drivers/watchdog/built-in.o: In function `bcm2835_power_off':
bcm2835_wdt.c:(.text+0x1946): undefined reference to `of_find_device_by_node'

This adds a new dependency, to allow the COMPILE_TEST check to succeed.

Fixes: da2a68b3eb ("watchdog: Enable COMPILE_TEST where possible")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01 06:15:10 -08:00
Arnd Bergmann
3eafee9564 watchdog: sp805: add back AMBA dependency
The driver fails to link if ARM_AMBA is disabled:

drivers/watchdog/sp805_wdt.o: In function `sp805_wdt_driver_init':
sp805_wdt.c:(.init.text+0x4): undefined reference to `amba_driver_register'

It seems that the COMPILE_TEST was added in the wrong place, as there
is no architecture dependency, but a bus dependency. This moves
the dependency accordingly.

Fixes: da2a68b3eb ("watchdog: Enable COMPILE_TEST where possible")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-03-01 06:15:10 -08:00