Commit Graph

603024 Commits

Author SHA1 Message Date
Bert Kenward
3497ed8c85 sfc: report supported link speeds on SFP connections
7000-series SFC NICs connected with an SFP+ module currently fail to
report any supported link speeds.

Reported-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Reviewed-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 11:18:45 -07:00
Eric Dumazet
e0d194adfa net_sched: add missing paddattr description
"make htmldocs" complains otherwise:

.//net/core/gen_stats.c:65: warning: No description found for parameter 'padattr'
.//net/core/gen_stats.c:101: warning: No description found for parameter 'padattr'

Fixes: 9854518ea0 ("sched: align nlattr properly when needed")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 11:17:39 -07:00
Jakub Sitnicki
00bc0ef588 ipv6: Skip XFRM lookup if dst_entry in socket cache is valid
At present we perform an xfrm_lookup() for each UDPv6 message we
send. The lookup involves querying the flow cache (flow_cache_lookup)
and, in case of a cache miss, creating an XFRM bundle.

If we miss the flow cache, we can end up creating a new bundle and
deriving the path MTU (xfrm_init_pmtu) from on an already transformed
dst_entry, which we pass from the socket cache (sk->sk_dst_cache) down
to xfrm_lookup(). This can happen only if we're caching the dst_entry
in the socket, that is when we're using a connected UDP socket.

To put it another way, the path MTU shrinks each time we miss the flow
cache, which later on leads to incorrectly fragmented payload. It can
be observed with ESPv6 in transport mode:

  1) Set up a transformation and lower the MTU to trigger fragmentation
    # ip xfrm policy add dir out src ::1 dst ::1 \
      tmpl src ::1 dst ::1 proto esp spi 1
    # ip xfrm state add src ::1 dst ::1 \
      proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
    # ip link set dev lo mtu 1500

  2) Monitor the packet flow and set up an UDP sink
    # tcpdump -ni lo -ttt &
    # socat udp6-listen:12345,fork /dev/null &

  3) Send a datagram that needs fragmentation with a connected socket
    # perl -e 'print "@" x 1470 | socat - udp6:[::1]:12345
    2016/06/07 18:52:52 socat[724] E read(3, 0x555bb3d5ba00, 8192): Protocol error
    00:00:00.000000 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x2), length 1448
    00:00:00.000014 IP6 ::1 > ::1: frag (1448|32)
    00:00:00.000050 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x3), length 1272
    (^ ICMPv6 Parameter Problem)
    00:00:00.000022 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x5), length 136

  4) Compare it to a non-connected socket
    # perl -e 'print "@" x 1500' | socat - udp6-sendto:[::1]:12345
    00:00:40.535488 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x6), length 1448
    00:00:00.000010 IP6 ::1 > ::1: frag (1448|64)

What happens in step (3) is:

  1) when connecting the socket in __ip6_datagram_connect(), we
     perform an XFRM lookup, miss the flow cache, create an XFRM
     bundle, and cache the destination,

  2) afterwards, when sending the datagram, we perform an XFRM lookup,
     again, miss the flow cache (due to mismatch of flowi6_iif and
     flowi6_oif, which is an issue of its own), and recreate an XFRM
     bundle based on the cached (and already transformed) destination.

To prevent the recreation of an XFRM bundle, avoid an XFRM lookup
altogether whenever we already have a destination entry cached in the
socket. This prevents the path MTU shrinkage and brings us on par with
UDPv4.

The fix also benefits connected PINGv6 sockets, another user of
ip6_sk_dst_lookup_flow(), who also suffer messages being transformed
twice.

Joint work with Hannes Frederic Sowa.

Reported-by: Jan Tluka <jtluka@redhat.com>
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 11:16:06 -07:00
Guillaume Nault
a5c5e2da85 l2tp: fix configuration passed to setup_udp_tunnel_sock()
Unused fields of udp_cfg must be all zeros. Otherwise
setup_udp_tunnel_sock() fills ->gro_receive and ->gro_complete
callbacks with garbage, eventually resulting in panic when used by
udp_gro_receive().

[   72.694123] BUG: unable to handle kernel paging request at ffff880033f87d78
[   72.695518] IP: [<ffff880033f87d78>] 0xffff880033f87d78
[   72.696530] PGD 26e2067 PUD 26e3067 PMD 342ed063 PTE 8000000033f87163
[   72.696530] Oops: 0011 [#1] SMP KASAN
[   72.696530] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pptp gre pppox ppp_generic slhc crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel evdev aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw acpi_cpufreq button proc\
essor ext4 crc16 jbd2 mbcache virtio_blk virtio_net virtio_pci virtio_ring virtio
[   72.696530] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.7.0-rc1 #1
[   72.696530] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[   72.696530] task: ffff880035b59700 ti: ffff880035b70000 task.ti: ffff880035b70000
[   72.696530] RIP: 0010:[<ffff880033f87d78>]  [<ffff880033f87d78>] 0xffff880033f87d78
[   72.696530] RSP: 0018:ffff880035f87bc0  EFLAGS: 00010246
[   72.696530] RAX: ffffed000698f996 RBX: ffff88003326b840 RCX: ffffffff814cc823
[   72.696530] RDX: ffff88003326b840 RSI: ffff880033e48038 RDI: ffff880034c7c780
[   72.696530] RBP: ffff880035f87c18 R08: 000000000000a506 R09: 0000000000000000
[   72.696530] R10: ffff880035f87b38 R11: ffff880034b9344d R12: 00000000ebfea715
[   72.696530] R13: 0000000000000000 R14: ffff880034c7c780 R15: 0000000000000000
[   72.696530] FS:  0000000000000000(0000) GS:ffff880035f80000(0000) knlGS:0000000000000000
[   72.696530] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   72.696530] CR2: ffff880033f87d78 CR3: 0000000033c98000 CR4: 00000000000406a0
[   72.696530] Stack:
[   72.696530]  ffffffff814cc834 ffff880034b93468 0000001481416818 ffff88003326b874
[   72.696530]  ffff880034c7ccb0 ffff880033e48038 ffff88003326b840 ffff880034b93462
[   72.696530]  ffff88003326b88a ffff88003326b88c ffff880034b93468 ffff880035f87c70
[   72.696530] Call Trace:
[   72.696530]  <IRQ>
[   72.696530]  [<ffffffff814cc834>] ? udp_gro_receive+0x1c6/0x1f9
[   72.696530]  [<ffffffff814ccb1c>] udp4_gro_receive+0x2b5/0x310
[   72.696530]  [<ffffffff814d989b>] inet_gro_receive+0x4a3/0x4cd
[   72.696530]  [<ffffffff81431b32>] dev_gro_receive+0x584/0x7a3
[   72.696530]  [<ffffffff810adf7a>] ? __lock_is_held+0x29/0x64
[   72.696530]  [<ffffffff814321f7>] napi_gro_receive+0x124/0x21d
[   72.696530]  [<ffffffffa000b145>] virtnet_receive+0x8df/0x8f6 [virtio_net]
[   72.696530]  [<ffffffffa000b27e>] virtnet_poll+0x1d/0x8d [virtio_net]
[   72.696530]  [<ffffffff81431350>] net_rx_action+0x15b/0x3b9
[   72.696530]  [<ffffffff815893d6>] __do_softirq+0x216/0x546
[   72.696530]  [<ffffffff81062392>] irq_exit+0x49/0xb6
[   72.696530]  [<ffffffff81588e9a>] do_IRQ+0xe2/0xfa
[   72.696530]  [<ffffffff81587a49>] common_interrupt+0x89/0x89
[   72.696530]  <EOI>
[   72.696530]  [<ffffffff810b05df>] ? trace_hardirqs_on_caller+0x229/0x270
[   72.696530]  [<ffffffff8102b3c7>] ? default_idle+0x1c/0x2d
[   72.696530]  [<ffffffff8102b3c5>] ? default_idle+0x1a/0x2d
[   72.696530]  [<ffffffff8102bb8c>] arch_cpu_idle+0xa/0xc
[   72.696530]  [<ffffffff810a6c39>] default_idle_call+0x1a/0x1c
[   72.696530]  [<ffffffff810a6d96>] cpu_startup_entry+0x15b/0x20f
[   72.696530]  [<ffffffff81039a81>] start_secondary+0x12c/0x133
[   72.696530] Code: ff ff ff ff ff ff ff ff ff ff 7f ff ff ff ff ff ff ff 7f 00 7e f8 33 00 88 ff ff 6d 61 58 81 ff ff ff ff 5e de 0a 81 ff ff ff ff <00> 5c e2 34 00 88 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00
[   72.696530] RIP  [<ffff880033f87d78>] 0xffff880033f87d78
[   72.696530]  RSP <ffff880035f87bc0>
[   72.696530] CR2: ffff880033f87d78
[   72.696530] ---[ end trace ad7758b9a1dccf99 ]---
[   72.696530] Kernel panic - not syncing: Fatal exception in interrupt
[   72.696530] Kernel Offset: disabled
[   72.696530] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

v2: use empty initialiser instead of "{ NULL }" to avoid relying on
    first field's type.

Fixes: 38fd2af24f ("udp: Add socket based GRO and config")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 11:11:53 -07:00
Bob Liu
2a6f71ad99 xen-blkfront: fix resume issues after a migration
After a migrate to another host (which may not have multiqueue
support), the number of rings (block hardware queues)
may be changed and the ring info structure will also be reallocated.

This patch fixes two related bugs:
 * call blk_mq_update_nr_hw_queues() to make blk-core know the number
   of hardware queues have been changed.
 * Don't store rinfo pointer to hctx->driver_data, because rinfo may be
   reallocated so use hctx->queue_num to get the rinfo structure instead.

Signed-off-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-06-08 13:54:46 -04:00
Bob Liu
efd1535270 xen-blkfront: don't call talk_to_blkback when already connected to blkback
Sometimes blkfront may twice receive blkback_changed() notification
(XenbusStateConnected) after migration, which will cause
talk_to_blkback() to be called twice too and confuse xen-blkback.

The flow is as follow:
   blkfront                                        blkback
blkfront_resume()
 > talk_to_blkback()
  > Set blkfront to XenbusStateInitialised
                                                front changed()
                                                 > Connect()
                                                  > Set blkback to XenbusStateConnected

blkback_changed()
 > Skip talk_to_blkback()
   because frontstate == XenbusStateInitialised
 > blkfront_connect()
  > Set blkfront to XenbusStateConnected

-----
And here we get another XenbusStateConnected notification leading
to:
-----
blkback_changed()
 > because now frontstate != XenbusStateInitialised
   talk_to_blkback() is also called again
  > blkfront state changed from
  XenbusStateConnected to XenbusStateInitialised
    (Which is not correct!)

						front_changed():
                                                 > Do nothing because blkback
                                                   already in XenbusStateConnected

Now blkback is in XenbusStateConnected but blkfront is still
in XenbusStateInitialised - leading to no disks.

Poking of the XenbusStateConnected state is allowed (to deal with
block disk change) and has to be dealt with. The most likely
cause of this bug are custom udev scripts hooking up the disks
and then validating the size.

Signed-off-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-06-08 13:54:39 -04:00
Mel Gorman
077fa7aed1 futex: Calculate the futex key based on a tail page for file-based futexes
Mike Galbraith reported that the LTP test case futex_wake04 was broken
by commit 65d8fc777f ("futex: Remove requirement for lock_page()
in get_futex_key()").

This test case uses futexes backed by hugetlbfs pages and so there is an
associated inode with a futex stored on such pages. The problem is that
the key is being calculated based on the head page index of the hugetlbfs
page and not the tail page.

Prior to the optimisation, the page lock was used to stabilise mappings and
pin the inode is file-backed which is overkill. If the page was a compound
page, the head page was automatically looked up as part of the page lock
operation but the tail page index was used to calculate the futex key.

After the optimisation, the compound head is looked up early and the page
lock is only relied upon to identify truncated pages, special pages or a
shmem page moving to swapcache. The head page is looked up because without
the page lock, special care has to be taken to pin the inode correctly.
However, the tail page is still required to calculate the futex key so
this patch records the tail page.

On vanilla 4.6, the output of the test case is;

futex_wake04    0  TINFO  :  Hugepagesize 2097152
futex_wake04    1  TFAIL  :  futex_wake04.c:126: Bug: wait_thread2 did not wake after 30 secs.

With the patch applied

futex_wake04    0  TINFO  :  Hugepagesize 2097152
futex_wake04    1  TPASS  :  Hi hydra, thread2 awake!

Fixes: 65d8fc777f "futex: Remove requirement for lock_page() in get_futex_key()"
Reported-and-tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160608132522.GM2469@suse.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-06-08 19:23:54 +02:00
Hariprasad Shenai
c0530dd3ef cxgb4: Add device id of T540-BT adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 10:23:46 -07:00
Josef Bacik
d366a0ff1c nbd: pass the nbd pointer for flags debugfs
We were passing in &nbd for the private data in debugfs_create_file() for the
flags entry.  We expect it to just be nbd, fix this so we get proper output from
this debugfs entry.

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-08 09:03:54 -06:00
Josh Poimboeuf
0b0d81e3b7 objtool, drm/vmwgfx: Fix "duplicate frame pointer save" warning
objtool reports the following warnings:

  drivers/gpu/drm/vmwgfx/vmwgfx_msg.o: warning: objtool: vmw_send_msg()+0x107: duplicate frame pointer save
  drivers/gpu/drm/vmwgfx/vmwgfx_msg.o: warning: objtool: vmw_host_get_guestinfo()+0x252: duplicate frame pointer save

To quote Linus:

 "The reason is that VMW_PORT_HB_OUT() uses a magic instruction sequence
  (a "rep outsb") to communicate with the hypervisor (it's a virtual GPU
  driver for vmware), and %rbp is part of the communication. So the
  inline asm does a save-and-restore of the frame pointer around the
  instruction sequence.

  I actually find the objtool warning to be quite reasonable, so it's
  not exactly a false positive, since in this case it actually does
  point out that the frame pointer won't be reliable over that
  instruction sequence.

  But in this particular case it just ends up being the wrong thing -
  the code is what it is, and %rbp just can't have the frame information
  due to annoying magic calling conventions."

Silence the warnings by telling objtool to ignore the two functions
which use the VMW_PORT_HB_{IN,OUT} macros.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: DRI <dri-devel@lists.freedesktop.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160526184343.fdtjjjg67smmeekt@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 15:36:18 +02:00
Robin Murphy
5c1d3310d8 drivers: of: Fix of_pci.h header guard
The compilation of of_pci.c is governed by CONFIG_OF_PCI, but the
corresponding declarations in of_pci.h are inconsistently guarded by
CONFIG_OF, with the result that if CONFIG_PCI is disabled for an OF
platform, the dangling external declarations are still active and the
inline stub definitions not. So far this has managed to go unnoticed
since it happens that the only references to these functions are from
code which itself depends on CONFIG_PCI or CONFIG_OF_PCI.

Fix this with the appropriate config guard so that any new callers
outside PCI-specific code don't start unexpectedly breaking under
certain configs.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2016-06-08 08:18:06 -05:00
Fabio Estevam
3eefa7e8cc dt-bindings: Add vendor prefix for TechNexion
TechNexion designs and manufactures embedded computing systems:
http://www.technexion.com/

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2016-06-08 08:13:31 -05:00
Josh Poimboeuf
4698f88c06 sched/debug: Fix 'schedstats=enable' cmdline option
The 'schedstats=enable' option doesn't work, and also produces the
following warning during boot:

  WARNING: CPU: 0 PID: 0 at /home/jpoimboe/git/linux/kernel/jump_label.c:61 static_key_slow_inc+0x8c/0xa0
  static_key_slow_inc used before call to jump_label_init
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 4.7.0-rc1+ #25
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   0000000000000086 3ae3475a4bea95d4 ffffffff81e03da8 ffffffff8143fc83
   ffffffff81e03df8 0000000000000000 ffffffff81e03de8 ffffffff810b1ffb
   0000003d00000096 ffffffff823514d0 ffff88007ff197c8 0000000000000000
  Call Trace:
   [<ffffffff8143fc83>] dump_stack+0x85/0xc2
   [<ffffffff810b1ffb>] __warn+0xcb/0xf0
   [<ffffffff810b207f>] warn_slowpath_fmt+0x5f/0x80
   [<ffffffff811e9c0c>] static_key_slow_inc+0x8c/0xa0
   [<ffffffff810e07c6>] static_key_enable+0x16/0x40
   [<ffffffff8216d633>] setup_schedstats+0x29/0x94
   [<ffffffff82148a05>] unknown_bootoption+0x89/0x191
   [<ffffffff810d8617>] parse_args+0x297/0x4b0
   [<ffffffff82148d61>] start_kernel+0x1d8/0x4a9
   [<ffffffff8214897c>] ? set_init_arg+0x55/0x55
   [<ffffffff82148120>] ? early_idt_handler_array+0x120/0x120
   [<ffffffff821482db>] x86_64_start_reservations+0x2f/0x31
   [<ffffffff82148427>] x86_64_start_kernel+0x14a/0x16d

The problem is that it tries to update the 'sched_schedstats' static key
before jump labels have been initialized.

Changing jump_label_init() to be called earlier before
parse_early_param() wouldn't fix it: it would still fail trying to
poke_text() because mm isn't yet initialized.

Instead, just create a temporary '__sched_schedstats' variable which can
be copied to the static key later during sched_init() after jump labels
have been initialized.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: cb2517653f ("sched/debug: Make schedstats a runtime tunable that is disabled by default")
Link: http://lkml.kernel.org/r/453775fe3433bed65731a583e228ccea806d18cd.1465322027.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 14:33:05 +02:00
Josh Poimboeuf
9c57259117 sched/debug: Fix /proc/sched_debug regression
Commit:

  cb2517653f ("sched/debug: Make schedstats a runtime tunable that is disabled by default")

... introduced a bug when CONFIG_SCHEDSTATS is enabled and the
runtime tunable is disabled (which is the default).

The wait-time, sum-exec, and sum-sleep fields are missing from the
/proc/sched_debug file in the runnable_tasks section.

Fix it with a new schedstat_val() macro which returns the field value
when schedstats is enabled and zero otherwise.  The macro works with
both SCHEDSTATS and !SCHEDSTATS.  I put the macro in stats.h since it
might end up being useful in other places.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: cb2517653f ("sched/debug: Make schedstats a runtime tunable that is disabled by default")
Link: http://lkml.kernel.org/r/bcda7c2790cf2ccbe586a28c02dd7b6fe7749a2b.1464994423.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 14:31:58 +02:00
Alexander Shishkin
62a92c8f55 perf/core: Remove a redundant check
There is no way to end up in _free_event() with event::pmu being NULL.
The latter is initialized in event allocation path and remains set
forever. In case of allocation failure, the error path doesn't use
_free_event().

Having the check, however, suggests that it is possible to have a
event::pmu==NULL situation in _free_event() and confuses the robots.

This patch gets rid of the check.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1465303455-26032-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 14:30:01 +02:00
Peter Zijlstra
2c61002271 locking/qspinlock: Fix spin_unlock_wait() some more
While this prior commit:

  54cf809b95 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")

... fixes spin_is_locked() and spin_unlock_wait() for the usage
in ipc/sem and netfilter, it does not in fact work right for the
usage in task_work and futex.

So while the 2 locks crossed problem:

	spin_lock(A)		spin_lock(B)
	if (!spin_is_locked(B)) spin_unlock_wait(A)
	  foo()			foo();

... works with the smp_mb() injected by both spin_is_locked() and
spin_unlock_wait(), this is not sufficient for:

	flag = 1;
	smp_mb();		spin_lock()
	spin_unlock_wait()	if (!flag)
				  // add to lockless list
	// iterate lockless list

... because in this scenario, the store from spin_lock() can be delayed
past the load of flag, uncrossing the variables and loosing the
guarantee.

This patch reworks spin_is_locked() and spin_unlock_wait() to work in
both cases by exploiting the observation that while the lock byte
store can be delayed, the contender must have registered itself
visibly in other state contained in the word.

It also allows for architectures to override both functions, as PPC
and ARM64 have an additional issue for which we currently have no
generic solution.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Giovanni Gherdovich <ggherdovich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <waiman.long@hpe.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org # v4.2 and later
Fixes: 54cf809b95 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 14:29:08 +02:00
Ben Dooks
b66b2a0adf gpio: bcm-kona: fix bcm_kona_gpio_reset() warnings
The bcm_kona_gpio_reset() calls bcm_kona_gpio_write_lock_regs()
with what looks like the wrong parameter. The write_lock_regs
function takes a pointer to the registers, not the bcm_kona_gpio
structure.

Fix the warning, and probably bug by changing the function to
pass reg_base instead of kona_gpio, fixing the following warning:

drivers/gpio/gpio-bcm-kona.c:550:47: warning: incorrect type in argument 1
  (different address spaces)
  expected void [noderef] <asn:2>*reg_base
  got struct bcm_kona_gpio *kona_gpio
  warning: incorrect type in argument 1 (different address spaces)
  expected void [noderef] <asn:2>*reg_base
  got struct bcm_kona_gpio *kona_gpio

Cc: stable@vger.kernel.org
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Acked-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Markus Mayer <mmayer@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-08 14:04:35 +02:00
Borislav Petkov
96685a55a8 x86/cpu/AMD: Extend X86_FEATURE_TOPOEXT workaround to newer models
We need to reenable the topology extensions CPUID leafs on newer models
too, if BIOS has disabled them, as we rely on them to get proper compute
unit topology.

Make the printk a once thing, while at it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rui Huang <ray.huang@amd.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-hwmon@vger.kernel.org
Link: http://lkml.kernel.org/r/1464775468-23355-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 13:51:34 +02:00
Linus Walleij
60a5eaba46 gpio: select ANON_INODES
The build servers found that gpiolib is using ANON_INODES but
has forgotten to select it. Fix this.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 521a2ad6f8 ("gpio: add userspace ABI for GPIO line information")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-08 13:47:37 +02:00
Srinivas Kandagatla
d1e44b6b28 regulator: qcom_smd: add regulator ops for pm8941 lnldo
After "regulator: qcom_smd: add list_voltage callback" patch adding
pm8941 lnldo regulators would bug on list_voltages as it is a fixed
regulator without any linear range.
This patch fixes that issue by adding dedicated ops for pm8941 lnldo
without list_voltages callback.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org # v4.6
2016-06-08 11:59:25 +01:00
Srinivas Kandagatla
a8a47540eb regulator: qcom_smd: add list_voltage callback
This patch adds support to list_voltage callback, so that consumers
like mmc core, can get information of supported voltage range.

Without this patch there is no way for mmc core to know this voltage range.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org # v4.6
2016-06-08 11:59:00 +01:00
Dave Hansen
970442c599 x86/cpu/intel: Introduce macros for Intel family numbers
Problem:

We have a boatload of open-coded family-6 model numbers.  Half of
them have these model numbers in hex and the other half in
decimal.  This makes grepping for them tons of fun, if you were
to try.

Solution:

Consolidate all the magic numbers.  Put all the definitions in
one header.

The names here are closely derived from the comments describing
the models from arch/x86/events/intel/core.c.  We could easily
make them shorter by doing things like s/SANDYBRIDGE/SNB/, but
they seemed fine even with the longer versions to me.

Do not take any of these names too literally, like "DESKTOP"
or "MOBILE".  These are all colloquial names and not precise
descriptions of everywhere a given model will show up.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vishwanath Somayaji <vishwanath.somayaji@intel.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: jacob.jun.pan@intel.com
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac@vger.kernel.org
Cc: linux-mmc@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: platform-driver-x86@vger.kernel.org
Link: http://lkml.kernel.org/r/20160603001927.F2A7D828@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08 11:59:09 +02:00
Linus Walleij
5ab92a7cb8 leds: handle suspend/resume in heartbeat trigger
The following phenomena was observed: when suspending the
system, sometimes the heartbeat LED was left on, glowing and
wasting power while the rest of the system is asleep, also
disturbing power dissapation measures on the odd suspend
cycle when it's left on.

Clearly this is not how we want the heartbeat trigger to
work: it should turn off and leave the LED off during
system suspend.

This removes the heartbeat trigger when preparing suspend and
restores it during resume. The trigger code will make sure all
LEDs are left in OFF state after removing the trigger, and
will re-enable the trigger on all LEDs after resuming.

Cc: linux-pm@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
2016-06-08 11:47:06 +02:00
Tony Makkiel
7cfe749fad leds: core: Fix brightness setting upon hardware blinking enabled
Commit 76931edd54 ("leds: fix brightness changing when software blinking
is active") changed the semantics of led_set_brightness() which according
to the documentation should disable blinking upon any brightness setting.
Moreover it made it different for soft blink case, where it was possible
to change blink brightness, and for hardware blink case, where setting
any brightness greater than 0 was ignored.

While the change itself is against the documentation claims, it was driven
also by the fact that timer trigger remained active after turning blinking
off. Fixing that would have required major refactoring in the led-core,
led-class, and led-triggers because of cyclic dependencies.

Finally, it has been decided that allowing for brightness change during
blinking is beneficial as it can be accomplished without disturbing
blink rhythm.

The change in brightness setting semantics will not affect existing
LED class drivers that implement blink_set op thanks to the LED_BLINK_SW
flag introduced by this patch. The flag state will be from now on checked
in led_set_brightness() which will allow to distinguish between software
and hardware blink mode. In the latter case the control will be passed
directly to the drivers which apply their semantics on brightness set,
which is disable the blinking in case of most such drivers. New drivers
will apply new semantics and just change the brightness while hardware
blinking is on, if possible.

The issue was smuggled by subsequent LED core improvements, which modified
the code that originally introduced the problem.

Fixes: f1e80c0741 ("leds: core: Add two new LED_BLINK_ flags")
Signed-off-by: Tony Makkiel <tony.makkiel@daqri.com>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
2016-06-08 11:47:06 +02:00
Will Deacon
0106d456c4 arm64: mm: always take dirty state from new pte in ptep_set_access_flags
Commit 66dbd6e61a ("arm64: Implement ptep_set_access_flags() for
hardware AF/DBM") ensured that pte flags are updated atomically in the
face of potential concurrent, hardware-assisted updates. However, Alex
reports that:

 | This patch breaks swapping for me.
 | In the broken case, you'll see either systemd cpu time spike (because
 | it's stuck in a page fault loop) or the system hang (because the
 | application owning the screen is stuck in a page fault loop).

It turns out that this is because the 'dirty' argument to
ptep_set_access_flags is always 0 for read faults, and so we can't use
it to set PTE_RDONLY. The failing sequence is:

  1. We put down a PTE_WRITE | PTE_DIRTY | PTE_AF pte
  2. Memory pressure -> pte_mkold(pte) -> clear PTE_AF
  3. A read faults due to the missing access flag
  4. ptep_set_access_flags is called with dirty = 0, due to the read fault
  5. pte is then made PTE_WRITE | PTE_DIRTY | PTE_AF | PTE_RDONLY (!)
  6. A write faults, but pte_write is true so we get stuck

The solution is to check the new page table entry (as would be done by
the generic, non-atomic definition of ptep_set_access_flags that just
calls set_pte_at) to establish the dirty state.

Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 66dbd6e61a ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Alexander Graf <agraf@suse.de>
Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-08 10:23:44 +01:00
Linus Walleij
7d4defe21c gpio: include <linux/io-mapping.h> in gpiolib-of
When enabling the gpiolib for all archs a build robot came
up with this:

All errors (new ones prefixed by >>):

   drivers/gpio/gpiolib-of.c: In function 'of_mm_gpiochip_add_data':
>> drivers/gpio/gpiolib-of.c:317:2: error: implicit declaration of
   function 'iounmap' [-Werror=implicit-function-declaration]
     iounmap(mm_gc->regs);
     ^~~~~~~
   cc1: some warnings being treated as errors

Fix this by including <linux/io-mapping.h> explicitly.

Fixes: 296ad4acb8 ("gpio: remove deps on ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-08 10:58:20 +02:00
Ricardo Ribalda Delgado
f4833b8cc7 gpiolib: Fix unaligned used of reference counters
gpiolib relies on the reference counters to clean up the gpio_device
structure.

Although the number of get/put is properly aligned on gpiolib.c
itself, it does not take into consideration how the referece counters
are affected by other external functions such as cdev_add and device_add.

Because of this, after the last call to put_device, the reference counter
has a value of +3, therefore never calling gpiodevice_release.

Due to the fact that some of the device  has already been cleaned on
gpiochip_remove, the library will end up OOPsing the kernel (e.g. a call
to of_gpiochip_find_and_xlate).

Cc: stable@vger.kernel.org
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-08 10:40:29 +02:00
Ricardo Ribalda Delgado
11f33a6d15 gpiolib: Fix NULL pointer deference
Under some circumstances, a gpiochip might be half cleaned from the
gpio_device list.

This patch makes sure that the chip pointer is still valid, before
calling the match function.

[  104.088296] BUG: unable to handle kernel NULL pointer dereference at
0000000000000090
[  104.089772] IP: [<ffffffff813d2045>] of_gpiochip_find_and_xlate+0x15/0x80
[  104.128273] Call Trace:
[  104.129802]  [<ffffffff813d2030>] ? of_parse_own_gpio+0x1f0/0x1f0
[  104.131353]  [<ffffffff813cd910>] gpiochip_find+0x60/0x90
[  104.132868]  [<ffffffff813d21ba>] of_get_named_gpiod_flags+0x9a/0x120
...
[  104.141586]  [<ffffffff8163d12b>] gpio_led_probe+0x11b/0x360

Cc: stable@vger.kernel.org
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-08 10:38:03 +02:00
Helmut Grohne
0f84f29ff3 gpio: zynq: initialize clock even without CONFIG_PM
When the PM initialization was moved in the commit referenced below, the
code enabling the clock was removed from the probe function. On
CONFIG_PM=y kernels, this is not a problem as the pm resume hook enables
the clock, but when power management is disabled, all those pm_*
functions are noops and the clock is never enabled resulting in a
dysfunctional gpio controller.

Put the clock initialization back to support CONFIG_PM=n.

Cc: stable@vger.kernel.org
Signed-off-by: Helmut Grohne <h.grohne@intenta.de>
Fixes: 3773c195d3 ("gpio: zynq: Do PM initialization earlier to support gpio hogs")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-08 10:36:29 +02:00
William Breathitt Gray
d15d6cf916 gpio: 104-dio-48e: Fix control port offset computation off-by-one error
There are only two control ports, each controlling three distinct I/O
ports. To compute the control port address offset for a respective I/O
port, the I/O port address offset should be divided by 3; dividing by 2
may result in not only the wrong address offset but possibly also an
out-of-bounds array memory access for a non-existent third control port.

Fixes: 1b06d64f73 ("gpio: Add GPIO support for the ACCES 104-DIO-48E")
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-06-08 10:08:12 +02:00
Ben Dooks
88832a22d6 net-sysfs: fix missing <linux/of_net.h>
The of_find_net_device_by_node() function is defined in
<linux/of_net.h> but not included in the .c file that
implements it. Fix the following warning by including the
header:

net/core/net-sysfs.c:1494:19: warning: symbol 'of_find_net_device_by_node' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 00:37:58 -07:00
Toshiaki Makita
0b148def40 bridge: Don't insert unnecessary local fdb entry on changing mac address
The missing br_vlan_should_use() test caused creation of an unneeded
local fdb entry on changing mac address of a bridge device when there is
a vlan which is configured on a bridge port but not on the bridge
device.

Fixes: 2594e9064a ("bridge: vlan: add per-vlan struct and move to rhashtables")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08 00:31:38 -07:00
David Binder
73e81350ad staging: unisys: visornic: change return statements
Changes return statements in visornic_rx() to use literals instead of a
variable. Also changes function description to reflect the correct return
type.

Signed-off-by: David Binder <david.binder@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:58:16 -07:00
Erik Arfvidson
35b2141556 staging: unisys: iovmcall_gnuc.h change -1 return values
This patch changes the vague -1 return values to -EPERM.
This operation is not supported is a good alternative
to -1 because the return is basically telling the caller
that the processor doesn't support vmcall operations.

Signed-off-by: Erik Arfvidson <erik.arfvidson@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:58:16 -07:00
Erik Arfvidson
119296eaa8 staging: unisys: visorchipset change -1 return value
This patch changes the vague -1 return value to -EINVAL

Signed-off-by: Erik Arfvidson <erik.arfvidson@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:58:16 -07:00
Erik Arfvidson
c294ea31aa staging: unisys: visorbus change -1 return values
This patch changes the vague -1 return values to -EFAULT since
it would be the most appropriate, given that this error
would only occur in an unexpected bad offset field.
Resulting in a bad address.

Signed-off-by: Erik Arfvidson <erik.arfvidson@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:58:16 -07:00
Erik Arfvidson
ba78c4707c staging: unisys: visorhba change -1 return value
This patch changes the vague -1 return value to -EBUSY

Signed-off-by: Erik Arfvidson <erik.arfvidson@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:58:16 -07:00
Erik Arfvidson
2efffad314 staging: unisys: visorinput change -1 return value
This patch changes the vague -1 return value to -EINVAL

Signed-off-by: Erik Arfvidson <erik.arfvidson@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:58:16 -07:00
Tim Sell
403ecd6364 staging: unisys: visorhba: "Prefer 'unsigned int'" checkpatch warnings
This patch fixes a few checkpatch warnings in visorhba:

    WARNING: Prefer 'unsigned int' to bare use of 'unsigned'

Signed-off-by: Tim Sell <Timothy.Sell@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:58:16 -07:00
David Binder
e1834bd0f6 staging: unisys: visornic: remove extraneous error check
Removes an extraneous error check in devdata_initialize(), and updates the
function comment accordingly.

Signed-off-by: David Binder <david.binder@unisys.com>
Reviewed-by: Tim Sell <Timothy.Sell@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:58:16 -07:00
David Binder
186896fdf0 staging: unisys: visornic: check for error instead of success
Changes the conditional logic to check for an error code instead
of a success code.

Signed-off-by: David Binder <david.binder@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
David Binder
ab2c3d7545 staging: unisys: visorhba: return 0 literal
Returns 0 instead of variable rc in visorhba_init().

Signed-off-by: David Binder <david.binder@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
David Binder
d12324e37d staging: unisys: visornic: cleanup error handling
Adjusts goto labels to prevent attempts to free unallocated resources.

Signed-off-by: David Binder <david.binder@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
David Binder
6d8c96cbc1 staging: unisys: visornic: simplify visornic if statements
Changes the conditional logic by looking for the absence of work
to do, instead of the opposite.

Signed-off-by: David Binder <david.binder@unisys.com>
Reviewed-by: Tim Sell <Timothy.Sell@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
Tim Sell
d91184a9c6 staging: unisys: visorhba: visorhbas_open[] no longer used, so deleted
The prior patch which simplified the visorhba debugfs interface made it so
visorhbas_open[] and VISORHBA_OPEN_MAX were no longer needed, so they have
now been deleted.

Signed-off-by: Tim Sell <Timothy.Sell@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
Tim Sell
5e1073d3f4 staging: unisys: visorhba: simplify and enhance debugfs interface
debugfs info for each visorhba device is now presented by a file named of
the following form within the debugfs tree:

    visorhba/vbus<x>:dev<y>/info

where <x> is the vbus number, and <y> is the relative device number.

Also, the debugfs presentation function was converted to use the seq_file
interface, so that it could access the device context without resorting to
a global array.  This also simplified the function.

Signed-off-by: Tim Sell <Timothy.Sell@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
Tim Sell
bf817f2f40 staging: unisys: visorhba: remove unused (and broken) logic
The handling of CMD_NOTIFYGUEST_TYPE messages from the IO partition appears
to be only partially implemented, but fortunately it is never used in our
current environment.  This patch deletes the unused code.

Signed-off-by: Tim Sell <Timothy.Sell@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
Tim Sell
a7d656063e staging: unisys: visorhba: correct scsi task mgmt completion handling
This patch is necessary to enable ANY task mgmt command to complete
successfully via visorhba.

When issuing a task mgmt command (CMD_SCSITASKMGMT_TYPE) to the IO
partition (back-end), forward_taskmgmt_command() includes pointers
within the command area that will be used to wake up the issuing
process and provide the result when the command completes:

    cmdrsp->scsitaskmgmt.notify_handle = (u64)&notifyevent;
    cmdrsp->scsitaskmgmt.notifyresult_handle = (u64)&notifyresult;

'notify_handle' is a pointer to a 'wait_queue_head_t' variable, and
'notifyresult' is a pointer to an int.  Both of these are just local
stack variables in the issuing process.

The way it's supposed to happen is that when the IO partition completes
the command, in our completion handling we get copies of those pointers
back from the IO partition, where we stash the result of the command at
'*notifyresult' (which should not be 0xffff, because that is the initial
value that the caller is looking to see a change in), and wake up the
wait queue at '*notify_handle'.  There are several places we do that dance,
but prior to this patch, we always do it WRONG, like:

    cmdrsp->scsitaskmgmt.notifyresult_handle = TASK_MGMT_FAILED;
    wake_up_all((wait_queue_head_t *)cmdrsp->scsitaskmgmt.notify_handle);

The wake_up_all() part is correct (albeit with the help of the sloppy
pointer casting, but that's irrelevant to the bug), but the assignment of
'notifyresult_handle' is WRONG, and SHOULD read:

    *(int *)(cmdrsp->scsitaskmgmt.notifyresult_handle) = TASK_MGMT_FAILED;

Without this change, the caller is NEVER going to notice a change in his
local value of 'notifyresult' when he does the:

    if (!wait_event_timeout(notifyevent, notifyresult != 0xffff,
                            msecs_to_jiffies(45000)))

and hence will be timing out EVERY taskmgmt command.

This patch also eliminates the need for sloppy casting of pointers
back-and-forth between u64 values, with the help of idr_alloc() to provide
handles for us.  It is the generated int handles we pass to the IO
partition to denote our completion context, and these are validated and
converted back to the required pointers when the task mgmt commands are
returned back to us by the IO partition.

== Testing ==

You must enable dynamic debugging in visorhba (build kernel with
'CONFIG_DYNAMIC_DEBUG=y', provide kernel parameter 'visorhba.dyndbg=+p')
to see kernel messages involved with visorhba scsi task mgmt commands,
which were added in this patch in the form of a few dev_dbg() / pr_debug()
messages.

In order to inject faults necessary to get visorhba to actully issue scsi
task mgmt commands, you will need to compile a kernel with
CONFIG_FAIL_IO_TIMEOUT and friends, in the "Kernel hacking" section:
* Enable "Fault-injection framework"
  * Enable "Fault-injection capability for disk IO"
  * Enable "Fault-injection capability for faking disk interrupts"
* Enable "Debugfs entries for fault-injection capabilities"

When running a kernel with those options, you can manually inject a fault
that will force a scsi task mgmt command to be issued like this:

    # mount -t debugfs nodev /sys/kernel/debug
    # cd /sys/kernel/debug/fail_io_timeout
    # cat interval
    1
    # cat probability
    0
    # cat times
    1
    # echo 100 >probability
    # cd /sys/block/sda
    # l | grep fail
    -rw-r--r--  1 root root 4096 May  5 10:53 io-timeout-fail
    -rw-r--r--  1 root root 4096 May  5 10:54 make-it-fail
    # echo 1 >io-timeout-fail
    # echo 1 >make-it-fail

To test this patch, after performing the above steps, I did something to
force a block device i/o, then shortly afterwards examined the kernel log.
There I found evidence that visorhba had successfully issued a task mgmt
command, and that it completed successfully:

    [  333.352612] FAULT_INJECTION: forcing a failure.
    name fail_io_timeout, interval 1, probability 100, space 0, times 1
    [  333.352617] CPU: 0 PID: 295 Comm: vhba_incoming Tainted: G         C
                   4.6.0-rc3-ARCH+ #2
    [  333.352619] Hardware name: Dell Inc. PowerEdge T110/ ,
                   BIOS 1.23 12/15/2009
    [  333.352620]  0000000000000000 ffff88001d1a7dd0 ffffffff8125beeb
                    ffffffff818507c0
    [  333.352623]  0000000000000064 ffff88001d1a7df0 ffffffff8128047a
                    ffff8800113462b0
    [  333.352625]  ffff88000e523000 ffff88001d1a7e00 ffffffff81241c79
                    ffff88001d1a7e18
    [  333.352627] Call Trace:
    [  333.352634]  [<ffffffff8125beeb>] dump_stack+0x4d/0x72
    [  333.352637]  [<ffffffff8128047a>] should_fail+0x11a/0x120
    [  333.352641]  [<ffffffff81241c79>] blk_should_fake_timeout+0x29/0x30
    [  333.352643]  [<ffffffff81241c36>] blk_complete_request+0x16/0x30
    [  333.352654]  [<ffffffffa0118b36>] scsi_done+0x26/0x80 [scsi_mod]
    [  333.352657]  [<ffffffffa014a56c>] process_incoming_rsps+0x2bc/0x770
                                         [visorhba]
    [  333.352661]  [<ffffffff81095630>] ? wait_woken+0x80/0x80
    [  333.352663]  [<ffffffffa014a2b0>] ? add_scsipending_entry+0x100/0x100
                                         [visorhba]
    [  333.352666]  [<ffffffff81077759>] kthread+0xc9/0xe0
    [  333.352669]  [<ffffffff814609d2>] ret_from_fork+0x22/0x40
    [  333.352671]  [<ffffffff81077690>] ? kthread_create_on_node+0x180/0x180
    [  364.025672] sd 0:0:1:1: visorhba: initiating type=1 taskmgmt command
    [  364.029721] visorhba: notifying initiator with result=0x1
    [  364.029726] sd 0:0:1:1: visorhba: taskmgmt type=1 success; result=0x1

Signed-off-by: Tim Sell <Timothy.Sell@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
Tim Sell
9c4dfdaa25 staging: unisys: visorhba: delete processing of vdiskmgmt commands
We never issue SCSI commands of type CMD_VDISKMGMT_TYPE, so there is no
need to have code that processes their completions.

Signed-off-by: Tim Sell <Timothy.Sell@unisys.com>
Signed-off-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:55:19 -07:00
Wolfram Sang
c5d9a03031 staging: ks7010: cleanup file headers
Remove svn-ids and fix typos in the licence declaration. Add my
copyright to the sdio code which I worked on mainly.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-07 22:42:53 -07:00