Commit Graph

506916 Commits

Author SHA1 Message Date
Stephen Rothwell
4b5edb2f4a mpls: using vzalloc requires including vmalloc.h
Fixes this build error:

net/mpls/af_mpls.c: In function 'resize_platform_label_table':
net/mpls/af_mpls.c:767:4: error: implicit declaration of function 'vzalloc' [-Werror=implicit-function-declaration]
    labels = vzalloc(size);
    ^

Fixes: 7720c01f3f ("mpls: Add a sysctl to control the size of the mpls label table")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 21:01:33 -05:00
Quentin Casasnovas
dd9ef135e3 Btrfs:__add_inode_ref: out of bounds memory read when looking for extended ref.
Improper arithmetics when calculting the address of the extended ref could
lead to an out of bounds memory read and kernel panic.

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-05 17:28:33 -08:00
Filipe Manana
3a8b36f378 Btrfs: fix data loss in the fast fsync path
When using the fast file fsync code path we can miss the fact that new
writes happened since the last file fsync and therefore return without
waiting for the IO to finish and write the new extents to the fsync log.

Here's an example scenario where the fsync will miss the fact that new
file data exists that wasn't yet durably persisted:

1. fs_info->last_trans_committed == N - 1 and current transaction is
   transaction N (fs_info->generation == N);

2. do a buffered write;

3. fsync our inode, this clears our inode's full sync flag, starts
   an ordered extent and waits for it to complete - when it completes
   at btrfs_finish_ordered_io(), the inode's last_trans is set to the
   value N (via btrfs_update_inode_fallback -> btrfs_update_inode ->
   btrfs_set_inode_last_trans);

4. transaction N is committed, so fs_info->last_trans_committed is now
   set to the value N and fs_info->generation remains with the value N;

5. do another buffered write, when this happens btrfs_file_write_iter
   sets our inode's last_trans to the value N + 1 (that is
   fs_info->generation + 1 == N + 1);

6. transaction N + 1 is started and fs_info->generation now has the
   value N + 1;

7. transaction N + 1 is committed, so fs_info->last_trans_committed
   is set to the value N + 1;

8. fsync our inode - because it doesn't have the full sync flag set,
   we only start the ordered extent, we don't wait for it to complete
   (only in a later phase) therefore its last_trans field has the
   value N + 1 set previously by btrfs_file_write_iter(), and so we
   have:

       inode->last_trans <= fs_info->last_trans_committed
           (N + 1)              (N + 1)

   Which made us not log the last buffered write and exit the fsync
   handler immediately, returning success (0) to user space and resulting
   in data loss after a crash.

This can actually be triggered deterministically and the following excerpt
from a testcase I made for xfstests triggers the issue. It moves a dummy
file across directories and then fsyncs the old parent directory - this
is just to trigger a transaction commit, so moving files around isn't
directly related to the issue but it was chosen because running 'sync' for
example does more than just committing the current transaction, as it
flushes/waits for all file data to be persisted. The issue can also happen
at random periods, since the transaction kthread periodicaly commits the
current transaction (about every 30 seconds by default).
The body of the test is:

  _scratch_mkfs >> $seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create our main test file 'foo', the one we check for data loss.
  # By doing an fsync against our file, it makes btrfs clear the 'needs_full_sync'
  # bit from its flags (btrfs inode specific flags).
  $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 8K" \
                  -c "fsync" $SCRATCH_MNT/foo | _filter_xfs_io

  # Now create one other file and 2 directories. We will move this second file
  # from one directory to the other later because it forces btrfs to commit its
  # currently open transaction if we fsync the old parent directory. This is
  # necessary to trigger the data loss bug that affected btrfs.
  mkdir $SCRATCH_MNT/testdir_1
  touch $SCRATCH_MNT/testdir_1/bar
  mkdir $SCRATCH_MNT/testdir_2

  # Make sure everything is durably persisted.
  sync

  # Write more 8Kb of data to our file.
  $XFS_IO_PROG -c "pwrite -S 0xbb 8K 8K" $SCRATCH_MNT/foo | _filter_xfs_io

  # Move our 'bar' file into a new directory.
  mv $SCRATCH_MNT/testdir_1/bar $SCRATCH_MNT/testdir_2/bar

  # Fsync our first directory. Because it had a file moved into some other
  # directory, this made btrfs commit the currently open transaction. This is
  # a condition necessary to trigger the data loss bug.
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir_1

  # Now fsync our main test file. If the fsync succeeds, we expect the 8Kb of
  # data we wrote previously to be persisted and available if a crash happens.
  # This did not happen with btrfs, because of the transaction commit that
  # happened when we fsynced the parent directory.
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

  # Simulate a crash/power loss.
  _load_flakey_table $FLAKEY_DROP_WRITES
  _unmount_flakey

  _load_flakey_table $FLAKEY_ALLOW_WRITES
  _mount_flakey

  # Now check that all data we wrote before are available.
  echo "File content after log replay:"
  od -t x1 $SCRATCH_MNT/foo

  status=0
  exit

The expected golden output for the test, which is what we get with this
fix applied (or when running against ext3/4 and xfs), is:

  wrote 8192/8192 bytes at offset 0
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  wrote 8192/8192 bytes at offset 8192
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  File content after log replay:
  0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
  *
  0020000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
  *
  0040000

Without this fix applied, the output shows the test file does not have
the second 8Kb extent that we successfully fsynced:

  wrote 8192/8192 bytes at offset 0
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  wrote 8192/8192 bytes at offset 8192
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  File content after log replay:
  0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
  *
  0020000

So fix this by skipping the fsync only if we're doing a full sync and
if the inode's last_trans is <= fs_info->last_trans_committed, or if
the inode is already in the log. Also remove setting the inode's
last_trans in btrfs_file_write_iter since it's useless/unreliable.

Also because btrfs_file_write_iter no longer sets inode->last_trans to
fs_info->generation + 1, don't set last_trans to 0 if we bail out and don't
bail out if last_trans is 0, otherwise something as simple as the following
example wouldn't log the second write on the last fsync:

  1. write to file

  2. fsync file

  3. fsync file
       |--> btrfs_inode_in_log() returns true and it set last_trans to 0

  4. write to file
       |--> btrfs_file_write_iter() no longers sets last_trans, so it
            remained with a value of 0
  5. fsync
       |--> inode->last_trans == 0, so it bails out without logging the
            second write

A test case for xfstests will be sent soon.

CC: <stable@vger.kernel.org>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-05 17:28:32 -08:00
Josef Bacik
f5c0a12280 Btrfs: remove extra run_delayed_refs in update_cowonly_root
This got added with my dirty_bgs patch, it's not needed.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-05 17:28:30 -08:00
Rafael J. Wysocki
e178e7d6df Merge branches 'pm-domains' and 'pm-cpufreq'
* pm-domains:
  PM / Domains: cleanup: rename gpd -> genpd in debugfs interface

* pm-cpufreq:
  cpufreq: ppc: Add missing #include <asm/smp.h>
2015-03-06 01:29:31 +01:00
Rafael J. Wysocki
1e3e770cfb Merge branch 'acpi-video'
* acpi-video:
  ACPI / video: Propagate the error code for acpi_video_register
  ACPI / video: Load the module even if ACPI is disabled
2015-03-06 01:29:16 +01:00
Rafael J. Wysocki
79d223646b Merge branch 'irq-pm'
* irq-pm:
  genirq / PM: describe IRQF_COND_SUSPEND
  tty: serial: atmel: rework interrupt and wakeup handling
  watchdog: at91sam9: request the irq with IRQF_NO_SUSPEND
  clk: at91: implement suspend/resume for the PMC irqchip
  rtc: at91rm9200: rework wakeup and interrupt handling
  rtc: at91sam9: rework wakeup and interrupt handling
  PM / wakeup: export pm_system_wakeup symbol
  genirq / PM: Add flag for shared NO_SUSPEND interrupt lines
  genirq / PM: better describe IRQF_NO_SUSPEND semantics
2015-03-06 01:29:05 +01:00
Mark Rutland
7438b633a6 genirq / PM: describe IRQF_COND_SUSPEND
With certain restrictions it is possible for a wakeup device to share
an IRQ with an IRQF_NO_SUSPEND user, and the warnings introduced by
commit cab303be91 are spurious. The new
IRQF_COND_SUSPEND flag allows drivers to tell the core when these
restrictions are met, allowing spurious warnings to be silenced.

This patch documents how IRQF_COND_SUSPEND is expected to be used,
updating some of the text now made invalid by its addition.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-03-06 01:28:14 +01:00
Pablo Neira Ayuso
1cae565e8b netfilter: nf_tables: limit maximum table name length to 32 bytes
Set the same as we use for chain names, it should be enough.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-06 01:21:21 +01:00
Pablo Neira Ayuso
f04e599e20 netfilter: nf_tables: consolidate Kconfig options
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-06 01:21:15 +01:00
Patrick McHardy
1a1e1a1219 netfilter: nf_tables: cleanup nf_tables.h
The transaction related definitions are squeezed in between the rule
and expression definitions, which are closely related and should be
next to each other. The transaction definitions actually don't belong
into that file at all since it defines the global objects and API and
transactions are internal to nf_tables_api, but for now simply move
them to a seperate section.

Similar, the chain types are in between a set of registration functions,
they belong to the chain section.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-06 01:21:13 +01:00
Patrick McHardy
354bf5a0d7 netfilter: nf_tables: consolidate tracing invocations
* JUMP and GOTO are equivalent except for JUMP pushing the current
  context to the stack

* RETURN and implicit RETURN (CONTINUE) are equivalent except that
  the logged rule number differs

Result:

  nft_do_chain              | -112
 1 function changed, 112 bytes removed, diff: -112

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-06 01:21:12 +01:00
Patrick McHardy
01ef16c2dd netfilter: nf_tables: minor tracing cleanups
The tracing code is squeezed between multiple related parts of the
evaluation code, move it out. Also add an inline wrapper for the
reoccuring test for skb->nf_trace.

Small code savings in nft_do_chain():

  nft_trace_packet          | -137
  nft_do_chain              |   -8
 2 functions changed, 145 bytes removed, diff: -145

net/netfilter/nf_tables_core.c:
  __nft_trace_packet | +137
 1 function changed, 137 bytes added, diff: +137

net/netfilter/nf_tables_core.o:
 3 functions changed, 137 bytes added, 145 bytes removed, diff: -8

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-06 01:21:11 +01:00
Pablo Neira Ayuso
43270b1bc5 netfilter: ipt_CLUSTERIP: deprecate it in favour of xt_cluster
xt_cluster supersedes ipt_CLUSTERIP since it can be also used in
gateway configurations (not only from the backend side).

ipt_CLUSTER is also known to leak the netdev that it uses on
device removal, which requires a rather large fix to workaround
the problem: http://patchwork.ozlabs.org/patch/358629/

So let's deprecate this so we can probably kill code this in the
future.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-06 01:21:05 +01:00
Boris BREZILLON
2c7af5ba65 tty: serial: atmel: rework interrupt and wakeup handling
The IRQ line connected to the DBGU UART is often shared with a timer device
which request the IRQ with IRQF_NO_SUSPEND.

Since the UART driver is correctly disabling IRQs when entering suspend
we can safely request the IRQ with IRQF_COND_SUSPEND so that irq core
will not complain about mixing IRQF_NO_SUSPEND and !IRQF_NO_SUSPEND.

Rework the interrupt handler to wake the system up when an interrupt
happens on the DEBUG_UART while the system is suspended.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-03-06 00:46:44 +01:00
Boris BREZILLON
d677772e13 watchdog: at91sam9: request the irq with IRQF_NO_SUSPEND
The watchdog interrupt (only used when activating software watchdog)
shouldn't be suspended when entering suspend mode, because it is shared
with a timer device (which request the line with IRQF_NO_SUSPEND) and once
the watchdog "Mode Register" has been written, it cannot be changed (which
means we cannot disable the watchdog interrupt when entering suspend).

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-03-06 00:46:31 +01:00
Rafael J. Wysocki
eef16e4362 Merge branch 'suspend-to-idle'
* suspend-to-idle:
  cpuidle / sleep: Use broadcast timer for states that stop local timer
  cpuidle: Clean up fallback handling in cpuidle_idle_call()
  cpuidle / sleep: Do sanity checks in cpuidle_enter_freeze() too
  idle / sleep: Avoid excessive disabling and enabling interrupts
2015-03-05 23:14:51 +01:00
Rafael J. Wysocki
8204680c7b Merge branch 'acpi-resources'
* acpi-resources:
  x86/PCI/ACPI: Relax ACPI resource descriptor checks to work around BIOS bugs
  x86/PCI/ACPI: Ignore resources consumed by host bridge itself
  PCI: versatile: Update for list_for_each_entry() API change
2015-03-05 23:14:40 +01:00
Rafael J. Wysocki
ef2b22ac54 cpuidle / sleep: Use broadcast timer for states that stop local timer
Commit 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
overlooked the fact that entering some sufficiently deep idle states
by CPUs may cause their local timers to stop and in those cases it
is necessary to switch over to a broadcast timer prior to entering
the idle state.  If the cpuidle driver in use does not provide
the new ->enter_freeze callback for any of the idle states, that
problem affects suspend-to-idle too, but it is not taken into account
after the changes made by commit 3810631332.

Fix that by changing the definition of cpuidle_enter_freeze() and
re-arranging of the code in cpuidle_idle_call(), so the former does
not call cpuidle_enter() any more and the fallback case is handled
by cpuidle_idle_call() directly.

Fixes: 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
Reported-and-tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-03-05 23:13:19 +01:00
Mark Salter
b0ab0afaeb net: eth: xgene: fix booting with devicetree
Commit de7b5b3d79 ("net: eth: xgene: change APM X-Gene SoC platform
ethernet to support ACPI") breaks booting with devicetree with UEFI
firmware. In that case, I get:

Unhandled fault: synchronous external abort (0x96000010) at 0xfffffc0000620010
 Internal error: : 96000010 [#1] SMP
 Modules linked in: vfat fat xfs libcrc32c ahci_xgene libahci_platform libahci
 CPU: 7 PID: 634 Comm: NetworkManager Not tainted 4.0.0-rc1+ #4
 Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0-rh-0.14 Mar  1 2015
 task: fffffe03d4c7e100 ti: fffffe03d4e24000 task.ti: fffffe03d4e24000
 PC is at xgene_enet_rd_mcx_mac.isra.11+0x58/0xd4
 LR is at xgene_gmac_tx_enable+0x2c/0x50
 pc : [<fffffe000069d6fc>] lr : [<fffffe000069dcc4>] pstate: 80000145
 sp : fffffe03d4e27590
 x29: fffffe03d4e27590 x28: 0000000000000000
 x27: fffffe03d4e277c0 x26: fffffe03da8fda10
 x25: fffffe03d4e2760c x24: fffffe03d49e28c0
 x23: fffffc0000620004 x22: 0000000000000000
 x21: fffffc0000620000 x20: fffffc0000620010
 x19: 000000000000000b x18: 000003ffd4a96020
 x17: 000003ff7fc1f7a0 x16: fffffe000079b9cc
 x15: 0000000000000000 x14: 0000000000000000
 x13: 0000000000000000 x12: fffffe03d4e24000
 x11: fffffe03d4e27da0 x10: 0000000000000001
 x9 : 0000000000000000 x8 : fffffe03d4e27a20
 x7 : 0000000000000000 x6 : 00000000ffffffef
 x5 : fffffe000105f7d0 x4 : fffffe00007ca8c8
 x3 : fffffe03d4e2760c x2 : 0000000000000000
 x1 : fffffc0000620000 x0 : 0000000040000000

 Process NetworkManager (pid: 634, stack limit = 0xfffffe03d4e24028)
 Stack: (0xfffffe03d4e27590 to 0xfffffe03d4e28000)
 ...
 Call trace:
 [<fffffe000069d6fc>] xgene_enet_rd_mcx_mac.isra.11+0x58/0xd4
 [<fffffe000069dcc0>] xgene_gmac_tx_enable+0x28/0x50
 [<fffffe00006a112c>] xgene_enet_open+0x2c/0x130
 [<fffffe00007b9254>] __dev_open+0xc8/0x148
 [<fffffe00007b956c>] __dev_change_flags+0x90/0x158
 [<fffffe00007b9664>] dev_change_flags+0x30/0x70
 [<fffffe00007c8ab8>] do_setlink+0x278/0x870
 [<fffffe00007c95bc>] rtnl_newlink+0x404/0x6a8
 [<fffffe00007c8040>] rtnetlink_rcv_msg+0x98/0x218
 [<fffffe00007e78e4>] netlink_rcv_skb+0xe0/0xf8
 [<fffffe00007c7f94>] rtnetlink_rcv+0x30/0x44
 [<fffffe00007e6f2c>] netlink_unicast+0xfc/0x210
 [<fffffe00007e75b8>] netlink_sendmsg+0x498/0x5ac
 [<fffffe00007990b8>] do_sock_sendmsg+0xa4/0xcc
 [<fffffe000079a958>] ___sys_sendmsg+0x1fc/0x208
 [<fffffe000079b984>] __sys_sendmsg+0x4c/0x94
 [<fffffe000079b9f8>] SyS_sendmsg+0x2c/0x3c

The problem here is that the enet hw clocks are not getting
initialized because of a test to avoid the initialization if
UEFI is used to boot. This is an incorrect test. When booting
with UEFI and devicetree, the kernel must still initialize
the enet hw clocks. If booting with ACPI, the clock hw is
not exposed to the kernel and it is that case where we want
to avoid initializing clocks.

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Feng Kan <fkan@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 15:40:10 -05:00
Brian King
da29370056 bnx2x: Force fundamental reset for EEH recovery
EEH recovery for bnx2x based adapters is not reliable on all Power
systems using the default hot reset, which can result in an
unrecoverable EEH error. Forcing the use of fundamental reset
during EEH recovery fixes this.

Cc: stable<stable@vger.kernel.org>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 15:38:24 -05:00
David S. Miller
0e9974f727 Merge branch 'cxgb4-next'
Hariprasad Shenai says:

====================
cxgb4: RX Queue related cleanup and fixes

This patch series adds a common function to allocate RX queues and queue
allocation changes to RDMA CIQ

The patches series is created against 'net-next' tree.
And includes patches on cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 15:11:57 -05:00
Hariprasad Shenai
f36e58e566 cxgb4: Try and provide an RDMA CIQ per cpu
To allow for better scalability on systems with large core counts, we
will try and allocate enough RDMA Concentrator IQs and MSI/X vectors as
we have cores. If we cannot get enough MSI/X vectors, fall back to the
minimum required: 1 per adapter rx channel.

Also clean up cxgb_enable_msix() to make it readable and correct a bug
where the vectors are not correctly assigned if the driver doesn't get
the full amount requested.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 15:11:52 -05:00
Hariprasad Shenai
1c6a5b0e34 cxgb4: Move offload Rx queue allocation to separate function
Adds a common function for all Rx queue allocation.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 15:11:52 -05:00
David S. Miller
08c852c43a Merge branch 'xen-netback'
David Vrabel says:

====================
xen-netback: fix ethtool stats and memory leak

A couple of bug fixes for netback:
- make ethool stats to report the correct values.
- don't leak 1 MiB every time a VIF is destroyed.

Changes in v2:
- Split 2nd patch into leak fix and refactor patches
====================

Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 14:59:32 -05:00
David Vrabel
b0c21badf1 xen-netback: refactor xenvif_handle_frag_list()
When handling a from-guest frag list, xenvif_handle_frag_list()
replaces the frags before calling the destructor to clean up the
original (foreign) frags.  Whilst this is safe (the destructor doesn't
actually use the frags), it looks odd.

Reorder the function to be less confusing.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 14:58:18 -05:00
David Vrabel
49d9991a18 xen-netback: unref frags when handling a from-guest skb with a frag list
Every time a VIF is destroyed up to 256 pages may be leaked if packets
with more than MAX_SKB_FRAGS frags were transmitted from the guest.
Even worse, if another user of ballooned pages allocated one of these
ballooned pages it would not handle the unexpectedly >1 page count
(e.g., gntdev would deadlock when unmapping a grant because the page
count would never reach 1).

When handling a from-guest skb with a frag list, unref the frags
before releasing them so they are freed correctly when the VIF is
destroyed.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 14:58:17 -05:00
David Vrabel
d63951d744 xen-netback: return correct ethtool stats
Use correct pointer arithmetic to get the pointer to each stat.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 14:58:17 -05:00
Jouni Malinen
842a9ae08a bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi
This extends the design in commit 958501163d ("bridge: Add support for
IEEE 802.11 Proxy ARP") with optional set of rules that are needed to
meet the IEEE 802.11 and Hotspot 2.0 requirements for ProxyARP. The
previously added BR_PROXYARP behavior is left as-is and a new
BR_PROXYARP_WIFI alternative is added so that this behavior can be
configured from user space when required.

In addition, this enables proxyarp functionality for unicast ARP
requests for both BR_PROXYARP and BR_PROXYARP_WIFI since it is possible
to use unicast as well as broadcast for these frames.

The key differences in functionality:

BR_PROXYARP:
- uses the flag on the bridge port on which the request frame was
  received to determine whether to reply
- block bridge port flooding completely on ports that enable proxy ARP

BR_PROXYARP_WIFI:
- uses the flag on the bridge port to which the target device of the
  request belongs
- block bridge port flooding selectively based on whether the proxyarp
  functionality replied

Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 14:52:23 -05:00
Linus Torvalds
99aedde086 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes: EFI fixes, an Intel Quark fix, an asm fix and an FPU
  handling fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu/xsaves: Fix improper uses of __ex_table
  x86/intel/quark: Select COMMON_CLK
  x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization
  firmware: dmi_scan: Fix dmi_len type
  efi/libstub: Fix boundary checking in efi_high_alloc()
  firmware: dmi_scan: Fix dmi scan to handle "End of Table" structure
2015-03-05 11:25:23 -08:00
kbuild test robot
787fb2bd42 ax25: Fix the build when CONFIG_INET is disabled
>
> >> net/ax25/ax25_ip.c:225:26: error: unknown type name 'sturct'
>     netdev_tx_t ax25_ip_xmit(sturct sk_buff *skb)
>                              ^
>
> vim +/sturct +225 net/ax25/ax25_ip.c
>
>    219				    unsigned short type, const void *daddr,
>    220				    const void *saddr, unsigned int len)
>    221	{
>    222		return -AX25_HEADER_LEN;
>    223	}
>    224
>  > 225	netdev_tx_t ax25_ip_xmit(sturct sk_buff *skb)
>    226	{
>    227		kfree_skb(skb);
>    228		return NETDEV_TX_OK;

Ooops I misspelled struct...

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-05 13:17:39 -05:00
Marcelo Tosatti
bfb8fb4775 KVM: s390: Fixups for changes in merge window for 4.0
Here are some fixups/improvements for
 
 commit 658b6eda20 ("KVM: s390: add cpu model support")
 commit 9d8d578605 ("KVM: s390: use facilities and cpu_id per KVM")
 commit a374e892c3 ("KVM: s390/cpacf: Enable/disable protected key
 functions for kvm guest")
 commit 45c9b47c58 ("KVM: s390/CPACF: Choose crypto control block format")
 
 which all have been merged during the merge window for 4.0.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJU9ucxAAoJEBF7vIC1phx8OOoP/iJhd73h4QsjjU/ZaXSNW+65
 pwgOhD3eSXdUkGNsKUMp9nsyDo89puSZdUo9+5hvtbal5P89BpLx2VQxryZY0SAU
 21D/AjaaxJ6YcL5mfT6lyp1W5jznOobZLl8wSHoIAWyKrmqv0hLG1M/Ip5TxmdE0
 7rSeABHJyQNcTnOLL9Iq/uut8Qaf8ApD2rxzNfZILjlyQveS1I+l2qRAyi4MYTU0
 E3wZAOirzzNZfhbe049hvyNYd086GfOucwf1wsezvhbVJeqi2MNRf61yCGpedRdw
 2V5ce6LefMSx+sdqBOoKCYziO/vnTMdQjBfsm9315fTMHlGVBY0PgPWLQFVe77pI
 2enB20/T4hTJ203Nwaa0jBkQ2IyunSrE1rP6M+E7g90ZgiXrR9TsURAfl+u5EzE4
 QZe7BcQRJ+Wrj1lrM7W67Mr+hzIlRvWeEJjKsHa/q8BPD4mbQRacPN2OKW2zcbrx
 Ye4Vr0Fanom5nyZqPDbMl5dWlN0xuboFksSMkdaR7KDp5fb4WMITOl0WeOZPRAn9
 16TTO6Mr19I3uCCuJFhZaI1HPyUQPJja2JZqB/HUcribTcp+GGPt8od0Yxvd07kP
 B4T5OlpiYbOIimIIkukGvbdNIBOcfvyiInq3offrE1S/q4PWo3uWf2HyeQE+/G8V
 d9gCVWo+st4Sj5UVRCm0
 =ZaGi
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-20150303' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

KVM: s390: Fixups for changes in merge window for 4.0

Here are some fixups/improvements for

commit 658b6eda20 ("KVM: s390: add cpu model support")
commit 9d8d578605 ("KVM: s390: use facilities and cpu_id per KVM")
commit a374e892c3 ("KVM: s390/cpacf: Enable/disable protected key
functions for kvm guest")
commit 45c9b47c58 ("KVM: s390/CPACF: Choose crypto control block format")

which all have been merged during the merge window for 4.0.
2015-03-05 14:42:48 -03:00
Quentin Casasnovas
06c8173eb9 x86/fpu/xsaves: Fix improper uses of __ex_table
Commit:

  f31a9f7c71 ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")

introduced alternative instructions for XSAVES/XRSTORS and commit:

  adb9d526e9 ("x86/xsaves: Add xsaves and xrstors support for booting time")

added support for the XSAVES/XRSTORS instructions at boot time.

Unfortunately both failed to properly protect them against faulting:

The 'xstate_fault' macro will use the closest label named '1'
backward and that ends up in the .altinstr_replacement section
rather than in .text. This means that the kernel will never find
in the __ex_table the .text address where this instruction might
fault, leading to serious problems if userspace manages to
trigger the fault.

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Jamie Iles <jamie.iles@oracle.com>
[ Improved the changelog, fixed some whitespace noise. ]
Acked-by: Borislav Petkov <bp@alien8.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Cc: Allan Xavier <mr.a.xavier@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: adb9d526e9 ("x86/xsaves: Add xsaves and xrstors support for booting time")
Fixes: f31a9f7c71 ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-05 18:20:36 +01:00
Robert Jarzmik
ecb9b4241f dmaengine: mmp_pdma: fix warning about slave caps
Fix the dmaengine complaint about missing slave caps :
 - declare the available bus widths
 - declare the available transfer types
 - declare the residue calculation type

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-03-05 22:15:35 +05:30
Andy Shevchenko
9ab6eb51ef x86/intel/quark: Select COMMON_CLK
The commit 8bbc2a135b ("x86/intel/quark: Add Intel Quark
platform support") introduced a minimal support of Intel Quark
SoC. That allows to use core parts of the SoC. However, the SPI,
I2C, and GPIO drivers can't be selected by kernel configuration
because they depend on COMMON_CLK. The patch adds a COMMON_CLK
selection to the platfrom definition to allow user choose the drivers.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Ong, Boon Leong <boon.leong.ong@intel.com>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Darren Hart <dvhart@linux.intel.com>
Fixes: 8bbc2a135b ("x86/intel/quark: Add Intel Quark platform support")
Link: http://lkml.kernel.org/r/1425569044-2867-1-git-send-email-andriy.shevchenko@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-05 17:44:53 +01:00
Stanimir Varbanov
90b1047f13 dmaengine: qcom_bam_dma: fix wrong register offsets
The commit fb93f520e (dmaengine: qcom_bam_dma: Generalize BAM
register offset calculations) wrongly populated base offsets
for event registers for bam v1.4.

Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Reviewed-by: Andy Gross <agross@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-03-05 22:07:12 +05:30
Sravanthi Tangeda
d3866a071c i40e/i40evf: Version bump
Bump i40e to 1.2.11 and i40evf to 1.2.5

Change-ID: Ie13375941606b0a027e5b5dbc235f5f5f03b75c8
Signed-off-by: Sravanthi Tangeda <sravanthi.tangeda@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 08:17:28 -08:00
Stanimir Varbanov
fe4be5e9f9 dmaengine: bam-dma: fix a warning about missing capabilities
Avoid the warning below triggered during dmaengine async device
registration.

WARNING: CPU: 1 PID: 1 at linux/drivers/dma/dmaengine.c:863
dma_async_device_register+0x2a8/0x4b8()
this driver doesn't support generic slave capabilities reporting

To do that fill mandatory .directions bit mask,
.src/dst_addr_widths and .residue_granularity dma_device fields
with appropriate values.

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-03-05 21:44:38 +05:30
Mitch A Williams
9466699001 i40e: don't spam the system log
The PF driver spams the system log with messages about VF VSI when VFs
are created, as well as each time they are reset. This is annoying, and
the information isn't even useful most of the time.

Remove this message to reduce user annoyance.

Change-ID: I8de90d05380f54b038c9c8c3265150be87c9242c
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 07:54:10 -08:00
Shannon Nelson
3b44439934 i40e: move IRQ tracking setup into MSIX setup
Move the IRQ tracking setup and teardown into the same routines that
do the IRQ setup and teardown.  This keeps like activities together and
allows us to track exactly the number of vectors reserved from the OS,
which may be fewer than are available from the HW.

Change-ID: I6b2b1a955c5f0ac6b94c3084304ed0b2ea6777cf
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 07:36:08 -08:00
Anjali Singhai
232f47060a i40e: Ioremap changes
For future device support we do not want to map the whole CSR space since some
of it is mapped by other drivers with different mapping methods.

Note: As a side effect, the flash region (if exposed through the memory map)
gets unmapped too since it follows the future use region.

Change-ID: Ic729a2eacd692984220b1a415ff4fa0f98ea419a
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 07:15:28 -08:00
Greg Kroah-Hartman
d3d5389475 USB-serial fixes for v4.0-rc3
Here are a few fixes for reported problems including a usb-debug device
 buffer overflow, potential use-after-free on failed probe, and a couple
 of issues with the USB console.
 
 Some new device IDs are also added.
 
 Signed-off-by: Johan Hovold <johan@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJU+Fy4AAoJEEEN5E/e4bSVxSIQAM/ekC0n6/v2BqF+RJ+1YmGs
 pELjhVxgiUJ+TqgncxhFkbSVVKhBJ7JN8qOj2wV9r55WX8OgoGVoqFopz+PHeHWC
 2JvPYkGi0UhCxCLto/XFnS8KYT+ODCqDF3E9CJKZEl6X3TGmQJ0Lyka953vYRWO3
 0g3DwHGVdAHCH1ZZFJB4R/45i6ACLWd5A7zlyfvjrDYWfTgfONIFbaOjNATJFjre
 3FyUXM2TP/AZNAcPYt8IUrbrUIriKptmQJk0F2gZEyLvYwlmJvtutpMLUL+Xs0Hx
 rbQVsQfKt7Plo2nkUkJ+1B/Q15+0aTyA5dqdXeYG77R6O5Y9v1c6fpc5Jr+fIPI/
 jMP9LBeAUqQpKsLvtN4LG93a19r+0lJfdZtNTComLmSvQXYIiqRPNBlLmrBUN+Tq
 KG7TYHCoDsxw6jgMRPNLGn3lMEHxWej1lhI3UYSAXks1qixHLm+5+hVcqbKqMeaQ
 2gX/xEHIFgaHNvfl9Hr1ke2vMxHB/tSt7DcrQVgBsBQgEP6PgvG+fQajoIZ7YQJC
 K+ElAwUiGeYq2Kqp8ljqVd+q1LyQo0w9B3eJSG5URnmLkxzQ4PPVf2tnVxy+xq3q
 N10cp/6vKLFATF4+Gu2dMGuSbK5LsNvOAGpjr6VDAqb2wAFmaT+MaQKQlgo/6TQH
 L/Cj0mx7uyGwhtPBZES6
 =0CrX
 -----END PGP SIGNATURE-----

Merge tag 'usb-serial-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for v4.0-rc3

Here are a few fixes for reported problems including a usb-debug device
buffer overflow, potential use-after-free on failed probe, and a couple
of issues with the USB console.

Some new device IDs are also added.

Signed-off-by: Johan Hovold <johan@kernel.org>
2015-03-05 07:15:17 -08:00
Jesse Brandeburg
5bbc330100 i40e/i40evf: Clean up some formatting and other things
Fix some double blank lines and un-split a function declaration that all
fits on one line. Also make i40e_get_priv_flags static.

Change-ID: I11b5d25d1153a06b286d0d2f5d916d7727c58e4a
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 06:39:42 -08:00
Catherine Sullivan
180204c79f i40e: Add AOC PHY types to case statements
Add the 10G and 40G AOC PHY types to the case statement in get_media_type
and ethtool get_settings so that the correct information gets reported
back to the user.

Change-ID: I1b4849d22199a9acf7c8807166d0317c1faad375
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 06:02:11 -08:00
Greg Rose
5b86c5cf75 i40e: Fix ethtool offline test
If the system administrator is requesting an offline diagnostic test using
'ethtool -t' then we should, you know, actually take the device offline
before doing the testing.

Change-ID: I6afa1cbfcc821c9ab6e6f47ed4d8dc2d8dd20e82
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 05:43:52 -08:00
Catherine Sullivan
088c4ee370 i40e: Reassign incorrect PHY type to fix a FW bug
Some FW versions are incorrectly reporting a breakout cable as PHY type
0x3 when it should be 0x16 (I40E_PHY_TYPE_10GBASE_SFPP_CU).
If we get this value back from FW and the version is < 4.40, reassign it
to I40E_PHY_TYPE_10GBASE_SFPP_CU.

Change-ID: Ibb41a0e3cd2c0753744e8553959240df6ed13ae8
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 05:25:18 -08:00
Tejun Heo
8603e1b300 workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE
cancel[_delayed]_work_sync() are implemented using
__cancel_work_timer() which grabs the PENDING bit using
try_to_grab_pending() and then flushes the work item with PENDING set
to prevent the on-going execution of the work item from requeueing
itself.

try_to_grab_pending() can always grab PENDING bit without blocking
except when someone else is doing the above flushing during
cancelation.  In that case, try_to_grab_pending() returns -ENOENT.  In
this case, __cancel_work_timer() currently invokes flush_work().  The
assumption is that the completion of the work item is what the other
canceling task would be waiting for too and thus waiting for the same
condition and retrying should allow forward progress without excessive
busy looping

Unfortunately, this doesn't work if preemption is disabled or the
latter task has real time priority.  Let's say task A just got woken
up from flush_work() by the completion of the target work item.  If,
before task A starts executing, task B gets scheduled and invokes
__cancel_work_timer() on the same work item, its try_to_grab_pending()
will return -ENOENT as the work item is still being canceled by task A
and flush_work() will also immediately return false as the work item
is no longer executing.  This puts task B in a busy loop possibly
preventing task A from executing and clearing the canceling state on
the work item leading to a hang.

task A			task B			worker

						executing work
__cancel_work_timer()
  try_to_grab_pending()
  set work CANCELING
  flush_work()
    block for work completion
						completion, wakes up A
			__cancel_work_timer()
			while (forever) {
			  try_to_grab_pending()
			    -ENOENT as work is being canceled
			  flush_work()
			    false as work is no longer executing
			}

This patch removes the possible hang by updating __cancel_work_timer()
to explicitly wait for clearing of CANCELING rather than invoking
flush_work() after try_to_grab_pending() fails with -ENOENT.

Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com

v3: bit_waitqueue() can't be used for work items defined in vmalloc
    area.  Switched to custom wake function which matches the target
    work item and exclusive wait and wakeup.

v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
    the target bit waitqueue has wait_bit_queue's on it.  Use
    DEFINE_WAIT_BIT() and __wake_up_bit() instead.  Reported by Tomeu
    Vizoso.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Rabin Vincent <rabin.vincent@axis.com>
Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com>
Cc: stable@vger.kernel.org
Tested-by: Jesper Nilsson <jesper.nilsson@axis.com>
Tested-by: Rabin Vincent <rabin.vincent@axis.com>
2015-03-05 08:04:13 -05:00
Jesse Brandeburg
9a660eeae2 i40e: fix XPS mask when resetting
During resets (possibly caused by a Tx hang) the driver would
accidentally clear the XPS mask for all queues back to 0.

This caused higher CPU utilization and had some other performance impacts
for transmit tests.

Change-ID: I95f112432c9e643a153eaa31cd28cdcbfdd01831
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-05 04:57:33 -08:00
Rafał Miłecki
1ca2760fb2 bcma: prepare Kconfig symbol for PCI driver
Driver for PCIe core requires PCI to be enabled, however we shouldn't
require it for the whole bus. Someone may be not interested in extra
PCI devices and what's more there are SoCs without any PCI at all (like
BCM5356C0, BCM5357*, BCM47186B0). For more details see Kconfig "help".
Please note this patch doesn't allow disabling PCI drivers yet, as it
requires more work on calls to bcma_core_pci_* functions.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-03-05 14:11:45 +02:00
Rafał Miłecki
0a4e699a41 bcma: move internal function declarations to private header
These functions are not exported nor used anywhere, so there is no
reason to put them in public headers.
Also drop unused bcma_chipco_(suspend|resume).

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-03-05 14:11:43 +02:00