This patch fixes sparse warnings in vlan driver.
It propagates the sparse __percpu attribute from alloc_percpu
into netdev_alloc_pcpu_stats. I expect it may trigger additional
sparse warnings from other drivers that are missing the __percpu
attribute.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The configuration for CAN FD depends on CAN_CTRLMODE_FD enabled in the driver
specific ctrlmode_supported capabilities.
The configuration can be done either with the 'fd { on | off }' option in the
'ip' tool from iproute2 or by setting the CAN netdevice MTU to CAN_MTU (16) or
to CANFD_MTU (72).
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
As CAN FD offers a second bitrate for the data section of the CAN frame the
infrastructure for storing and configuring this second bitrate is introduced.
Improved the readability of the if-statement by inserting some newlines.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Conflicts:
drivers/net/wireless/ath/ath9k/recv.c
drivers/net/wireless/mwifiex/pcie.c
net/ipv6/sit.c
The SIT driver conflict consists of a bug fix being done by hand
in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper
was created (netdev_alloc_pcpu_stats()) which takes care of this.
The two wireless conflicts were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Labels for the Multiprotocol Label Switching are defined in RFC 3032
which was superseded by RFC 5462. Add the definition to UAPI and a stub
header for include/linux.
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix memory leak in ieee80211_prep_connection(), sta_info leaked on
error. From Eytan Lifshitz.
2) Unintentional switch case fallthrough in nft_reject_inet_eval(),
from Patrick McHardy.
3) Must check if payload lenth is a power of 2 in
nft_payload_select_ops(), from Nikolay Aleksandrov.
4) Fix mis-checksumming in xen-netfront driver, ip_hdr() is not in the
correct place when we invoke skb_checksum_setup(). From Wei Liu.
5) TUN driver should not advertise HW vlan offload features in
vlan_features. Fix from Fernando Luis Vazquez Cao.
6) IPV6_VTI needs to select NET_IPV_TUNNEL to avoid build errors, fix
from Steffen Klassert.
7) Add missing locking in xfrm_migrade_state_find(), we must hold the
per-namespace xfrm_state_lock while traversing the lists. Fix from
Steffen Klassert.
8) Missing locking in ath9k driver, access to tid->sched must be done
under ath_txq_lock(). Fix from Stanislaw Gruszka.
9) Fix two bugs in TCP fastopen. First respect the size argument given
to tcp_sendmsg() in the fastopen path, and secondly prevent
tcp_send_syn_data() from potentially using order-5 allocations.
From Eric Dumazet.
10) Fix handling of default neigh garbage collection params, from Jiri
Pirko.
11) Fix cwnd bloat and over-inflation of RTT when transmit segmentation
is in use. From Eric Dumazet.
12) Missing initialization of Realtek r8169 driver's statistics
seqlocks. Fix from Kyle McMartin.
13) Fix RTNL assertion failures in 802.3ad and AB ARP monitor of bonding
driver, from Ding Tianhong.
14) Bonding slave release race can cause divide by zero, fix from
Nikolay Aleksandrov.
15) Overzealous return from neigh_periodic_work() causes reachability
time to not be computed. Fix from Duain Jiong.
16) Fix regression in ipv6_find_hdr(), it should not return -ENOENT when
a specific target is specified and found. From Hans Schillstrom.
17) Fix VLAN tag stripping regression in BNA driver, from Ivan Vecera.
18) Tail loss probe can calculate bogus RTTs due to missing packet
marking on retransmit. Fix from Yuchung Cheng.
19) We cannot do skb_dst_drop() in iptunnel_pull_header() because
multicast loopback detection in later code paths need access to
skb_rtable(). Fix from Xin Long.
20) The macvlan driver regresses in that it propagates lower device
offload support disables into itself, causing severe slowdowns when
running over a bridge. Provide the software offloads always on
macvlan devices to deal with this and the regression is gone. From
Vlad Yasevich.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
macvlan: Add support for 'always_on' offload features
net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer
net: cpsw: fix cpdma rx descriptor leak on down interface
be2net: isolate TX workarounds not applicable to Skyhawk-R
be2net: Fix skb double free in be_xmit_wrokarounds() failure path
be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode
be2net: Fix to reset transparent vlan tagging
qlcnic: dcb: a couple off by one bugs
tcp: fix bogus RTT on special retransmission
hsr: off by one sanity check in hsr_register_frame_in()
can: remove CAN FD compatibility for CAN 2.0 sockets
can: flexcan: factor out soft reset into seperate funtion
can: flexcan: flexcan_remove(): add missing netif_napi_del()
can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze
can: flexcan: factor out transceiver {en,dis}able into seperate functions
can: flexcan: fix transition from and to low power mode in chip_{en,dis}able
can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails
can: flexcan: fix shutdown: first disable chip, then all interrupts
USB AX88179/178A: Support D-Link DUB-1312
...
When doing some numa tests on powerpc, I triggered an oops bug. I find
it is caused by using page->_last_cpupid. It should be initialized as
"-1 & LAST_CPUPID_MASK", but not "-1". Otherwise, in task_numa_fault(),
we will miss the checking (last_cpupid == (-1 & LAST_CPUPID_MASK)). And
finally cause an oops bug in task_numa_group(), since the online cpu is
less than possible cpu. This happen with CONFIG_SPARSE_VMEMMAP disabled
Call trace:
SMP NR_CPUS=64 NUMA PowerNV
Modules linked in:
CPU: 24 PID: 804 Comm: systemd-udevd Not tainted3.13.0-rc1+ #32
task: c000001e2746aa80 ti: c000001e32c50000 task.ti:c000001e32c50000
REGS: c000001e32c53510 TRAP: 0300 Not tainted(3.13.0-rc1+)
MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR:28024424 XER: 20000000
CFAR: c000000000009324 DAR: 7265717569726857 DSISR:40000000 SOFTE: 1
NIP .task_numa_fault+0x1470/0x2370
LR .task_numa_fault+0x1468/0x2370
Call Trace:
.task_numa_fault+0x1468/0x2370 (unreliable)
.do_numa_page+0x480/0x4a0
.handle_mm_fault+0x4ec/0xc90
.do_page_fault+0x3a8/0x890
handle_page_fault+0x10/0x30
Instruction dump:
3c82fefb 3884b138 48d9cff1 60000000 48000574 3c62fefb3863af78 3c82fefb
3884b138 48d9cfd5 60000000 e93f0100 <812902e4> 7d2907b45529063e 7d2a07b4
---[ end trace 15f2510da5ae07cf ]---
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Borkmann reported a VM_BUG_ON assertion failing:
------------[ cut here ]------------
kernel BUG at mm/mlock.c:528!
invalid opcode: 0000 [#1] SMP
Modules linked in: ccm arc4 iwldvm [...]
video
CPU: 3 PID: 2266 Comm: netsniff-ng Not tainted 3.14.0-rc2+ #8
Hardware name: LENOVO 2429BP3/2429BP3, BIOS G4ET37WW (1.12 ) 05/29/2012
task: ffff8801f87f9820 ti: ffff88002cb44000 task.ti: ffff88002cb44000
RIP: 0010:[<ffffffff81171ad0>] [<ffffffff81171ad0>] munlock_vma_pages_range+0x2e0/0x2f0
Call Trace:
do_munmap+0x18f/0x3b0
vm_munmap+0x41/0x60
SyS_munmap+0x22/0x30
system_call_fastpath+0x1a/0x1f
RIP munlock_vma_pages_range+0x2e0/0x2f0
---[ end trace a0088dcf07ae10f2 ]---
because munlock_vma_pages_range() thinks it's unexpectedly in the middle
of a THP page. This can be reproduced with default config since 3.11
kernels. A reproducer can be found in the kernel's selftest directory
for networking by running ./psock_tpacket.
The problem is that an order=2 compound page (allocated by
alloc_one_pg_vec_page() is part of the munlocked VM_MIXEDMAP vma (mapped
by packet_mmap()) and mistaken for a THP page and assumed to be order=9.
The checks for THP in munlock came with commit ff6a6da60b ("mm:
accelerate munlock() treatment of THP pages"), i.e. since 3.9, but did
not trigger a bug. It just makes munlock_vma_pages_range() skip such
compound pages until the next 512-pages-aligned page, when it encounters
a head page. This is however not a problem for vma's where mlocking has
no effect anyway, but it can distort the accounting.
Since commit 7225522bb4 ("mm: munlock: batch non-THP page isolation
and munlock+putback using pagevec") this can trigger a VM_BUG_ON in
PageTransHuge() check.
This patch fixes the issue by adding VM_MIXEDMAP flag to VM_SPECIAL, a
list of flags that make vma's non-mlockable and non-mergeable. The
reasoning is that VM_MIXEDMAP vma's are similar to VM_PFNMAP, which is
already on the VM_SPECIAL list, and both are intended for non-LRU pages
where mlocking makes no sense anyway. Related Lkml discussion can be
found in [2].
[1] tools/testing/selftests/net/psock_tpacket
[2] https://lkml.org/lkml/2014/1/10/427
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Reported-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: John David Anglin <dave.anglin@bell.net>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Jared Hulbert <jaredeh@gmail.com>
Tested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [3.11.x+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit bf6bddf192 ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page->first_page if PageTail(page).
This results in a very rare NULL pointer dereference on the
aforementioned page_count(page). Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page->first_page
pointer.
This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation. This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling. The patch then adds a store memory barrier to
prep_compound_page() to ensure page->first_page is set.
This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.
Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When BlueFlame is turned on, control segment of the TX WQE is changed,
and the second line of it is used for QPN.
Changed code to use a union in the mlx4_wqe_ctrl_seg instead of casting.
This makes the code clearer and solves the static checker warning:
drivers/net/ethernet/mellanox/mlx4/en_tx.c:839 mlx4_en_xmit()
warn: potential memory corrupting cast 4 vs 2 bytes
CC: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the EN driver uses a private static function
mlx4_en_mac_to_u64(). Move it to a common include file (driver.h)
for mlx4_en and mlx4_ib for further use.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is a single sysfs fix for 3.14-rc5. It fixes a reported problem
with the namespace code in sysfs.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iEYEABECAAYFAlMTq7sACgkQMUfUDdst+yml/wCgkUWPlSGv3UA5AJ1yDBnFqgxB
RcAAn1CM1x6k3ULHG6Hz7SGkFg9dqpjz
=aK2B
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull sysfs fix from Greg KH:
"Here is a single sysfs fix for 3.14-rc5. It fixes a reported problem
with the namespace code in sysfs"
* tag 'driver-core-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: fix namespace refcnt leak
These are private to userspace, and they're unstable
anyway and can be shuffled at will (see 080e4130b1)
so any userspace application relying on them is on crack.
Test compiled with allyesconfig.
mcgrof@drvbp1 /pub/mem/mcgrof/net-next (git::master)$ make allyesconfig
mcgrof@drvbp1 /pub/mem/mcgrof/net-next (git::master)$ time make -j 20
...
BUILD arch/x86/boot/bzImage
Setup is 16992 bytes (padded to 17408 bytes).
System is 56153 kB
CRC 721d2751
Kernel: arch/x86/boot/bzImage is ready (#1)
real 19m35.744s
user 280m37.984s
sys 27m54.104s
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull filesystem fixes from Jan Kara:
"Notification, writeback, udf, quota fixes
The notification patches are (with one exception) a fallout of my
fsnotify rework which went into -rc1 (I've extented LTP to cover these
cornercases to avoid similar breakage in future).
The UDF patch is a nasty data corruption Al has recently reported,
the revert of the writeback patch is due to possibility of violating
sync(2) guarantees, and a quota bug can lead to corruption of quota
files in ocfs2"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: Allocate overflow events with proper type
fanotify: Handle overflow in case of permission events
fsnotify: Fix detection whether overflow event is queued
Revert "writeback: do not sync data dirtied after sync start"
quota: Fix race between dqput() and dquot_scan_active()
udf: Fix data corruption on file type conversion
inotify: Fix reporting of cookies for inotify events
Upcoming congestion controls for TCP require usec resolution for RTT
estimations. Millisecond resolution is simply not enough these days.
FQ/pacing in DC environments also require this change for finer control
and removal of bimodal behavior due to the current hack in
tcp_update_pacing_rate() for 'small rtt'
TCP_CONG_RTT_STAMP is no longer needed.
As Julian Anastasov pointed out, we need to keep user compatibility :
tcp_metrics used to export RTT and RTTVAR in msec resolution,
so we added RTT_US and RTTVAR_US. An iproute2 patch is needed
to use the new attributes if provided by the kernel.
In this example ss command displays a srtt of 32 usecs (10Gbit link)
lpk51:~# ./ss -i dst lpk52
Netid State Recv-Q Send-Q Local Address:Port Peer
Address:Port
tcp ESTAB 0 1 10.246.11.51:42959
10.246.11.52:64614
cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448
cwnd:10 send
3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559
Updated iproute2 ip command displays :
lpk51:~# ./ip tcp_metrics | grep 10.246.11.52
10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source
10.246.11.51
Old binary displays :
lpk51:~# ip tcp_metrics | grep 10.246.11.52
10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source
10.246.11.51
With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Larry Brakmo <brakmo@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
ktime_get() is too expensive on some cases, and we'd like to get
usec resolution timestamps in TCP stack.
This patch adds a light weight facility using a combination of
local_clock() and jiffies samples.
Instead of :
u64 t0, t1;
t0 = ktime_get();
// stuff
t1 = ktime_get();
delta_us = ktime_us_delta(t1, t0);
use :
struct skb_mstamp t0, t1;
skb_mstamp_get(&t0);
// stuff
skb_mstamp_get(&t1);
delta_us = skb_mstamp_us_delta(&t1, &t0);
Note : local_clock() might have a (bounded) drift between cpus.
Do not use this infra in place of ktime_get() without understanding the
issues.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Larry Brakmo <brakmo@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a sysfs file to enable user space to query the device
port number used by a netdevice instance. This is needed for
devices that have multiple ports on the same PCI function.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 93e6f119c0 ("ipc/mqueue: cleanup definition names and
locations") added global hardcoded limits to the amount of message
queues that can be created. While these limits are per-namespace,
reality is that it ends up breaking userspace applications.
Historically users have, at least in theory, been able to create up to
INT_MAX queues, and limiting it to just 1024 is way too low and dramatic
for some workloads and use cases. For instance, Madars reports:
"This update imposes bad limits on our multi-process application. As
our app uses approaches that each process opens its own set of queues
(usually something about 3-5 queues per process). In some scenarios
we might run up to 3000 processes or more (which of-course for linux
is not a problem). Thus we might need up to 9000 queues or more. All
processes run under one user."
Other affected users can be found in launchpad bug #1155695:
https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695
Instead of increasing this limit, revert it entirely and fallback to the
original way of dealing queue limits -- where once a user's resource
limit is reached, and all memory is used, new queues cannot be created.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reported-by: Madars Vitolins <m@silodev.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org> [3.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As mount() and kill_sb() is not a one-to-one match, we shoudn't get
ns refcnt unconditionally in sysfs_mount(), and instead we should
get the refcnt only when kernfs_mount() allocated a new superblock.
v2:
- Changed the name of the new argument, suggested by Tejun.
- Made the argument optional, suggested by Tejun.
v3:
- Make the new argument as second-to-last arg, suggested by Tejun.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
---
fs/kernfs/mount.c | 8 +++++++-
fs/sysfs/mount.c | 5 +++--
include/linux/kernfs.h | 9 +++++----
3 files changed, 15 insertions(+), 7 deletions(-)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 7053aee26a "fsnotify: do not share events between notification
groups" used overflow event statically allocated in a group with the
size of the generic notification event. This causes problems because
some code looks at type specific parts of event structure and gets
confused by a random data it sees there and causes crashes.
Fix the problem by allocating overflow event with type corresponding to
the group type so code cannot get confused.
Signed-off-by: Jan Kara <jack@suse.cz>
Steffen Klassert says:
====================
1) Introduce skb_to_sgvec_nomark function to add further data to the sg list
without calling sg_unmark_end first. Needed to add extended sequence
number informations. From Fan Du.
2) Add IPsec extended sequence numbers support to the Authentication Header
protocol for ipv4 and ipv6. From Fan Du.
3) Make the IPsec flowcache namespace aware, from Fan Du.
4) Avoid creating temporary SA for every packet when no key manager is
registered. From Horia Geanta.
5) Support filtering of SA dumps to show only the SAs that match a
given filter. From Nicolas Dichtel.
6) Remove caching of xfrm_policy_sk_bundles. The cached socket policy bundles
are never used, instead we create a new cache entry whenever xfrm_lookup()
is called on a socket policy. Most protocols cache the used routes to the
socket, so this caching is not needed.
7) Fix a forgotten SADB_X_EXT_FILTER length check in pfkey, from Nicolas
Dichtel.
8) Cleanup error handling of xfrm_state_clone.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit c4a391b53a. Dave
Chinner <david@fromorbit.com> has reported the commit may cause some
inodes to be left out from sync(2). This is because we can call
redirty_tail() for some inode (which sets i_dirtied_when to current time)
after sync(2) has started or similarly requeue_inode() can set
i_dirtied_when to current time if writeback had to skip some pages. The
real problem is in the functions clobbering i_dirtied_when but fixing
that isn't trivial so revert is a safer choice for now.
CC: stable@vger.kernel.org # >= 3.13
Signed-off-by: Jan Kara <jack@suse.cz>
Because of a recent syscall design debate; its deemed appropriate for
each syscall to have a flags argument for future extension; without
immediately requiring new syscalls.
Cc: juri.lelli@gmail.com
Cc: Ingo Molnar <mingo@redhat.com>
Suggested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140214161929.GL27965@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
MSI
- Fix AHCI single-MSI fallback (Alexander Gordeev)
- Fix populate_msi_sysfs() error paths (Greg Kroah-Hartman)
- Fix htmldocs problem (Masanari Iida)
- Add pci_enable_msi_exact() and pci_enable_msix_exact() (Alexander Gordeev)
- Update documentation (Alexander Gordeev)
Miscellaneous
- mvebu: expose device ID & revision via lspci (Andrew Lunn)
- Enable INTx if the BIOS left them disabled (Bjorn Helgaas)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTBkj5AAoJEFmIoMA60/r8aiAQAIWQnZ7UhXBqMAXDR8nJuTbk
b2l4EpNtrGPKy27ZogwDV7ACE7BcBc8vQWhsuMbaxyYTUh4Amr19CysjyBqmoLe9
4eMuGlItkXCbtEw8wquiSz8rtUHH90yTwXk3XMQ0SkscMuAp+QSUb48a3uBSPMX/
gf29IeV8CJjqfLnvtCYkp9jgVuph9vpw+g+DTaLPGA23KS8QJKvmJ95R15fhfcGZ
B4fbJG8TT8LLLD4LDeZOSqbz2n4rE8Xaif1locLAkQtPhiSe65vZYP5IFwlH/t4T
Rzqtkuy2pbybfMk2JVDXzXQgIbCH0h3fEYRZM7ydhU3dndb1E8oUAYf1CbG1GoLv
36feVn7YWs3VQhs+IpoqJivtgmQKOmFgtGByPOgP47SWXssmyBz2DZCap6WPVGGb
KCJNshSGtpNA3ge34jj8Y5wKN2Y+jGoBvLPObJd80Rwwmx00Nn33jn4dYX9JkPlB
kq4I9+y8CmMuADB+St3kHklAw0qFeK7pj2iMRnpfdEbau4el16ch8S9rEBltOj/2
wMejSViUH1RsdpJMMHads3pR+oAjFxxc8x1fnp4roIr2SkvZhCmcZwM6GJJhMJpi
RM/B4RnK4dMuE6vGX5jsDQFy7xKoE6Wfop/cXK6HbifX+kiZo90PD8vbNthFJ/Wy
2B0AN2cvL5dCKvoX2gqJ
=CTv7
-----END PGP SIGNATURE-----
Merge tag 'pci-v3.14-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"The most interesting thing here is the change to enable INTx (by
clearing PCI_COMMAND_INTX_DISABLE) if the BIOS left INTx disabled.
Apparently the Baytrail BIOS does this, which means EHCI doesn't work.
Also, fix an AHCI MSI regression and other issues with the recent MSI
changes. This also adds pci_enable_msi_exact() and
pci_enable_msix_exact(), which aren't regression fixes, but will keep
us from touching drivers twice (once to stop using the deprecated
pci_enable_msi(), etc., and again to use the *_exact() variants).
There's also a minor MVEBU fix.
Summary:
MSI:
- Fix AHCI single-MSI fallback (Alexander Gordeev)
- Fix populate_msi_sysfs() error paths (Greg Kroah-Hartman)
- Fix htmldocs problem (Masanari Iida)
- Add pci_enable_msi_exact() and pci_enable_msix_exact() (Alexander Gordeev)
- Update documentation (Alexander Gordeev)
Miscellaneous:
- mvebu: expose device ID & revision via lspci (Andrew Lunn)
- Enable INTx if the BIOS left them disabled (Bjorn Helgaas)"
* tag 'pci-v3.14-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
ahci: Fix broken fallback to single MSI mode
PCI: Enable INTx if BIOS left them disabled
PCI/MSI: Add pci_enable_msi_exact() and pci_enable_msix_exact()
PCI/MSI: Fix cut-and-paste errors in documentation
PCI/MSI: Add pci_enable_msi() documentation back
PCI/MSI: Fix pci_msix_vec_count() htmldocs failure
PCI/MSI: Fix leak of msi_attrs
PCI/MSI: Check kmalloc() return value, fix leak of name
PCI: mvebu: Use Device ID and revision from underlying endpoint
Pull cgroup fixes from Tejun Heo:
"Quite a few fixes this time.
Three locking fixes, all marked for -stable. A couple error path
fixes and some misc fixes. Hugh found a bug in memcg offlining
sequence and we thought we could fix that from cgroup core side but
that turned out to be insufficient and got reverted. A different fix
has been applied to -mm"
* 'for-3.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: update cgroup_enable_task_cg_lists() to grab siglock
Revert "cgroup: use an ordered workqueue for cgroup destruction"
cgroup: protect modifications to cgroup_idr with cgroup_mutex
cgroup: fix locking in cgroup_cfts_commit()
cgroup: fix error return from cgroup_create()
cgroup: fix error return value in cgroup_mount()
cgroup: use an ordered workqueue for cgroup destruction
nfs: include xattr.h from fs/nfs/nfs3proc.c
cpuset: update MAINTAINERS entry
arm, pm, vmpressure: add missing slab.h includes
Pull workqueue fixes from Tejun Heo:
"Two workqueue fixes. One for an unlikely but possible critical bug
during kworker shutdown and the other to make lockdep names a bit more
descriptive"
* 'for-3.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: ensure @task is valid across kthread_stop()
workqueue: add args to workqueue lockdep name
Couple of small issues solved:
- Suspend/Resume call-backs require CONFIG_PM_SLEEP
- Some drivers written for 32bit architectures fail when compiled
with a 64bit compiler. The fixes will future proof the drivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTBLjhAAoJEFGvii+H/HdhRn4P/2ovuuE0pGm4OkwIi+5yOM0S
ABrW3UhHXfEW9S8rj6ZpDsgOjio+ZfojIbeeK59j01KkqC3l1mzz66wIfRUpjisk
CPtERrSj/BzGgXiJpMgBEdugYIegSlygZOCTSj1fESDG4T3XMShaHvlZPj6Oe87x
YF83wyqxd59+SFTnmwRnNo2RqXKSgkO+Gl0Rx6/CHQdHk3IXIforC0pLPPx5pNmZ
FuIN8+hWBree6ih8nCPLmQI05KwU74U+NKWO+CKBkdAt+SJ8+3cr16+zfoFu351m
4ZedKAVZ7O3KUpdhIzDAQZzenz5VaVuo0KvQc4ZEgPlAWP3RmOsaUBaoCKI8LWlm
Th2kkBTRleZuxA6psb3craXIasvkVLfLcVVYVAfPU/i+VqVDp9c24BeWDPVoJOLa
//09ND+TokHjBWB+NnITO6dH240k7j6QY09pOpWJnJdWGwZSIo7/rZgDDKFIFes+
EcuN4nrurG379xsffMKNXgtYgLj7Okvn3lPZp1E+c7KF8MMd2o13mAQdGQFl+7V1
bbIKBY5nuL95pCS2DQktuu4WiQaQ5ONWFFdJ3iTpB5lv27eU6FyAPK2GmVU2xwT+
+tAjVwldww1JInNZrDsdLPn+PSKUGKAZWcNOSedjdHUzV0I2emYzXUAttLVjtp3m
380u+hdzHAd959XT2Mie
=705r
-----END PGP SIGNATURE-----
Merge tag 'mfd-fixes-3.14-1' of git://git.linaro.org/people/lee.jones/mfd
Pull MFD fixes from Lee Jones:
"Couple of small issues solved:
- Suspend/Resume call-backs require CONFIG_PM_SLEEP
- Some drivers written for 32bit architectures fail when compiled
with a 64bit compiler. The fixes will future proof the drivers"
* tag 'mfd-fixes-3.14-1' of git://git.linaro.org/people/lee.jones/mfd:
mfd: sec-core: sec_pmic_{suspend,resume}() should depend on CONFIG_PM_SLEEP
mfd: max14577: max14577_{suspend,resume}() should depend on CONFIG_PM_SLEEP
mfd: tps65217: Naturalise cross-architecture discrepancies
mfd: wm8994-core: Naturalise cross-architecture discrepancies
mfd: max8998: Naturalise cross-architecture discrepancies
mfd: max8997: Naturalise cross-architecture discrepancies
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
* Fix nf_trace in nftables if XT_TRACE=n, from Florian Westphal.
* Don't use the fast payload operation in nf_tables if the length is
not power of 2 or it is not aligned, from Nikolay Aleksandrov.
* Fix missing break statement the inet flavour of nft_reject, which
results in evaluating IPv4 packets with the IPv6 evaluation routine,
from Patrick McHardy.
* Fix wrong kconfig symbol in nft_meta to match the routing realm,
from Paul Bolle.
* Allocate the NAT null binding when creating new conntracks via
ctnetlink to avoid that several packets race at initializing the
the conntrack NAT extension, original patch from Florian Westphal,
revisited version from me.
* Fix DNAT handling in the snmp NAT helper, the same handling was being
done for SNAT and DNAT and 2.4 already contains that fix, from
Francois-Xavier Le Bail.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If we compile the TPS65217 for a 64bit architecture we receive the following
warnings:
drivers/mfd/tps65217.c: In function ‘tps65217_probe’:
drivers/mfd/tps65217.c:173:13:
warning: cast from pointer to integer of different size
chip_id = (unsigned int)match->data;
^
Signed-off-by: Lee Jones <lee.jones@linaro.org>
If we compile the MAX8998 for a 64bit architecture we receive the following
warnings:
drivers/mfd/max8998.c: In function ‘max8998_i2c_get_driver_data’:
drivers/mfd/max8998.c:178:10:
warning: cast from pointer to integer of different size
return (int)match->data;
^
Signed-off-by: Lee Jones <lee.jones@linaro.org>
If we compile the MAX8997 for a 64bit architecture we receive the following
warnings:
drivers/mfd/max8997.c: In function ‘max8997_i2c_get_driver_data’:
drivers/mfd/max8997.c:173:10:
warning: cast from pointer to integer of different size
return (int)match->data;
^
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Conflicts:
drivers/net/bonding/bond_3ad.h
drivers/net/bonding/bond_main.c
Two minor conflicts in bonding, both of which were overlapping
changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) kvaser CAN driver has fixed limits of some of it's table, validate
that we won't exceed those limits at probe time. Fix from Olivier
Sobrie.
2) Fix rtl8192ce disabling interrupts for too long, from Olivier
Langlois.
3) Fix botched shift in ath5k driver, from Dan Carpenter.
4) Fix corruption of deferred packets in TIPC, from Erik Hugne.
5) Fix newlink error path in macvlan driver, from Cong Wang.
6) Fix netpoll deadlock in bonding, from Ding Tianhong.
7) Handle GSO packets properly in forwarding path when fragmentation is
necessary on egress, from Florian Westphal.
8) Fix axienet build errors, from Michal Simek.
9) Fix refcounting of ubufs on tx in vhost net driver, from Michael S
Tsirkin.
10) Carrier status isn't set properly in hyperv driver, from Haiyang
Zhang.
11) Missing pci_disable_device() in tulip_remove_one), from Ingo Molnar.
12) AF_PACKET qdisc bypass mode doesn't adhere to driver provided TX
queue selection method. Add a fallback method mechanism to fix this
bug, from Daniel Borkmann.
13) Fix regression in link local route handling on GRE tunnels, from
Nicolas Dichtel.
14) Bonding can assign dup aggregator IDs in some sequences of
configuration, fix by making the allocation counter per-bond instead
of global. From Jiri Bohac.
15) sctp_connectx() needs compat translations, from Daniel Borkmann.
16) Fix of_mdio PHY interrupt parsing, from Ben Dooks
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
MAINTAINERS: add entry for the PHY library
of_mdio: fix phy interrupt passing
net: ethernet: update dependency and help text of mvneta
NET: fec: only enable napi if we are successful
af_packet: remove a stray tab in packet_set_ring()
net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode
ipv4: fix counter in_slow_tot
irtty-sir.c: Do not set_termios() on irtty_close()
bonding: 802.3ad: make aggregator_identifier bond-private
usbnet: remove generic hard_header_len check
gre: add link local route when local addr is any
batman-adv: fix potential kernel paging error for unicast transmissions
batman-adv: avoid double free when orig_node initialization fails
batman-adv: free skb on TVLV parsing success
batman-adv: fix TT CRC computation by ensuring byte order
batman-adv: fix potential orig_node reference leak
batman-adv: avoid potential race condition when adding a new neighbour
batman-adv: properly check pskb_may_pull return value
batman-adv: release vlan object after checking the CRC
batman-adv: fix TT-TVLV parsing on OGM reception
...
Now that all the logic is handled via last_arp_rx, we don't need to use
last_rx.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My rework of handling of notification events (namely commit 7053aee26a
"fsnotify: do not share events between notification groups") broke
sending of cookies with inotify events. We didn't propagate the value
passed to fsnotify() properly and passed 4 uninitialized bytes to
userspace instead (so it is also an information leak). Sadly I didn't
notice this during my testing because inotify cookies aren't used very
much and LTP inotify tests ignore them.
Fix the problem by passing the cookie value properly.
Fixes: 7053aee26a
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Pull Ceph fixes from Sage Weil:
"We have some patches fixing up ACL support issues from Zheng and
Guangliang and a mount option to enable/disable this support. (These
fixes were somewhat delayed by the Chinese holiday.)
There is also a small fix for cached readdir handling when directories
are fragmented"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: fix __dcache_readdir()
ceph: add acl, noacl options for cephfs mount
ceph: make ceph_forget_all_cached_acls() static inline
ceph: add missing init_acl() for mkdir() and atomic_open()
ceph: fix ceph_set_acl()
ceph: fix ceph_removexattr()
ceph: remove xattr when null value is given to setxattr()
ceph: properly handle XATTR_CREATE and XATTR_REPLACE
Introduce new netlink attributes for SET_PHY_ATTRS:
* CSMA minimal backoff exponent
* CSMA maximal backoff exponent
* CSMA retry limit
* frame retransmission limit
The CSMA attributes shall correspond to minBE, maxBE and maxCSMABackoffs of
802.15.4, respectively. The frame retransmission shall correspond to
maxFrameRetries of 802.15.4, unless given as -1: then the old behaviour
of the stack shall apply. For RF2xy, the old behaviour is to not do
channel sensing at all and simply send *right now*, which is not
intended behaviour for most applications and actually prohibited for
some channel/page combinations.
For all values except frame retransmission limit, the defaults of
802.15.4 apply. Frame retransmission limits are set to -1 to indicate
backward-compatible behaviour.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since three of the four clear channel assesment modes make use of energy
detection, provide an API to set the energy detection threshold.
Driver support for this is available in at86rf230 for the RF212 chips.
Since for these chips the minimal energy detection threshold depends on
page and channel used, add a field to struct at86rf230_local that stores
the minimal threshold. Actual ED thresholds are configured as offsets
from this value.
For RF212, setting the ED threshold will not work before a channel/page
has been set due to the dependency of energy detection in the chip and
the actual channel/page selected.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The standard describes four modes of clear channel assesment: "energy
above threshold", "carrier found", and the logical and/or of these two.
Support for CCA mode setting is included in the at86rf230 driver,
predicated for RF212 chips.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Listen-before-talk is an alternative to CSMA in uncoordinated networks
and prescribed by european regulations if one wants to have a device
with radio duty cycles above 10% (or less in some bands). Add a phy
property to enable/disable LBT in the phy, including support in the
at86rf230 driver for RF212 chips.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the current u8 transmit_power in wpan_phy with s8 transmit_power.
The u8 field contained the actual tx power and a tolerance field,
which no physical radio every used. Adjust sysfs entries to keep
compatibility with userspace, give tolerances of +-1dB statically there.
This patch only adds support for this in the at86rf230 driver and the
RF212 chip. Configuration calculation for RF212 is also somewhat basic,
but does the job - the RF212 datasheet gives a large table with
suggested values for combinations of TX power and page/channel, if this
does not work well, we might have to copy the whole table.
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
As pointed out by Shaohui, most 10G PHYs out there have a non-standard
compliant software reset sequence, eventually something much more
complex than just toggling the BMCR_RESET bit. Allow PHY driver to
implement their own soft_reset() callback to deal with that. If no
callback is provided, call into genphy_soft_reset() which makes sure the
existing behavior is kept intact.
Reported-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As pointed out by Shaohui, this function is generic for 10/100/1000
PHYs, but 10G PHYs might have a slightly different reset sequence which
prevents most of them from using this function.
Move the BMCR_RESET based software resent sequence to
genphy_soft_reset() in preparation for allowing PHY drivers to implement
a soft_reset() callback.
Reported-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the setxattr request, introduce a new flag CEPH_XATTR_REMOVE
to distinguish null value case from the zero-length value case.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
When using nftables with CONFIG_NETFILTER_XT_TARGET_TRACE=n, we get
lots of "TRACE: filter:output:policy:1 IN=..." warnings as several
places will leave skb->nf_trace uninitialised.
Unlike iptables tracing functionality is not conditional in nftables,
so always copy/zero nf_trace setting when nftables is enabled.
Move this into __nf_copy() helper.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In order to allow users to invoke netdev_cap_txqueue, it needs to
be moved into netdevice.h header file. While at it, also add kernel
doc header to document the API.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new argument for ndo_select_queue() callback that passes a
fallback handler. This gets invoked through netdev_pick_tx();
fallback handler is currently __netdev_pick_tx() as most drivers
invoke this function within their customized implementation in
case for skbs that don't need any special handling. This fallback
handler can then be replaced on other call-sites with different
queue selection methods (e.g. in packet sockets, pktgen etc).
This also has the nice side-effect that __netdev_pick_tx() is
then only invoked from netdev_pick_tx() and export of that
function to modules can be undone.
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull irq update from Thomas Gleixner:
"Fix from the urgent branch: a trivial oneliner adding the missing
Kconfig dependency curing build failures which have been discovered by
several build robots.
The update in the irq-core branch provides a new function in the
irq/devres code, which is a prerequisite for driver developers to get
rid of boilerplate code all over the place.
Not a bugfix, but it has zero impact on the current kernel due to the
lack of users. It's simpler to provide the infrastructure to
interested parties via your tree than fulfilling the wishlist of
driver maintainers on which particular commit or tag this should be
based on"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Add devm_request_any_context_irq()