The cap_mask1 isn't protected by field_select and not listed among RW
fields, but it is required to be written to properly initialize ports
in IB virtualization mode.
Link: https://lore.kernel.org/linux-rdma/88bab94d2fd72f3145835b4518bc63dda587add6.camel@redhat.com
Fixes: ab118da4c1 ("net/mlx5: Don't write read-only fields in MODIFY_HCA_VPORT_CONTEXT command")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Nvidia card may come with a "phantom" UCSI device, and its driver gets
stuck in probe routine, prevents any system PM operations like suspend.
There's an unaccounted case that the target time can equal to jiffies in
gpu_i2c_check_status(), let's solve that by using readl_poll_timeout()
instead of jiffies comparison functions.
Fixes: c71bcdcb42 ("i2c: add i2c bus driver for NVIDIA GPU")
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Ajay Gupta <ajayg@nvidia.com>
Tested-by: Ajay Gupta <ajayg@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Add a test case to check nf queue infrastructure.
Could be extended in the future to also cover serialization of
conntrack, uid and secctx attributes in nfqueue.
For now, this checks that 'queue bypass' works, that a queue rule with
no bypass option blocks traffic and that userspace receives the expected
number of packets.
For this we add two queues and hook all of
prerouting/input/forward/output/postrouting.
Packets get queued twice with a dummy base chain in between:
This passes with current nf tree, but reverting
commit 946c0d8e6e ("netfilter: nf_queue: fix reinject verdict handling")
makes this trip (it processes 30 instead of expected 20 packets).
v2: update config file with queue and other options missing/needed for
other tests.
v3: also test with tcp, this reveals problem with commit
28f8bfd1ac ("netfilter: Support iif matches in POSTROUTING"), due to
skb->dev pointing at another skb in the retransmit rbtree (skb->dev
aliases to rbnode child).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Set skb->tc_redirected to 1, otherwise the ifb driver drops the packet.
Set skb->tc_from_ingress to 1 to reinject the packet back to the ingress
path after leaving the ifb egress path.
This patch inconditionally sets on these two skb fields that are
meaningful to the ifb driver. The existing forward action is guaranteed
to run from ingress path.
Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Make sure the forward action is only used from ingress.
Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
...and return -ENOTEMPTY to the front-end in this case, instead of
proceeding. Currently, nft takes care of checking for these cases
and not sending them to the kernel, but if we drop the set_overlap()
call in nft we can end up in situations like:
# nft add table t
# nft add set t s '{ type inet_service ; flags interval ; }'
# nft add element t s '{ 1 - 5 }'
# nft add element t s '{ 6 - 10 }'
# nft add element t s '{ 4 - 7 }'
# nft list set t s
table ip t {
set s {
type inet_service
flags interval
elements = { 1-3, 4-5, 6-7 }
}
}
This change has the primary purpose of making the behaviour
consistent with nft_set_pipapo, but is also functional to avoid
inconsistent behaviour if userspace sends overlapping elements for
any reason.
v2: When we meet the same key data in the tree, as start element while
inserting an end element, or as end element while inserting a start
element, actually check that the existing element is active, before
resetting the overlap flag (Pablo Neira Ayuso)
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Replace negations of nft_rbtree_interval_end() with a new helper,
nft_rbtree_interval_start(), wherever this helps to visualise the
problem at hand, that is, for all the occurrences except for the
comparison against given flags in __nft_rbtree_get().
This gets especially useful in the next patch.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
...and return -ENOTEMPTY to the front-end on collision, -EEXIST if
an identical element already exists. Together with the previous patch,
element collision will now be returned to the user as -EEXIST.
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently, the -EEXIST return code of ->insert() callbacks is ambiguous: it
might indicate that a given element (including intervals) already exists as
such, or that the new element would clash with existing ones.
If identical elements already exist, the front-end is ignoring this without
returning error, in case NLM_F_EXCL is not set. However, if the new element
can't be inserted due an overlap, we should report this to the user.
To this purpose, allow set back-ends to return -ENOTEMPTY on collision with
existing elements, translate that to -EEXIST, and return that to userspace,
no matter if NLM_F_EXCL was set.
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull perf tooling fixes from Ingo Molnar:
"A handful of tooling fixes all across the map, no kernel changes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools headers uapi: Update linux/in.h copy
perf probe: Do not depend on dwfl_module_addrsym()
perf probe: Fix to delete multiple probe event
perf parse-events: Fix reading of invalid memory in event parsing
perf python: Fix clang detection when using CC=clang-version
perf map: Fix off by one in strncpy() size argument
tools: Let O= makes handle a relative path with -C option
Pull x86 fix from Ingo Molnar:
"A build fix with certain Kconfig combinations"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioremap: Fix CONFIG_EFI=n build
Late fixes in dmaengine for v5.6:
- move .device_release missing log warning to debug
- couple of maintainer entries for HiSilicon and IADX drivers
- one off fix for idxd driver
- documentation warn fixes
- TI k3 dma error handling fix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAl55mjUACgkQfBQHDyUj
g0dnIA//eHgGyhvuLWkKunLH2+x5YLYL0ZzWdldW6dbsHaxB59/j1aO325KWXiXB
c2GKGxqRzoNc9i4V5k6cDWjml34z9x44HGUhySyOqE3MkNzM4STjtclePE/DDWaV
QU2zxd7cg3QkP0q3WFaJtw7ffObwJyJqL2GbXcLbfEw731XyjsV3qZvrcHcHkiro
X8taSqlVhZEBc6aGQRNQijWYVH/a/SK2kqo79zv1r24EEkvId3f2k0/ZsHT9r/tD
M2+guUvPEWuL7hUUuhul++7tauvi0Klvil7Ye6HRaUWDjJ1UBW5bnQXzVJMzKoRv
n8BJXet6owIWucHJijqifRDkKJDAg6XIT97ado/jpoH11xtmRqFF+85uPOF2MQSR
Ko2CtsHZjk02XmrVBpqesW8vN2iWlpeaG5lKtDyvwMpOR1b/iTs29LEIZkHtpm1z
kA5/w4MEZF9jP1up7dTIs7rQJsArFhfh6hKUahWu9FdaHg8VufmtiRUW71NTfY9z
pM5PNL7+2dq6BBVnxobrSN1GybR4NEie37xKZnF5JPKG6/qUl9WCciRxcfJjNxFz
qip4Bm39elXTzFZ7S/U7qJyB+/vTbKMldDj8YmUa57jev+8pfjp9Pfqja/V2SIHx
IiLG9ugpQFxXRnGOdo4LDjh6ute9sZvTdSrpKUT4J+7E9qFNuAk=
=QaCh
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-fix-5.6' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"Late fixes in dmaengine for v5.6:
- move .device_release missing log warning to debug
- couple of maintainer entries for HiSilicon and IADX drivers
- off-by-one fix for idxd driver
- documentation warning fixes
- TI k3 dma error handling fix"
* tag 'dmaengine-fix-5.6' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: ti: k3-udma-glue: Fix an error handling path in 'k3_udma_glue_cfg_rx_flow()'
MAINTAINERS: Add maintainer for HiSilicon DMA engine driver
dmaengine: idxd: fix off by one on cdev dwq refcount
MAINTAINERS: rectify the INTEL IADX DRIVER entry
dmaengine: move .device_release missing log warning to debug level
docs: dmaengine: provider.rst: get rid of some warnings
The timer is disarmed when switching between TSC deadline and other modes,
we should set everything to disarmed state, however, LAPIC timer can be
emulated by preemption timer, it still works if vmx->hv_deadline_timer is
not -1. This patch also cancels preemption timer when disarm LAPIC timer.
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1585031530-19823-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are at least 3 models of the HP x2 10 models:
Bay Trail SoC + AXP288 PMIC
Cherry Trail SoC + AXP288 PMIC
Cherry Trail SoC + TI PMIC
Like on the other HP x2 10 models we need to ignore wakeup for ACPI GPIO
events on the external embedded-controller pin to avoid spurious wakeups
on the HP x2 10 CHT + AXP288 model too.
This commit adds an extra DMI based quirk for the HP x2 10 CHT + AXP288
model, ignoring wakeups for ACPI GPIO events on the EC interrupt pin
on this model. This fixes spurious wakeups from suspend on this model.
Fixes: aa23ca3d98 ("gpiolib: acpi: Add honor_wakeup module-option + quirk mechanism")
Reported-and-tested-by: Marc Lehmann <schmorp@schmorp.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200302111225.6641-4-hdegoede@redhat.com
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
After xfrm_add_policy add a policy, its ref is 2, then
xfrm_policy_timer
read_lock
xp->walk.dead is 0
....
mod_timer()
xfrm_policy_kill
policy->walk.dead = 1
....
del_timer(&policy->timer)
xfrm_pol_put //ref is 1
xfrm_pol_put //ref is 0
xfrm_policy_destroy
call_rcu
xfrm_pol_hold //ref is 1
read_unlock
xfrm_pol_put //ref is 0
xfrm_policy_destroy
call_rcu
xfrm_policy_destroy is called twice, which may leads to
double free.
Call Trace:
RIP: 0010:refcount_warn_saturate+0x161/0x210
...
xfrm_policy_timer+0x522/0x600
call_timer_fn+0x1b3/0x5e0
? __xfrm_decode_session+0x2990/0x2990
? msleep+0xb0/0xb0
? _raw_spin_unlock_irq+0x24/0x40
? __xfrm_decode_session+0x2990/0x2990
? __xfrm_decode_session+0x2990/0x2990
run_timer_softirq+0x5c5/0x10e0
Fix this by use write_lock_bh in xfrm_policy_kill.
Fixes: ea2dea9dac ("xfrm: remove policy lock when accessing policy->walk.dead")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Add missing Makefile for net/forwarding tests and include it to
the targets list, otherwise forwarding tests are not installed
in case of cross-compilation.
Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew noticed that some handlers for *_SET commands leak a netdev
reference if required ethtool_ops callbacks do not exist. A simple
reproducer would be e.g.
ip link add veth1 type veth peer name veth2
ethtool -s veth1 wol g
ip link del veth1
Make sure dev_put() is called when ethtool_ops check fails.
v2: add Fixes tags
Fixes: a53f3d41e4 ("ethtool: set link settings with LINKINFO_SET request")
Fixes: bfbcfe2032 ("ethtool: set link modes related data with LINKMODES_SET request")
Fixes: e54d04e3af ("ethtool: set message mask with DEBUG_SET request")
Fixes: 8d425b19b3 ("ethtool: set wake-on-lan settings with WOL_SET request")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When both the switch and the bridge are learning about new addresses,
switch ports attached to the bridge would see duplicate ARP frames
because both entities would attempt to send them.
Fixes: 5037d532b8 ("net: dsa: add Broadcom tag RX/TX handler")
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Bug fixes.
5 bug fix patches covering an indexing bug for priority counters, memory
leak when retrieving DCB ETS settings, error path return code, proper
disabling of PCI before freeing context memory, and proper ring accounting
in error path.
Please also apply these to -stable. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If ring counts are not reset when ring reservation fails,
bnxt_init_dflt_ring_mode() will not be called again to reinitialise
IRQs when open() is called and results in system crash as napi will
also be not initialised. This patch fixes it by resetting the ring
counts.
Fixes: 47558acd56 ("bnxt_en: Reserve rings at driver open if none was reserved at probe time.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Other shutdown code paths will always disable PCI first to shutdown DMA
before freeing context memory. Do the same sequence in the error path
of probe to be safe and consistent.
Fixes: c20dc142dd ("bnxt_en: Disable bus master during PCI shutdown and driver unload.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code ignores the return value from
bnxt_hwrm_func_backing_store_cfg(), causing the driver to proceed in
the init path even when this vital firmware call has failed. Fix it
by propagating the error code to the caller.
Fixes: 1b9394e5a2 ("bnxt_en: Configure context memory on new devices.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The allocated ieee_ets structure goes out of scope without being freed,
leaking memory. Appropriate result codes should be returned so that
callers do not rely on invalid data passed by reference.
Also cache the ETS config retrieved from the device so that it doesn't
need to be freed. The balance of the code was clearly written with the
intent of having the results of querying the hardware cached in the
device structure. The commensurate store was evidently missed though.
Fixes: 7df4ae9fe8 ("bnxt_en: Implement DCBNL to support host-based DCBX.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is an indexing bug in determining these ethtool priority
counters. Instead of using the queue ID to index, we need to
normalize by modulo 10 to get the index. This index is then used
to obtain the proper CoS queue counter. Rename bp->pri2cos to
bp->pri2cos_idx to make this more clear.
Fixes: e37fed7903 ("bnxt_en: Add ethtool -S priority counters.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only attach macsec to ethernet devices.
Syzbot was able to trigger a KMSAN warning in macsec_handle_frame
by attaching to a phonet device.
Macvlan has a similar check in macvlan_port_create.
v1->v2
- fix commit message typo
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike NL_SET_ERR_* macros, nl_set_extack_cookie_u64() and
nl_set_extack_cookie_u32() helpers do not check extack argument for null
and neither do their callers, as syzbot recently discovered for
ethnl_parse_header().
Instead of fixing the callers and leaving the trap in place, add check of
null extack to both helpers to make them consistent with NL_SET_ERR_*
macros.
v2: drop incorrect second Fixes tag
Fixes: 2363d73a2f ("ethtool: reject unrecognized request flags")
Reported-by: syzbot+258a9089477493cea67b@syzkaller.appspotmail.com
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nci_conn_max_data_pkt_payload_size() function sometimes returns
-EPROTO so "max_size" needs to be signed for the error handling to
work. We can make "payload_size" an int as well.
Fixes: a06347c04c ("NFC: Add Intel Fields Peak NFC solution driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a place,
inet_dump_fib()
fib_table_dump
fn_trie_dump_leaf()
hlist_for_each_entry_rcu()
without rcu_read_lock() will trigger a warning,
WARNING: suspicious RCU usage
-----------------------------
net/ipv4/fib_trie.c:2216 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ip/1923:
#0: ffffffff8ce76e40 (rtnl_mutex){+.+.}, at: netlink_dump+0xd6/0x840
Call Trace:
dump_stack+0xa1/0xea
lockdep_rcu_suspicious+0x103/0x10d
fn_trie_dump_leaf+0x581/0x590
fib_table_dump+0x15f/0x220
inet_dump_fib+0x4ad/0x5d0
netlink_dump+0x350/0x840
__netlink_dump_start+0x315/0x3e0
rtnetlink_rcv_msg+0x4d1/0x720
netlink_rcv_skb+0xf0/0x220
rtnetlink_rcv+0x15/0x20
netlink_unicast+0x306/0x460
netlink_sendmsg+0x44b/0x770
__sys_sendto+0x259/0x270
__x64_sys_sendto+0x80/0xa0
do_syscall_64+0x69/0xf4
entry_SYSCALL_64_after_hwframe+0x49/0xb3
Fixes: 18a8021a7b ("net/ipv4: Plumb support for filtering route dumps")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl5hh6wACgkQSD+KveBX
+j6qvQf9HQsiQ+cE1UIbM/IzyTWXeBMzjljCWFgQfvyKQjSFnoATeVl6GMQJNk7M
ovQ7XHOlN36E/tQW/ypnwbX+btjl/mDEJsxEcvVf4gnw/QH1AwUjo291vPfrE5md
DcWWe9Jrq2MkHeZAgIt/Tnw4GYIwQOBVmYgky3A0+azzuvxK+nXX6JJG5JqRR8Wd
tBsN9UKok1nOt73d+TyHjGnzQHWzGBdS0vlxl0MYcD1QD66UzA0Atgz5aUQLGvTK
Nk/IcHr3SF+0kvbtpjRxlrpi4ywD2gBNHXMJX1DDnCkqg9nWjJ5DByYDASo+I6uL
0fMNsimp/gZG8HD0oFEVTFgaqPACUA==
=DEqe
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2020-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-03-05
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
For -stable v5.4
('net/mlx5: DR, Fix postsend actions write length')
For -stable v5.5
('net/mlx5e: kTLS, Fix TCP seq off-by-1 issue in TX resync flow')
('net/mlx5e: Fix endianness handling in pedit mask')
====================
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following warning can occur when a pq is left on the dmawait list and
the pq is then freed:
WARNING: CPU: 47 PID: 3546 at lib/list_debug.c:29 __list_add+0x65/0xc0
list_add corruption. next->prev should be prev (ffff939228da1880), but was ffff939cabb52230. (next=ffff939cabb52230).
Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ast ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev drm_panel_orientation_quirks i2c_i801 mei_me lpc_ich mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables
nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci libahci i2c_algo_bit dca libata ptp pps_core crc32c_intel [last unloaded: i2c_algo_bit]
CPU: 47 PID: 3546 Comm: wrf.exe Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.41.1.el7.x86_64 #1
Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019
Call Trace:
[<ffffffff91f65ac0>] dump_stack+0x19/0x1b
[<ffffffff91898b78>] __warn+0xd8/0x100
[<ffffffff91898bff>] warn_slowpath_fmt+0x5f/0x80
[<ffffffff91a1dabe>] ? ___slab_alloc+0x24e/0x4f0
[<ffffffff91b97025>] __list_add+0x65/0xc0
[<ffffffffc03926a5>] defer_packet_queue+0x145/0x1a0 [hfi1]
[<ffffffffc0372987>] sdma_check_progress+0x67/0xa0 [hfi1]
[<ffffffffc03779d2>] sdma_send_txlist+0x432/0x550 [hfi1]
[<ffffffff91a20009>] ? kmem_cache_alloc+0x179/0x1f0
[<ffffffffc0392973>] ? user_sdma_send_pkts+0xc3/0x1990 [hfi1]
[<ffffffffc0393e3a>] user_sdma_send_pkts+0x158a/0x1990 [hfi1]
[<ffffffff918ab65e>] ? try_to_del_timer_sync+0x5e/0x90
[<ffffffff91a3fe1a>] ? __check_object_size+0x1ca/0x250
[<ffffffffc0395546>] hfi1_user_sdma_process_request+0xd66/0x1280 [hfi1]
[<ffffffffc034e0da>] hfi1_aio_write+0xca/0x120 [hfi1]
[<ffffffff91a4245b>] do_sync_readv_writev+0x7b/0xd0
[<ffffffff91a4409e>] do_readv_writev+0xce/0x260
[<ffffffff918df69f>] ? pick_next_task_fair+0x5f/0x1b0
[<ffffffff918db535>] ? sched_clock_cpu+0x85/0xc0
[<ffffffff91f6b16a>] ? __schedule+0x13a/0x860
[<ffffffff91a442c5>] vfs_writev+0x35/0x60
[<ffffffff91a4447f>] SyS_writev+0x7f/0x110
[<ffffffff91f78ddb>] system_call_fastpath+0x22/0x27
The issue happens when wait_event_interruptible_timeout() returns a value
<= 0.
In that case, the pq is left on the list. The code continues sending
packets and potentially can complete the current request with the pq still
on the dmawait list provided no descriptor shortage is seen.
If the pq is torn down in that state, the sdma interrupt handler could
find the now freed pq on the list with list corruption or memory
corruption resulting.
Fix by adding a flush routine to ensure that the pq is never on a list
after processing a request.
A follow-up patch series will address issues with seqlock surfaced in:
https://lore.kernel.org/r/20200320003129.GP20941@ziepe.ca
The seqlock use for sdma will then be converted to a spin lock since the
list_empty() doesn't need the protection afforded by the sequence lock
currently in use.
Fixes: a0d406934a ("staging/rdma/hfi1: Add page lock limit check for SDMA requests")
Link: https://lore.kernel.org/r/20200320200200.23203.37777.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Pull crypto fix from Herbert Xu:
"This fixes a correctness bug in the ARM64 version of ChaCha for
lib/crypto used by WireGuard"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: arm64/chacha - correctly walk through blocks
When application uses TCP_QUEUE_SEQ socket option to
change tp->rcv_next, we must also update tp->copied_seq.
Otherwise, stuff relying on tcp_inq() being precise can
eventually be confused.
For example, tcp_zerocopy_receive() might crash because
it does not expect tcp_recv_skb() to return NULL.
We could add tests in various places to fix the issue,
or simply make sure tcp_inq() wont return a random value,
and leave fast path as it is.
Note that this fixes ioctl(fd, SIOCINQ, &val) at the same
time.
Fixes: ee9952831c ("tcp: Initial repair mode")
Fixes: 05255b823a ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
THUNK_TARGET defines [thunk_target] as having "rm" input constraints
when CONFIG_RETPOLINE is not set, which isn't constrained enough for
this specific case.
For inline assembly that modifies the stack pointer before using this
input, the underspecification of constraints is dangerous, and results
in an indirect call to a previously pushed flags register.
In this case `entry`'s stack slot is good enough to satisfy the "m"
constraint in "rm", but the inline assembly in
handle_external_interrupt_irqoff() modifies the stack pointer via
push+pushf before using this input, which in this case results in
calling what was the previous state of the flags register, rather than
`entry`.
Be more specific in the constraints by requiring `entry` be in a
register, and not a memory operand.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com
Debugged-by: Alexander Potapenko <glider@google.com>
Debugged-by: Paolo Bonzini <pbonzini@redhat.com>
Debugged-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Message-Id: <20200323191243.30002-1-ndesaulniers@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The GEO_TX_POWER_LIMIT command was sent although
there is no wgds table, so the fw got wrong SAR values
from the driver.
Fix this by avoiding sending the command if no wgds
tables are available.
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
Fixes: 39c1a9728f ("iwlwifi: refactor the SAR tables from mvm to acpi")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Tested-By: Jonathan McDowell <noodles@earth.li>
Tested-by: Len Brown <len.brown@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20200318081237.46db40617cc6.Id5cf852ec8c5dbf20ba86bad7b165a0c828f8b2e@changeid
Currently, CLFLUSH is used to flush SEV guest memory before the guest is
terminated (or a memory hotplug region is removed). However, CLFLUSH is
not enough to ensure that SEV guest tagged data is flushed from the cache.
With 33af3a7ef9 ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations"), the
original WBINVD was removed. This then exposed crashes at random times
because of a cache flush race with a page that had both a hypervisor and
a guest tag in the cache.
Restore the WBINVD when destroying an SEV guest and add a WBINVD to the
svm_unregister_enc_region() function to ensure hotplug memory is flushed
when removed. The DF_FLUSH can still be avoided at this point.
Fixes: 33af3a7ef9 ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <c8bf9087ca3711c5770bdeaafa3e45b717dc5ef4.1584720426.git.thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make it so that CEPH_MSG_DATA_PAGES data item can own pages,
fixing a bunch of memory leaks for a page vector allocated in
alloc_msg_with_page_vector(). Currently, only watch-notify
messages trigger this allocation, and normally the page vector
is freed either in handle_watch_notify() or by the caller of
ceph_osdc_notify(). But if the message is freed before that
(e.g. if the session faults while reading in the message or
if the notify is stale), we leak the page vector.
This was supposed to be fixed by switching to a message-owned
pagelist, but that never happened.
Fixes: 1907920324 ("libceph: support for sending notifies")
Reported-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Roman Penyaev <rpenyaev@suse.de>
CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
per-pool flags as well. Unfortunately the backwards compatibility here
is lacking:
- the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
was guarded by require_osd_release >= RELEASE_LUMINOUS
- it was subsequently backported to luminous in v12.2.2, but that makes
no difference to clients that only check OSDMAP_FULL/NEARFULL because
require_osd_release is not client-facing -- it is for OSDs
Since all kernels are affected, the best we can do here is just start
checking both map flags and pool flags and send that to stable.
These checks are best effort, so take osdc->lock and look up pool flags
just once. Remove the FIXME, since filesystem quotas are checked above
and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.
Cc: stable@vger.kernel.org
Reported-by: Yanhu Cao <gmayyyha@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Sage Weil <sage@redhat.com>
Don't let non-letters inside a literal block without escaping it, as
the toolchain would mis-interpret it:
./include/linux/i2c.h:518: WARNING: Inline strong start-string without end-string.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Disable all rps-irq interrupts during driver initialization to prevent
an accidental interrupt on GIC.
Fixes: 84316f4ef1 ("ARM: boot: dts: Add Oxford Semiconductor OX810SE dtsi")
Fixes: 38d4a53733 ("ARM: dts: Add support for OX820 and Pogoplug V3")
Signed-off-by: Sungbo Eo <mans0n@gorani.run>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
All but one error handling paths in the 'k3_udma_glue_cfg_rx_flow()'
function 'goto err' and call 'k3_udma_glue_release_rx_flow()'.
This not correct because this function has a 'channel->flows_ready--;' at
the end, but 'flows_ready' has not been incremented here, when we branch to
the error handling path.
In order to keep a correct value in 'flows_ready', un-roll
'k3_udma_glue_release_rx_flow()', simplify it, add some labels and branch
at the correct places when an error is detected.
Doing so, we also NULLify 'flow->udma_rflow' in a path that was lacking it.
Fixes: d702419134 ("dmaengine: ti: k3-udma: Add glue layer for non DMAengine user")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20200318191209.1267-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The refcount check for dedicated workqueue (dwq) is off by one and allows
more than 1 user to open the char device. Fix check so only a single user
can open the device.
Fixes: 42d279f913 ("dmaengine: idxd: add char driver to expose submission portal to userland")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/158403020187.10208.14117394394540710774.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The bus is virtual and devices have to inherit their DMA constraints
from the underlying interconnect. So add an empty dma-ranges property to
the bus node, implying the firmware bus' DMA constraints are identical to
its parent's.
Fixes: 7dbe8c62ce ("ARM: dts: Add minimal Raspberry Pi 4 support")
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl53Vh0ACgkQxWXV+ddt
WDtfOQ//bbUyKXcdH0FBZOCEcJmegcK1eUFYqKrwR2bHGe5JRdLM8pAvjCcqmWeO
jtaRiFC4NSCqTIl3mkBUb+XmQtjZwixBUHRxJpuEO8zqawvFZXTqg/KJklNvi2rd
KdflSNia6KrozTT+B/lpwZ5emS+wSdj5XTZ6VGj4riwtphSfWAjOu+4cOASMeFu+
Gfn+N9xu0ZcR/6zO20xAg0Xz+WU2uj4EfeM35dtRP2bPLG0yOGmiYT15Ll9h74Wm
7F+28iNTQfYutAexGvUpiouanGXE+ka3TCsJg5LuVTpdKGraOVGEuX+RhsyoKQrB
E8bk91fbkLlooluhUC306iNA9/+RN/yFGtILX8JsgI2Od26ZuU01l/OHrc19MDIm
gw1w3PMsD/hXLsG5ba4QsIYOzXofSrPdWej29h/o5p0VEQrAoCJEpAi7fVsiJDR1
sx6kCodw5jYhVs1P6DdXO1pgjE7iFUmjUQCFkl40edPMLy/LwB99A4zNnCOwI0KZ
49CMWHDe+tXVJBTzPvtma/PycQHIxJYMf1f8ko9E4stB7HtfH4dnUERDkb1UwQ5n
aJgyhsCCnp/EJoPunUT7g9nLUdyu0Rtwknn3NascWZEieX2QhKEF5RcjAUSL+Hlo
jbGGvoLhG0nOtYkU7BNSQbL8wxPJEEAq8e6F4tWMcOkhX4pNZP8=
=YkB0
-----END PGP SIGNATURE-----
Merge tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Two fixes.
The first is a regression: when dropping some incompat bits the
conditions were reversed. The other is a fix for rename whiteout
potentially leaving stack memory linked to a list"
* tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix removal of raid[56|1c34} incompat flags after removing block group
btrfs: fix log context list corruption after rename whiteout error
Merge misc fixes from Andrew Morton:
"10 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
x86/mm: split vmalloc_sync_all()
mm, slub: prevent kmalloc_node crashes and memory leaks
mm/mmu_notifier: silence PROVE_RCU_LIST warnings
epoll: fix possible lost wakeup on epoll_ctl() path
mm: do not allow MADV_PAGEOUT for CoW pages
mm, memcg: throttle allocators based on ancestral memory.high
mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
page-flags: fix a crash at SetPageError(THP_SWAP)
mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event