CQ is armed for solicited events only, ignoring other notification
flags. Correct this by arming for next and arming for solicited
event if IB_CQ_SOLICITED is set. Also protect CQ shadow area update
with spinlock.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
ipoib_neigh_get unconditionally updates the "alive" variable member on
any packet send. This prevents the neighbor garbage collection from
cleaning out a dead neighbor entry if we are still queueing packets
for it. If the queue for this neighbor is full, then don't update the
alive timestamp. That way the neighbor can time out even if packets
are still being queued as long as none of them are being sent.
Fixes: b63b70d877 ("IPoIB: Use a private hash table for path lookup in xmit path")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Userspace flag IBV_QP_ALT_PATH is supposed to set the alternate path
including fields alt_pkey_index and alt_timeout.
Added IB_QP_PKEY_INDEX and IB_QP_TIMEOUT to the attribute mask when
calling mlx5_set_path for the alternate path to force setting the
alt_pkey_index and alt_timeout values.
Fixes: bf24481a3a7c4 ('IB/mlx5: Consider alternate path in pkey ...')
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pkey index fields in the QP context path record are extended to 16
bits, as required by IB spec (version 1.3).
This change affects all QP commands which include path records.
To enable this change, moved the free adaptive routing flag bit
(free_ar) to the most significant byte of the QP path record.
Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB ...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Verify that number of entries is less than device capability.
Add an appropriate warning message for error flow.
Fixes: bde51583f4 ('IB/mlx5: Add support for resize CQ')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Number of entries shouldn't be greater than the device's max
capability. This should be checked before rounding the entries number
to power of two.
Fixes: 51ee86a4af ('IB/mlx5: Fix check of number of entries...')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
BlueFlame support is reported only for PFs when the HCA capability is
on.
Fixes: 938fe83c8d ('net/mlx5_core: New device capabilities...')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When PAGE_SIZE is larger than 4K, the user shouldn't be able to query
the HCA core clock. This counter is within 4KB boundary and the
user-space shall not read information that's after this boundary.
Fixes: b368d7cb8c ('IB/mlx5: Add hca_core_clock_offset to...')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add a 4-digit padding to show FW version in proper format.
Fixes: 9603b61de1 ('mlx5: Move pci device handling from...')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
FW port-change events are fired on Active <-> non Active port state
transitions only.
When the port state changes from Active to Initializing (Active ->
Down -> Initializing), a single event is fired.
The HCA transitions from Down to Initializing unless prevented from
doing so, hence the driver should also propagate events when the port
state is Initializing to consumers so they'll be aware that the port
is no longer Active and act accordingly.
Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Flow steering is supported by mlx5 device when the following
features are supported by firmware:
1. NIC RX flow table.
2. Device has enough flow steering levels.
3. Atomic modification of flow table entry.
4. Flow tables chaining.
To check if flow steering is supported it's enough to check
if the driver opened the mlx5 bypass namespace.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Replace the few u64 casts with ULL to match the rest of the casts.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
ib_device_cap_flags 64-bit expansion caused caps overlapping
and made consumers read wrong device capabilities. For example
IB_DEVICE_SG_GAPS_REG was falsely read by the iser driver causing
it to use a non-existing capability. This happened because signed
int becomes sign extended when converted it to u64. Fix this by
casting IB_DEVICE_ON_DEMAND_PAGING enumeration to ULL.
Fixes: f5aa9159a4 ('IB/core: Add arbitrary sg_list support')
Reported-by: Robert LeBlanc <robert@leblancnet.us>
Cc: Stable <stable@vger.kernel.org> #[v4.6+]
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
For dynamically allocated sysfs attributes there is a need to call
sysfs_attr_init in order to comply with lockdep, not calling it
will result in error complaining key is not in .data section.
Fixes: b40f4757da ("IB/core: Make device counter infrastructure dynamic")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Align locking usage when touching device address with rest
of the kernel. Lock the bottom half when doing so using
netif_addr_lock_bh.
This also solves the following case as reported by lockdep:
CPU0 CPU1
---- ----
lock(_xmit_INFINIBAND);
local_irq_disable();
lock(&(&mc->mca_lock)->rlock);
lock(_xmit_INFINIBAND);
<Interrupt>
lock(&(&mc->mca_lock)->rlock);
*** DEADLOCK ***
Fixes: 492a7e67ff ("IB/IPoIB: Allow setting the device address")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When deleting a default GID from the cache, its gid_type field is set
to 0.
This could set the gid_type to RoCE v1 for a RoCE v2 default GID,
essentially making it inaccessible to future modifications, since it
is no longer found by find_gid().
This fix preserves the gid_type value for default gids during cache
operations.
Fixes: b39ffa1df5 ('IB/core: Add gid_type to gid attribute')
Signed-off-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In ipoib_remove_one the driver holds the rtnl_lock and tries to do some
operation like dev_change_flags or unregister_netdev, while sysfs
callback like ipoib_vlan_delete holds sysfs mutex and tries to hold the
rtnl_lock via rtnl_trylock() and restart_syscall() if the lock is not
free, meanwhile ipoib_remove_one tries to get the sysfs lock in order to
free its sysfs directory, and we will get a->b, b->a deadlock.
Trace like the following:
schedule+0x37/0x80
schedule_preempt_disabled+0xe/0x10
__mutex_lock_slowpath+0xb5/0x120
mutex_lock+0x23/0x40
rtnl_lock+0x15/0x20
netdev_run_todo+0x17c/0x320
rtnl_unlock+0xe/0x10
ipoib_vlan_delete+0x11b/0x1b0 [ib_ipoib]
delete_child+0x54/0x80 [ib_ipoib]
dev_attr_store+0x18/0x30
sysfs_kf_write+0x37/0x40
mutex_lock+0x16/0x40
SyS_write+0x55/0xc0
entry_SYSCALL_64_fastpath+0x16/0x75
And
schedule+0x37/0x80
__kernfs_remove+0x1a8/0x260
? wake_atomic_t_function+0x60/0x60
kernfs_remove+0x25/0x40
sysfs_remove_dir+0x50/0x80
kobject_del+0x18/0x50
device_del+0x19f/0x260
netdev_unregister_kobject+0x6a/0x80
rollback_registered_many+0x1fd/0x340
rollback_registered+0x3c/0x70
unregister_netdevice_queue+0x55/0xc0
unregister_netdev+0x20/0x30
ipoib_remove_one+0x114/0x1b0 [ib_ipoib]
ib_unregister_client+0x4a/0x170 [ib_core]
? find_module_all+0x71/0xa0
ipoib_cleanup_module+0x10/0x94 [ib_ipoib]
SyS_delete_module+0x1b5/0x210
entry_SYSCALL_64_fastpath+0x16/0x75
The fix is by checking the flag IPOIB_FLAG_INTF_ON_DESTROY in order to
get out from the sysfs function.
Fixes: 862096a8bb ("IB/ipoib: Add more rtnl_link_ops callbacks")
Fixes: 9baa0b0364 ("IB/ipoib: Add rtnl_link_ops support")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Currently ib_query_port always attempts to to read the subnet prefix by
calling ib_query_gid(). For RoCE/iWARP there is no subnet manager and no
subnet prefix. Fix this by querying GID[0] only for IB networks.
Fixes: fad61ad4e7 ('IB/core: Add subnet prefix to port info')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Between the initial and final versions of the function setup_hw_stats,
the order of variable initialization was changed. However, the unwind
flow on error did not properly keep up with the flow changes. Make
the unwind flow match a proper unwind of the allocation flow, then
remove no longer needed variable initializations.
Fixes: b40f4757da (IB/core: Make device counter infrastructure
dynamic)
Signed-off-by: Doug Ledford <dledford@redhat.com>
The new sysfs hw_counters code had an off by one in its array allocation
length. Fix that and the comment along with it.
Reported-by: Mark Bloch <markb@mellanox.com>
Fixes: b40f4757da (IB/core: Make device counter infrastructure
dynamic)
Signed-off-by: Doug Ledford <dledford@redhat.com>
Avoid that sparse reports the following warnings for the hfi1 driver:
trace.c:217:13: warning: no previous prototype for ‘print_u64_array’ [-Wmissing-prototypes]
user_sdma.c:1361:17: warning: dubious: !x & y
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The first argument of test_bit() and clear_bit() is a bit number and
not a bitmask. Hence change that first argument from (1 << 0) into 0.
This patch avoids that smatch reports the following warnings:
user_sdma.c:1059: sdma_cache_evict() warn: test_bit() takes a bit number
user_sdma.c:1590: sdma_rb_remove() warn: test_bit() takes a bit number
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Make the indentation of the source code consistent. Detected by
smatch.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch avoids that sparse reports the following warning:
rdmavt/qp.c:507:17: warning: context imbalance in 'rvt_reset_qp' - unexpected unlock
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Make indentation consistent. Detected by smatch.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hal Rosenstock <hal@mellanox.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-By: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Because patch "IB/srp: Move common code into the caller" was applied
partially srp_map_sg_dma() doesn't work properly. Fix this by
applying the remainder of that patch. See also
http://thread.gmane.org/gmane.linux.drivers.rdma/35803/focus=35811.
Fixes: 3849e44d1c ("IB/srp: Move common code into the caller")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Sagi Grimberg <sai@grimberg.me>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Avoid that mapping fails due to use_fast_reg != 0 or use_fmr != 0
if both member variables should be zero (if never_register == 1 or
if neither FMR nor FR is supported). Remove an initialization that
became superfluous due to changing a kmalloc() into a kzalloc()
call.
Fixes: 509c5f33f4 ("IB/srp: Prevent mapping failures")
Cc: Sagi Grimberg <sai@grimberg.m>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Perform the test for device managed flow steering support even if
memory windows are not supported. I noticed this because smatch
reported inconsistent indentation for the device managed flow
steering support test.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The DMA attributes are set but never used.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The current error handling in setup_hw_stats has a couple of issues.
It is possible to generate a null pointer deference on the
kfree of hsag->attrs[i] because two of the early error exit paths
jump to the kfree when hsags NULL and not allocated. Fix this by
moving the kfree on stats and jumping to that, avoiding the hsag
freeing.
Secondly, there is a memory leak of stats if the hsag allocation
fails; instead of returning, jump to the kfree on stats.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
We should return the error code if ib_add_ibnl_clients() fails. The
current code returns success.
Fixes: 735c631ae9 ('IB/core: Register SA ibnl client during ib_core initialization')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When CONFIG_FRAME_WARN is set to 1024 bytes, which is useful to find
stack consumers, we get a warning in hfi1 driver.
drivers/infiniband/hw/hfi1/affinity.c: In function
‘hfi1_get_proc_affinity’:
drivers/infiniband/hw/hfi1/affinity.c:415:1: warning: the frame size of
1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
This change removes unneeded buf[1024] declaration and usage.
Fixes: f48ad614c1 ("IB/hfi1: Move driver out of staging")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
That extra tabs are misleading.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pull parisc fixes from Helge Deller:
- Fix printk time stamps on SMP systems which got wrong due to a patch
which was added during the merge window
- Fix two bugs in the stack backtrace code: Races in module unloading
and possible invalid accesses to memory due to wrong instruction
decoding (Mikulas Patocka)
- Fix userspace crash when syscalls access invalid unaligned userspace
addresses. Those syscalls will now return EFAULT as expected.
(tagged for stable kernel series)
* 'parisc-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Move die_if_kernel() prototype into traps.h header
parisc: Fix pagefault crash in unaligned __get_user() call
parisc: Fix printk time during boot
parisc: Fix backtrace on PA-RISC
Pull key handling update from James Morris:
"This alters a new keyctl function added in the current merge window to
allow for a future extension planned for the next merge window"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Add placeholder for KDF usage with DH
The /dev/ptmx device node is changed to lookup the directory entry "pts"
in the same directory as the /dev/ptmx device node was opened in. If
there is a "pts" entry and that entry is a devpts filesystem /dev/ptmx
uses that filesystem. Otherwise the open of /dev/ptmx fails.
The DEVPTS_MULTIPLE_INSTANCES configuration option is removed, so that
userspace can now safely depend on each mount of devpts creating a new
instance of the filesystem.
Each mount of devpts is now a separate and equal filesystem.
Reserved ttys are now available to all instances of devpts where the
mounter is in the initial mount namespace.
A new vfs helper path_pts is introduced that finds a directory entry
named "pts" in the directory of the passed in path, and changes the
passed in path to point to it. The helper path_pts uses a function
path_parent_directory that was factored out of follow_dotdot.
In the implementation of devpts:
- devpts_mnt is killed as it is no longer meaningful if all mounts of
devpts are equal.
- pts_sb_from_inode is replaced by just inode->i_sb as all cached
inodes in the tty layer are now from the devpts filesystem.
- devpts_add_ref is rolled into the new function devpts_ptmx. And the
unnecessary inode hold is removed.
- devpts_del_ref is renamed devpts_release and reduced to just a
deacrivate_super.
- The newinstance mount option continues to be accepted but is now
ignored.
In devpts_fs.h definitions for when !CONFIG_UNIX98_PTYS are removed as
they are never used.
Documentation/filesystems/devices.txt is updated to describe the current
situation.
This has been verified to work properly on openwrt-15.05, centos5,
centos6, centos7, debian-6.0.2, debian-7.9, debian-8.2, ubuntu-14.04.3,
ubuntu-15.10, fedora23, magia-5, mint-17.3, opensuse-42.1,
slackware-14.1, gentoo-20151225 (13.0?), archlinux-2015-12-01. With the
caveat that on centos6 and on slackware-14.1 that there wind up being
two instances of the devpts filesystem mounted on /dev/pts, the lower
copy does not end up getting used.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg KH <greg@kroah.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Jann Horn <jann@thejh.net>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Florian Weimer <fw@deneb.enyo.de>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One of the debian buildd servers had this crash in the syslog without
any other information:
Unaligned handler failed, ret = -2
clock_adjtime (pid 22578): Unaligned data reference (code 28)
CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1
task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111100000001111 Tainted: G E
r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0
r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff
r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4
r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b
r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218
r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000
r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0
r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218
sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88
IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff
CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff
ORIG_R28: 00000002369fe628
IAOQ[0]: compat_get_timex+0x2dc/0x3c0
IAOQ[1]: compat_get_timex+0x2e0/0x3c0
RP(r2): compat_get_timex+0x40/0x3c0
Backtrace:
[<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0
[<0000000040205024>] syscall_exit+0x0/0x14
This means the userspace program clock_adjtime called the clock_adjtime()
syscall and then crashed inside the compat_get_timex() function.
Syscalls should never crash programs, but instead return EFAULT.
The IIR register contains the executed instruction, which disassebles
into "ldw 0(sr3,r5),r9".
This load-word instruction is part of __get_user() which tried to read the word
at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The
unaligned handler is able to emulate all ldw instructions, but it fails if it
fails to read the source e.g. because of page fault.
The following program reproduces the problem:
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/mman.h>
int main(void) {
/* allocate 8k */
char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
/* free second half (upper 4k) and make it invalid. */
munmap(ptr+4096, 4096);
/* syscall where first int is unaligned and clobbers into invalid memory region */
/* syscall should return EFAULT */
return syscall(__NR_clock_adjtime, 0, ptr+4095);
}
To fix this issue we simply need to check if the faulting instruction address
is in the exception fixup table when the unaligned handler failed. If it
is, call the fixup routine instead of crashing.
While looking at the unaligned handler I found another issue as well: The
target register should not be modified if the handler was unsuccessful.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
This patch fixes backtrace on PA-RISC
There were several problems:
1) The code that decodes instructions handles instructions that subtract
from the stack pointer incorrectly. If the instruction subtracts the
number X from the stack pointer the code increases the frame size by
(0x100000000-X). This results in invalid accesses to memory and
recursive page faults.
2) Because gcc reorders blocks, handling instructions that subtract from
the frame pointer is incorrect. For example, this function
int f(int a)
{
if (__builtin_expect(a, 1))
return a;
g();
return a;
}
is compiled in such a way, that the code that decreases the stack
pointer for the first "return a" is placed before the code for "g" call.
If we recognize this decrement, we mistakenly believe that the frame
size for the "g" call is zero.
To fix problems 1) and 2), the patch doesn't recognize instructions that
decrease the stack pointer at all. To further safeguard the unwind code
against nonsense values, we don't allow frame size larger than
Total_frame_size.
3) The backtrace is not locked. If stack dump races with module unload,
invalid table can be accessed.
This patch adds a spinlock when processing module tables.
Note, that for correct backtrace, you need recent binutils.
Binutils 2.18 from Debian 5 produce garbage unwind tables.
Binutils 2.21 work better (it sometimes forgets function frames, but at
least it doesn't generate garbage).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Pull drm fixes from Dave Airlie:
"A bunch of ARM drivers got into the fixes vibe this time around, so
this contains a bunch of fixes for imx, atmel hlcdc, arm hdlcd (only
so many combos of hlcd), mediatek and omap drm.
Other than that there is one mgag200 fix and a few core drm regression
fixes"
* tag 'drm-fixes-for-v4.7-rc2' of git://people.freedesktop.org/~airlied/linux: (34 commits)
drm/omap: fix unused variable warning.
drm: hdlcd: Add information about the underlying framebuffers in debugfs
drm: hdlcd: Cleanup the atomic plane operations
drm/hdlcd: Fix up crtc_state->event handling
drm: hdlcd: Revamp runtime power management
drm/mediatek: mtk_dsi: Remove spurious drm_connector_unregister
drm/mediatek: mtk_dpi: remove invalid error message
drm: atmel-hlcdc: fix a NULL check
drm: atmel-hlcdc: fix atmel_hlcdc_crtc_reset() implementation
drm/mgag200: Black screen fix for G200e rev 4
drm: Wrap direct calls to driver->gem_free_object from CMA
drm: fix fb refcount issue with atomic modesetting
drm: make drm_atomic_set_mode_prop_for_crtc() more reliable
drm/sti: remove extra mode fixup
drm: add missing drm_mode_set_crtcinfo call
drm/omap: include gpio/consumer.h where needed
drm/omap: include linux/seq_file.h where needed
Revert "drm/omap: no need to select OMAP2_DSS"
drm/omap: Remove regulator API abuse
OMAPDSS: HDMI5: Change DDC timings
...
Pull btrfs fixes from Chris Mason:
"The important part of this pull is Filipe's set of fixes for btrfs
device replacement. Filipe fixed a few issues seen on the list and a
number he found on his own"
* 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent
Btrfs: fix race between device replace and read repair
Btrfs: fix race between device replace and discard
Btrfs: fix race between device replace and chunk allocation
Btrfs: fix race setting block group back to RW mode during device replace
Btrfs: fix unprotected assignment of the left cursor for device replace
Btrfs: fix race setting block group readonly during device replace
Btrfs: fix race between device replace and block group removal
Btrfs: fix race between readahead and device replace/removal