If CPU_UP_PREPARE is called it is not guaranteed, that a previously allocated
and assigned hash has been freed already, but perf_event_init_cpu()
unconditionally allocates and assignes a new hash if the swhash is referenced.
By overwriting the pointer the existing hash is not longer accessible.
Verify that there is no hash assigned on this cpu before allocating and
assigning a new one.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20160209201007.843269966@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If CPU_DOWN_PREPARE fails the perf hotplug notifier is called for
CPU_DOWN_FAILED and calls perf_event_init_cpu(), which checks whether the
swhash is referenced. If yes it allocates a new hash and stores the pointer in
the per cpu data structure.
But at this point the cpu is still online, so there must be a valid hash
already. By overwriting the pointer the existing hash is not longer
accessible.
Remove the CPU_DOWN_FAILED state, as there is nothing to (re)allocate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20160209201007.763417379@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If CPU_UP_PREPARE fails the perf hotplug code calls perf_event_exit_cpu(),
which is a pointless exercise. The cpu is not online, so the smp function
calls return -ENXIO. So the result is a list walk to call noops.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20160209201007.682184765@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Looks like the HPET spec at intel.com got moved.
It isn't hard to find so drop the link, just mention
the revision assumed.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1455145462-3877-1-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit ca369d51b ("sd: Fix device-imposed transfer length limits")
introduced a new queue limit max_dev_sectors which limits the maximum
sectors for requests. The default value leads to small dasd requests
and therefor to a performance drop.
Set the max_dev_sectors value to the same value as the max_hw_sectors
to use the maximum available request size for DASD devices.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 4.4+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Data corruption issues were observed in tests which initiated
a system crash/reset while accessing BTT devices. This problem
is reproducible.
The BTT driver calls pmem_rw_bytes() to update data in pmem
devices. This interface calls __copy_user_nocache(), which
uses non-temporal stores so that the stores to pmem are
persistent.
__copy_user_nocache() uses non-temporal stores when a request
size is 8 bytes or larger (and is aligned by 8 bytes). The
BTT driver updates the BTT map table, which entry size is
4 bytes. Therefore, updates to the map table entries remain
cached, and are not written to pmem after a crash.
Change __copy_user_nocache() to use non-temporal store when
a request size is 4 bytes. The change extends the current
byte-copy path for a less-than-8-bytes request, and does not
add any overhead to the regular path.
Reported-and-tested-by: Micah Parrish <micah.parrish@hpe.com>
Reported-and-tested-by: Brian Boylston <brian.boylston@hpe.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: <stable@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/1455225857-12039-3-git-send-email-toshi.kani@hpe.com
[ Small readability edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When fixing the DAT off bug ("s390: fix DAT off memory access, e.g.
on kdump") both Christian and I missed that we can save an additional
stnsm instruction.
This saves us a couple of cycles which could improve the speed of
memcpy_real.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In the error path of amd_uncore_cpu_up_prepare() the newly allocated uncore
struct is freed, but the percpu pointer still references it. Set it to NULL.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1602162302170.19512@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The qxl_gem_prime_mmap() function returns ENOSYS instead of -ENOSYS
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In the display resume path, move the calls to drm_vblank_on()
after the point when the display engine is running again.
Since changes were made to drm_update_vblank_count() in Linux 4.4+
to emulate hw vblank counters via vblank timestamping, the function
drm_vblank_on() now needs working high precision vblank timestamping
and therefore working scanout position queries at time of call.
These don't work before the display engine gets restarted, causing
miscalculation of vblank counter increments and thereby large forward
jumps in vblank count at display resume. These jumps can cause client
hangs on resume, or desktop hangs in the case of composited desktops.
Fix this Linux 4.4 regression by reordering calls accordingly.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: <stable@vger.kernel.org> # 4.4+
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
drm_vblank_offdelay can have three different types of values:
< 0 is to be always treated the same as dev->vblank_disable_immediate
= 0 is to be treated as "never disable vblanks"
> 0 is to be treated as disable immediate if kms driver wants it
that way via dev->vblank_disable_immediate. Otherwise it is
a disable timeout in msecs.
This got broken in Linux 3.18+ for the implementation of
drm_vblank_on. If the user specified a value of zero which should
always reenable vblank irqs in this function, a kms driver could
override the users choice by setting vblank_disable_immediate
to true. This patch fixes the regression and keeps the user in
control.
v2: Only reenable vblank if there are clients left or the user
requested to "never disable vblanks" via offdelay 0. Enabling
vblanks even in the "delayed disable" case (offdelay > 0) was
specifically added by Ville in commit cd19e52aee
("drm: Kick start vblank interrupts at drm_vblank_on()"),
but after discussion it turns out that this was done by accident.
Citing Ville: "I think it just ended up as a mess due to changing
some of the semantics of offdelay<0 vs. offdelay==0 vs.
disable_immediate during the review of the series. So yeah, given
how drm_vblank_put() works now, I'd just make this check for
offdelay==0."
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org> # 3.18+
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Changes to drm_update_vblank_count() in Linux 4.4 broke the
behaviour of the pre/post modeset functions as the new update
code doesn't deal with hw vblank counter resets inbetween calls
to drm_vblank_pre_modeset an drm_vblank_post_modeset, as it
should.
This causes mistreatment of such hw counter resets as counter
wraparound, and thereby large forward jumps of the software
vblank counter which in turn cause vblank event dispatching
and vblank waits to fail/hang --> userspace clients hang.
This symptom was reported on radeon-kms to cause a infinite
hang of KDE Plasma 5 shell's login procedure, preventing users
from logging in.
Fix this by detecting when drm_update_vblank_count() is called
inside a pre->post modeset interval. If so, clamp valid vblank
increments to the safe values 0 and 1, pretty much restoring
the update behavior of the old update code of Linux 4.3 and
earlier. Also reset the last recorded hw vblank count at call
to drm_vblank_post_modeset() to be safe against hw that after
modesetting, dpms on etc. only fires its first vblank irq after
drm_vblank_post_modeset() was already called.
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org> # 4.4+
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
This fixes a regression introduced by the new drm_update_vblank_count()
implementation in Linux 4.4:
Restrict the bump of the software vblank counter in drm_update_vblank_count()
to a safe maximum value of +1 whenever there is the possibility that
concurrent readers of vblank timestamps could be active at the moment,
as the current implementation of the timestamp caching and updating is
not safe against concurrent readers for calls to store_vblank() with a
bump of anything but +1. A bump != 1 would very likely return corrupted
timestamps to userspace, because the same slot in the cache could
be concurrently written by store_vblank() and read by one of those
readers in a non-atomic fashion and without the read-retry logic
detecting this collision.
Concurrent readers can exist while drm_update_vblank_count() is called
from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank-
irq callers. However, all those calls are happening with the vbl_lock
locked thereby preventing a drm_vblank_get(), so the vblank refcount
can't increase while drm_update_vblank_count() is executing. Therefore
a zero vblank refcount during execution of that function signals that
is safe for arbitrary counter bumps if called from outside vblank irq,
whereas a non-zero count is not safe.
Whenever the function is called from vblank irq, we have to assume concurrent
readers could show up any time during its execution, even if the refcount
is currently zero, as vblank irqs are usually only enabled due to the
presence of readers, and because when it is called from vblank irq it
can't hold the vbl_lock to protect it from sudden bumps in vblank refcount.
Therefore also restrict bumps to +1 when the function is called from vblank
irq.
Such bumps of more than +1 can happen at other times than reenabling
vblank irqs, e.g., when regular vblank interrupts get delayed by more
than 1 frame due to long held locks, long irq off periods, realtime
preemption on RT kernels, or system management interrupts.
A better solution would be to rewrite the timestamp caching to use
full seqlocks to allow concurrent writes and reads for arbitrary
vblank counter increments.
v2: Add code comment that this is essentially a hack and should
be replaced by a full seqlock implementation for caching of
timestamps.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org> # 4.4+
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Otherwise if a kms driver calls into drm_vblank_off() more than once
before calling drm_vblank_on() again, the redundant calls to
vblank_disable_and_save() will call drm_update_vblank_count()
while hw vblank counters and vblank timestamping are in a undefined
state during modesets, dpms off etc.
At least with the legacy drm helpers it is not unusual to
get multiple calls to drm_vblank_off and drm_vblank_on, e.g.,
half a dozen calls to drm_vblank_off and two calls to drm_vblank_on
were observed on radeon-kms during dpms-off -> dpms-on transition.
We don't no-op calls from atomic modesetting drivers, as they
should do a proper job of tracking hw state.
Fixes large jumps of the software maintained vblank counter due to
the hardware vblank counter resetting to zero during dpms off or
modeset, e.g., if radeon-kms is modified to use drm_vblank_off/on
instead of drm_vblank_pre/post_modeset().
This fixes a regression caused by the changes made to
drm_update_vblank_count() in Linux 4.4.
v2: Don't no-op on atomic modesetting drivers, per suggestion
of Daniel Vetter.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org> # 4.4+
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
This avoids integer overflows on 32bit machines when calculating
reloc_info size, as reported by Alan Cox.
Cc: stable@vger.kernel.org
Cc: gnomes@lxorguk.ukuu.org.uk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Summary:
- fix compilation warnings on ARM64bit.
- fix mic driver initialization.
. MIC is a part of KMS so it converts it to use component framework
like other KMS drivers did.
- fix wrong driver state and disable clock order on DECON driver.
- fix incorrect use of dma_mmap_attrs function.
* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos/decon: fix disable clocks order
drm/exynos: fix incorrect cpu address for dma_mmap_attrs()
drm/exynos: exynos5433_decon: fix wrong state in decon_vblank_enable
drm/exynos: exynos5433_decon: fix wrong state assignment in decon_enable
drm/exynos: dsi: restore support for drm bridge
drm/exynos: mic: make all functions static
drm/exynos: mic: convert to component framework
drm/exynos: mic: use devm_clk interface
drm/exynos: fix types for compilation on 64bit architectures
drm/exynos: ipp: fix incorrect format specifiers in debug messages
drm/exynos: depend on ARCH_EXYNOS for DRM_EXYNOS
This reverts commit cfcfa086d4.
This causes the tiling properties to break in some unexpected ways,
Revert it for now.
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
If tcp_v4_inbound_md5_hash() returns an error, we must release
the refcount on the request socket, not on the listener.
The bug was added for IPv4 only.
Fixes: 079096f103 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
inode struct members that track cgroup writeback information
should be reinitialized when inode gets allocated from
kmem_cache. Otherwise, their values remain and get used by the
new inode.
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Fixes: d10c809552 ("writeback: implement foreign cgroup inode bdi_writeback switching")
Signed-off-by: Jens Axboe <axboe@fb.com>
There are some cases where rtt_us derives from deltas of jiffies,
instead of using usec timestamps.
Since we want to track minimal rtt, better to assume a delta of 0 jiffie
might be in fact be very close to 1 jiffie.
It is kind of sad jiffies_to_usecs(1) calls a function instead of simply
using a constant.
Fixes: f672258391 ("tcp: track min RTT using windowed min-filter")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phy has not been initialized, disconnecting it in the error
path results in a NULL pointer exception. Drop the phy_disconnect
from the error path.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell 88E6240 has been tested successfully without further
changes. Add entry to the table of supported devices.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 5266698661 ("tipc: let broadcast packet reception
use new link receive function") we introduced a new per-node
broadcast reception link instance. This link is created at the
moment the node itself is created. Unfortunately, the allocation
is done after the node instance has already been added to the node
lookup hash table. This creates a potential race condition, where
arriving broadcast packets are able to find and access the node
before it has been fully initialized, and before the above mentioned
link has been created. The result is occasional crashes in the function
tipc_bcast_rcv(), which is trying to access the not-yet existing link.
We fix this by deferring the addition of the node instance until after
it has been fully initialized in the function tipc_node_create().
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Bug fixes.
Fixed autoneg logic and some related cleanups, fixed tx push operation,
and reduced default ring sizes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The current default tx ring size of 512 causes an extra page to be
allocated for the tx ring with only 1 entry in it. Reduce it to
511. The default rx ring size is also reduced to 511 to use less
memory by default.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tx push is supported for small packets to reduce DMA latency. The
following bugs are fixed in this patch:
1. Fix the definition of the push BD which is different from the DMA BD.
2. The push buffer has to be zero padded to the next 64-bit word boundary
or tx checksum won't be correct.
3. Increase the tx push packet threshold to 164 bytes (192 bytes with the BD)
so that small tunneled packets are within the threshold.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20G is not supported by production hardware and only the 40GbaseCR4 standard
is supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup bnxt_probe_phy() to cleanly separate 2 code blocks for autoneg
on and off. Autoneg flow control is possible only if autoneg is enabled.
In bnxt_get_settings(), Pause and Asym_Pause are always supported.
Only the advertisement bits change depending on the ethtool -A setting
in auto mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Determine autoneg on|off setting from link_info->autoneg. Using the
firmware returned setting can be misleading if autoneg is changed and
there hasn't been a phy update from the firmware.
2. If autoneg is disabled, link_info->autoneg should be set to 0 to
indicate both speed and flow control autoneg are disabled.
3. To enable autoneg flow control, speed autoneg must be enabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A recent change to the mdb code confused the compiler to the point
where it did not realize that the port-group returned from
br_mdb_add_group() is always valid when the function returns a nonzero
return value, so we get a spurious warning:
net/bridge/br_mdb.c: In function 'br_mdb_add':
net/bridge/br_mdb.c:542:4: error: 'pg' may be used uninitialized in this function [-Werror=maybe-uninitialized]
__br_mdb_notify(dev, entry, RTM_NEWMDB, pg);
Slightly rearranging the code in br_mdb_add_group() makes the problem
go away, as gcc is clever enough to see that both functions check
for 'ret != 0'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 9e8430f8d6 ("bridge: mdb: Passing the port-group pointer to br_mdb module")
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Kochetkov says:
====================
Fixes for rockchip EMAC
Here is a set of 3 patches what fix koops, memory leak and
rockchip EMAC hang. Tested on radxarock lite.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
EMAC could be disabled, while there is some sb_buff
in use. That buffers got lost for linux.
In order to reproduce run on device during active ethernet work:
ifconfig eth0 down
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EMAC reset internal tx ring pointer to zero at statup.
txbd_curr and txbd_dirty can be different from zero.
That cause ethernet transfer hang (no packets transmitted).
In order to reproduce, run on device:
ifconfig eth0 down
ifconfig eth0 up
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch corrects the unaligned accesses seen on GRE TEB tunnels when
generating hash keys. Specifically what this patch does is make it so that
we force the use of skb_copy_bits when the GRE inner headers will be
unaligned due to NET_IP_ALIGNED being a non-zero value.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
mlx5 driver fixes for 4.5-rc2
We added here a patch from Matan and Alaa for addressing Linus comments on
the mess w.r.t reserved field names in the driver/firmware auto-generated file.
Once the patch hits linus tree, we'll ask Doug to rebase his tree on that
rc so both net-next and rdma-next development for 4.6 will be done under
the fixed robust form.
Also provided two patches that addresses the dynamic ndo initialization
issue of mlx5e netdevice.
Or and Saeed.
changes from V1: (Only first patch was changed)
In this V we fixed the issues addressed in Or's previous e-mail.
1. Offsets took into account two dimensional u8 arrays
2. Offsets took into account nesting unions and structs
3. Offsets for unions
4. Offsets for any reserved field
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently our netdevice ops is a one static global variable which
is referenced by all mlx5e netdevice instances. This can be
problematic when different driver instances do not share same
HW capabilities (e.g SRIOV PF and VFs probed to the host).
Now we have two constant global netdevice ops variables, one
for basic netdevice ops and the other with extended SRIOV ops,
on netdevice construction we choose the one suitable for
current device capabilities.
Fixes: 66e49dedad ("net/mlx5e: Add support for SR-IOV ndos")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently mlx5e_select_queue is redundant since num_tc is always 1.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx5_ifc.h is a header file representing the API and ABI between
the driver to the firmware and hardware. This file is used from
both the mlx5_ib and mlx5_core drivers.
Previously, this file used incrementing counter to indicate
reserved fields, for example:
struct mlx5_ifc_odp_per_transport_service_cap_bits {
u8 send[0x1];
u8 receive[0x1];
u8 write[0x1];
u8 read[0x1];
u8 reserved_0[0x1];
u8 srq_receive[0x1];
u8 reserved_1[0x1a];
};
If one developer implements through net-next feature A that uses
reserved_0, they replace it with featureA and renames reserved_1 to
reserved_0. In the same kernel cycle, a 2nd developer could implement
feature B through the rdma tree, that uses reserved_1 and split it to
featureB and a smaller reserved_1 field. This will cause a conflict
when the two trees are merged.
The source of this conflict is that the 1st developer changed *all*
reserved fields.
As Linus suggested, we change the layout of structs to:
struct mlx5_ifc_odp_per_transport_service_cap_bits {
u8 send[0x1];
u8 receive[0x1];
u8 write[0x1];
u8 read[0x1];
u8 reserved_at_4[0x1];
u8 srq_receive[0x1];
u8 reserved_at_6[0x1a];
};
This makes the conflicts much more rare and preserves the locality of
changes.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This gets us functional GPU reset again, like we had until a refactor
at merge time. Tested with a little patch to stuff in a broken binner
job every 100 frames.
Signed-off-by: Eric Anholt <eric@anholt.net>
This may actually get us a feature that the closed driver didn't have:
turning off the GPU in between rendering jobs, while the V3D device is
still opened by the client.
There may be some tuning to be applied here to use autosuspend so that
we don't bounce the device's power so much, but in steady-state
GPU-bound rendering we keep the power on (since we keep multiple jobs
outstanding) and even if we power cycle on every job we can still
manage at least 680 fps.
More importantly, though, runtime PM will allow us to power off the
device to do a GPU reset.
v2: Switch #ifdef to CONFIG_PM not CONFIG_PM_SLEEP (caught by kbuild
test robot)
Signed-off-by: Eric Anholt <eric@anholt.net>
We were tracking the "where are the head pointers pointing" globally,
so if another job reused the same BOs and execution was at the same
point as last time we checked, we'd stop and trigger a reset even
though the GPU had made progress.
Signed-off-by: Eric Anholt <eric@anholt.net>
These ioctls end up getting exposed to fairly directly to GL users,
and having normal user operations print DRM errors is obviously wrong.
The message was originally to give us some idea of what happened when
a hang occurred, but we have a DRM_INFO from reset for that.
Signed-off-by: Eric Anholt <eric@anholt.net>
This caused the wait ioctls to claim that waiting had completed when
we actually got interrupted by a signal before it was done. Fixes
broken rendering throttling that produced serious lag in X window
dragging.
Signed-off-by: Eric Anholt <eric@anholt.net>