transports need to be able to detect legacy-only
devices (ATM balloon only) to use legacy path
to drive them.
Add a core API to do just that.
The implementation just blacklists balloon:
not too pretty, but let's not over-engineer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
We have no plans to support virtio 1.0 in balloon driver. Add an
explicit flag to mark it legacy only.
This will be used by follow-up patches.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Guests need to use virtio scsi API, so export it to uapi,
nice to e.g. qemu and will help us remember this file
affects ABI.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Note: for consistency, and to avoid sparse errors,
convert all fields, even those no longer in use
for virtio v1.0.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
virtio-blk has some legacy feature bits that modern drivers
must not negotiate, but are needed for old legacy hosts
(that e.g. don't support virtio-scsi).
Allow a separate legacy feature table for such cases.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
For virtio-1, we can theoretically have a more complex virtqueue
layout with avail and used buffers not on a contiguous memory area
with the descriptor table. For now, it's fine for a transport driver
to stay with the old layout: It needs, however, a way to access
the locations of the avail/used rings so it can register them with
the host.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We (ab)use virtio conversion functions for device-specific
config space accesses.
Based on original patches by Cornelia and Rusty.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.com>
virtio 1.0 makes all memory structures LE, so
we need APIs to conditionally do a byteswap on BE
architectures.
To make it easier to check code statically,
add virtio specific types for multi-byte integers
in memory.
Add low level wrappers that do a byteswap conditionally, these will be
useful e.g. for vhost. Add high level wrappers that
query device endian-ness and act accordingly.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Change u32 to u64, and use BIT_ULL and 1ULL everywhere.
Note: transports are unchanged, and only set low 32 bit.
This guarantees that no transport sets e.g. VERSION_1
by mistake without proper support.
Based on patch by Rusty.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
It seemed like a good idea to use bitmap for features
in struct virtio_device, but it's actually a pain,
and seems to become even more painful when we get more
than 32 feature bits. Just change it to a u32 for now.
Based on patch by Rusty.
Suggested-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Add low level APIs to test/set/clear feature bits.
For use by transports, to make it easier to
write code independent of feature bit array format.
Note: APIs is prefixed with __ and has _bit suffix
to stress its low level nature. It's for use by transports only:
drivers should use virtio_has_feature and never need to set/clear
features.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Here are some Staging and IIO driver fixes for 3.18-rc7 that resolve a
number of reported issues, and a new device id for a staging wireless
driver.
All of these have been in linux-next.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlR4+R4ACgkQMUfUDdst+ynr6ACgwT/lNLkrC9it9f46Wzlrx+G7
vH0An1OPrgAGjpiu8LYou1CpzsJkMTAq
=3+bs
-----END PGP SIGNATURE-----
Merge tag 'staging-3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO driver fixes from Greg KH:
"Here are some staging and IIO driver fixes for 3.18-rc7 that resolve a
number of reported issues, and a new device id for a staging wireless
driver.
All of these have been in linux-next"
* tag 'staging-3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8188eu: Add new device ID for DLink GO-USB-N150
staging: r8188eu: Fix scheduling while atomic error introduced in commit fadbe0cd
iio: accel: bmc150: set low default thresholds
iio: accel: bmc150: Fix iio_event_spec direction
iio: accel: bmc150: Send x, y and z motion separately
iio: accel: bmc150: Error handling when mode set fails
iio: gyro: bmg160: Fix iio_event_spec direction
iio: gyro: bmg160: Send x, y and z motion separately
iio: gyro: bmg160: Don't let interrupt mode to be open drain
iio: gyro: bmg160: Error handling when mode set fails
iio: adc: men_z188_adc: Add terminating entry for men_z188_ids
iio: accel: kxcjk-1013: Fix kxcjk10013_set_range
iio: Fix IIO_EVENT_CODE_EXTRACT_DIR bit mask
Most of these are fairly standard little fixes, a bmc150 and bmg160 patch
is to make an ABI change to indicated a specific axis in an event rather
than the generic option in the original drivers. As both of these drivers
are new in this cycle it would be ideal to push this minor change through
even though it isn't strictly a fix. A couple of other 'fixes' change
defaults for some settings on these new drivers to more intuitive calues.
Looks like some useful feedback has been coming in for this driver
since it was applied.
* IIO_EVENT_CODE_EXTRACT_DIR bit mask was wrong and has been for a while
0xCF clearly doesn't give a contiguous bitmask.
* kxcjk-1013 range setting was failing to mask out the previous value
in the register and hence was 'enable only'.
* men_z188 device id table wasn't null terminated.
* bmg160 and bmc150 both failed to correctly handling an error in mode
setting.
* bmg160 and bmc150 both had a bug in setting the event direction in the
event spec (leads to an attribute name being incorrect)
* bmg160 defaulted to an open drain output for the interrupt - as a default
this obviously only works with some interrupt chips - hence change the
default to push-pull (note this is a new driver so we aren't going to
cause any regressions with this change).
* bmc150 had an unintuitive default for the rate of change (motion detector)
so change it to 0 (new driver so change of default won't cause any
regressions).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJUaRGrAAoJEFSFNJnE9BaIrWgP/jdxYsA7l7gAamjkK6Dm+jmR
FIDaD1eJ0ZMRS43guIwaGJ90OC2Mxc7xIAgbE3xurI0r/73YagTwWtUwzUOGRnE1
kqRKyo8NEu3+BZtasoigEZfm9pHmsmkdD+lQLAnLlDeVFhYpTbsr/j9qeMD7L8CN
b+hTwQBsObnpJ/tN61KLSjlx57D8c47ghCgqEaGqXrfR4r/wMItMN6cB1xbM4yU4
tHJQBBbOv03vZI5oaxJ3+Q4aBGf4TdrL3z/P/vrmMUyyQrmbS6jCBjUlmjcylVSn
Yz2mr5oPyRgRRzH/KcMT/S+i8BELxBuC5nURBAkO35YqHhXvZENgg29edEWX4s4c
KOTC+FgbEEPEu5wcUl5NuFPP3D8NNuOxDl677bLz9I6ufhwFLCNtyN6vGsqAMPA+
s/eviz/W8u2L90/+ryiEV+UESXjqLszWU7xpfdheo4Z6jokpWi9ZT65m11Z+aJ79
QldzeniUxR9ycH6O6z9GxkdhXV79OACjkvoNZgh33MfmuX7jLMIodWfrI/Xn2+Pb
N0hpWzcADcd2KfXoXRvuN8t3Wgz09T7CuodOSBsAOjhvkUiufj+iOhwU96rNnNzl
ZWtYAbRr+DmKks+bzoobyCypaH/0hPuC/YUSBUALlg80P8ozGiIs+E1phiZze+rB
fHyT8lUFg4a0syWOfx8s
=IPMl
-----END PGP SIGNATURE-----
Merge tag 'iio-fixes-for-3.18c' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
Third set of IIO fixes for the 3.18 cycle.
Most of these are fairly standard little fixes, a bmc150 and bmg160 patch
is to make an ABI change to indicated a specific axis in an event rather
than the generic option in the original drivers. As both of these drivers
are new in this cycle it would be ideal to push this minor change through
even though it isn't strictly a fix. A couple of other 'fixes' change
defaults for some settings on these new drivers to more intuitive calues.
Looks like some useful feedback has been coming in for this driver
since it was applied.
* IIO_EVENT_CODE_EXTRACT_DIR bit mask was wrong and has been for a while
0xCF clearly doesn't give a contiguous bitmask.
* kxcjk-1013 range setting was failing to mask out the previous value
in the register and hence was 'enable only'.
* men_z188 device id table wasn't null terminated.
* bmg160 and bmc150 both failed to correctly handling an error in mode
setting.
* bmg160 and bmc150 both had a bug in setting the event direction in the
event spec (leads to an attribute name being incorrect)
* bmg160 defaulted to an open drain output for the interrupt - as a default
this obviously only works with some interrupt chips - hence change the
default to push-pull (note this is a new driver so we aren't going to
cause any regressions with this change).
* bmc150 had an unintuitive default for the rate of change (motion detector)
so change it to 0 (new driver so change of default won't cause any
regressions).
This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in
kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn.
The problem being addressed by the patch above was that some ARM code
based the memory mapping attributes of a pfn on the return value of
kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should
be mapped as device memory.
However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin,
and the existing non-ARM users were already using it in a way which
suggests that its name should probably have been 'kvm_is_reserved_pfn'
from the beginning, e.g., whether or not to call get_page/put_page on
it etc. This means that returning false for the zero page is a mistake
and the patch above should be reverted.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull powerpc fixes from Ben Herrenschmidt:
"This series fix a nasty issue with radeon adapters on powerpc servers,
it's all CC'ed stable and has the relevant maintainers ack's/reviews.
Basically, some (radeon) adapters have issues with MSI addresses above
1T (only support 40-bits). We had powerpc specific quirk but it only
listed a specific revision of an adapter that we shipped with our
machines and didn't properly handle the audio function which some
distros enable nowadays.
So we made the quirk generic and fixed both the graphic and audio
drivers properly to use it.
Without that, ppc64 server machines will crash at boot with a radeon
adapter.
Note: This has been brewing for a while, it just needed a last respin
which got delayed due to us moving ozlabs to a new location in town
and other such things taking priority"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/pci: Remove unused force_32bit_msi quirk
powerpc/pseries: Honor the generic "no_64bit_msi" flag
powerpc/powernv: Honor the generic "no_64bit_msi" flag
sound/radeon: Move 64-bit MSI quirk from arch to driver
gpu/radeon: Set flag to indicate broken 64-bit MSI
PCI/MSI: Add device flag indicating that 64-bit MSIs don't work
ALSA: hda - Limit 40bit DMA for AMD HDMI controllers
single fix in one of the basic clock templates. No fixes to the core
this time around. As with most clock driver fixes these run the gamut
from fixing a build warning to fixing wrecked memory timings, with a
little USB tossed in for fun. Please consider pulling.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUcrkwAAoJEDqPOy9afJhJzq4P/jT9K+g0ljQrY93t97Wm6s4x
Xi+RrVO/MOUhpIGzqrhPflGALl5Yj96iBUiC2QSVpVjDUdoQL5tc8c3FtQDGA7fA
Q/9e2yUmjQ+nNxizdeIzaNUHO+fIe8FEn3NwyondfaDlI1sqVv/0WAf6MNkuLCwM
/DJ1MmJbwgK255gI3FwUhbNylCCPeUENKRs3xGW3p4+fFIZGyROhBsJClE1nUiT1
EFzWM6Bq29qOLxZ4Dqkfzz1BWLiqcTlRcf8ZaHjME77k09ybwNS9cmXrB9gHhmlL
sMfDa0uwsv/mFWRohP5jK3AUqqtR7EgcPL5euO+d9Q+nBVofgTwxyvA0nlGqX8XQ
hm1OZeolnWHPPHasRkgzSnd/0b/A8s+tr96XSvHjIlrx1ioWQD2K7GU82/3bObTL
isqzW34+Y0dX2GpgwJu2eWrSwHk705wBA0t8/pP+r7aWdUsyX4J1ElGHLElzTLI0
VkQZPwKvjVNd0kQRplZ/KPQoboDuFh8b09+MvG8Kz8t3Ilt0MS7rFrxEQ6xIBfe9
M49vUJw2egmOCgcWp3GeyICIQJCfet2acyZy+vJivpu0//ssD7BT/woR7qmgHic1
kmiVdj1iBSoUK4NIr+DvsNmMMDEW58CSK/j11chitT8WCRGYKW849iUk7LiGhXU0
IgTphTfMdFF1a2gzqaQo
=4O2k
-----END PGP SIGNATURE-----
Merge tag 'clk-fixes-for-linus' of https://git.linaro.org/people/mike.turquette/linux
Pull clock fixes from Mike Turquette:
"The fixes for the clock framework are all regressions in drivers, plus
a single fix in one of the basic clock templates. No fixes to the
core this time around.
As with most clock driver fixes these run the gamut from fixing a
build warning to fixing wrecked memory timings, with a little USB
tossed in for fun"
* tag 'clk-fixes-for-linus' of https://git.linaro.org/people/mike.turquette/linux:
clk: pxa: fix pxa27x CCCR bit usage
clk-divider: Fix READ_ONLY when divider > 1
clk: qcom: Fix duplicate rbcpr clock name
clk: at91: usb: fix at91sam9x5 recalc, round and set rate
clk: at91: usb: fix at91rm9200 round and set rate
This can be set by quirks/drivers to be used by the architecture code
that assigns the MSI addresses.
We additionally add verification in the core MSI code that the values
assigned by the architecture do satisfy the limitation in order to fail
gracefully if they don't (ie. the arch hasn't been updated to deal with
that quirk yet).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Pull percpu fix from Tejun Heo:
"This contains one patch to fix a race condition which can lead to
percpu_ref using a percpu pointer which is corrupted with a set DEAD
bit. The bug was introduced while separating out the ATOMIC mode flag
from the DEAD flag. The fix is pretty straight forward.
I just committed the patch to the percpu tree but am sending out the
pull request early as I'll be on vacation for a week. The patch
should be fairly safe and while the latency will be higher I'll be
checking emails"
* 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu-ref: fix DEAD flag contamination of percpu pointer
While decoupling ATOMIC and DEAD flags, f47ad45784 ("percpu_ref:
decouple switching to percpu mode and reinit") updated
__ref_is_percpu() so that it only tests ATOMIC flag to determine
whether the ref is in percpu mode or not; however, while DEAD implies
ATOMIC, the two flags are set separately during percpu_ref_kill() and
if __ref_is_percpu() races percpu_ref_kill(), it may see DEAD w/o
ATOMIC. Because __ref_is_percpu() returns @ref->percpu_count_ptr
value verbatim as the percpu pointer after testing ATOMIC, the pointer
may now be contaminated with the DEAD flag.
This can be fixed by clearing the flag bits before returning the
pointer which was the fix proposed by Shaohua; however, as DEAD
implies ATOMIC, we can just test for both flags at once and avoid the
explicit masking.
Update __ref_is_percpu() so that it tests that both ATOMIC and DEAD
are clear before returning @ref->percpu_count_ptr as the percpu
pointer.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-Reviewed-by: Shaohua Li <shli@kernel.org>
Link: http://lkml.kernel.org/r/995deb699f5b873c45d667df4add3b06f73c2c25.1416638887.git.shli@kernel.org
Fixes: f47ad45784 ("percpu_ref: decouple switching to percpu mode and reinit")
Pull networking fixes from David Miller:
1) Fix BUG when decrypting empty packets in mac80211, from Ronald Wahl.
2) nf_nat_range is not fully initialized and this is copied back to
userspace, from Daniel Borkmann.
3) Fix read past end of b uffer in netfilter ipset, also from Dan
Carpenter.
4) Signed integer overflow in ipv4 address mask creation helper
inet_make_mask(), from Vincent BENAYOUN.
5) VXLAN, be2net, mlx4_en, and qlcnic need ->ndo_gso_check() methods to
properly describe the device's capabilities, from Joe Stringer.
6) Fix memory leaks and checksum miscalculations in openvswitch, from
Pravin B SHelar and Jesse Gross.
7) FIB rules passes back ambiguous error code for unreachable routes,
making behavior confusing for userspace. Fix from Panu Matilainen.
8) ieee802154fake_probe() doesn't release resources properly on error,
from Alexey Khoroshilov.
9) Fix skb_over_panic in add_grhead(), from Daniel Borkmann.
10) Fix access of stale slave pointers in bonding code, from Nikolay
Aleksandrov.
11) Fix stack info leak in PPP pptp code, from Mathias Krause.
12) Cure locking bug in IPX stack, from Jiri Bohac.
13) Revert SKB fclone memory freeing optimization that is racey and can
allow accesses to freed up memory, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (71 commits)
tcp: Restore RFC5961-compliant behavior for SYN packets
net: Revert "net: avoid one atomic operation in skb_clone()"
virtio-net: validate features during probe
cxgb4 : Fix DCB priority groups being returned in wrong order
ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg
openvswitch: Don't validate IPv6 label masks.
pptp: fix stack info leak in pptp_getname()
brcmfmac: don't include linux/unaligned/access_ok.h
cxgb4i : Don't block unload/cxgb4 unload when remote closes TCP connection
ipv6: delete protocol and unregister rtnetlink when cleanup
net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too
bonding: fix curr_active_slave/carrier with loadbalance arp monitoring
mac80211: minstrel_ht: fix a crash in rate sorting
vxlan: Inline vxlan_gso_check().
can: m_can: update to support CAN FD features
can: m_can: fix incorrect error messages
can: m_can: add missing delay after setting CCCR_INIT bit
can: m_can: fix not set can_dlc for remote frame
can: m_can: fix possible sleep in napi poll
can: m_can: add missing message RAM initialization
...
Pull scheduler fixes from Ingo Molnar:
"Misc fixes: two NUMA fixes, two cputime fixes and an RCU/lockdep fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency
sched/cputime: Fix cpu_timer_sample_group() double accounting
sched/numa: Avoid selecting oneself as swap target
sched/numa: Fix out of bounds read in sched_init_numa()
sched: Remove lockdep check in sched_move_task()
Pull core fix from Ingo Molnar:
"Fix GENMASK macro shift overflow"
Nobody seems to currently use GENMASK() to fill every single last bit
(which is what overflows) in-tree, and gcc would warn about it, so we
have that going for us. But apparently there are pending changes that
want this.
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
bitops: Fix shift overflow in GENMASK macros
The CAN device drivers can use can_is_canfd_skb() to check if the frame to send
is on CAN FD mode or normal CAN mode.
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Dong Aisheng <b29396@freescale.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Commit 79c6ab5095 (clk: divider: add CLK_DIVIDER_READ_ONLY flag) in
v3.16 introduced the CLK_DIVIDER_READ_ONLY flag which caused the
recalc_rate() and round_rate() clock callbacks to be omitted.
However using this flag has the unfortunate side effect of causing the
clock recalculation code when a clock rate change is attempted to always
treat it as a pass-through clock, i.e. with a fixed divide of 1, which
may not be the case. Child clock rates are then recalculated using the
wrong parent rate.
Therefore instead of dropping the recalc_rate() and round_rate()
callbacks, alter clk_divider_bestdiv() to always report the current
divider as the best divider so that it is never altered.
For me the read only clock was the system clock, which divided the PLL
rate by 2, from which both the UART and the SPI clocks were divided.
Initial setting of the UART rate set it correctly, but when the SPI
clock was set, the other child clocks were miscalculated. The UART clock
was recalculated using the PLL rate as the parent rate, resulting in a
UART new_rate of double what it should be, and a UART which spewed forth
garbage when the rate changes were propagated.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Thomas Abraham <thomas.ab@samsung.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Max Schwarz <max.schwarz@online.de>
Cc: <stable@vger.kernel.org> # v3.16+
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Signed-off-by: Michael Turquette <mturquette@linaro.org>
While looking over the cpu-timer code I found that we appear to add
the delta for the calling task twice, through:
cpu_timer_sample_group()
thread_group_cputimer()
thread_group_cputime()
times->sum_exec_runtime += task_sched_runtime();
*sample = cputime.sum_exec_runtime + task_delta_exec();
Which would make the sample run ahead, making the sleep short.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20141112113737.GI10476@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On some 32 bits architectures, including x86, GENMASK(31, 0) returns 0
instead of the expected ~0UL.
This is the same on some 64 bits architectures with GENMASK_ULL(63, 0).
This is due to an overflow in the shift operand, 1 << 32 for GENMASK,
1 << 64 for GENMASK_ULL.
Reported-by: Eric Paire <eric.paire@st.com>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # v3.13+
Cc: linux@rasmusvillemoes.dk
Cc: gong.chen@linux.intel.com
Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Fixes: 10ef6b0dff ("bitops: Introduce a more generic BITMASK macro")
Link: http://lkml.kernel.org/r/1415267659-10563-1-git-send-email-maxime.coquelin@st.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Highlights include:
- Stable patches to fix NFSv4.x delegation reclaim error paths
- Fix a bug whereby we were advertising NFSv4.1 but using NFSv4.2 features
- Fix a use-after-free problem with pNFS block layouts
- Fix a memory leak in the pNFS files O_DIRECT code
- Replace an intrusive and Oops-prone performance fix in the NFSv4 atomic
open code with a safer one-line version and revert the two original patches.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUZol9AAoJEGcL54qWCgDyNQQQALnngvpPR51BoO/iTz9ruXol
fGZy0SRIlTUKm1ArsQsQ+HGbV5K0hgP3Tg+z2AtEEZ8u/2Fi2Bqdl6+eNY12tKHd
uUctDdM5TXLrETAn1UULrnd2eX1cvPMBfOlXlAdNHHsGEgC7w7YQ+rzGwnls+HDy
LYXzY7Y3jYGdTMaRgZc5YRdtd8JBpCxciRvPEQLDIobwP0JnZC1afTLe1XInqB2I
TZ4NTHT+DEWA+Ou1P2deL7+RuJNEAeWWBvULJy76n4BqKvN/HNedOO5HyBYXrwSd
3UX3wbx9CWRxN1F0EqNKxjxZ/597JwqBeNoTDRcofLsqumUfAOtlbym1EahcD3Ls
pykopNfgUhGuhxolStmuHdS6CnyQPERpR5lFZcDp7XtcwSq4FcwD8DRzLJMZW5dg
N1lkfFlwQN3rqdk/NEHL+IxS41Hlk4HXjMoP6MNbRtqzIN6tW9tvC4MtAWd1aYxO
YuUW281pbWxXQ731s0kTIrMUdQ9vGSRBMcbnO9rL3o+xkh8y5SPVkx9lhdhJN0UD
VbQ5Ws/xZ54bD1PfyYb+Yx659lI8MSFOsDuMuLmDtfYnVicHwCA3H63StvQ3ihf/
q0gu8Iex9YbNNjf7IfYGuWPmPn3gwPBoURPC0bcZvMPdY6DXodU6Oj4BRTQ5VCie
9N0pt2wp2eRjaSzD7r5A
=8YN6
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
- stable patches to fix NFSv4.x delegation reclaim error paths
- fix a bug whereby we were advertising NFSv4.1 but using NFSv4.2
features
- fix a use-after-free problem with pNFS block layouts
- fix a memory leak in the pNFS files O_DIRECT code
- replace an intrusive and Oops-prone performance fix in the NFSv4
atomic open code with a safer one-line version and revert the two
original patches"
* tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor
NFS: Don't try to reclaim delegation open state if recovery failed
NFSv4: Ensure that we call FREE_STATEID when NFSv4.x stateids are revoked
NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return
NFSv4.1: nfs41_clear_delegation_stateid shouldn't trust NFS_DELEGATED_STATE
NFSv4: Ensure that we remove NFSv4.0 delegations when state has expired
NFS: SEEK is an NFS v4.2 feature
nfs: Fix use of uninitialized variable in nfs_getattr()
nfs: Remove bogus assignment
nfs: remove spurious WARN_ON_ONCE in write path
pnfs/blocklayout: serialize GETDEVICEINFO calls
nfs: fix pnfs direct write memory leak
Revert "NFS: nfs4_do_open should add negative results to the dcache."
Revert "NFS: remove BUG possibility in nfs4_open_and_get_state"
NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT
The direction field is set on 7 bits, thus we need to AND it with 0111 111 mask
in order to retrieve it, that is 0x7F, not 0xCF as it is now.
Fixes: ade7ef7ba (staging:iio: Differential channel handling)
Signed-off-by: Cristina Ciocan <cristina.ciocan@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
There could be a signed overflow in the following code.
The expression, (32-logmask) is comprised between 0 and 31 included.
It may be equal to 31.
In such a case the left shift will produce a signed integer overflow.
According to the C99 Standard, this is an undefined behavior.
A simple fix is to replace the signed int 1 with the unsigned int 1U.
Signed-off-by: Vincent BENAYOUN <vincent.benayoun@trust-in-soft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix a crash in the suspend-to-idle code path introduced by a
recent commit that forgot to check a pointer against NULL before
dereferencing it (Dmitry Eremin-Solenikov).
- Fix a boot crash on Exynos5 introduced by a recent commit making
that platform use generic Device Tree bindings for power domains
which exposed a weakness in the generic power domains framework
leading to that crash (Ulf Hansson).
- Fix a crash during system resume on systems where cpufreq depends
on Operation Performance Points (OPP) for functionality, but
CONFIG_OPP is not set. This leads the cpufreq driver registration
to fail, but the resume code attempts to restore the pre-suspend
cpufreq configuration (which does not exist) nevertheless and
crashes. From Geert Uytterhoeven.
- Add a new ACPI blacklist entry for Dell Vostro 3546 that has
problems if it is reported as Windows 8 compatible to the BIOS
(Adam Lee).
- Fix swapped arguments in an error message in the cpufreq-dt
driver (Abhilash Kesavan).
- Fix up the prototypes of new callbacks in struct generic_pm_domain
to make them more useful. Users of those callbacks will be added
in 3.19 and it's better for them to be based on the correct struct
definition in mainline from the start. From Ulf Hansson and
Kevin Hilman.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJUZhtnAAoJEILEb/54YlRxx1EP/0Rk7pJUHeOMmdyXyY7B+n+f
MlXHVMDhskT370fsdTGbpeYb5ATr5kGatfhr+vyDQmBtxdw7lDJxKq54s6kmmIL3
SEMRRb4NtkPsdDE7zq985JmjsrnHtKxC5NjSUwEGxdyyfAZxll4mrZL6RrqXCu44
L+qdVXRffCCrJDXZl5FZUpSZ3ZUc+xTiaDy7ObjLe2bwmzvBOAwS2flBMKxN9X+e
khlGdQZ0e9T2Y3IXriHxHMui8OVbkPyYZkW1aubCd0HwuTMP7sebosX/2JWdJOmg
q6bGcvPlBwXDRoShlzFO8CN5w5E8fIe0vfPcg9SB3s21S7rJEbYQX/5ytm107aJj
Ysv7mcb2dAHG0V3J7hxhkS+7UNPxfk3G+8frxW2UQ6eIDlZkBORIUhGCzeSbIGYM
aIKiomN4jGuPeaOkEnKl4RwMlzjuzAs2V06viffbq63eyWBvtHDW8M5bdq901pXp
1jOT7yKqLzOZYqcYaLr3z+IBw/+hfuG/FdCp3uGyFqeHPBNIP3BfFnWm6A6E13b+
aC6gvhQHojT7L2gqIBJ+Qn0EiRWNqwoLk6w6DLDYJna/hYyoXq0BKv+/x2OegItU
ENKYVpfmSt3YsEhcTBW4h5IpUvK07o5Oa3nTxen6924Im61dMyaSUDD5DiaqCgXO
bVJTsF983hBZGTy0IMX/
=wQxT
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are three regression fixes, two recent (generic power domains,
suspend-to-idle) and one older (cpufreq), an ACPI blacklist entry for
one more machine having problems with Windows 8 compatibility, a minor
cpufreq driver fix (cpufreq-dt) and a fixup for new callback
definitions (generic power domains).
Specifics:
- Fix a crash in the suspend-to-idle code path introduced by a recent
commit that forgot to check a pointer against NULL before
dereferencing it (Dmitry Eremin-Solenikov).
- Fix a boot crash on Exynos5 introduced by a recent commit making
that platform use generic Device Tree bindings for power domains
which exposed a weakness in the generic power domains framework
leading to that crash (Ulf Hansson).
- Fix a crash during system resume on systems where cpufreq depends
on Operation Performance Points (OPP) for functionality, but
CONFIG_OPP is not set. This leads the cpufreq driver registration
to fail, but the resume code attempts to restore the pre-suspend
cpufreq configuration (which does not exist) nevertheless and
crashes. From Geert Uytterhoeven.
- Add a new ACPI blacklist entry for Dell Vostro 3546 that has
problems if it is reported as Windows 8 compatible to the BIOS
(Adam Lee).
- Fix swapped arguments in an error message in the cpufreq-dt driver
(Abhilash Kesavan).
- Fix up the prototypes of new callbacks in struct generic_pm_domain
to make them more useful. Users of those callbacks will be added
in 3.19 and it's better for them to be based on the correct struct
definition in mainline from the start. From Ulf Hansson and Kevin
Hilman"
* tag 'pm+acpi-3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Domains: Fix initial default state of the need_restore flag
PM / sleep: Fix entering suspend-to-IDLE if no freeze_oops is set
PM / Domains: Change prototype for the attach and detach callbacks
cpufreq: Avoid crash in resume on SMP without OPP
cpufreq: cpufreq-dt: Fix arguments in clock failure error message
ACPI / blacklist: blacklist Win8 OSI for Dell Vostro 3546
* pm-domains:
PM / Domains: Fix initial default state of the need_restore flag
PM / Domains: Change prototype for the attach and detach callbacks
* pm-sleep:
PM / sleep: Fix entering suspend-to-IDLE if no freeze_oops is set
* pm-cpufreq:
cpufreq: Avoid crash in resume on SMP without OPP
cpufreq: cpufreq-dt: Fix arguments in clock failure error message
Pull networking fixes from David Miller:
1) sunhme driver lacks DMA mapping error checks, based upon a report by
Meelis Roos.
2) Fix memory leak in mvpp2 driver, from Sudip Mukherjee.
3) DMA memory allocation sizes are wrong in systemport ethernet driver,
fix from Florian Fainelli.
4) Fix use after free in mac80211 defragmentation code, from Johannes
Berg.
5) Some networking uapi headers missing from Kbuild file, from Stephen
Hemminger.
6) TUN driver gets csum_start offset wrong when VLAN accel is enabled,
and macvtap has a similar bug, from Herbert Xu.
7) Adjust several tunneling drivers to set dev->iflink after registry,
because registry sets that to -1 overwriting whatever we did. From
Steffen Klassert.
8) Geneve forgets to set inner tunneling type, causing GSO segmentation
to fail on some NICs. From Jesse Gross.
9) Fix several locking bugs in stmmac driver, from Fabrice Gasnier and
Giuseppe CAVALLARO.
10) Fix spurious timeouts with NewReno on low traffic connections, from
Marcelo Leitner.
11) Fix descriptor updates in enic driver, from Govindarajulu
Varadarajan.
12) PPP calls bpf_prog_create() with locks held, which isn't kosher.
Fix from Takashi Iwai.
13) Fix NULL deref in SCTP with malformed INIT packets, from Daniel
Borkmann.
14) psock_fanout selftest accesses past the end of the mmap ring, fix
from Shuah Khan.
15) Fix PTP timestamping for VLAN packets, from Richard Cochran.
16) netlink_unbind() calls in netlink pass wrong initial argument, from
Hiroaki SHIMODA.
17) vxlan socket reuse accidently reuses a socket when the address
family is different, so we have to explicitly check this, from
Marcelo Lietner.
18) Fix missing include in nft_reject_bridge.c breaking the build on ppc
and other architectures, from Guenter Roeck.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
vxlan: Do not reuse sockets for a different address family
smsc911x: power-up phydev before doing a software reset.
lib: rhashtable - Remove weird non-ASCII characters from comments
net/smsc911x: Fix delays in the PHY enable/disable routines
net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode
netlink: Properly unbind in error conditions.
net: ptp: fix time stamp matching logic for VLAN packets.
cxgb4 : dcb open-lldp interop fixes
selftests/net: psock_fanout seg faults in sock_fanout_read_ring()
net: bcmgenet: apply MII configuration in bcmgenet_open()
net: bcmgenet: connect and disconnect from the PHY state machine
net: qualcomm: Fix dependency
ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx
net: phy: Correctly handle MII ioctl which changes autonegotiation.
ipv6: fix IPV6_PKTINFO with v4 mapped
net: sctp: fix memory leak in auth key management
net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
net: ppp: Don't call bpf_prog_create() in ppp_lock
net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set
cxgb4 : Fix bug in DCB app deletion
...
In free_area_init_core(), zone->managed_pages is set to an approximate
value for lowmem, and will be adjusted when the bootmem allocator frees
pages into the buddy system.
But free_area_init_core() is also called by hotadd_new_pgdat() when
hot-adding memory. As a result, zone->managed_pages of the newly added
node's pgdat is set to an approximate value in the very beginning.
Even if the memory on that node has node been onlined,
/sys/device/system/node/nodeXXX/meminfo has wrong value:
hot-add node2 (memory not onlined)
cat /sys/device/system/node/node2/meminfo
Node 2 MemTotal: 33554432 kB
Node 2 MemFree: 0 kB
Node 2 MemUsed: 33554432 kB
Node 2 Active: 0 kB
This patch fixes this problem by reset node managed pages to 0 after
hot-adding a new node.
1. Move reset_managed_pages_done from reset_node_managed_pages() to
reset_all_zones_managed_pages()
2. Make reset_node_managed_pages() non-static
3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat
is initialized
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.16+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before describing bugs itself, I first explain definition of freepage.
1. pages on buddy list are counted as freepage.
2. pages on isolate migratetype buddy list are *not* counted as freepage.
3. pages on cma buddy list are counted as CMA freepage, too.
Now, I describe problems and related patch.
Patch 1: There is race conditions on getting pageblock migratetype that
it results in misplacement of freepages on buddy list, incorrect
freepage count and un-availability of freepage.
Patch 2: Freepages on pcp list could have stale cached information to
determine migratetype of buddy list to go. This causes misplacement of
freepages on buddy list and incorrect freepage count.
Patch 4: Merging between freepages on different migratetype of
pageblocks will cause freepages accouting problem. This patch fixes it.
Without patchset [3], above problem doesn't happens on my CMA allocation
test, because CMA reserved pages aren't used at all. So there is no
chance for above race.
With patchset [3], I did simple CMA allocation test and get below
result:
- Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
- run kernel build (make -j16) on background
- 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
- Result: more than 5000 freepage count are missed
With patchset [3] and this patchset, I found that no freepage count are
missed so that I conclude that problems are solved.
On my simple memory offlining test, these problems also occur on that
environment, too.
This patch (of 4):
There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems. At first, this
patch is focused on first type of freepath. And then, following patch
will solve the problem in second type of freepath.
In the first type of freepath, we got migratetype of freeing page
without holding the zone lock, so it could be racy. There are two cases
of this race.
1. pages are added to isolate buddy list after restoring orignal
migratetype
CPU1 CPU2
get migratetype => return MIGRATE_ISOLATE
call free_one_page() with MIGRATE_ISOLATE
grab the zone lock
unisolate pageblock
release the zone lock
grab the zone lock
call __free_one_page() with MIGRATE_ISOLATE
freepage go into isolate buddy list,
although pageblock is already unisolated
This may cause two problems. One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list. The other is that freepage accouting could be wrong
due to merging between different buddy list. Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are
counted as freepage. If merge happens, buddy freepage on normal buddy
list is inevitably moved to isolate buddy list without any consideration
of freepage accouting so it could be incorrect.
2. pages are added to normal buddy list while pageblock is isolated.
It is similar with above case.
This also may cause two problems. One is that we can't keep these
freepages from being allocated. Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction. And the other problem is same as
case 1, that it, incorrect freepage accouting.
This race condition would be prevented by checking migratetype again
with holding the zone lock. Because it is somewhat heavy operation and
it isn't needed in common case, we want to avoid rechecking as much as
possible. So this patch introduce new variable, nr_isolate_pageblock in
struct zone to check if there is isolated pageblock. With this, we can
avoid to re-check migratetype in common case and do it only if there is
isolated pageblock or migratetype is MIGRATE_ISOLATE. This solve above
mentioned problems.
Changes from v3:
Add one more check in free_one_page() that checks whether migratetype is
MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Heesub Shin <heesub.shin@samsung.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ritesh Harjani <ritesh.list@gmail.com>
Cc: Gioh Kim <gioh.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
in the kernel. The splice logic wants a full page from the ring buffer
but the ring_buffer_wait() returns when there's any data in the ring buffer.
The splice code would then continue the loop waiting for a full page.
But if a full page never happens, the splice code will never sleep and
just continue to loop.
There's another case that Rabin fixed that could loop if there's no memory
and kmalloc() constantly returns NULL.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUYgiFAAoJEEjnJuOKh9ldRuoIAMHOx/TgIYSPuVQBWuljoELZ
Mbaq5EGPkJ0fdwP9+X3JPG7pFazcP8+xZx7iVKYihazgS7BkF/khbxsgPl5ZSGOf
j39kSoWK1ZzKIbM3MMjIBgZ2LF8wL1VoRu/dyI7GXWeBt9Dnj7vDtkoSGCYjDJ9B
UBK2E3vjwNxc4Z9U3YRZj7U4GEKwMkpddKv0DIfAmzA4tF1CryuGmvpkRtGi6wc0
vs6OV1jqFa300v8ckFvTrO/UdBVnisVWHmBrP6XXB/Likz/6+56pphCRvoc/LskG
kFHCsjXXJ/tI+/RdhQt/dqgKDl7Cs3nIhXwMZ/TbaxGdFT6kbq3xbVpRk1L90/U=
=v5/I
-----END PGP SIGNATURE-----
Merge tag 'trace-fixes-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Rabin Vincent found a way that tracing could cause an infinite loop in
the kernel. The splice logic wants a full page from the ring buffer
but the ring_buffer_wait() returns when there's any data in the ring
buffer. The splice code would then continue the loop waiting for a
full page. But if a full page never happens, the splice code will
never sleep and just continue to loop.
There's another case that Rabin fixed that could loop if there's no
memory and kmalloc() constantly returns NULL"
* tag 'trace-fixes-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Do not risk busy looping in buffer splice
tracing: Do not busy wait in buffer splice
For pNFS direct writes, layout driver may dynamically allocate ds_cinfo.buckets.
So we need to take care to free them when freeing dreq.
Ideally this needs to be done inside layout driver where ds_cinfo.buckets
are allocated. But buckets are attached to dreq and reused across LD IO iterations.
So I feel it's OK to free them in the generic layer.
Cc: stable@vger.kernel.org [v3.4+]
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
The initial state of the device's need_restore flag should'nt depend on
the current state of the PM domain. For example it should be perfectly
valid to attach an inactive device to a powered PM domain.
The pm_genpd_dev_need_restore() API allow us to update the need_restore
flag to somewhat cope with such scenarios. Typically that should have
been done from drivers/buses ->probe() since it's those that put the
requirements on the value of the need_restore flag.
Until recently, the Exynos SOCs were the only user of the
pm_genpd_dev_need_restore() API, though invoking it from a centralized
location while adding devices to their PM domains.
Due to that Exynos now have swithed to the generic OF-based PM domain
look-up, it's no longer possible to invoke the API from a centralized
location. The reason is because devices are now added to their PM
domains during the probe sequence.
Commit "ARM: exynos: Move to generic PM domain DT bindings"
did the switch for Exynos to the generic OF-based PM domain look-up,
but it also removed the call to pm_genpd_dev_need_restore(). This
caused a regression for some of the Exynos drivers.
To handle things more properly in the generic PM domain, let's change
the default initial value of the need_restore flag to reflect that the
state is unknown. As soon as some of the runtime PM callbacks gets
invoked, update the initial value accordingly.
Moreover, since the generic PM domain is verifying that all devices
are both runtime PM enabled and suspended, using pm_runtime_suspended()
while pm_genpd_poweroff() is invoked from the scheduled work, we can be
sure of that the PM domain won't be powering off while having active
devices.
Do note that, the generic PM domain can still only know about active
devices which has been activated through invoking its runtime PM resume
callback. In other words, buses/drivers using pm_runtime_set_active()
during ->probe() will still suffer from a race condition, potentially
probing a device without having its PM domain being powered. That issue
will have to be solved using a different approach.
This a log from the boot regression for Exynos5, which is being fixed in
this patch.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 308 at ../drivers/clk/clk.c:851 clk_disable+0x24/0x30()
Modules linked in:
CPU: 0 PID: 308 Comm: kworker/0:1 Not tainted 3.18.0-rc3-00569-gbd9449f-dirty #10
Workqueue: pm pm_runtime_work
[<c0013c64>] (unwind_backtrace) from [<c0010dec>] (show_stack+0x10/0x14)
[<c0010dec>] (show_stack) from [<c03ee4cc>] (dump_stack+0x70/0xbc)
[<c03ee4cc>] (dump_stack) from [<c0020d34>] (warn_slowpath_common+0x64/0x88)
[<c0020d34>] (warn_slowpath_common) from [<c0020d74>] (warn_slowpath_null+0x1c/0x24)
[<c0020d74>] (warn_slowpath_null) from [<c03107b0>] (clk_disable+0x24/0x30)
[<c03107b0>] (clk_disable) from [<c02cc834>] (gsc_runtime_suspend+0x128/0x160)
[<c02cc834>] (gsc_runtime_suspend) from [<c0249024>] (pm_generic_runtime_suspend+0x2c/0x38)
[<c0249024>] (pm_generic_runtime_suspend) from [<c024f44c>] (pm_genpd_default_save_state+0x2c/0x8c)
[<c024f44c>] (pm_genpd_default_save_state) from [<c024ff2c>] (pm_genpd_poweroff+0x224/0x3ec)
[<c024ff2c>] (pm_genpd_poweroff) from [<c02501b4>] (pm_genpd_runtime_suspend+0x9c/0xcc)
[<c02501b4>] (pm_genpd_runtime_suspend) from [<c024a4f8>] (__rpm_callback+0x2c/0x60)
[<c024a4f8>] (__rpm_callback) from [<c024a54c>] (rpm_callback+0x20/0x74)
[<c024a54c>] (rpm_callback) from [<c024a930>] (rpm_suspend+0xd4/0x43c)
[<c024a930>] (rpm_suspend) from [<c024bbcc>] (pm_runtime_work+0x80/0x90)
[<c024bbcc>] (pm_runtime_work) from [<c0032a9c>] (process_one_work+0x12c/0x314)
[<c0032a9c>] (process_one_work) from [<c0032cf4>] (worker_thread+0x3c/0x4b0)
[<c0032cf4>] (worker_thread) from [<c003747c>] (kthread+0xcc/0xe8)
[<c003747c>] (kthread) from [<c000e738>] (ret_from_fork+0x14/0x3c)
---[ end trace 40cd58bcd6988f12 ]---
Fixes: a4a8c2c496 (ARM: exynos: Move to generic PM domain DT bindings)
Reported-and-tested0by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On a !PREEMPT kernel, attempting to use trace-cmd results in a soft
lockup:
# trace-cmd record -e raw_syscalls:* -F false
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [trace-cmd:61]
...
Call Trace:
[<ffffffff8105b580>] ? __wake_up_common+0x90/0x90
[<ffffffff81092e25>] wait_on_pipe+0x35/0x40
[<ffffffff810936e3>] tracing_buffers_splice_read+0x2e3/0x3c0
[<ffffffff81093300>] ? tracing_stats_read+0x2a0/0x2a0
[<ffffffff812d10ab>] ? _raw_spin_unlock+0x2b/0x40
[<ffffffff810dc87b>] ? do_read_fault+0x21b/0x290
[<ffffffff810de56a>] ? handle_mm_fault+0x2ba/0xbd0
[<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
[<ffffffff810951e2>] ? trace_buffer_lock_reserve+0x22/0x60
[<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
[<ffffffff8112415d>] do_splice_to+0x6d/0x90
[<ffffffff81126971>] SyS_splice+0x7c1/0x800
[<ffffffff812d1edd>] tracesys_phase2+0xd3/0xd8
The problem is this: tracing_buffers_splice_read() calls
ring_buffer_wait() to wait for data in the ring buffers. The buffers
are not empty so ring_buffer_wait() returns immediately. But
tracing_buffers_splice_read() calls ring_buffer_read_page() with full=1,
meaning it only wants to read a full page. When the full page is not
available, tracing_buffers_splice_read() tries to wait again with
ring_buffer_wait(), which again returns immediately, and so on.
Fix this by adding a "full" argument to ring_buffer_wait() which will
make ring_buffer_wait() wait until the writer has left the reader's
page, i.e. until full-page reads will succeed.
Link: http://lkml.kernel.org/r/1415645194-25379-1-git-send-email-rabin@rab.in
Cc: stable@vger.kernel.org # 3.16+
Fixes: b1169cc69b ("tracing: Remove mock up poll wait function")
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
All interrupts coming from MUIC were ignored because interrupt source
register was masked.
The Maxim 77693 has a "interrupt source" - a separate register and interrupts
which give information about PMIC block triggering the individual
interrupt (charger, topsys, MUIC, flash LED).
By default bootloader could initialize this register to "mask all"
value. In such case (observed on Trats2 board) MUIC interrupts won't be
generated regardless of their mask status. Regmap irq chip was unmasking
individual MUIC interrupts but the source was masked
Before introducing regmap irq chip this interrupt source was unmasked,
read and acked. Reading and acking is not necessary but unmasking is.
Fixes: 342d669c1e ("mfd: max77693: Handle IRQs using regmap")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Pull devicetree bugfix from Grant Likely:
"One buffer overflow bug that shouldn't be left around"
* 'devicetree/merge' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux:
of: Fix overflow bug in string property parsing functions
Convert the prototypes to return an int in order to support error
handling in these callbacks.
Also, as suggested by Dmitry Torokhov, pass the domain pointer for use
inside the callbacks, and so that they match the existing
power_on/power_off callbacks which currently take the domain pointer.
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ khilman: added domain as parameter to callbacks, as suggested by Dmitry ]
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Enumeration
- Don't oops on virtual buses in acpi_pci_get_bridge_handle() (Yinghai Lu)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUW8JBAAoJEFmIoMA60/r8Kp8P/A7B0T0dC9ZLMdSvbwQy7ynn
OAUjsbhGj8l1aL51rc/VISaZjaSMIFXUZtst329D/bQYvzqiaWv14K2ib5h8PlA7
8cpZAMQMxs+GClbCX2CsPK7WKzL65tSc9biiSgtdId0ii/GVN8X6GDJNwMtjNHwB
6Fku1wfFgekA9LlRga8ZGJ7XD1MYCH5eAvxJ6qWjO+IshELpH7ya4Nm9pyo/z0oQ
nIKqIbcHVV6sxhRAV8eGHycU/DFPXnYtfJ9htep6xhxnvEV5yZgSi/EK0CA29re3
AWRAZHN1fOA2GI/ALKskytLqf9mbOtl17OZY9a3M3e4P9Aw1L119L5WbJ2B6XZF3
JYrrdSubhHqCzGRaxrJpG4HmdWqli1BdHyZY8sLCeVeQdvs5yucnbtZ4PuYShBDH
1JEn1VvZ4/4daluni603uBpig1i5OHeks5OeGOe25T87SBQ5pnn57s/qIEHz4lGB
MfkyVdz7RxvN+vqieYAHDEYhd57nXI1tpMAngm+cM/O2hBvZdXlGN23JZ9SIQR19
Ub14SYRyXH9cjh1uRlX0+Mn3bLnWS8uXGwDrxHMjJCl4IA8vpFnLCB8LDz8bZ6Ox
DNSMzYB2++xzufon1aCxLNJ6P2AkfB4ecgeqixFgpy7JRfdZamKdkjVmcBswrWGp
dYz8b/TKUTasBP+6vSV/
=n1pR
-----END PGP SIGNATURE-----
Merge tag 'pci-v3.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"This fixes an oops when enabling SR-IOV VF devices. The oops is a
regression I added by configuring all devices during enumeration.
- Don't oops on virtual buses in acpi_pci_get_bridge_handle() (Yinghai Lu)"
* tag 'pci-v3.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Don't oops on virtual buses in acpi_pci_get_bridge_handle()
File descriptors are always closed on exit :-)
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
acpi_pci_get_bridge_handle() returns the ACPI handle for the bridge device
(either a host bridge or a PCI-to-PCI bridge) leading to a PCI bus. But
SR-IOV virtual functions can be on a virtual bus with no bridge leading to
it. Return a NULL acpi_handle in this case instead of trying to
dereference the NULL pointer to the bridge.
This fixes a NULL pointer dereference oops in pci_get_hp_params() when
adding SR-IOV VF devices on virtual buses.
[bhelgaas: changelog, add comment in code]
Fixes: 6cd33649fa ("PCI: Add pci_configure_device() during enumeration")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=87591
Reported-by: Chao Zhou <chao.zhou@intel.com>
Reported-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The string property read helpers will run off the end of the buffer if
it is handed a malformed string property. Rework the parsers to make
sure that doesn't happen. At the same time add new test cases to make
sure the functions behave themselves.
The original implementations of of_property_read_string_index() and
of_property_count_strings() both open-coded the same block of parsing
code, each with it's own subtly different bugs. The fix here merges
functions into a single helper and makes the original functions static
inline wrappers around the helper.
One non-bugfix aspect of this patch is the addition of a new wrapper,
of_property_read_string_array(). The new wrapper is needed by the
device_properties feature that Rafael is working on and planning to
merge for v3.19. The implementation is identical both with and without
the new static inline wrapper, so it just got left in to reduce the
churn on the header file.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Darren Hart <darren.hart@intel.com>
Cc: <stable@vger.kernel.org> # v3.3+: Drop selftest hunks that don't apply
Pull CMA and DMA-mapping fixes from Marek Szyprowski:
"This contains important fixes for recently introduced highmem support
for default contiguous memory region used for dma-mapping subsystem"
* 'fixes-for-v3.18' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
mm, cma: make parameters order consistent in func declaration and definition
mm: cma: Use %pa to print physical addresses
mm: cma: Ensure that reservations never cross the low/high mem boundary
mm: cma: Always consider a 0 base address reservation as dynamic
mm: cma: Don't crash on allocation if CMA area can't be activated
* A regression from 3.16 which was noticed in 3.17. With the restructuring of
the m25p80.c driver and the SPI NOR library framework, we omitted proper
listing of the SPI device IDs. This means m25p80.c wouldn't auto-load
(modprobe) properly when built as a module. For now, we duplicate the device
IDs into both modules.
* The OMAP / ELM modules were depending on an implicit link ordering. Use
deferred probing so that the new link order (in 3.18-rc) can still allow for
successful probing.
* Fix suspend/resume support for LH28F640BF NOR flash
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUVq20AAoJEFySrpd9RFgt8XQQAI4oIygz4zGQ6n0y4HOqwOBy
F4ZPtOuzuCYA86x2zORFgj4A9JGVjDQwTfnMQnn1NG+XEEmZMJfG2IwqlUxsZd5A
KkAS5XUoi/Fvq95Qi95KQYXqm1dniXEGKsRFsHKXIsnnDmbqRK5fBn6Ve5PAwcau
uru5FwrZ2Ve0EwF9/Z/bxAatRirdAhwgMGlaXdXLmL7S13NQGmXP9QI7CbxSZ38R
GJ+A6PhiYs6Sml6Ou5bovNXyFGfx4J35pk6nTWoWe5MfZHRQk447OQwBPbsrM119
Boq8F/6diXyJfuXdxvF6JiDmDzaw/fBY+Xuq1O6p+JzLONN16x93KlpAPhzy4a15
PwFHCBzg5khY49if/dmrPJ+kLkU+9wIHUib8m6HSImCKBT5Bv/VJXoQ1g4s1IJ8/
Di3mz/8pWm/cscABIkuEqb9TwwUrSHzVXgGH/p4CY0eUo8DbQQA1zDsig8aRIX36
FlAReaHH8QivdnghkMX9Px7SIo7XoMZZEi+55k8FrIVqjqEHNGx+w+BKhxgtFggN
nAg0l7NrLdQHpigK1SZjLFGIYi7MmarvbatUjVPGagiRqoQ0mCSS7eKX1DEs4EAo
P2g64BSJickGAhUiAV9ZO1EBoaOU6olIPpc33J+uG/8qBU1cNClx3FJ1UPWX27JQ
+FBsD1mec4FuoZ2SoE7r
=IxjM
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20141102' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
"Three main MTD fixes for 3.18:
- A regression from 3.16 which was noticed in 3.17. With the
restructuring of the m25p80.c driver and the SPI NOR library
framework, we omitted proper listing of the SPI device IDs. This
means m25p80.c wouldn't auto-load (modprobe) properly when built as
a module. For now, we duplicate the device IDs into both modules.
- The OMAP / ELM modules were depending on an implicit link ordering.
Use deferred probing so that the new link order (in 3.18-rc) can
still allow for successful probing.
- Fix suspend/resume support for LH28F640BF NOR flash"
* tag 'for-linus-20141102' of git://git.infradead.org/linux-mtd:
mtd: cfi_cmdset_0001.c: fix resume for LH28F640BF chips
mtd: omap: fix mtd devices not showing up
mtd: m25p80,spi-nor: Fix module aliases for m25p80
mtd: spi-nor: make spi_nor_scan() take a chip type name, not spi_device_id
mtd: m25p80: get rid of spi_get_device_id