Commit Graph

785655 Commits

Author SHA1 Message Date
Jon Maloy
b06f9d9f1a tipc: fix info leak from kernel tipc_event
We initialize a struct tipc_event allocated on the kernel stack to
zero to avert info leak to user space.

Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18 16:49:53 -07:00
Wenwen Wang
b6168562c8 net: socket: fix a missing-check bug
In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
statement to see whether it is necessary to pre-process the ethtool
structure, because, as mentioned in the comment, the structure
ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
is allocated through compat_alloc_user_space(). One thing to note here is
that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
partially determined by 'rule_cnt', which is actually acquired from the
user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
get_user(). After 'rxnfc' is allocated, the data in the original user-space
buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
including the 'rule_cnt' field. However, after this copy, no check is
re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
race to change the value in the 'compat_rxnfc->rule_cnt' between these two
copies. Through this way, the attacker can bypass the previous check on
'rule_cnt' and inject malicious data. This can cause undefined behavior of
the kernel and introduce potential security risk.

This patch avoids the above issue via copying the value acquired by
get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18 16:43:06 -07:00
Phil Sutter
3c53ed8fef net: sched: Fix for duplicate class dump
When dumping classes by parent, kernel would return classes twice:

| # tc qdisc add dev lo root prio
| # tc class show dev lo
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
| # tc class show dev lo parent 8001:
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:

This comes from qdisc_match_from_root() potentially returning the root
qdisc itself if its handle matched. Though in that case, root's classes
were already dumped a few lines above.

Fixes: cb395b2010 ("net: sched: optimize class dumps")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18 16:00:02 -07:00
Christoph Hellwig
ee75fa2ae0 mtip32xx: fully switch to the generic DMA API
The mtip32xx used an odd mix of the old PCI and the generic DMA API,
so switch it over to the generic API entirely.

Note that this also removes a weird fallback to just a 32-bit coherent
dma mask if the 64-bit dma mask doesn't work, as that can't even happen.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 15:14:50 -06:00
Christoph Hellwig
77a12e51fc rsxx: switch to the generic DMA API
The PCI DMA API is deprecated, switch to the generic DMA API instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 15:14:48 -06:00
Christoph Hellwig
b46d40daba umem: switch to the generic DMA API
The PCI DMA API is deprecated, switch to the generic DMA API instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 15:14:47 -06:00
Christoph Hellwig
931da2f7a5 sx8: switch to the generic DMA API
The PCI DMA API is deprecated, switch to the generic DMA API instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 15:14:45 -06:00
Christoph Hellwig
64ab1fa5da sx8: remove dead IF_64BIT_DMA_IS_POSSIBLE code
This code has effectively been commented out since the first commit,
so remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 15:14:43 -06:00
Christoph Hellwig
1381262148 skd: switch to the generic DMA API
The PCI DMA API is deprecated, switch to the generic DMA API instead.
Also make use of the dma_set_mask_and_coherent helper to easily set
the streaming an coherent DMA masks together.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 15:14:42 -06:00
Christoph Hellwig
ecb0a83e31 ubd: remove use of blk_rq_map_sg
There is no good reason to create a scatterlist in the ubd driver,
it can just iterate the request directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
[rw: Folded in improvements as discussed with hch and jens]
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 15:13:12 -06:00
Heiner Kallweit
6b839b6cf9 r8169: fix NAPI handling under high load
rtl_rx() and rtl_tx() are called only if the respective bits are set
in the interrupt status register. Under high load NAPI may not be
able to process all data (work_done == budget) and it will schedule
subsequent calls to the poll callback.
rtl_ack_events() however resets the bits in the interrupt status
register, therefore subsequent calls to rtl8169_poll() won't call
rtl_rx() and rtl_tx() - chip interrupts are still disabled.

Fix this by calling rtl_rx() and rtl_tx() independent of the bits
set in the interrupt status register. Both functions will detect
if there's nothing to do for them.

Fixes: da78dbff2e ("r8169: remove work from irq handler.")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18 11:33:29 -07:00
David S. Miller
27faeebd00 sparc: Revert unintended perf changes.
Some local debugging hacks accidently slipped into the VDSO commit.

Sorry!

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18 11:32:29 -07:00
Leo Li
4364bcb2cd drm: Get ref on CRTC commit object when waiting for flip_done
This fixes a general protection fault, caused by accessing the contents
of a flip_done completion object that has already been freed. It occurs
due to the preemption of a non-blocking commit worker thread W by
another commit thread X. X continues to clear its atomic state at the
end, destroying the CRTC commit object that W still needs. Switching
back to W and accessing the commit objects then leads to bad results.

Worker W becomes preemptable when waiting for flip_done to complete. At
this point, a frequently occurring commit thread X can take over. Here's
an example where W is a worker thread that flips on both CRTCs, and X
does a legacy cursor update on both CRTCs:

        ...
     1. W does flip work
     2. W runs commit_hw_done()
     3. W waits for flip_done on CRTC 1
     4. > flip_done for CRTC 1 completes
     5. W finishes waiting for CRTC 1
     6. W waits for flip_done on CRTC 2

     7. > Preempted by X
     8. > flip_done for CRTC 2 completes
     9. X atomic_check: hw_done and flip_done are complete on all CRTCs
    10. X updates cursor on both CRTCs
    11. X destroys atomic state
    12. X done

    13. > Switch back to W
    14. W waits for flip_done on CRTC 2
    15. W raises general protection fault

The error looks like so:

    general protection fault: 0000 [#1] PREEMPT SMP PTI
    **snip**
    Call Trace:
     lock_acquire+0xa2/0x1b0
     _raw_spin_lock_irq+0x39/0x70
     wait_for_completion_timeout+0x31/0x130
     drm_atomic_helper_wait_for_flip_done+0x64/0x90 [drm_kms_helper]
     amdgpu_dm_atomic_commit_tail+0xcae/0xdd0 [amdgpu]
     commit_tail+0x3d/0x70 [drm_kms_helper]
     process_one_work+0x212/0x650
     worker_thread+0x49/0x420
     kthread+0xfb/0x130
     ret_from_fork+0x3a/0x50
    Modules linked in: x86_pkg_temp_thermal amdgpu(O) chash(O)
    gpu_sched(O) drm_kms_helper(O) syscopyarea sysfillrect sysimgblt
    fb_sys_fops ttm(O) drm(O)

Note that i915 has this issue masked, since hw_done is signaled after
waiting for flip_done. Doing so will block the cursor update from
happening until hw_done is signaled, preventing the cursor commit from
destroying the state.

v2: The reference on the commit object needs to be obtained before
    hw_done() is signaled, since that's the point where another commit
    is allowed to modify the state. Assuming that the
    new_crtc_state->commit object still exists within flip_done() is
    incorrect.

    Fix by getting a reference in setup_commit(), and releasing it
    during default_clear().

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1539611200-6184-1-git-send-email-sunpeng.li@amd.com
2018-10-18 14:23:13 -04:00
David S. Miller
2ee653f644 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2018-10-18

1) Free the xfrm interface gro_cells when deleting the
   interface, otherwise we leak it. From Li RongQing.

2) net/core/flow.c does not exist anymore, so remove it
   from the MAINTAINERS file.

3) Fix a slab-out-of-bounds in _decode_session6.
   From Alexei Starovoitov.

4) Fix RCU protection when policies inserted into
   thei bydst lists. From Florian Westphal.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18 09:55:08 -07:00
Ming Lei
744889b7cb block: don't deal with discard limit in blkdev_issue_discard()
blk_queue_split() does respect this limit via bio splitting, so no
need to do that in blkdev_issue_discard(), then we can align to
normal bio submit(bio_add_page() & submit_bio()).

More importantly, this patch fixes one issue introduced in a22c4d7e34
("block: re-add discard_granularity and alignment checks"), in which
zero discard bio may be generated in case of zero alignment.

Fixes: a22c4d7e34 ("block: re-add discard_granularity and alignment checks")
Cc: stable@vger.kernel.org
Cc: Ming Lin <ming.l@ssi.samsung.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Xiao Ni <xni@redhat.com>
Tested-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 07:23:40 -06:00
Rafael J. Wysocki
0a1875ad29 Merge branches 'acpi-property' and 'acpi-sbs'
* acpi-property:
  ACPI / property: Switch to bitmap_zalloc()

* acpi-sbs:
  ACPI / SBS: Fix rare oops when removing modules
  ACPI / SBS: Fix GPE storm on recent MacBookPro's
2018-10-18 12:37:51 +02:00
Rafael J. Wysocki
1f825f74c1 Merge branches 'acpi-soc', 'acpi-processor', 'acpi-pmic', 'acpi-cppc' and 'acpi-tad'
* acpi-soc:
  ACPI / LPSS: Resume BYT/CHT I2C controllers from resume_noirq
  ACPI / LPSS: Add a device link from the GPU to the BYT I2C5 controller
  ACPI / LPSS: Add a device link from the GPU to the CHT I2C7 controller
  ACPI / LPSS: Make acpi_lpss_find_device() also find PCI devices
  ACPI / LPSS: Make hid_uid_match helper accept a NULL uid argument
  ACPI / LPSS: Make hid_uid_match helper take an acpi_device as first argument
  ACPI / LPSS: Exclude I2C busses shared with PUNIT from pmc_atom_d3_mask
  ACPI / LPSS: Add alternative ACPI HIDs for Cherry Trail DMA controllers

* acpi-processor:
  ACPI / processor: Fix the return value of acpi_processor_ids_walk()

* acpi-pmic:
  ACPI / PMIC: Convert drivers to use SPDX identifier
  ACPI / PMIC: Sort headers alphabetically

* acpi-cppc:
  mailbox: PCC: handle parse error

* acpi-tad:
  ACPI: TAD: Add low-level support for real time capability
2018-10-18 12:37:27 +02:00
Rafael J. Wysocki
bd371e088b Merge branches 'acpi-init', 'acpi-osl', 'acpi-bus', 'acpi-tables' and 'acpi-misc'
* acpi-init:
  ACPI: probe ECDT before loading AML tables regardless of module-level code flag

* acpi-osl:
  ACPI / OSL: Use 'jiffies' as the time bassis for acpi_os_get_timer()

* acpi-bus:
  ACPI / glue: Split dev_is_platform() out of module for wide use

* acpi-tables:
  ACPI/PPTT: Handle architecturally unknown cache types
  drivers: base: cacheinfo: Do not populate sysfs for unknown cache types

* acpi-misc:
  ACPI: remove redundant 'default n' from Kconfig
  ACPI: custom_method: remove meaningless null check before debugfs_remove()
2018-10-18 12:37:11 +02:00
Rafael J. Wysocki
cc19b05e38 Merge branches 'pm-devfreq' and 'pm-tools'
* pm-devfreq:
  PM / devfreq: remove redundant null pointer check before kfree
  PM / devfreq: stopping the governor before device_unregister()
  PM / devfreq: Convert to using %pOFn instead of device_node.name
  PM / devfreq: Make update_devfreq() public
  PM / devfreq: Don't adjust to user limits in governors
  PM / devfreq: Fix handling of min/max_freq == 0
  PM / devfreq: Drop custom MIN/MAX macros
  PM / devfreq: Fix devfreq_add_device() when drivers are built as modules.

* pm-tools:
  PM / tools: sleepgraph and bootgraph: upgrade to v5.2
  PM / tools: sleepgraph: first batch of v5.2 changes
  cpupower: Fix coredump on VMWare
  cpupower: Fix AMD Family 0x17 msr_pstate size
  cpupower: remove stringop-truncation waring
2018-10-18 12:28:12 +02:00
Rafael J. Wysocki
5d113aa679 Merge branches 'pm-opp' and 'powercap'
* pm-opp:
  PM / OPP: _of_add_opp_table_v2(): increment count only if OPP is added
  cpufreq: dt: Try freeing static OPPs only if we have added them
  OPP: Return error on error from dev_pm_opp_get_opp_count()
  OPP: Improve error handling in dev_pm_opp_of_cpumask_add_table()
  OPP: Pass OPP table to _of_add_opp_table_v{1|2}()
  OPP: Prevent creating multiple OPP tables for devices sharing OPP nodes
  OPP: Use a single mechanism to free the OPP table
  OPP: Don't remove dynamic OPPs from _dev_pm_opp_remove_table()
  cpufreq: mvebu: Remove OPPs using dev_pm_opp_remove()
  OPP: Create separate kref for static OPPs list
  OPP: Don't take OPP table's kref for static OPPs
  OPP: Parse OPP table's DT properties from _of_init_opp_table()
  OPP: Pass index to _of_init_opp_table()
  OPP: Protect dev_list with opp_table lock
  OPP: Don't try to remove all OPP tables on failure
  OPP: Free OPP table properly on performance state irregularities

* powercap:
  powercap: RAPL: Get rid of custom RAPL_CPU() macro
2018-10-18 12:27:51 +02:00
Rafael J. Wysocki
3f858ae02c Merge branches 'acpi-pm' and 'pm-sleep'
* acpi-pm:
  ACPI / PM: LPIT: Register sysfs attributes based on FADT

* pm-sleep:
  x86-32, hibernate: Adjust in_suspend after resumed on 32bit system
  x86-32, hibernate: Set up temporary text mapping for 32bit system
  x86-32, hibernate: Switch to relocated restore code during resume on 32bit system
  x86-32, hibernate: Switch to original page table after resumed
  x86-32, hibernate: Use the page size macro instead of constant value
  x86-32, hibernate: Use temp_pgt as the temporary page table
  x86, hibernate: Rename temp_level4_pgt to temp_pgt
  x86-32, hibernate: Enable CONFIG_ARCH_HIBERNATION_HEADER on 32bit system
  x86, hibernate: Extract the common code of 64/32 bit system
  x86-32/asm/power: Create stack frames in hibernate_asm_32.S
  PM / hibernate: Check the success of generating md5 digest before hibernation
  x86, hibernate: Fix nosave_regions setup for hibernation
  PM / sleep: Show freezing tasks that caused a suspend abort
  PM / hibernate: Documentation: fix image_size default value
2018-10-18 12:27:30 +02:00
Rafael J. Wysocki
d1551f7a5a Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: tegra186: don't pass GFP_DMA32 to dma_alloc_coherent()
  cpufreq: conservative: Take limits changes into account properly
  Documentation: intel_pstate: Add base_frequency information
  cpufreq: intel_pstate: Add base_frequency attribute
  ACPI / CPPC: Add support for guaranteed performance
  cpufreq: imx6q: read OCOTP through nvmem for imx6ul/imx6ull
  cpufreq: dt-platdev: allow RK3399 to have separate tunables per cluster
  cpufreq / CPPC: Mark acpi_ids as used
  cpufreq: dt: Add support for r8a7744
  cpufreq: Convert to using %pOFn instead of device_node.name
  cpufreq: remove unnecessary unlikely()
2018-10-18 12:26:11 +02:00
Rafael J. Wysocki
41fd838cda Merge branch 'pm-cpuidle'
* pm-cpuidle:
  cpuidle: menu: Avoid computations when result will be discarded
  cpuidle: menu: Drop redundant comparison
  cpuidle: menu: Simplify checks related to the polling state
  cpuidle: poll_state: Revise loop termination condition
  cpuidle: menu: Move the latency_req == 0 special case check
  cpuidle: menu: Avoid computations for very close timers
  cpuidle: menu: Do not update last_state_idx in menu_select()
  cpuidle: menu: Get rid of first_idx from menu_select()
  cpuidle: menu: Compute first_idx when latency_req is known
  cpuidle: menu: Fix wakeup statistics updates for polling state
  cpuidle: menu: Replace data->predicted_us with local variable
  cpuidle: enter_state: Don't needlessly calculate diff time
  cpuidle: Remove unnecessary wrapper cpuidle_get_last_residency()
  intel_idle: Get rid of custom ICPU() macro
2018-10-18 12:26:00 +02:00
Ulf Hansson
e5089c2c73 PM / Domains: Document flags for genpd
The current documented description of the GENPD_FLAG_* flags, are too
simplified, so let's extend them.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 12:25:10 +02:00
Ulf Hansson
2c9b7f8772 PM / Domains: Deal with multiple states but no governor in genpd
A caller of pm_genpd_init() that provides some states for the genpd via the
->states pointer in the struct generic_pm_domain, should also provide a
governor. This because it's the job of the governor to pick a state that
satisfies the constraints.

Therefore, let's print a warning to inform the user about such bogus
configuration and avoid to bail out, by instead picking the shallowest
state before genpd invokes the ->power_off() callback.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Lina Iyer <ilina@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 12:25:09 +02:00
Ulf Hansson
2c36168480 PM / Domains: Don't treat zero found compatible idle states as an error
Instead of returning -EINVAL from of_genpd_parse_idle_states() in case none
compatible states was found, let's return 0 to indicate success. Assign
also the out-parameter *states to NULL and *n to 0, to indicate to the
caller that zero states have been found/allocated.

This enables the caller of of_genpd_parse_idle_states() to easier act on
the returned error code.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Lina Iyer <ilina@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 12:25:09 +02:00
Rafael J. Wysocki
3c88a889b4 Merge branch 'acpica'
* acpica:
  ACPICA: Remove acpi_gbl_group_module_level_code and only use acpi_gbl_execute_tables_as_methods instead
  ACPICA: AML Parser: fix parse loop to correctly skip erroneous extended opcodes
  ACPICA: AML interpreter: add region addresses in global list during initialization
  ACPICA: Update version to 20181003
  ACPICA: Never run _REG on system_memory and system_IO
  ACPICA: Split large interpreter file
  ACPICA: Update for field unit access
  ACPICA: Rename some of the Field Attribute defines
  ACPICA: Update for generic_serial_bus and attrib_raw_process_bytes protocol
2018-10-18 12:20:59 +02:00
Eric Sandeen
fa520c47ea fscache: Fix out of bound read in long cookie keys
fscache_set_key() can incur an out-of-bounds read, reported by KASAN:

 BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache]
 Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615

and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236

  BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline]
  BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171
  Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466

This happens for any index_key_len which is not divisible by 4 and is
larger than the size of the inline key, because the code allocates exactly
index_key_len for the key buffer, but the hashing loop is stepping through
it 4 bytes (u32) at a time in the buf[] array.

Fix this by calculating how many u32 buffers we'll need by using
DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation
buffer to hold the index_key, then using that same count as the hashing
index limit.

Fixes: ec0328e46d ("fscache: Maintain a catalogue of allocated cookies")
Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18 11:32:21 +02:00
David Howells
1ff22883b0 fscache: Fix incomplete initialisation of inline key space
The inline key in struct rxrpc_cookie is insufficiently initialized,
zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15
bytes will end up hashing uninitialized memory because the memcpy only
partially fills the last buf[] element.

Fix this by clearing fscache_cookie objects on allocation rather than using
the slab constructor to initialise them.  We're going to pretty much fill
in the entire struct anyway, so bringing it into our dcache writably
shouldn't incur much overhead.

This removes the need to do clearance in fscache_set_key() (where we aren't
doing it correctly anyway).

Also, we don't need to set cookie->key_len in fscache_set_key() as we
already did it in the only caller, so remove that.

Fixes: ec0328e46d ("fscache: Maintain a catalogue of allocated cookies")
Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
Reported-by: Eric Sandeen <sandeen@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18 11:32:21 +02:00
Al Viro
169b803397 cachefiles: fix the race between cachefiles_bury_object() and rmdir(2)
the victim might've been rmdir'ed just before the lock_rename();
unlike the normal callers, we do not look the source up after the
parents are locked - we know it beforehand and just recheck that it's
still the child of what used to be its parent.  Unfortunately,
the check is too weak - we don't spot a dead directory since its
->d_parent is unchanged, dentry is positive, etc.  So we sail all
the way to ->rename(), with hosting filesystems _not_ expecting
to be asked renaming an rmdir'ed subdirectory.

The fix is easy, fortunately - the lock on parent is sufficient for
making IS_DEADDIR() on child safe.

Cc: stable@vger.kernel.org
Fixes: 9ae326a690 (CacheFiles: A cache that backs onto a mounted filesystem)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18 11:32:21 +02:00
Linus Torvalds
eb66ae0308 mremap: properly flush TLB before releasing the page
Jann Horn points out that our TLB flushing was subtly wrong for the
mremap() case.  What makes mremap() special is that we don't follow the
usual "add page to list of pages to be freed, then flush tlb, and then
free pages".  No, mremap() obviously just _moves_ the page from one page
table location to another.

That matters, because mremap() thus doesn't directly control the
lifetime of the moved page with a freelist: instead, the lifetime of the
page is controlled by the page table locking, that serializes access to
the entry.

As a result, we need to flush the TLB not just before releasing the lock
for the source location (to avoid any concurrent accesses to the entry),
but also before we release the destination page table lock (to avoid the
TLB being flushed after somebody else has already done something to that
page).

This also makes the whole "need_flush" logic unnecessary, since we now
always end up flushing the TLB for every valid entry.

Reported-and-tested-by: Jann Horn <jannh@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18 11:30:52 +02:00
Christoph Hellwig
19e6420e41 LICENSES: Remove CC-BY-SA-4.0 license text
Using non-GPL licenses for our documentation is rather problematic,
as it can directly include other files, which generally are GPLv2
licensed and thus not compatible.

Remove this license now that the only user (idr.rst) is gone to avoid
people semi-accidentally using it again.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18 11:28:50 +02:00
Greg Kroah-Hartman
ca9f672f7c Merge branch 'ida-fixes-4.19-rc8' of git://git.infradead.org/users/willy/linux-dax
Matthew writes:
  "IDA/IDR fixes for 4.19

   I have two tiny fixes, one for the IDA test-suite and one for the IDR
   documentation license."

* 'ida-fixes-4.19-rc8' of git://git.infradead.org/users/willy/linux-dax:
  idr: Change documentation license
  test_ida: Fix lockdep warning
2018-10-18 11:24:32 +02:00
Rafael J. Wysocki
f1c8e410cd cpuidle: menu: Avoid computations when result will be discarded
If the minimum interval taken into account in the average computation
loop in get_typical_interval() is less than the expected idle
duration determined so far, the resultant average cannot be greater
than that value as well and the entire return result of the function
is going to be discarded anyway going forward.

In that case, it is a waste of time to carry out the remaining
computations in get_typical_interval(), so avoid that by returning
early if the minimum interval is not below the expected idle duration.

No intentional changes of behavior.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 09:34:13 +02:00
Rafael J. Wysocki
12b65eadf0 cpuidle: menu: Drop redundant comparison
Since the correction factor cannot be greater than RESOLUTION * DECAY,
the result of the predicted_us computation in menu_select() cannot be
greater than data->next_timer_us, so it is not necessary to compare
the "typical interval" value coming from get_typical_interval() with
data->next_timer_us separately.

It is sufficient to copmare predicted_us with the return value of
get_typical_interval() directly, so do that and drop the now
redundant expected_interval variable.

No intentional changes of behavior.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 09:34:13 +02:00
Chaitanya Kulkarni
3045c0d05e nvme-pci: remove duplicate check
This is a cleanup patch doesn't change any functionality. It removes
the duplicate call to the blk_integrity_rq() in the nvme_map_data().

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-10-18 09:31:43 +02:00
Hans de Goede
589edb56b4 ACPI / scan: Create platform device for INT33FE ACPI nodes
Bay and Cherry Trail devices with a Dollar Cove or Whiskey Cove PMIC
have an ACPI node with a HID of INT33FE which is a "virtual" battery
device implementing a standard ACPI battery interface which depends upon
a proprietary, undocument OpRegion called BMOP. Since we do have docs
for the actual fuel-gauges used on these boards we instead use native
fuel-gauge drivers talking directly to the fuel-gauge ICs on boards which
rely on this INT33FE device for their battery monitoring.

On boards with a Dollar Cove PMIC the INT33FE device's resources (_CRS)
describe a non-existing I2C client at address 0x6b with a bus-speed of
100KHz. This is a problem on some boards since there are actual devices
on that same bus which need a speed of 400KHz to function properly.

This commit adds the INT33FE HID to the list of devices with I2C resources
which should be enumerated as a platform-device rather then letting the
i2c-core instantiate an i2c-client matching the first I2C resource,
so that its bus-speed will not influence the max speed of the I2C bus.
This fixes e.g. the touchscreen not working on the Teclast X98 II Plus.

The INT33FE device on boards with a Whiskey Cove PMIC is somewhat special.
Its first I2C resource is for a secondary I2C address of the PMIC itself,
which is already described in an ACPI device with an INT34D3 HID.

But it has 3 more I2C resources describing 3 other chips for which we do
need to instantiate I2C clients and which need device-connections added
between them for things to work properly. This special case is handled by
the drivers/platform/x86/intel_cht_int33fe.c code.

Before this commit that code was binding to the i2c-client instantiated
for the secondary I2C address of the PMIC, since we now instantiate a
platform device for the INT33FE device instead, this commit also changes
the intel_cht_int33fe driver from an i2c driver to a platform driver.

This also brings the intel_cht_int33fe drv inline with how we instantiate
multiple i2c clients from a single ACPI device in other cases, as done
by the drivers/platform/x86/i2c-multi-instantiate.c code.

Reported-and-tested-by: Alexander Meiler <alex.meiler@protonmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 09:26:37 +02:00
Bart Van Assche
83b2348e27 ACPI / OSL: Use 'jiffies' as the time bassis for acpi_os_get_timer()
Since acpi_os_get_timer() may be called after the timer subsystem has
been suspended, use the jiffies counter instead of ktime_get(). This
patch avoids that the following warning is reported during hibernation:

WARNING: CPU: 0 PID: 612 at kernel/time/timekeeping.c:751 ktime_get+0x116/0x120
RIP: 0010:ktime_get+0x116/0x120
Call Trace:
 acpi_os_get_timer+0xe/0x30
 acpi_ds_exec_begin_control_op+0x175/0x1de
 acpi_ds_exec_begin_op+0x2c7/0x39a
 acpi_ps_create_op+0x573/0x5e4
 acpi_ps_parse_loop+0x349/0x1220
 acpi_ps_parse_aml+0x25b/0x6da
 acpi_ps_execute_method+0x327/0x41b
 acpi_ns_evaluate+0x4e9/0x6f5
 acpi_ut_evaluate_object+0xd9/0x2f2
 acpi_rs_get_method_data+0x8f/0x114
 acpi_walk_resources+0x122/0x1b6
 acpi_pci_link_get_current.isra.2+0x157/0x280
 acpi_pci_link_set+0x32f/0x4a0
 irqrouter_resume+0x58/0x80
 syscore_resume+0x84/0x380
 hibernation_snapshot+0x20c/0x4f0
 hibernate+0x22d/0x3a6
 state_store+0x99/0xa0
 kobj_attr_store+0x37/0x50
 sysfs_kf_write+0x87/0xa0
 kernfs_fop_write+0x1a5/0x240
 __vfs_write+0xd2/0x410
 vfs_write+0x101/0x250
 ksys_write+0xab/0x120
 __x64_sys_write+0x43/0x50
 do_syscall_64+0x71/0x220
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 164a08cee1 (ACPICA: Dispatcher: Introduce timeout mechanism for infinite loop detection)
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
References: https://lists.01.org/pipermail/lkp/2018-April/008406.html
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 09:23:16 +02:00
Erik Schmauss
d737f333b2 ACPI: probe ECDT before loading AML tables regardless of module-level code flag
It was discovered that AML tables were loaded before or after the
ECDT depending on acpi_gbl_execute_tables_as_methods. According to
the ACPI spec, the ECDT should be loaded before the namespace is
populated by loading AML tables (DSDT and SSDT). Since the ECDT
should be loaded early in the boot process, this change moves the
ECDT probing to acpi_early_init.

Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 09:19:17 +02:00
Erik Schmauss
08930d56c7 ACPICA: Remove acpi_gbl_group_module_level_code and only use acpi_gbl_execute_tables_as_methods instead
acpi_gbl_group_module_level_code and acpi_gbl_execute_tables_as_methods were
used to enable different table load behavior. The different table
load behaviors are as follows:

A.) acpi_gbl_group_module_level_code enabled the legacy approach where
    ASL if statements are executed after the namespace object has
    been loaded.
B.) acpi_gbl_execute_tables_as_methods is currently used to enable the
    table load to be a method invocation. This meaning that ASL If
    statements are executed in-line rather than deferred until after
    the ACPI namespace has been populated. This is the correct
    behavior and option A will be removed in the future.

We do not support a table load behavior where these variables are
assigned the same value. In otherwords, we only support option A or B
and do not need acpi_gbl_group_module_level_code to enable A. From now on,
acpi_gbl_execute_tables_as_methods == 0 enables option A and
acpi_gbl_execute_tables_as_methods == 1 enables option B.

Note: option A is expected to be removed in the future and option B
will become the only supported table load behavior.

Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 09:17:04 +02:00
Erik Schmauss
c64baa3a6f ACPICA: AML Parser: fix parse loop to correctly skip erroneous extended opcodes
AML opcodes come in two lengths: 1-byte opcodes and 2-byte, extended opcodes.
If an error occurs due to illegal opcodes during table load, the AML parser
needs to continue loading the table. In order to do this, it needs to skip
parsing of the offending opcode and operands associated with that opcode.

This change fixes the AML parse loop to correctly skip parsing of incorrect
extended opcodes. Previously, only the short opcodes were skipped correctly.

Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 09:17:04 +02:00
Erik Schmauss
4abb951b73 ACPICA: AML interpreter: add region addresses in global list during initialization
The table load process omitted adding the operation region address
range to the global list. This omission is problematic because the OS
queries the global list to check for address range conflicts before
deciding which drivers to load. This commit may result in warning
messages that look like the following:

[    7.871761] ACPI Warning: system_IO range 0x00000428-0x0000042F conflicts with op_region 0x00000400-0x0000047F (\PMIO) (20180531/utaddress-213)
[    7.871769] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

However, these messages do not signify regressions. It is a result of
properly adding address ranges within the global address list.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011
Tested-by: Jean-Marc Lenoir <archlinux@jihemel.com>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18 09:17:04 +02:00
Rafael J. Wysocki
3230b2b3c1 ACPI: TAD: Add low-level support for real time capability
Add low-level support for the (optional) real time capability of the
ACPI Time and Alarm Device (TAD) to the ACPI TAD driver.

This allows the real time to be acquired or set via sysfs with the
help of the _GRT and _SRT methods of the TAD, respectively.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2018-10-18 09:11:53 +02:00
Steven Rostedt (VMware)
c2712b8581 kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
Andy had some concerns about using regs_get_kernel_stack_nth() in a new
function regs_get_kernel_argument() as if there's any error in the stack
code, it could cause a bad memory access. To be on the safe side, call
probe_kernel_read() on the stack address to be extra careful in accessing
the memory. A helper function, regs_get_kernel_stack_nth_addr(), was added
to just return the stack address (or NULL if not on the stack), that will be
used to find the address (and could be used by other functions) and read the
address with kernel_probe_read().

Requested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181017165951.09119177@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-18 08:28:35 +02:00
Ingo Molnar
20e8e72d0f perf/urgent fixes:
- Stop fallbacking to kallsyms for vDSO symbols lookup, this wasn't
   being really used and is not valid in arches such as Sparc, where
   user and kernel space don't share the address space, relying only on
   cpumode to figure out what DSOs to lookup (Arnaldo Carvalho de Melo)
 
 - Align cpu map synthesized events properly, fixing SIGBUS in
   CPUs like Sparc (David Miller)
 
 - Fix use of alternatives to find JDIR (Jarod Wilson)
 
 - Store ids for events with their own cpus when synthesizing user
   level event details (scale, unit, etc) events, fixing a crash
   when recording a PMU event with a cpumask defined (Jiri Olsa)
 
 - Fix wrong filter_band* values for uncore Intel vendor events (Jiri Olsa)
 
 - Fix detection of tracefs path in systems without tracefs, where
   that path should be the debugfs mountpoint plus "/tracing/" (Jiri Olsa)
 
 - Pass build flags to traceevent build, allowing using alternative
   flags in distro packages, RPM, for instance (Jiri Olsa)
 
 - Fix 'perf report' crash on invalid inline debug information (Milian Wolff)
 
 - Synch kvm uapi copies (Arnaldo Carvalho de Melo)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCW8eytQAKCRCyPKLppCJ+
 Jz94AP9Ra7FFmnMuffimP5pIkUacfqkLXPG3Lymxa8+pm0FH6gD/cWUZCxNdchBN
 v4zFXT1i9iR2YCKu8/1iijVx2wtpZQw=
 =Dh50
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-4.19-20181017' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Stop falling back to kallsyms for vDSO symbols lookup, this wasn't
  being really used and is not valid in arches such as Sparc, where
  user and kernel space don't share the address space, relying only on
  cpumode to figure out what DSOs to lookup (Arnaldo Carvalho de Melo)

- Align CPU map synthesized events properly, fixing SIGBUS in
  CPUs like Sparc (David Miller)

- Fix use of alternatives to find JDIR (Jarod Wilson)

- Store IDs for events with their own CPUs when synthesizing user
  level event details (scale, unit, etc) events, fixing a crash
  when recording a PMU event with a cpumask defined (Jiri Olsa)

- Fix wrong filter_band* values for uncore Intel vendor events (Jiri Olsa)

- Fix detection of tracefs path in systems without tracefs, where
  that path should be the debugfs mountpoint plus "/tracing/" (Jiri Olsa)

- Pass build flags to traceevent build, allowing using alternative
  flags in distro packages, RPM, for instance (Jiri Olsa)

- Fix 'perf report' crash on invalid inline debug information (Milian Wolff)

- Synch KVM UAPI copies (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-18 07:41:29 +02:00
Nikolay Aleksandrov
eddf016b91 net: ipmr: fix unresolved entry dumps
If the skb space ends in an unresolved entry while dumping we'll miss
some unresolved entries. The reason is due to zeroing the entry counter
between dumping resolved and unresolved mfc entries. We should just
keep counting until the whole table is dumped and zero when we move to
the next as we have a separate table counter.

Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: 8fb472c09b ("ipmr: improve hash scalability")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17 22:35:42 -07:00
Gregory CLEMENT
06a36ecb5d net: mscc: ocelot: Fix comment in ocelot_vlant_wait_for_completion()
The ocelot_vlant_wait_for_completion() function is very similar to the
ocelot_mact_wait_for_completion(). It seemed to have be copied but the
comment was not updated, so let's fix it.

Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17 22:33:43 -07:00
Xin Long
5660b9d9d6 sctp: fix the data size calculation in sctp_data_size
sctp data size should be calculated by subtracting data chunk header's
length from chunk_hdr->length, not just data header.

Fixes: 668c9beb90 ("sctp: implement assign_number for sctp_stream_interleave")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17 22:32:21 -07:00
Ake Koomsin
05c998b738 virtio_net: avoid using netif_tx_disable() for serializing tx routine
Commit 713a98d90c ("virtio-net: serialize tx routine during reset")
introduces netif_tx_disable() after netif_device_detach() in order to
avoid use-after-free of tx queues. However, there are two issues.

1) Its operation is redundant with netif_device_detach() in case the
   interface is running.
2) In case of the interface is not running before suspending and
   resuming, the tx does not get resumed by netif_device_attach().
   This results in losing network connectivity.

It is better to use netif_tx_lock_bh()/netif_tx_unlock_bh() instead for
serializing tx routine during reset. This also preserves the symmetry
of netif_device_detach() and netif_device_attach().

Fixes commit 713a98d90c ("virtio-net: serialize tx routine during reset")
Signed-off-by: Ake Koomsin <ake@igel.co.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17 22:29:30 -07:00
Greg Kroah-Hartman
9bd871df56 This fixes two bugs:
- Fix size mismatch of tracepoint array
 
  - Have preemptirq test module use same clock source of the selftest
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCW8eRhRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qkEgAP4vscLVMSYBTUuDNXX0+l8FVdrpPagL
 1tjTJpTUfG3QLQEA9XOl8vR/Yy/BywcU7K2R3zGbo7Qh6AgpWl2pJcmsGQk=
 =XS5E
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Steven writes:
  "tracing: Two fixes for 4.19

   This fixes two bugs:
    - Fix size mismatch of tracepoint array
    - Have preemptirq test module use same clock source of the selftest"

* tag 'trace-v4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Use trace_clock_local() for looping in preemptirq_delay_test.c
  tracepoint: Fix tracepoint array element size mismatch
2018-10-18 07:29:05 +02:00