Pull UML updates from Richard Weinberger:
- a new and faster epoll based IRQ controller and NIC driver
- misc fixes and janitorial updates
* git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
Fix vector raw inintialization logic
Migrate vector timers to new timer API
um: Compile with modern headers
um: vector: Fix an error handling path in 'vector_parse()'
um: vector: Fix a memory allocation check
um: vector: fix missing unlock on error in vector_net_open()
um: Add missing EXPORT for free_irq_by_fd()
High Performance UML Vector Network Driver
Epoll based IRQ controller
um: Use POSIX ucontext_t instead of struct ucontext
um: time: Use timespec64 for persistent clock
um: Restore symbol versions for __memcpy and memcpy
Here is a very small set of fixes for inclusion in linux-4.17-rc1:
Two changes for the maintainer file, and one more fix for the newly
added npcm platform, to enable the level 2 cache controller.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJazib3AAoJEGCrR//JCVIndE8P/1vRE7HcR+DPVwjDWGdfDhQ+
LuF86+eZRwxHewykA+RQpsdQ8c2N2g+5oL20EvtJpInS/2iE36evZicFIMHQKfnL
cbij5fi2/hlCRGWfscr2g5Zq1KNPSLqyfRms5z27TRddYoZMYqMg67TwwkQ0tdmf
WeQ+2Dg17SaZecGFLWmr8cJlo+bx6U32KYHOMo7X8mhW91GPEHLqh/u+LhVQPnnt
LdZ4IcMH4lC/uXl39qojT8aWmzm2fcKkYBJAMbm3cL3tu76Jyy8rEy5AAdqjHsin
Xb+wcnCfk6q+dO3OjFhA3bvgDw4LUIyntMiFQvANWWMiGrmIynw9VYIMiQ2Y0S15
Kv9e2uwkUHiuWJZCDbFCa9pa0kDmxMpoC45wDBR2ktuAa4HB/BP6PKTFrLC0Fxag
KJvRRDklxHDWbsA+mZTser1mEKG9kslnGNfF3LCALWnFkMgWKiV1NuLr8HucaO7d
7b7ppofmjva+qcHjDd8I9r4qFnLcWDJHZg3cZus+4WjTHt5ws+ueb0pYPNIKvnE1
dhvjY9dSmeEZrM8rqLcUD9Npnm4GA+ySFpFty1jPP4hb0Q+Ho1p8BssP1Ktxw8mX
FQ59x/RYj/5+yPj7st0ERZsxir4D39H3PlYsTWeqjZCCG8VMgBX+UND0ozh7ytYM
1+1r1us28shsL5Xp6HRo
=kiZ6
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"Here is a very small set of fixes for inclusion in linux-4.17-rc1: Two
changes for the maintainer file, and one more fix for the newly added
npcm platform, to enable the level 2 cache controller"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
MAINTAINERS: Update ASPEED entry with details
MAINTAINERS: Migrate oxnas list to groups.io
arm: npcm: enable L2 cache in NPCM7xx architecture
* Since cifs_vfs_error was just using pr_debug_ratelimited like the rest
of cifs_dbg, move it there too
* Add a ONCE type flag to call the pr_xxx_once() debug function instead
of the ratelimited ones.
To convert existing printk_once() calls to this we can run:
perl -i -pE \
's/printk_once\s*\(([^" \n]+)(.*)/cifs_dbg(VFS|ONCE,$2/g' \
fs/cifs/*.c
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
On 32-bit (e.g. with m68k-linux-gnu-gcc-4.1):
fs/cifs/inode.c: In function ‘simple_hashstr’:
fs/cifs/inode.c:713: warning: integer constant is too large for ‘long’ type
Fixes: 7ea884c77e ("smb3: Fix root directory when server returns inode number of zero")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Adding an extra debug message to show if a tree connect failure during
reconnect (and made it a log once so it doesn't spam the logs).
Saw a case recently where tree connect repeatedly returned
access denied on reconnect and it wasn't as easy to spot as it
should have been.
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
tcon->ses is being dereferenced before it is null checked, hence
there is a potential null pointer dereference.
Fix this by moving the pointer dereference after tcon->ses has
been properly null checked.
Addresses-Coverity-ID: 1467426 ("Dereference before null check")
Fixes: 93012bf984 ("cifs: add server->vals->header_preamble_size")
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Guillaume Nault says:
====================
l2tp: tunnel creation fixes
L2TP tunnel creation is racy. We need to make sure that the tunnel
returned by l2tp_tunnel_create() isn't going to be freed while the
caller is using it. This is done in patch #1, by separating tunnel
creation from tunnel registration.
With the tunnel registration code in place, we can now check for
duplicate tunnels in a race-free way. This is done in patch #2, which
incidentally removes the last use of l2tp_tunnel_find().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We can't use l2tp_tunnel_find() to prevent l2tp_nl_cmd_tunnel_create()
from creating a duplicate tunnel. A tunnel can be concurrently
registered after l2tp_tunnel_find() returns. Therefore, searching for
duplicates must be done at registration time.
Finally, remove l2tp_tunnel_find() entirely as it isn't use anywhere
anymore.
Fixes: 309795f4be ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
l2tp_tunnel_create() inserts the new tunnel into the namespace's tunnel
list and sets the socket's ->sk_user_data field, before returning it to
the caller. Therefore, there are two ways the tunnel can be accessed
and freed, before the caller even had the opportunity to take a
reference. In practice, syzbot could crash the module by closing the
socket right after a new tunnel was returned to pppol2tp_create().
This patch moves tunnel registration out of l2tp_tunnel_create(), so
that the caller can safely hold a reference before publishing the
tunnel. This second step is done with the new l2tp_tunnel_register()
function, which is now responsible for associating the tunnel to its
socket and for inserting it into the namespace's list.
While moving the code to l2tp_tunnel_register(), a few modifications
have been done. First, the socket validation tests are done in a helper
function, for clarity. Also, modifying the socket is now done after
having inserted the tunnel to the namespace's tunnels list. This will
allow insertion to fail, without having to revert theses modifications
in the error path (a followup patch will check for duplicate tunnels
before insertion). Either the socket is a kernel socket which we
control, or it is a user-space socket for which we have a reference on
the file descriptor. In any case, the socket isn't going to be closed
from under us.
Reported-by: syzbot+fbeeb5c3b538e8545644@syzkaller.appspotmail.com
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
The API docs describe i2c_transfer() as taking a pointer to an array
of i2c_msg containing at least 1 entry, but leaves it to the individual
drivers to sanity check the msgs and num parameters. Let's do this in
core code instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[wsa: changed '<= 0' to '< 1']
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Intentionally missing i2c-riic here, Chris Brandt will add himself for
that one later.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The number of I2C host controller drivers keeps increasing, and although
I had some success acquiring specific driver maintainers, my bandwidth
is by far not enough to act as a fallback for the rest of the drivers.
To reflect this status-quo in MAINTAINERS, add a separate entry for I2C
host drivers, let the I2C list (= community) be the contact point, and
mark this section as "Odd fixes".
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
On some systems, the BIOS expects certain SMBus register values to
match the hardware defaults. Restore these configuration registers at
shutdown time to avoid confusing the BIOS. This avoids hard-locking
such systems upon reboot.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@vger.kernel.org
Saving the original value of register SMBSLVCMD in
i801_enable_host_notify() doesn't work, because this function is
called not only at probe time but also at resume time. Do it in
i801_probe() instead, so that the saved value is not overwritten at
resume time.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 22e94bd677 ("i2c: i801: store and restore the SLVCMD register at load and unload")
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@vger.kernel.org # v4.10+
I added dumping of link information about tun devices over netlink in
commit 1ec010e705 ("tun: export flags, uid, gid, queue information
over netlink"), but didn't add the missing netlink notifications when
the device's exported properties change.
This patch adds notifications when owner/group or flags are modified,
when queues are attached/detached, and when a tun fd is closed.
Reported-by: Thomas Haller <thaller@redhat.com>
Fixes: 1ec010e705 ("tun: export flags, uid, gid, queue information over netlink")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise, register_netdevice advertises the creation of the device with
the default flags, instead of what the user requested.
Reported-by: Thomas Haller <thaller@redhat.com>
Fixes: 1ec010e705 ("tun: export flags, uid, gid, queue information over netlink")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 92571a1aae ("lan78xx: Connect phy early") moves the PHY
initialisation into lan78xx_probe, but lan78xx_open subsequently calls
lan78xx_reset. As well as forcing a second round of link negotiation,
this reset frequently prevents the phy interrupt from being generated
(even though the link is up), rendering the interface unusable.
Fix this issue by removing the lan78xx_reset call from lan78xx_open.
Fixes: 92571a1aae ("lan78xx: Connect phy early")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Fixes for net.
This bug fix series include NULL pointer fixes in ethtool -x code path
and in the error clean up path when freeing IRQs, a ring accounting bug
that missed rings used by the RDMA driver, and 3 bug fixes related to TC
Flower and VF-reps.
v2: Fixed commit message of patch 4. Changed the pound sign to $ sign
in front of the ip command.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When open fails during ethtool -L ring change, for example, the driver
may crash at bnxt_free_irq() because bp->bnapi is NULL.
If we fail to allocate all the new rings, bnxt_open_nic() will free
all the memory including bp->bnapi. Subsequent call to bnxt_close_nic()
will try to dereference bp->bnapi in bnxt_free_irq().
Fix it by checking for !bp->bnapi in bnxt_free_irq().
Fixes: e5811b8c09 ("bnxt_en: Add IRQ remapping logic.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With recent changes to reserve both L2 and RDMA rings, we need to include
the RDMA rings in bnxt_check_rings(). Otherwise we will under-estimate
the rings we need during ethtool -L and may lead to failure.
Fixes: fbcfc8e467 ("bnxt_en: Reserve completion rings and MSIX for bnxt_re RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While a VF is configured with a bigger mtu (> 1500), any packets that
are punted to the VF-rep (slow-path) get dropped by OVS kernel-datapath
with the following message: "dropped over-mtu packet". Fix this by
returning the max-mtu value for a VF-rep derived from its corresponding VF.
VF-rep's mtu can be changed using 'ip' command as shown in this example:
$ ip link set bnxt0_pf0vf0 mtu 9000
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver currently uses src port field (along with other fields) in the
decap tunnel key, while looking up and adding tunnel nodes. This leads to
redundant cfa_decap_filter_alloc() requests to the FW and flow-miss in the
flow engine. Fix this by ignoring the src port field in decap tunnel nodes.
Fixes: f484f6782e ("bnxt_en: add hwrm FW cmds for cfa_encap_record and decap_filter")
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this patch the following commands would succeed as far as the
user was concerned:
$ tc qdisc add dev p1p1 ingress
$ tc filter add dev p1p1 parent ffff: protocol all \
flower skip_sw action drop
$ tc filter add dev p1p1 parent ffff: protocol ipv4 \
flower skip_sw src_mac 00:02:00:00:00:01/44 action drop
The current flow offload infrastructure used does not support wildcard
matching for ethernet headers, so do not allow the second or third
commands to succeed. If a user wants to drop traffic on that interface
the protocol and MAC addresses need to be specified explicitly:
$ tc qdisc add dev p1p1 ingress
$ tc filter add dev p1p1 parent ffff: protocol arp \
flower skip_sw action drop
$ tc filter add dev p1p1 parent ffff: protocol ipv4 \
flower skip_sw action drop
...
$ tc filter add dev p1p1 parent ffff: protocol ipv4 \
flower skip_sw src_mac 00:02:00:00:00:01 action drop
$ tc filter add dev p1p1 parent ffff: protocol ipv4 \
flower skip_sw src_mac 00:02:00:00:00:02 action drop
...
There are also checks for VLAN parameters in this patch as other callers
may wildcard those parameters even if tc does not. Using different
flow infrastructure could allow this to work in the future for L2 flows,
but for now it does not.
Fixes: 2ae7408fed ("bnxt_en: bnxt: add TC flower filter offload support")
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix ethtool .get_rxfh() crash by checking for valid indirection table
address before copying the data.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding missing call to cache current backlight values.
Otherwise the brightness resets to default value on resume.
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With this the dGPU turns on correctly.
Signed-off-by: Nico Sneck <nicosneck@hotmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Merge more updates from Andrew Morton:
- almost all of the rest of MM
- kasan updates
- lots of procfs work
- misc things
- lib/ updates
- checkpatch
- rapidio
- ipc/shm updates
- the start of willy's XArray conversion
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (140 commits)
page cache: use xa_lock
xarray: add the xa_lock to the radix_tree_root
fscache: use appropriate radix tree accessors
export __set_page_dirty
unicore32: turn flush_dcache_mmap_lock into a no-op
arm64: turn flush_dcache_mmap_lock into a no-op
mac80211_hwsim: use DEFINE_IDA
radix tree: use GFP_ZONEMASK bits of gfp_t for flags
linux/const.h: refactor _BITUL and _BITULL a bit
linux/const.h: move UL() macro to include/linux/const.h
linux/const.h: prefix include guard of uapi/linux/const.h with _UAPI
xen, mm: allow deferred page initialization for xen pv domains
elf: enforce MAP_FIXED on overlaying elf segments
fs, elf: drop MAP_FIXED usage from elf_map
mm: introduce MAP_FIXED_NOREPLACE
MAINTAINERS: update bouncing aacraid@adaptec.com addresses
fs/dcache.c: add cond_resched() in shrink_dentry_list()
include/linux/kfifo.h: fix comment
ipc/shm.c: shm_split(): remove unneeded test for NULL shm_file_data.vm_ops
kernel/sysctl.c: add kdoc comments to do_proc_do{u}intvec_minmax_conv_param
...
Add support macros to conditionally yield the NEON (and thus the CPU)
that may be called from the assembler code.
In some cases, yielding the NEON involves saving and restoring a non
trivial amount of context (especially in the CRC folding algorithms),
and so the macro is split into three, and the code in between is only
executed when the yield path is taken, allowing the context to be preserved.
The third macro takes an optional label argument that marks the resume
path after a yield has been performed.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We are going to add code to all the NEON crypto routines that will
turn them into non-leaf functions, so we need to manage the stack
frames. To make this less tedious and error prone, add some macros
that take the number of callee saved registers to preserve and the
extra size to allocate in the stack frame (for locals) and emit
the ldp/stp sequences.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
bpi.S was introduced as we were starting to build the Spectre v2
mitigation framework, and it was rather unclear that it would
become strictly KVM specific.
Now that the picture is a lot clearer, let's move the content
of that file to hyp-entry.S, where it actually belong.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The very existence of __smccc_workaround_1_hvc_* is a thinko, as
KVM will never use a HVC call to perform the branch prediction
invalidation. Even as a nested hypervisor, it would use an SMC
instruction.
Let's get rid of it.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since 5e7951ce19 ("arm64: capabilities: Clean up midr range helpers"),
capabilities must be represented with a single entry. If multiple
CPU types can use the same capability, then they need to be enumerated
in a list.
The EL2 hardening stuff (which affects both A57 and A72) managed to
escape the conversion in the above patch thanks to the 4.17 merge
window. Let's fix it now.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC
V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses
the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead
of Silicon provider service ID 0xC2001700.
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
[maz: reworked errata framework integration]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Remove the address_space ->tree_lock and use the xa_lock newly added to
the radix_tree_root. Rename the address_space ->page_tree to ->i_pages,
since we don't really care that it's a tree.
[willy@infradead.org: fix nds32, fs/dax.c]
Link: http://lkml.kernel.org/r/20180406145415.GB20605@bombadil.infradead.orgLink: http://lkml.kernel.org/r/20180313132639.17387-9-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This results in no change in structure size on 64-bit machines as it
fits in the padding between the gfp_t and the void *. 32-bit machines
will grow the structure from 8 to 12 bytes. Almost all radix trees are
protected with (at least) a spinlock, so as they are converted from
radix trees to xarrays, the data structures will shrink again.
Initialising the spinlock requires a name for the benefit of lockdep, so
RADIX_TREE_INIT() now needs to know the name of the radix tree it's
initialising, and so do IDR_INIT() and IDA_INIT().
Also add the xa_lock() and xa_unlock() family of wrappers to make it
easier to use the lock. If we could rely on -fplan9-extensions in the
compiler, we could avoid all of this syntactic sugar, but that wasn't
added until gcc 4.6.
Link: http://lkml.kernel.org/r/20180313132639.17387-8-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't open-code accesses to data structure internals.
Link: http://lkml.kernel.org/r/20180313132639.17387-7-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
XFS currently contains a copy-and-paste of __set_page_dirty(). Export
it from buffer.c instead.
Link: http://lkml.kernel.org/r/20180313132639.17387-6-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unicore doesn't walk the VMA tree in its flush_dcache_page()
implementation, so has no need to take the tree_lock.
Link: http://lkml.kernel.org/r/20180313132639.17387-5-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ARM64 doesn't walk the VMA tree in its flush_dcache_page()
implementation, so has no need to take the tree_lock.
Link: http://lkml.kernel.org/r/20180313132639.17387-4-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is preferred to opencoding an IDA_INIT.
Link: http://lkml.kernel.org/r/20180313132639.17387-2-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "XArray", v9. (First part thereof).
This patchset is, I believe, appropriate for merging for 4.17. It
contains the XArray implementation, to eventually replace the radix
tree, and converts the page cache to use it.
This conversion keeps the radix tree and XArray data structures in sync
at all times. That allows us to convert the page cache one function at
a time and should allow for easier bisection. Other than renaming some
elements of the structures, the data structures are fundamentally
unchanged; a radix tree walk and an XArray walk will touch the same
number of cachelines. I have changes planned to the XArray data
structure, but those will happen in future patches.
Improvements the XArray has over the radix tree:
- The radix tree provides operations like other trees do; 'insert' and
'delete'. But what most users really want is an automatically
resizing array, and so it makes more sense to give users an API that
is like an array -- 'load' and 'store'. We still have an 'insert'
operation for users that really want that semantic.
- The XArray considers locking as part of its API. This simplifies a
lot of users who formerly had to manage their own locking just for
the radix tree. It also improves code generation as we can now tell
RCU that we're holding a lock and it doesn't need to generate as much
fencing code. The other advantage is that tree nodes can be moved
(not yet implemented).
- GFP flags are now parameters to calls which may need to allocate
memory. The radix tree forced users to decide what the allocation
flags would be at creation time. It's much clearer to specify them at
allocation time.
- Memory is not preloaded; we don't tie up dozens of pages on the off
chance that the slab allocator fails. Instead, we drop the lock,
allocate a new node and retry the operation. We have to convert all
the radix tree, IDA and IDR preload users before we can realise this
benefit, but I have not yet found a user which cannot be converted.
- The XArray provides a cmpxchg operation. The radix tree forces users
to roll their own (and at least four have).
- Iterators take a 'max' parameter. That simplifies many users and will
reduce the amount of iteration done.
- Iteration can proceed backwards. We only have one user for this, but
since it's called as part of the pagefault readahead algorithm, that
seemed worth mentioning.
- RCU-protected pointers are not exposed as part of the API. There are
some fun bugs where the page cache forgets to use rcu_dereference()
in the current codebase.
- Value entries gain an extra bit compared to radix tree exceptional
entries. That gives us the extra bit we need to put huge page swap
entries in the page cache.
- Some iterators now take a 'filter' argument instead of having
separate iterators for tagged/untagged iterations.
The page cache is improved by this:
- Shorter, easier to read code
- More efficient iterations
- Reduction in size of struct address_space
- Fewer walks from the top of the data structure; the XArray API
encourages staying at the leaf node and conducting operations there.
This patch (of 8):
None of these bits may be used for slab allocations, so we can use them
as radix tree flags as long as we mask them off before passing them to
the slab allocator. Move the IDR flag from the high bits to the
GFP_ZONEMASK bits.
Link: http://lkml.kernel.org/r/20180313132639.17387-3-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minor cleanups available by _UL and _ULL.
Link: http://lkml.kernel.org/r/1519301715-31798-5-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ARM, ARM64 and UniCore32 duplicate the definition of UL():
#define UL(x) _AC(x, UL)
This is not actually arch-specific, so it will be useful to move it to a
common header. Currently, we only have the uapi variant for
linux/const.h, so I am creating include/linux/const.h.
I also added _UL(), _ULL() and ULL() because _AC() is mostly used in
the form either _AC(..., UL) or _AC(..., ULL). I expect they will be
replaced in follow-up cleanups. The underscore-prefixed ones should
be used for exported headers.
Link: http://lkml.kernel.org/r/1519301715-31798-4-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "linux/const.h: cleanups of macros such as UL(), _BITUL(),
BIT() etc", v3.
ARM, ARM64, UniCore32 define UL() as a shorthand of _AC(..., UL). More
architectures may introduce it in the future.
UL() is arch-agnostic, and useful. So let's move it to
include/linux/const.h
Currently, <asm/memory.h> must be included to use UL(). It pulls in more
bloats just for defining some bit macros.
I posted V2 one year ago.
The previous posts are:
https://patchwork.kernel.org/patch/9498273/https://patchwork.kernel.org/patch/9498275/https://patchwork.kernel.org/patch/9498269/https://patchwork.kernel.org/patch/9498271/
At that time, what blocked this series was a comment from
David Howells:
You need to be very careful doing this. Some userspace stuff
depends on the guard macro names on the kernel header files.
(https://patchwork.kernel.org/patch/9498275/)
Looking at the code closer, I noticed this is not a problem.
See the following line.
https://github.com/torvalds/linux/blob/v4.16-rc2/scripts/headers_install.sh#L40
scripts/headers_install.sh rips off _UAPI prefix from guard macro names.
I ran "make headers_install" and confirmed the result is what I expect.
So, we can prefix the include guard of include/uapi/linux/const.h,
and add a new include/linux/const.h.
This patch (of 4):
I am going to add include/linux/const.h for the kernel space.
Add _UAPI to the include guard of include/uapi/linux/const.h to
prepare for that.
Please notice the guard name of the exported one will be kept as-is.
So, this commit has no impact to the userspace even if some userspace
stuff depends on the guard macro names.
scripts/headers_install.sh processes exported headers by SED, and
rips off "_UAPI" from guard macro names.
#ifndef _UAPI_LINUX_CONST_H
#define _UAPI_LINUX_CONST_H
will be turned into
#ifndef _LINUX_CONST_H
#define _LINUX_CONST_H
Link: http://lkml.kernel.org/r/1519301715-31798-2-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Juergen Gross noticed that commit f7f99100d8 ("mm: stop zeroing memory
during allocation in vmemmap") broke XEN PV domains when deferred struct
page initialization is enabled.
This is because the xen's PagePinned() flag is getting erased from
struct pages when they are initialized later in boot.
Juergen fixed this problem by disabling deferred pages on xen pv
domains. It is desirable, however, to have this feature available as it
reduces boot time. This fix re-enables the feature for pv-dmains, and
fixes the problem the following way:
The fix is to delay setting PagePinned flag until struct pages for all
allocated memory are initialized, i.e. until after free_all_bootmem().
A new x86_init.hyper op init_after_bootmem() is called to let xen know
that boot allocator is done, and hence struct pages for all the
allocated memory are now initialized. If deferred page initialization
is enabled, the rest of struct pages are going to be initialized later
in boot once page_alloc_init_late() is called.
xen_after_bootmem() walks page table's pages and marks them pinned.
Link: http://lkml.kernel.org/r/20180226160112.24724-2-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Tested-by: Juergen Gross <jgross@suse.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Jinbum Park <jinb.park7@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Jia Zhang <zhang.jia@linux.alibaba.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>