With the recent addition of filesystem checksum types other than CRC32c,
it is not anymore hard-coded which checksum type a btrfs filesystem uses.
Up to now there is no good way to read the filesystem checksum, apart from
reading the filesystem UUID and then query sysfs for the checksum type.
Add a new csum_type and csum_size fields to the BTRFS_IOC_FS_INFO ioctl
command which usually is used to query filesystem features. Also add a
flags member indicating that the kernel responded with a set csum_type and
csum_size field.
For compatibility reasons, only return the csum_type and csum_size if
the BTRFS_FS_INFO_FLAG_CSUM_INFO flag was passed to the kernel. Also
clear any unknown flags so we don't pass false positives to user-space
newer than the kernel.
To simplify further additions to the ioctl, also switch the padding to a
u8 array. Pahole was used to verify the result of this switch:
The csum members are added before flags, which might look odd, but this
is to keep the alignment requirements and not to introduce holes in the
structure.
$ pahole -C btrfs_ioctl_fs_info_args fs/btrfs/btrfs.ko
struct btrfs_ioctl_fs_info_args {
__u64 max_id; /* 0 8 */
__u64 num_devices; /* 8 8 */
__u8 fsid[16]; /* 16 16 */
__u32 nodesize; /* 32 4 */
__u32 sectorsize; /* 36 4 */
__u32 clone_alignment; /* 40 4 */
__u16 csum_type; /* 44 2 */
__u16 csum_size; /* 46 2 */
__u64 flags; /* 48 8 */
__u8 reserved[968]; /* 56 968 */
/* size: 1024, cachelines: 16, members: 10 */
};
Fixes: 3951e7f050 ("btrfs: add xxhash64 to checksumming algorithms")
Fixes: 3831bf0094 ("btrfs: add sha256 to checksumming algorithm")
CC: stable@vger.kernel.org # 5.5+
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The qgroup level is limited to u16, so no need to use u64 for it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Fix the Amlogic Video Framebuffer Compression modifier macro to
correctly add the layout options, a pair of parenthesis was missing.
Fixes: d6528ec883 ("drm/fourcc: Add modifier definitions for describing Amlogic Video Framebuffer Compression")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200723090551.27529-1-narmstrong@baylibre.com
Add bpf_link-based API (bpf_xdp_link) to attach BPF XDP program through
BPF_LINK_CREATE command.
bpf_xdp_link is mutually exclusive with direct BPF program attachment,
previous BPF program should be detached prior to attempting to create a new
bpf_xdp_link attachment (for a given XDP mode). Once BPF link is attached, it
can't be replaced by other BPF program attachment or link attachment. It will
be detached only when the last BPF link FD is closed.
bpf_xdp_link will be auto-detached when net_device is shutdown, similarly to
how other BPF links behave (cgroup, flow_dissector). At that point bpf_link
will become defunct, but won't be destroyed until last FD is closed.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200722064603.3350758-5-andriin@fb.com
The bpf iterator for map elements are implemented.
The bpf program will receive four parameters:
bpf_iter_meta *meta: the meta data
bpf_map *map: the bpf_map whose elements are traversed
void *key: the key of one element
void *value: the value of the same element
Here, meta and map pointers are always valid, and
key has register type PTR_TO_RDONLY_BUF_OR_NULL and
value has register type PTR_TO_RDWR_BUF_OR_NULL.
The kernel will track the access range of key and value
during verification time. Later, these values will be compared
against the values in the actual map to ensure all accesses
are within range.
A new field iter_seq_info is added to bpf_map_ops which
is used to add map type specific information, i.e., seq_ops,
init/fini seq_file func and seq_file private data size.
Subsequent patches will have actual implementation
for bpf_map_ops->iter_seq_info.
In user space, BPF_ITER_LINK_MAP_FD needs to be
specified in prog attr->link_create.flags, which indicates
that attr->link_create.target_fd is a map_fd.
The reason for such an explicit flag is for possible
future cases where one bpf iterator may allow more than
one possible customization, e.g., pid and cgroup id for
task_file.
Current kernel internal implementation only allows
the target to register at most one required bpf_iter_link_info.
To support the above case, optional bpf_iter_link_info's
are needed, the target can be extended to register such link
infos, and user provided link_info needs to match one of
target supported ones.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723184112.590360-1-yhs@fb.com
Platform reboots are expensive. Towards reducing downtime to apply
firmware updates the Intel NVDIMM command definition is growing support
for applying live firmware updates that only require temporarily
suspending memory traffic instead of a full reboot.
Follow-on commits add support for triggering firmware activation, this
patch only defines the commands, adds probe support, and validates that
they are blocked via the ioctl path. The ioctl-path block ensures that
the OS is in charge since these commands have side effects only the OS
can handle. Specifically firmware activation may cause the memory
controller to be quiesced on the order of 100s of milliseconds. In that
case Linux ensure the activation only takes place while the OS is in a
suspend state.
Link: https://pmem.io/documents/IntelOptanePMem_DSM_Interface-V2.0.pdf
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
The ND_CMD_CALL format allows for a general passthrough of passlisted
commands targeting a given command set. However there is no validation
of the family index relative to what the bus supports.
- Update the NFIT bus implementation (the only one that supports
ND_CMD_CALL passthrough) to also passlist the valid set of command
family indices.
- Update the generic __nd_ioctl() path to validate that field on behalf
of all implementations.
Fixes: 31eca76ba2 ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism")
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
The UDP reuseport conflict was a little bit tricky.
The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.
At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.
This requires moving the reuseport_has_conns() logic into the callers.
While we are here, get rid of inline directives as they do not belong
in foo.c files.
The other changes were cases of more straightforward overlapping
modifications.
Signed-off-by: David S. Miller <davem@davemloft.net>
The large bucket feature is to extend bucket_size from 16bit to 32bit.
When create cache device on zoned device (e.g. zoned NVMe SSD), making
a single bucket cover one or more zones of the zoned device is the
simplest way to support zoned device as cache by bcache.
But current maximum bucket size is 16MB and a typical zone size of zoned
device is 256MB, this is the major motiviation to extend bucket size to
a larger bit width.
This patch is the basic and first change to support large bucket size,
the major changes it makes are,
- Add BCH_FEATURE_INCOMPAT_LARGE_BUCKET for the large bucket feature,
INCOMPAT means it introduces incompatible on-disk format change.
- Add BCH_FEATURE_INCOMPAT_FUNCS(large_bucket, LARGE_BUCKET) routines.
- Adds __le16 bucket_size_hi into struct cache_sb_disk at offset 0x8d0
for the on-disk super block format.
- For the in-memory super block struct cache_sb, member bucket_size is
extended from __u16 to __32.
- Add get_bucket_size() to combine the bucket_size and bucket_size_hi
from struct cache_sb_disk into an unsigned int value.
Since we already have large bucket size helpers meta_bucket_pages(),
meta_bucket_bytes() and alloc_meta_bucket_pages(), they make sure when
bucket size > 8MB, the memory allocation for bcache meta data bucket
won't fail no matter how large the bucket size extended. So these meta
data buckets are handled properly when the bucket size width increase
from 16bit to 32bit, we don't need to worry about them.
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We have struct cache_sb_disk for on-disk super block already, it is
unnecessary to keep the in-memory super block format exactly mapping
to the on-disk struct layout.
This patch adds code comments to notice that struct cache_sb is not
exactly mapping to cache_sb_disk, and removes the useless member csum
and pad[5].
Although struct cache_sb does not belong to uapi, but there are still
some on-disk format related macros reference it and it is unncessary to
get rid of such dependency now. So struct cache_sb will continue to stay
in include/uapi/linux/bache.h for now.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The new added super block version BCACHE_SB_VERSION_BDEV_WITH_FEATURES
(5) BCACHE_SB_VERSION_CDEV_WITH_FEATURES value (6), is for the feature
set bits.
Devices have super block version equal to the new version will have
three new members for feature set bits in the on-disk super block,
__le64 feature_compat;
__le64 feature_incompat;
__le64 feature_ro_compat;
They are used for further new features which may introduce on-disk
format change, and avoid unncessary super block version increase.
The very basic features handling code skeleton is also initialized in
this patch.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Extend the rfc 4884 read interface introduced for ipv4 in
commit eba75c587e ("icmp: support rfc 4884") to ipv6.
Add socket option SOL_IPV6/IPV6_RECVERR_RFC4884.
Changes v1->v2:
- make ipv6_icmp_error_rfc4884 static (file scope)
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding new cls flower keys for hash value and hash
mask and dissect the hash info from the skb into
the flow key towards flow classication.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nnothing uses the enum name, so this is harmless.
Fixes: 3226944124 ("IB/mlx5: Introduce driver create and destroy flow methods")
Link: https://lore.kernel.org/r/20200724084112.GC31930@amd
Signed-off-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Merge in io_uring-5.8 fixes, as changes/cleanups to how we do locked
mem accounting require a fixup, and only one of the spots are noticed
by git as the other merges cleanly. The flags fix from io_uring-5.8
also causes a merge conflict, the leak fix for recvmsg, the double poll
fix, and the link failure locking fix.
* io_uring-5.8:
io_uring: fix lockup in io_fail_links()
io_uring: fix ->work corruption with poll_add
io_uring: missed req_init_async() for IOSQE_ASYNC
io_uring: always allow drain/link/hardlink/async sqe flags
io_uring: ensure double poll additions work with both request types
io_uring: fix recvmsg memory leak with buffer selection
io_uring: fix not initialised work->flags
io_uring: fix missing msg_name assignment
io_uring: account user memory freed when exit has been queued
io_uring: fix memleak in io_sqe_files_register()
io_uring: fix memleak in __io_sqe_files_update()
io_uring: export cq overflow status to userspace
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add command submission statistics structure which can be obtained
through the info ioctl. Each drop counter describes the reason for
which the command submission was dropped.
This information is needed for the user to be aware of the specific
reason for which the submitted work was dropped. The user can then
utilize the driver more efficiently.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
0+eXO4A=
=9EpR
-----END PGP SIGNATURE-----
Merge v5.8-rc6 into drm-next
I've got a silent conflict + two trees based on fixes to merge.
Fixes a silent merge with amdgpu
Signed-off-by: Dave Airlie <airlied@redhat.com>
Drop the repeated word "the" in a comment.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: linux-hyperv@vger.kernel.org
Link: https://lore.kernel.org/r/20200719002841.20369-1-rdunlap@infradead.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Here is the (slightly larger than usual) patch set for the 5.9-rc1 merge
window.
DFL:
- Xu's changes add support for AFU interrupt handling and puts them to
use for error handling.
- Xu's other change also adds another device-id for the Intel FPGA PAC N3000.
- John's change converts from using get_user_pages() to
pin_user_pages().
- Gustavo's patch cleans up some of the allocation by using
struct_size().
Xilinx:
- Luca's changes clean up the xilinx-spi and xilinx-slave-serial drivers
and updates the comments and dt-bindings to reflect the fact it also
supports 7 series devices.
Core:
- Tom cleaned up the fpga-bridge / fpga-mgr core by removing some
dead-stores.
All patches have been reviewed on the mailing list, and have been in the
last few linux-next releases (as part of my for-next branch) without issues.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
-----BEGIN PGP SIGNATURE-----
iIUEABYIAC0WIQRORt0E5Sb/c/mZMgkXxQAtim5VSwUCXxJbZQ8cbWRmQGtlcm5l
bC5vcmcACgkQF8UALYpuVUsAIgD/eMm7y2Kgw0vnteTR4diLH0T8isv24ZjgRNL1
2FpscKsBAIp7XJUF9z1I85D476GAlArbxGK8BpscbKPJVw3mP9AC
=Owcl
-----END PGP SIGNATURE-----
Merge tag 'fpga-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga into char-misc-next
Moritz writes:
FPGA Manager changes for 5.9-rc1
Here is the (slightly larger than usual) patch set for the 5.9-rc1 merge
window.
DFL:
- Xu's changes add support for AFU interrupt handling and puts them to
use for error handling.
- Xu's other change also adds another device-id for the Intel FPGA PAC N3000.
- John's change converts from using get_user_pages() to
pin_user_pages().
- Gustavo's patch cleans up some of the allocation by using
struct_size().
Xilinx:
- Luca's changes clean up the xilinx-spi and xilinx-slave-serial drivers
and updates the comments and dt-bindings to reflect the fact it also
supports 7 series devices.
Core:
- Tom cleaned up the fpga-bridge / fpga-mgr core by removing some
dead-stores.
All patches have been reviewed on the mailing list, and have been in the
last few linux-next releases (as part of my for-next branch) without issues.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
* tag 'fpga-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
fpga: dfl: pci: add device id for Intel FPGA PAC N3000
Documentation: fpga: dfl: add descriptions for interrupt related interfaces.
fpga: dfl: afu: add AFU interrupt support
fpga: dfl: fme: add interrupt support for global error reporting
fpga: dfl: afu: add interrupt support for port error reporting
fpga: dfl: introduce interrupt trigger setting API
fpga: dfl: pci: add irq info for feature devices enumeration
fpga: dfl: parse interrupt info for feature devices on enumeration
fpga manager: xilinx-spi: check INIT_B pin during write_init
dt-bindings: fpga: xilinx-slave-serial: add optional INIT_B GPIO
fpga: Fix dead store in fpga-bridge.c
fpga: Fix dead store fpga-mgr.c
fpga: dfl: Use struct_size() in kzalloc()
fpga manager: xilinx-spi: remove unneeded, mistyped variables
fpga manager: xilinx-spi: valid for the 7 Series too
dt-bindings: fpga: xilinx-slave-serial: valid for the 7 Series too
fpga: dfl: afu: convert get_user_pages() --> pin_user_pages()
amd-drm-next-5.9-2020-07-17:
amdgpu:
- SI UVD/VCE clock support
- Updates for Sienna Cichlid
- Expose drm rotation property
- Atomfirmware updates for renoir
- updates to GPUVM hub handling for different register layouts
- swSMU restructuring and cleanups
- RAS fixes
- DC fixes
- mode1 reset support for Sienna Cichlid
- Add support for Navy Flounder GPUs
amdkfd:
- Add SMI events watch interface
UAPI:
- Add amdkfd SMI events watch interface
Userspace which uses this interface:
2235ede34c
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200717132022.4014-1-alexander.deucher@amd.com
UAPI Changes:
Cross-subsystem Changes:
- Convert panel-dsi-cm and ingenic bindings to YAML.
- Add lockdep annotations for dma-fence. \o/
- Describe why indefinite fences are a bad idea
- Update binding for rocktech jh057n00900.
Core Changes:
- Add vblank workers.
- Use spin_(un)lock_irq instead of the irqsave/restore variants in crtc code.
- Add managed vram helpers.
- Convert more logging to drm functions.
- Replace more http links with https in core and drivers.
- Cleanup to ttm iomem functions and implementation.
- Remove TTM CMA memtype as it doesn't work correctly.
- Remove TTM_MEMTYPE_FLAG_MAPPABLE for many drivers that have no
unmappable memory resources.
Driver Changes:
- Add CRC support to nouveau, using the new vblank workers.
- Dithering and atomic state fix for nouveau.
- Fixes for Frida FRD350H54004 panel.
- Add support for OSD mode (sprite planes), IPU (scaling) and multiple
panels/bridges to ingenic.
- Use managed vram helpers in ast.
- Assorted small fixes to ingenic, i810, mxsfb.
- Remove optional unused ttm dummy functions.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl8YFw0ACgkQ/lWMcqZw
E8MlixAAk+lI5qZ95xtZL8Evdg4c70wYYLuPKz5JPb/lTYoV0MciFEUCF5J2Df9N
83oMB1fycCEe396fb0aQlzq7IzMV5RFF+Y4hrDSqq0m7prlK7EphVTmTlaSFovPW
nQQTXBuET9LzgM7Dnhu4MPbD75IeFZ+pT+58yr3oUjki3r9bf0KIgy9QnkanDwl1
nH1b2MCCqvVjgca8zk3NpD4H2FwOpLL87/SQmRINR9CshvuACZD6zRLoJuKrWGzs
XZDVifhTib/ZfONr6tTWgsfv5d4IEifKml6XOV5OPy0K37u9tG0MmjJOm7pswq1/
F8oyvNWbfP7IOgTeBvT3sDMgVv4v8rvHumYoL+J4v0Sg4Qpsro/KDX9aLQLT1SIA
ZlHahSxW10H699UrV4Lr6DnW1caTaWLuvJyqmo838MNhskuEmKFWGxOPH8oGqcwW
2/Hk8Ni+z2q7do+VWwezniy9k2d4AHF40B1ZjzWMR3dCQdz/sCHd7YY4fLOeGEF3
C5zx6On9+S80iok4zfSATI/uVd+zwsngaqGsxZkoOhLum7xS1s8JwIyeVjTPQ2OX
iOKH4vFu3sIBzq0dz1Wrha2uBRDur+nzDL2EqD9EuSBDMN0Du5cVic94QoaXNUeO
9yBiDNF8p8xLX2TdfzeGPx9hxchs+2Tx9GV1B62KNpkbIo3Ix0I=
=jDr3
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-07-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.9:
UAPI Changes:
Cross-subsystem Changes:
- Convert panel-dsi-cm and ingenic bindings to YAML.
- Add lockdep annotations for dma-fence. \o/
- Describe why indefinite fences are a bad idea
- Update binding for rocktech jh057n00900.
Core Changes:
- Add vblank workers.
- Use spin_(un)lock_irq instead of the irqsave/restore variants in crtc code.
- Add managed vram helpers.
- Convert more logging to drm functions.
- Replace more http links with https in core and drivers.
- Cleanup to ttm iomem functions and implementation.
- Remove TTM CMA memtype as it doesn't work correctly.
- Remove TTM_MEMTYPE_FLAG_MAPPABLE for many drivers that have no
unmappable memory resources.
Driver Changes:
- Add CRC support to nouveau, using the new vblank workers.
- Dithering and atomic state fix for nouveau.
- Fixes for Frida FRD350H54004 panel.
- Add support for OSD mode (sprite planes), IPU (scaling) and multiple
panels/bridges to ingenic.
- Use managed vram helpers in ast.
- Assorted small fixes to ingenic, i810, mxsfb.
- Remove optional unused ttm dummy functions.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d6bf269e-ccb2-8a7b-fdae-226e9e3f8274@linux.intel.com
Alexei Starovoitov says:
====================
pull-request: bpf-next 2020-07-21
The following pull-request contains BPF updates for your *net-next* tree.
We've added 46 non-merge commits during the last 6 day(s) which contain
a total of 68 files changed, 4929 insertions(+), 526 deletions(-).
The main changes are:
1) Run BPF program on socket lookup, from Jakub.
2) Introduce cpumap, from Lorenzo.
3) s390 JIT fixes, from Ilya.
4) teach riscv JIT to emit compressed insns, from Luke.
5) use build time computed BTF ids in bpf iter, from Yonghong.
====================
Purely independent overlapping changes in both filter.h and xdp.h
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the doubled word "the" in a comment.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: linux-raid@vger.kernel.org
Signed-off-by: Song Liu <songliubraving@fb.com>
The commit fe80536acf ("bareudp: Added attribute to enable & disable
rx metadata collection") breaks the the original(5.7) default behavior of
bareudp module to collect RX metadadata at the receive. It was added to
avoid the crash at the kernel neighbour subsytem when packet with metadata
from bareudp is processed. But it is no more needed as the
commit 394de110a7 ("net: Added pointer check for
dst->ops->neigh_lookup in dst_neigh_lookup_skb") solves this crash.
Fixes: fe80536acf ("bareudp: Added attribute to enable & disable rx metadata collection")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In environments where the preservation of audit events and predictable
usage of system memory are prioritized, admins may use a combination of
--backlog_wait_time and -b options at the risk of degraded performance
resulting from backlog waiting. In some cases, this risk may be
preferred to lost events or unbounded memory usage. Ideally, this risk
can be mitigated by making adjustments when backlog waiting is detected.
However, detection can be difficult using the currently available
metrics. For example, an admin attempting to debug degraded performance
may falsely believe a full backlog indicates backlog waiting. It may
turn out the backlog frequently fills up but drains quickly.
To make it easier to reliably track degraded performance to backlog
waiting, this patch makes the following changes:
Add a new field backlog_wait_time_total to the audit status reply.
Initialize this field to zero. Add to this field the total time spent
by the current task on scheduled timeouts while the backlog limit is
exceeded. Reset field to zero upon request via AUDIT_SET.
Tested on Ubuntu 18.04 using complementary changes to the
audit-userspace and audit-testsuite:
- https://github.com/linux-audit/audit-userspace/pull/134
- https://github.com/linux-audit/audit-testsuite/pull/97
Signed-off-by: Max Englander <max.englander@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Change the repeated word "it" in a comment to "it to".
Also insert a dash in the sentence to add clarity.
Link: https://lore.kernel.org/r/20200719003220.21250-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In order to support short clock counters, provide an ABI extension.
As a whole:
u64 time, delta, cyc = read_cycle_counter();
+ if (cap_user_time_short)
+ cyc = time_cycle + ((cyc - time_cycle) & time_mask);
delta = mul_u64_u32_shr(cyc, time_mult, time_shift);
if (cap_user_time_zero)
time = time_zero + delta;
delta += time_offset;
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200716051130.4359-6-leo.yan@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200719171428.60470-1-grandmaster@al2klimov.de
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
0+eXO4A=
=9EpR
-----END PGP SIGNATURE-----
Merge 5.8-rc6 into usb-next
We need the USB fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
0+eXO4A=
=9EpR
-----END PGP SIGNATURE-----
Merge 5.8-rc6 into tty-next
We need the serial/tty fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UAPI Changes:
Cross-subsystem Changes:
- Add ckoenig as dma-buf maintainer.
- Revert invalid fix for dma-fence-chain, and fix selftest.
- Add fixmes to amifb about APUS support.
- Use array3_size in fbcon_prepare_logo, and struct_size() in alloc_apertures.
- Fix leaks in neofb, fb/savage and omapfb.
- Other small fixes to fb code.
- Convert some dt bindings to schema for some panels, and fix simple-framebuffer dt example.
Core Changes:
- Add DRM_FORMAT_MOD_GENERIC_16_16_TILE as alias to DRM_FORMAT_MOD_SAMSUNG_16_16_TILE,
as it can be used more generic.
- Add support for multiple DispID extension blocks in edid.
- Use https instead of http for some of the urls.
- Use drm_* macros for logging in mipi-dsi and fb-helper.
- Further cleanup ttm_mem_reg handling.
- Remove duplicated words in comments.
Driver Changes:
- Use __drm_atomic_helper_crtc_reset in all atomic drivers.
- Add Amlogic Video FBC support to meson and fourcc to core.
- Refactor hisilicon's hibmc_drv_vdac.
- Create a TXP CRTC for vc4.
- Rework cursor support in ast.
- Fix runtime PM in STM.
- Allow bigger cursors in vkms.
- Cleanup sg handling in radeon and amdgpu, and stop creating dummy
gtt nodes with ttm fixed.
- Rework crtc handling in mgag200.
- Miscellaneous small fixes to meson, vgem, bridge/dw-hdmi,
panel/auo,b116xw03, panel/LG LB070WV8, lima, bridge/sil_sii8620,
virtio, tilcdc.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl8QO6cACgkQ/lWMcqZw
E8O/xxAAick0N8Q8sf0bL7Emh7iiHeUl2fWzbfQ8VmKEOjoScO9KYz7SSuSW8868
qsy1pdI+T/ko2shl9w8hZ8aunNDOdCycL7F3WNSKT+SNAP2XY7R57xXjn0NCsfm/
iY3RsST6vU1gJMFyS/6U+4OddbTcqjDT5dwSD26/kva86s2CiS/P/I6dxoH0bcDg
Yo9QcNflZKjnP/0imRYDAmm3y7N09VKtYa2Df5dOCJiiijXxMTSQN6TB/TfFtYTi
CfyIm7dEF1Nnmy+jlxiYxAXVZYlPvIJ/7nTInO/gRGLhiEyIuG9u1lZSna9kRGrn
5eg+41u4sq3YnDB5+qZmMJ7yBKFIy51+5JweVQeaykBW8p4Z4Qrir2ENPLZWuyeC
CR1cOUUrUkSaMThy2H6IPe+T6BDzKpceuHnOxv7MmTfBSzLwRR7Bn216zrC33sET
i6AsS6Ir+lfkH26oGceceEHdL5biMjFuRPiq8MfzzEfnh1o7RZ2wvEg7gHV/QeiE
ugD7peLR28gJnupFQyBzcbyqKr761W7twgwAOvEOo3Up1LldxYLmQmc3VQeB84j2
mndhyBfXD6Jniuit2+PxuNXGRcK1oYExRxJKD9msZCkUMe1pezSDrHZcc+emnh2G
bqy8EPWcpCL0KkO/xICdJx57UwaLfAMsyP1C4u0vxy2GGSirxeg=
=XKYB
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-07-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.9:
UAPI Changes:
Cross-subsystem Changes:
- Add ckoenig as dma-buf maintainer.
- Revert invalid fix for dma-fence-chain, and fix selftest.
- Add fixmes to amifb about APUS support.
- Use array3_size in fbcon_prepare_logo, and struct_size() in alloc_apertures.
- Fix leaks in neofb, fb/savage and omapfb.
- Other small fixes to fb code.
- Convert some dt bindings to schema for some panels, and fix simple-framebuffer dt example.
Core Changes:
- Add DRM_FORMAT_MOD_GENERIC_16_16_TILE as alias to DRM_FORMAT_MOD_SAMSUNG_16_16_TILE,
as it can be used more generic.
- Add support for multiple DispID extension blocks in edid.
- Use https instead of http for some of the urls.
- Use drm_* macros for logging in mipi-dsi and fb-helper.
- Further cleanup ttm_mem_reg handling.
- Remove duplicated words in comments.
Driver Changes:
- Use __drm_atomic_helper_crtc_reset in all atomic drivers.
- Add Amlogic Video FBC support to meson and fourcc to core.
- Refactor hisilicon's hibmc_drv_vdac.
- Create a TXP CRTC for vc4.
- Rework cursor support in ast.
- Fix runtime PM in STM.
- Allow bigger cursors in vkms.
- Cleanup sg handling in radeon and amdgpu, and stop creating dummy
gtt nodes with ttm fixed.
- Rework crtc handling in mgag200.
- Miscellaneous small fixes to meson, vgem, bridge/dw-hdmi,
panel/auo,b116xw03, panel/LG LB070WV8, lima, bridge/sil_sii8620,
virtio, tilcdc.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8b360d65-f228-9286-d247-3004156a5254@linux.intel.com
Some PHCs like the ocelot/felix switch cannot emit generic periodic
output, but just PPS (pulse per second) signals, which:
- don't start from arbitrary absolute times, but are rather
phase-aligned to the beginning of [the closest next] second.
- have an optional phase offset relative to that beginning of the
second.
For those, it was initially established that they should reject any
other absolute time for the PTP_PEROUT_REQUEST than 0.000000000 [1].
But when it actually came to writing an application [2] that makes use
of this functionality, we realized that we can't really deal generically
with PHCs that support absolute start time, and with PHCs that don't,
without an explicit interface. Namely, in an ideal world, PHC drivers
would ensure that the "perout.start" value written to hardware will
result in a functional output. This means that if the PTP time has
become in the past of this PHC's current time, it should be
automatically fast-forwarded by the driver into a close enough future
time that is known to work (note: this is necessary only if the hardware
doesn't do this fast-forward by itself). But we don't really know what
is the status for PHC drivers in use today, so in the general sense,
user space would be risking to have a non-functional periodic output if
it simply asked for a start time of 0.000000000.
So let's introduce a flag for this type of reduced-functionality
hardware, named PTP_PEROUT_PHASE. The start time is just "soon", the
only thing we know for sure about this signal is that its rising edge
events, Rn, occur at:
Rn = perout.phase + n * perout.period
The "phase" in the periodic output structure is simply an alias to the
"start" time, since both cannot logically be specified at the same time.
Therefore, the binary layout of the structure is not affected.
[1]: https://patchwork.ozlabs.org/project/netdev/patch/20200320103726.32559-7-yangbo.lu@nxp.com/
[2]: https://www.mail-archive.com/linuxptp-devel@lists.sourceforge.net/msg04142.html
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are external event timestampers (PHCs with support for
PTP_EXTTS_REQUEST) that timestamp both event edges.
When those edges are very close (such as in the case of a short pulse),
there is a chance that the collected timestamp might be of the rising,
or of the falling edge, we never know.
There are also PHCs capable of generating periodic output with a
configurable duty cycle. This is good news, because we can space the
rising and falling edge out enough in time, that the risks to overrun
the 1-entry timestamp FIFO of the extts PHC are lower (example: the
perout PHC can be configured for a period of 1 second, and an "on" time
of 0.5 seconds, resulting in a duty cycle of 50%).
A flag is introduced for signaling that an on time is present in the
perout request structure, for preserving compatibility. Logically
speaking, the duty cycle cannot exceed 100% and the PTP core checks for
this.
PHC drivers that don't support this flag emit a periodic output of an
unspecified duty cycle, same as before.
The duty cycle is encoded as an "on" time, similar to the "start" and
"period" times, and reuses the reserved space while preserving overall
binary layout.
Pahole reported before:
struct ptp_perout_request {
struct ptp_clock_time start; /* 0 16 */
struct ptp_clock_time period; /* 16 16 */
unsigned int index; /* 32 4 */
unsigned int flags; /* 36 4 */
unsigned int rsv[4]; /* 40 16 */
/* size: 56, cachelines: 1, members: 5 */
/* last cacheline: 56 bytes */
};
And now:
struct ptp_perout_request {
struct ptp_clock_time start; /* 0 16 */
struct ptp_clock_time period; /* 16 16 */
unsigned int index; /* 32 4 */
unsigned int flags; /* 36 4 */
union {
struct ptp_clock_time on; /* 40 16 */
unsigned int rsv[4]; /* 40 16 */
}; /* 40 16 */
/* size: 56, cachelines: 1, members: 5 */
/* last cacheline: 56 bytes */
};
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add setsockopt SOL_IP/IP_RECVERR_4884 to return the offset to an
extension struct if present.
ICMP messages may include an extension structure after the original
datagram. RFC 4884 standardized this behavior. It stores the offset
in words to the extension header in u8 icmphdr.un.reserved[1].
The field is valid only for ICMP types destination unreachable, time
exceeded and parameter problem, if length is at least 128 bytes and
entire packet does not exceed 576 bytes.
Return the offset to the start of the extension struct when reading an
ICMP error from the error queue, if it matches the above constraints.
Do not return the raw u8 field. Return the offset from the start of
the user buffer, in bytes. The kernel does not return the network and
transport headers, so subtract those.
Also validate the headers. Return the offset regardless of validation,
as an invalid extension must still not be misinterpreted as part of
the original datagram. Note that !invalid does not imply valid. If
the extension version does not match, no validation can take place,
for instance.
For backward compatibility, make this optional, set by setsockopt
SOL_IP/IP_RECVERR_RFC4884. For API example and feature test, see
github.com/wdebruij/kerneltools/blob/master/tests/recv_icmp_v2.c
For forward compatibility, reserve only setsockopt value 1, leaving
other bits for additional icmp extensions.
Changes
v1->v2:
- convert word offset to byte offset from start of user buffer
- return in ee_data as u8 may be insufficient
- define extension struct and object header structs
- return len only if constraints met
- if returning len, also validate
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.
This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.
It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The constants are taken from the USXGMII Singleport Copper Interface
specification. The naming are based on the SGMII ones, but with an MDIO_
prefix.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces CAP_CHECKPOINT_RESTORE, a new capability facilitating
checkpoint/restore for non-root users.
Over the last years, The CRIU (Checkpoint/Restore In Userspace) team has
been asked numerous times if it is possible to checkpoint/restore a
process as non-root. The answer usually was: 'almost'.
The main blocker to restore a process as non-root was to control the PID
of the restored process. This feature available via the clone3 system
call, or via /proc/sys/kernel/ns_last_pid is unfortunately guarded by
CAP_SYS_ADMIN.
In the past two years, requests for non-root checkpoint/restore have
increased due to the following use cases:
* Checkpoint/Restore in an HPC environment in combination with a
resource manager distributing jobs where users are always running as
non-root. There is a desire to provide a way to checkpoint and
restore long running jobs.
* Container migration as non-root
* We have been in contact with JVM developers who are integrating
CRIU into a Java VM to decrease the startup time. These
checkpoint/restore applications are not meant to be running with
CAP_SYS_ADMIN.
We have seen the following workarounds:
* Use a setuid wrapper around CRIU:
See https://github.com/FredHutch/slurm-examples/blob/master/checkpointer/lib/checkpointer/checkpointer-suid.c
* Use a setuid helper that writes to ns_last_pid.
Unfortunately, this helper delegation technique is impossible to use
with clone3, and is thus prone to races.
See https://github.com/twosigma/set_ns_last_pid
* Cycle through PIDs with fork() until the desired PID is reached:
This has been demonstrated to work with cycling rates of 100,000 PIDs/s
See https://github.com/twosigma/set_ns_last_pid
* Patch out the CAP_SYS_ADMIN check from the kernel
* Run the desired application in a new user and PID namespace to provide
a local CAP_SYS_ADMIN for controlling PIDs. This technique has limited
use in typical container environments (e.g., Kubernetes) as /proc is
typically protected with read-only layers (e.g., /proc/sys) for
hardening purposes. Read-only layers prevent additional /proc mounts
(due to proc's SB_I_USERNS_VISIBLE property), making the use of new
PID namespaces limited as certain applications need access to /proc
matching their PID namespace.
The introduced capability allows to:
* Control PIDs when the current user is CAP_CHECKPOINT_RESTORE capable
for the corresponding PID namespace via ns_last_pid/clone3.
* Open files in /proc/pid/map_files when the current user is
CAP_CHECKPOINT_RESTORE capable in the root namespace, useful for
recovering files that are unreachable via the file system such as
deleted files, or memfd files.
See corresponding selftest for an example with clone3().
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200719100418.2112740-2-areber@redhat.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
It's all too easy to get confused by the V4L2_TYPE_IS_OUTPUT
macro, when it's used as !V4L2_TYPE_IS_OUTPUT.
Reduce the risk of confusion with macro to explicitly
check for the CAPTURE queue type case.
This change does not affect functionality, and it's
only intended to make the code more readable.
Suggested-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
[hverkuil-cisco@xs4all.nl: checkpatch: align with parenthesis]
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Add a new program type BPF_PROG_TYPE_SK_LOOKUP with a dedicated attach type
BPF_SK_LOOKUP. The new program kind is to be invoked by the transport layer
when looking up a listening socket for a new connection request for
connection oriented protocols, or when looking up an unconnected socket for
a packet for connection-less protocols.
When called, SK_LOOKUP BPF program can select a socket that will receive
the packet. This serves as a mechanism to overcome the limits of what
bind() API allows to express. Two use-cases driving this work are:
(1) steer packets destined to an IP range, on fixed port to a socket
192.0.2.0/24, port 80 -> NGINX socket
(2) steer packets destined to an IP address, on any port to a socket
198.51.100.1, any port -> L7 proxy socket
In its run-time context program receives information about the packet that
triggered the socket lookup. Namely IP version, L4 protocol identifier, and
address 4-tuple. Context can be further extended to include ingress
interface identifier.
To select a socket BPF program fetches it from a map holding socket
references, like SOCKMAP or SOCKHASH, and calls bpf_sk_assign(ctx, sk, ...)
helper to record the selection. Transport layer then uses the selected
socket as a result of socket lookup.
In its basic form, SK_LOOKUP acts as a filter and hence must return either
SK_PASS or SK_DROP. If the program returns with SK_PASS, transport should
look for a socket to receive the packet, or use the one selected by the
program if available, while SK_DROP informs the transport layer that the
lookup should fail.
This patch only enables the user to attach an SK_LOOKUP program to a
network namespace. Subsequent patches hook it up to run on local delivery
path in ipv4 and ipv6 stacks.
Suggested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-3-jakub@cloudflare.com
There are two existing SNMP counters, TCPDSACKRecv and TCPDSACKOfoRecv,
which are incremented depending on whether the DSACKed range is below
the cumulative ACK sequence number or not. Unfortunately, these both
implicitly assume each DSACK covers only one segment. This makes these
counters unusable for estimating spurious retransmit rates,
or real/non-spurious loss rate.
This patch introduces a new SNMP counter, TCPDSACKRecvSegs, which tracks
the estimated number of duplicate segments based on:
(DSACKed sequence range) / MSS. This counter is usable for estimating
spurious retransmit rates, or real/non-spurious loss rate.
Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
User space should receive the maximum edpm size from kernel driver,
similar to other edpm/ldpm related limits. Add an additional parameter to
the alloc_ucontext_resp structure for the edpm maximum size.
In addition, pass an indication from user-space to kernel
(and not just kernel to user) that the DPM sizes are supported.
This is for supporting backward-forward compatibility between driver and
lib for everything related to DPM transaction and limit sizes.
This should have been part of commit mentioned in Fixes tag.
Link: https://lore.kernel.org/r/20200707063100.3811-3-michal.kalderon@marvell.com
Fixes: 93a3d05f9d ("RDMA/qedr: Add kernel capability flags for dpm enabled mode")
Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In older FW versions the completion flag was treated as the ack flag in
edpm messages. commit ff937b916e ("qed: Add EDPM mode type for user-fw
compatibility") exposed the FW option of setting which mode the QP is in
by adding a flag to the qedr <-> qed API.
This patch adds the qedr <-> libqedr interface so that the libqedr can set
the flag appropriately and qedr can pass it down to FW. Flag is added for
backward compatibility with libqedr.
For older libs, this flag didn't exist and therefore set to zero.
Fixes: ac1b36e55a ("qedr: Add support for user context verbs")
Link: https://lore.kernel.org/r/20200707063100.3811-2-michal.kalderon@marvell.com
Signed-off-by: Yuval Bason <yuval.bason@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Here are number of small char/misc driver fixes for 5.8-rc6
Not that many complex fixes here, just a number of small fixes for
reported issues, and some new device ids. Nothing fancy.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXxBvPg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynJEQCfTmYNFFZ7WTx1FNsG/ScZLvgEC/QAnA19tK48
HBVDaLxodkGEa24u582V
=EcB/
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc into master
Pull char/misc fixes from Greg KH:
"Here are number of small char/misc driver fixes for 5.8-rc6
Not that many complex fixes here, just a number of small fixes for
reported issues, and some new device ids. Nothing fancy.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
virtio: virtio_console: add missing MODULE_DEVICE_TABLE() for rproc serial
intel_th: Fix a NULL dereference when hub driver is not loaded
intel_th: pci: Add Emmitsburg PCH support
intel_th: pci: Add Tiger Lake PCH-H support
intel_th: pci: Add Jasper Lake CPU support
virt: vbox: Fix guest capabilities mask check
virt: vbox: Fix VBGL_IOCTL_VMMDEV_REQUEST_BIG and _LOG req numbers to match upstream
uio_pdrv_genirq: fix use without device tree and no interrupt
uio_pdrv_genirq: Remove warning when irq is not specified
coresight: etmv4: Fix CPU power management setup in probe() function
coresight: cti: Fix error handling in probe
Revert "zram: convert remaining CLASS_ATTR() to CLASS_ATTR_RO()"
mei: bus: don't clean driver pointer
misc: atmel-ssc: lock with mutex instead of spinlock
phy: sun4i-usb: fix dereference of pointer phy0 before it is null checked
phy: rockchip: Fix return value of inno_dsidphy_probe()
phy: ti: j721e-wiz: Constify structs
phy: ti: am654-serdes: Constify regmap_config
phy: intel: fix enum type mismatch warning
phy: intel: Fix compilation error on FIELD_PREP usage
...
Introduce the capability to attach an eBPF program to cpumap entries.
The idea behind this feature is to add the possibility to define on
which CPU run the eBPF program if the underlying hw does not support
RSS. Current supported verdicts are XDP_DROP and XDP_PASS.
This patch has been tested on Marvell ESPRESSObin using xdp_redirect_cpu
sample available in the kernel tree to identify possible performance
regressions. Results show there are no observable differences in
packet-per-second:
$./xdp_redirect_cpu --progname xdp_cpu_map0 --dev eth0 --cpu 1
rx: 354.8 Kpps
rx: 356.0 Kpps
rx: 356.8 Kpps
rx: 356.3 Kpps
rx: 356.6 Kpps
rx: 356.6 Kpps
rx: 356.7 Kpps
rx: 355.8 Kpps
rx: 356.8 Kpps
rx: 356.8 Kpps
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/5c9febdf903d810b3415732e5cd98491d7d9067a.1594734381.git.lorenzo@kernel.org
As it has been already done for devmap, introduce 'struct bpf_cpumap_val'
to formalize the expected values that can be passed in for a CPUMAP.
Update cpumap code to use the struct.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/754f950674665dae6139c061d28c1d982aaf4170.1594734381.git.lorenzo@kernel.org
Drop doubled words "or" and "the" in several comments.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bump KFD ioctl after adding SMI events support
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When the compute is malfunctioning or performance drops, the system admin
will use SMI (System Management Interface) tool to monitor/diagnostic what
went wrong. This patch provides an event watch interface for the user
space to register devices and subscribe events they are interested. After
registered, the user can use annoymous file descriptor's poll function
with wait-time specified and wait for events to happen. Once an event
happens, the user can use read() to retrieve information related to the
event.
VM fault event is done in this patch.
v2: - remove UNREGISTER and add event ENABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
v5: move events enablement from ioctl to fd write
v6: sparse fix
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current SECCOMP_RET_USER_NOTIF API allows for syscall supervision over
an fd. It is often used in settings where a supervising task emulates
syscalls on behalf of a supervised task in userspace, either to further
restrict the supervisee's syscall abilities or to circumvent kernel
enforced restrictions the supervisor deems safe to lift (e.g. actually
performing a mount(2) for an unprivileged container).
While SECCOMP_RET_USER_NOTIF allows for the interception of any syscall,
only a certain subset of syscalls could be correctly emulated. Over the
last few development cycles, the set of syscalls which can't be emulated
has been reduced due to the addition of pidfd_getfd(2). With this we are
now able to, for example, intercept syscalls that require the supervisor
to operate on file descriptors of the supervisee such as connect(2).
However, syscalls that cause new file descriptors to be installed can not
currently be correctly emulated since there is no way for the supervisor
to inject file descriptors into the supervisee. This patch adds a
new addfd ioctl to remove this restriction by allowing the supervisor to
install file descriptors into the intercepted task. By implementing this
feature via seccomp the supervisor effectively instructs the supervisee
to install a set of file descriptors into its own file descriptor table
during the intercepted syscall. This way it is possible to intercept
syscalls such as open() or accept(), and install (or replace, like
dup2(2)) the supervisor's resulting fd into the supervisee. One
replacement use-case would be to redirect the stdout and stderr of a
supervisee into log file descriptors opened by the supervisor.
The ioctl handling is based on the discussions[1] of how Extensible
Arguments should interact with ioctls. Instead of building size into
the addfd structure, make it a function of the ioctl command (which
is how sizes are normally passed to ioctls). To support forward and
backward compatibility, just mask out the direction and size, and match
everything. The size (and any future direction) checks are done along
with copy_struct_from_user() logic.
As a note, the seccomp_notif_addfd structure is laid out based on 8-byte
alignment without requiring packing as there have been packing issues
with uapi highlighted before[2][3]. Although we could overload the
newfd field and use -1 to indicate that it is not to be used, doing
so requires changing the size of the fd field, and introduces struct
packing complexity.
[1]: https://lore.kernel.org/lkml/87o8w9bcaf.fsf@mid.deneb.enyo.de/
[2]: https://lore.kernel.org/lkml/a328b91d-fd8f-4f27-b3c2-91a9c45f18c0@rasmusvillemoes.dk/
[3]: https://lore.kernel.org/lkml/20200612104629.GA15814@ircssh-2.c.rugged-nimbus-611.internal
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Tycho Andersen <tycho@tycho.ws>
Cc: Jann Horn <jannh@google.com>
Cc: Robert Sesek <rsesek@google.com>
Cc: Chris Palmer <palmer@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Suggested-by: Matt Denton <mpdenton@google.com>
Link: https://lore.kernel.org/r/20200603011044.7972-4-sargun@sargun.me
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Reviewed-by: Will Drewry <wad@chromium.org>
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
This patch adds a new port attribute, IFLA_BRPORT_MRP_IN_OPEN, which
allows to notify the userspace when the node lost the contiuity of
MRP_InTest frames.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend the existing MRP_INFO to return status of MRP interconnect. In
case there is no MRP interconnect on the node then the role will be
disabled so the other attributes can be ignored.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend the existing MRP netlink attributes to allow to configure MRP
Interconnect:
IFLA_BRIDGE_MRP_IN_ROLE - the parameter type is br_mrp_in_role which
contains the interconnect id, the ring id, the interconnect role(MIM
or MIC) and the port ifindex that represents the interconnect port.
IFLA_BRIDGE_MRP_IN_STATE - the parameter type is br_mrp_in_state which
contains the interconnect id and the interconnect state.
IFLA_BRIDGE_MRP_IN_TEST - the parameter type is br_mrp_start_in_test
which contains the interconnect id, the interval at which to send
MRP_InTest frames, how many test frames can be missed before declaring
the interconnect ring open and the period which represents for how long
to send MRP_InTest frames.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull input fixes from Dmitry Torokhov:
"A few quirks for the Elan touchpad driver, another Thinkpad is being
switched over from PS/2 to native RMI4 interface, and we gave a brand
new SW_MACHINE_COVER switch definition"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - add more hardware ID for Lenovo laptops
Input: i8042 - add Lenovo XiaoXin Air 12 to i8042 nomux list
Revert "Input: elants_i2c - report resolution information for touch major"
Input: elan_i2c - only increment wakeup count on touch
Input: synaptics - enable InterTouch for ThinkPad X1E 1st gen
ARM: dts: n900: remove mmc1 card detect gpio
Input: add `SW_MACHINE_COVER`
Alexei Starovoitov says:
====================
pull-request: bpf-next 2020-07-13
The following pull-request contains BPF updates for your *net-next* tree.
We've added 36 non-merge commits during the last 7 day(s) which contain
a total of 62 files changed, 2242 insertions(+), 468 deletions(-).
The main changes are:
1) Avoid trace_printk warning banner by switching bpf_trace_printk to use
its own tracing event, from Alan.
2) Better libbpf support on older kernels, from Andrii.
3) Additional AF_XDP stats, from Ciara.
4) build time resolution of BTF IDs, from Jiri.
5) BPF_CGROUP_INET_SOCK_RELEASE hook, from Stanislav.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
It can be useful for the user to know the reason behind a dropped packet.
Introduce new counters which track drops on the receive path caused by:
1. rx ring being full
2. fill ring being empty
Also, on the tx path introduce a counter which tracks the number of times
we attempt pull from the tx ring when it is empty.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200708072835.4427-2-ciara.loftus@intel.com
Implement client side caching for NFSv4.2 extended attributes. The cache
is a per-inode hashtable, with name/value entries. There is one special
entry for the listxattr cache.
NFS inodes have a pointer to a cache structure. The cache structure is
allocated on demand, freed when the cache is invalidated.
Memory shrinkers keep the size in check. Large entries (> PAGE_SIZE)
are collected by a separate shrinker, and freed more aggressively
than others.
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Add definitions for the new operations, errors and flags as defined
in RFC 8276 (File System Extended Attributes in NFSv4).
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The second line of the description for event_type is before the first.
Move it to after the first line.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Pull networking fixes from David Miller:
1) Restore previous behavior of CAP_SYS_ADMIN wrt loading networking
BPF programs, from Maciej Żenczykowski.
2) Fix dropped broadcasts in mac80211 code, from Seevalamuthu
Mariappan.
3) Slay memory leak in nl80211 bss color attribute parsing code, from
Luca Coelho.
4) Get route from skb properly in ip_route_use_hint(), from Miaohe Lin.
5) Don't allow anything other than ARPHRD_ETHER in llc code, from Eric
Dumazet.
6) xsk code dips too deeply into DMA mapping implementation internals.
Add dma_need_sync and use it. From Christoph Hellwig
7) Enforce power-of-2 for BPF ringbuf sizes. From Andrii Nakryiko.
8) Check for disallowed attributes when loading flow dissector BPF
programs. From Lorenz Bauer.
9) Correct packet injection to L3 tunnel devices via AF_PACKET, from
Jason A. Donenfeld.
10) Don't advertise checksum offload on ipa devices that don't support
it. From Alex Elder.
11) Resolve several issues in TCP MD5 signature support. Missing memory
barriers, bogus options emitted when using syncookies, and failure
to allow md5 key changes in established states. All from Eric
Dumazet.
12) Fix interface leak in hsr code, from Taehee Yoo.
13) VF reset fixes in hns3 driver, from Huazhong Tan.
14) Make loopback work again with ipv6 anycast, from David Ahern.
15) Fix TX starvation under high load in fec driver, from Tobias
Waldekranz.
16) MLD2 payload lengths not checked properly in bridge multicast code,
from Linus Lüssing.
17) Packet scheduler code that wants to find the inner protocol
currently only works for one level of VLAN encapsulation. Allow
Q-in-Q situations to work properly here, from Toke
Høiland-Jørgensen.
18) Fix route leak in l2tp, from Xin Long.
19) Resolve conflict between the sk->sk_user_data usage of bpf reuseport
support and various protocols. From Martin KaFai Lau.
20) Fix socket cgroup v2 reference counting in some situations, from
Cong Wang.
21) Cure memory leak in mlx5 connection tracking offload support, from
Eli Britstein.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (146 commits)
mlxsw: pci: Fix use-after-free in case of failed devlink reload
mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()
net: macb: fix call to pm_runtime in the suspend/resume functions
net: macb: fix macb_suspend() by removing call to netif_carrier_off()
net: macb: fix macb_get/set_wol() when moving to phylink
net: macb: mark device wake capable when "magic-packet" property present
net: macb: fix wakeup test in runtime suspend/resume routines
bnxt_en: fix NULL dereference in case SR-IOV configuration fails
libbpf: Fix libbpf hashmap on (I)LP32 architectures
net/mlx5e: CT: Fix memory leak in cleanup
net/mlx5e: Fix port buffers cell size value
net/mlx5e: Fix 50G per lane indication
net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash
net/mlx5e: Fix VXLAN configuration restore after function reload
net/mlx5e: Fix usage of rcu-protected pointer
net/mxl5e: Verify that rpriv is not NULL
net/mlx5: E-Switch, Fix vlan or qos setting in legacy mode
net/mlx5: Fix eeprom support for SFP module
cgroup: Fix sock_cgroup_data on big-endian.
selftests: bpf: Fix detach from sockmap tests
...
When SECCOMP_IOCTL_NOTIF_ID_VALID was first introduced it had the wrong
direction flag set. While this isn't a big deal as nothing currently
enforces these bits in the kernel, it should be defined correctly. Fix
the define and provide support for the old command until it is no longer
needed for backward compatibility.
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Kees Cook <keescook@chromium.org>
This patch adds a new capability KVM_CAP_SMALLER_MAXPHYADDR which
allows userspace to query if the underlying architecture would
support GUEST_MAXPHYADDR < HOST_MAXPHYADDR and hence act accordingly
(e.g. qemu can decide if it should warn for -cpu ..,phys-bits=X)
The complications in this patch are due to unexpected (but documented)
behaviour we see with NPF vmexit handling in AMD processor. If
SVM is modified to add guest physical address checks in the NPF
and guest #PF paths, we see the followning error multiple times in
the 'access' test in kvm-unit-tests:
test pte.p pte.36 pde.p: FAIL: pte 2000021 expected 2000001
Dump mapping: address: 0x123400000000
------L4: 24c3027
------L3: 24c4027
------L2: 24c5021
------L1: 1002000021
This is because the PTE's accessed bit is set by the CPU hardware before
the NPF vmexit. This is handled completely by hardware and cannot be fixed
in software.
Therefore, availability of the new capability depends on a boolean variable
allow_smaller_maxphyaddr which is set individually by VMX and SVM init
routines. On VMX it's always set to true, on SVM it's only set to true
when NPT is not enabled.
CC: Tom Lendacky <thomas.lendacky@amd.com>
CC: Babu Moger <babu.moger@amd.com>
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Message-Id: <20200710154811.418214-10-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add an interface to report offloaded UDP ports via ethtool netlink.
Now that core takes care of tracking which UDP tunnel ports the NICs
are aware of we can quite easily export this information out to
user space.
The responsibility of writing the netlink dumps is split between
ethtool code and udp_tunnel_nic.c - since udp_tunnel module may
not always be loaded, yet we should always report the capabilities
of the NIC.
$ ethtool --show-tunnels eth0
Tunnel information for eth0:
UDP port table 0:
Size: 4
Types: vxlan
No entries
UDP port table 1:
Size: 4
Types: geneve, vxlan-gpe
Entries (1):
port 1230, vxlan-gpe
v4:
- back to v2, build fix is now directly in udp_tunnel.h
v3:
- don't compile ETHTOOL_MSG_TUNNEL_INFO_GET in if CONFIG_INET
not set.
v2:
- fix string set count,
- reorder enums in the uAPI,
- fix type of ETHTOOL_A_TUNNEL_UDP_TABLE_TYPES to bitset
in docs and comments.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the "PID" category QPs have same PID will be bound to same counter;
If this category is not set then QPs have different PIDs will be bound
to same counter.
This is implemented for 2 reasons:
1. The counter is a limited resource, while there may be dozens of
applications, each of which creates several types of QPs, which means
it may doesn't have enough counter.
2. The system administrator needs all QPs created by all applications
with same type bound to one counter.
The counter name and PID is only make sense when "PID" category are
configured.
This category can also be used in combine with others, e.g. QP type.
Link: https://lore.kernel.org/r/20200702082933.424537-2-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl8IkFIQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgph7tEACdkJ9CupSjVQBJ35/2wBiUCU6lh0iUbru3
oURuw9AdjzT75en68LOyrhfdGCkNX8xrIoHnh9pGukoPid//R3rAsk+5j4Doc7SY
FBZuH7HbOC/gdTAzSd0tLyaddmf06eq0PPmlWQGUYu29Vy49nvLDM2V+455OgkT9
csCNk/Fq/np7IKVZW56KZ1Iz90Dtp8BVwFJeFdzFK5cpCjKXTyvzS/5GEFqz0rc3
AwvxkBCpDjXQqh5OOs2qM18QUeJAQV0PX1x+XjtASMB3dr4W8D5KDEhaQwiXiGOT
j7yHX/JThtOYI41IiKpF/LP9EQAkavIT1n7ZjXwablq6ecQjj24DxCe0h1uwnKyW
G5Hztjcb2YMhhDVj4q6m/uzD13+ccIeEmVoKjRuGj+0YDFETIBYK8U4GElu7VUIr
yUKqy7jFEHKFZj5eQz5JDl08+zq/ocZHJctqZhCB3DWMbE5f+VCzNGEqGGs2IiRS
J7/S8jB1j7Te8jFXBN3uaNpuj+U/JlYCPqO83fxF14ar3wZ1mblML4cw0sXUJazO
1YI3LU91qBeCNK/9xUc43MzzG/SgdIKJjfoaYYPBPMG5SzVYAtRoUUiNf942UE1k
7u8/leql+RqwSLedjU/AQqzs29p7wcK8nzFBApmTfiiBHYcRbJ6Fxp3/UwTOR3tQ
kwmkUtivtQ==
=31iz
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- Fix memleak for error path in registered files (Yang)
- Export CQ overflow state in flags, necessary to fix a case where
liburing doesn't know if it needs to enter the kernel (Xiaoguang)
- Fix for a regression in when user memory is accounted freed, causing
issues with back-to-back ring exit + init if the ulimit -l setting is
very tight.
* tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-block:
io_uring: account user memory freed when exit has been queued
io_uring: fix memleak in io_sqe_files_register()
io_uring: fix memleak in __io_sqe_files_update()
io_uring: export cq overflow status to userspace
include/uapi/linux/raw.h leaks CONFIG_MAX_RAW_DEVS to userspace.
Userspace programs cannot use MAX_RAW_MINORS since CONFIG_MAX_RAW_DEVS
is not available anyway.
Remove the MAX_RAW_MINORS definition from the exported header, and use
CONFIG_MAX_RAW_DEVS in drivers/char/raw.c
While I was here, I converted printk(KERN_WARNING ...) to pr_warn(...)
and stretched the warning message.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20200617083313.183184-1-masahiroy@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Upstream VirtualBox has defined and is using a few new request types for
vmmdev requests passed through /dev/vboxguest to the hypervisor.
Add the defines for these to vbox_vmmdev_types.h and add add them to the
whitelists of vmmdev requests which userspace is allowed to make.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1789545
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200709120858.63928-7-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add support for the new VBG_IOCTL_ACQUIRE_GUEST_CAPABILITIES ioctl, this
is necessary for automatic resizing of the guest resolution to match the
VM-window size to work with the new VMSVGA virtual GPU which is now the
new default in VirtualBox.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1789545
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200709120858.63928-6-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Until this commit the mainline kernel version (this version) of the
vboxguest module contained a bug where it defined
VBGL_IOCTL_VMMDEV_REQUEST_BIG and VBGL_IOCTL_LOG using
_IOC(_IOC_READ | _IOC_WRITE, 'V', ...) instead of
_IO(V, ...) as the out of tree VirtualBox upstream version does.
Since the VirtualBox userspace bits are always built against VirtualBox
upstream's headers, this means that so far the mainline kernel version
of the vboxguest module has been failing these 2 ioctls with -ENOTTY.
I guess that VBGL_IOCTL_VMMDEV_REQUEST_BIG is never used causing us to
not hit that one and sofar the vboxguest driver has failed to actually
log any log messages passed it through VBGL_IOCTL_LOG.
This commit changes the VBGL_IOCTL_VMMDEV_REQUEST_BIG and VBGL_IOCTL_LOG
defines to match the out of tree VirtualBox upstream vboxguest version,
while keeping compatibility with the old wrong request defines so as
to not break the kernel ABI in case someone has been using the old
request defines.
Fixes: f6ddd094f5 ("virt: Add vboxguest driver for Virtual Box Guest integration UAPI")
Cc: stable@vger.kernel.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200709120858.63928-2-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a new attribute that indicates the split ability of devlink port.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new devlink port attribute that indicates the port's number of lanes.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
The attribute is not passed to user space in case the number of lanes is
invalid (0).
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
exposes basic inet socket attribute, plus some MPTCP socket
fields comprising PM status and MPTCP-level sequence numbers.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit bf9765145b ("sock: Make sk_protocol a 16-bit value")
the current size of 'sdiag_protocol' is not sufficient to represent
the possible protocol values.
This change introduces a new inet diag request attribute to let
user space specify the relevant protocol number using u32 values.
The attribute is parsed by inet diag core on get/dump command
and the extended protocol value, if available, is preferred to
'sdiag_protocol' to lookup the diag handler.
The parse attributed are exposed to all the diag handlers via
the cb->data.
Note that inet_diag_dump_one_icsk() is left unmodified, as it
will not be used by protocol using the extended attribute.
Suggested-by: David S. Miller <davem@davemloft.net>
Co-developed-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For those applications which are not willing to use io_uring_enter()
to reap and handle cqes, they may completely rely on liburing's
io_uring_peek_cqe(), but if cq ring has overflowed, currently because
io_uring_peek_cqe() is not aware of this overflow, it won't enter
kernel to flush cqes, below test program can reveal this bug:
static void test_cq_overflow(struct io_uring *ring)
{
struct io_uring_cqe *cqe;
struct io_uring_sqe *sqe;
int issued = 0;
int ret = 0;
do {
sqe = io_uring_get_sqe(ring);
if (!sqe) {
fprintf(stderr, "get sqe failed\n");
break;;
}
ret = io_uring_submit(ring);
if (ret <= 0) {
if (ret != -EBUSY)
fprintf(stderr, "sqe submit failed: %d\n", ret);
break;
}
issued++;
} while (ret > 0);
assert(ret == -EBUSY);
printf("issued requests: %d\n", issued);
while (issued) {
ret = io_uring_peek_cqe(ring, &cqe);
if (ret) {
if (ret != -EAGAIN) {
fprintf(stderr, "peek completion failed: %s\n",
strerror(ret));
break;
}
printf("left requets: %d\n", issued);
continue;
}
io_uring_cqe_seen(ring, cqe);
issued--;
printf("left requets: %d\n", issued);
}
}
int main(int argc, char *argv[])
{
int ret;
struct io_uring ring;
ret = io_uring_queue_init(16, &ring, 0);
if (ret) {
fprintf(stderr, "ring setup failed: %d\n", ret);
return 1;
}
test_cq_overflow(&ring);
return 0;
}
To fix this issue, export cq overflow status to userspace by adding new
IORING_SQ_CQ_OVERFLOW flag, then helper functions() in liburing, such as
io_uring_peek_cqe, can be aware of this cq overflow and do flush accordingly.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Define 100G, 200G and 400G link modes using 100Gbps per lane
LR, ER and FR are defined as a single link mode because they are
using same technology and by design are fully interoperable.
EEPROM content indicates if the module is LR, ER, or FR, and the
user space ethtool decoder is planned to support decoding these
modes in the EEPROM.
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
CC: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
More often than not, a failed VM-entry in an x86 production
environment is induced by a defective CPU. To help identify the bad
hardware, include the id of the last logical CPU to run a vCPU in the
information provided to userspace on a KVM exit for failed VM-entry or
for KVM internal errors not associated with emulation. The presence of
this additional information is indicated by a new capability,
KVM_CAP_LAST_CPU.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Message-Id: <20200603235623.245638-5-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Support for rejecting packets from the prerouting chain, from
Laura Garcia Liebana.
2) Remove useless assignment in pipapo, from Stefano Brivio.
3) On demand hook registration in IPVS, from Julian Anastasov.
4) Expire IPVS connection from process context to not overload
timers, also from Julian.
5) Fallback to conntrack TCP tracker to handle connection reuse
in IPVS, from Julian Anastasov.
6) Several patches to support for chain bindings.
7) Expose enum nft_chain_flags through UAPI.
8) Reject unsupported chain flags from the netlink control plane.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In the zoned storage model, the sectors within a zone are typically all
writeable. With the introduction of the Zoned Namespace (ZNS) Command
Set in the NVM Express organization, the model was extended to have a
specific writeable capacity.
Extend the zone descriptor data structure with a zone capacity field to
indicate to the user how many sectors in a zone are writeable.
Introduce backward compatibility in the zone report ioctl by extending
the zone report header data structure with a flags field to indicate if
the capacity field is available.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Matias Bjørling <matias.bjorling@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sometimes it's handy to know when the socket gets freed. In
particular, we'd like to try to use a smarter allocation of
ports for bpf_bind and explore the possibility of limiting
the number of SOCK_DGRAM sockets the process can have.
Implement BPF_CGROUP_INET_SOCK_RELEASE hook that triggers on
inet socket release. It triggers only for userspace sockets
(not in-kernel ones) and therefore has the same semantics as
the existing BPF_CGROUP_INET_SOCK_CREATE.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200706230128.4073544-2-sdf@google.com
Initially the thermal framework had a very simple notification
mechanism to send generic netlink messages to the userspace.
The notification function was never called from anywhere and the
corresponding dead code was removed. It was probably a first attempt
to introduce the netlink notification.
At LPC2018, the presentation "Linux thermal: User kernel interface",
proposed to create the notifications to the userspace via a kfifo.
The advantage of the kfifo is the performance. It is usually used from
a 1:1 communication channel where a driver captures data and sends it
as fast as possible to a userspace process.
The drawback is that only one process uses the notification channel
exclusively, thus no other process is allowed to use the channel to
get temperature or notifications.
This patch defines a generic netlink API to discover the current
thermal setup and adds event notifications as well as temperature
sampling. As any genetlink protocol, it can evolve and the versioning
allows to keep the backward compatibility.
In order to prevent the user from getting flooded with data on a
single channel, there are two multicast channels, one for the
temperature sampling when the thermal zone is updated and another one
for the events, so the user can get the events only without the
thermal zone temperature sampling.
Also, a list of commands to discover the thermal setup is added and
can be extended when needed.
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20200706105538.2159-3-daniel.lezcano@linaro.org
AFU (Accelerated Function Unit) is dynamic region of the DFL based FPGA,
and always defined by users. Some DFL based FPGA cards allow users to
implement their own interrupts in AFU. In order to support this,
hardware implements a new UINT (AFU Interrupt) private feature with
related capability register which describes the number of supported
AFU interrupts as well as the local index of the interrupts for
software enumeration, and from software side, driver follows the common
DFL interrupt notification and handling mechanism, and it implements
two ioctls below for user to query number of irqs supported and set/unset
interrupt triggers.
Ioctls:
* DFL_FPGA_PORT_UINT_GET_IRQ_NUM
get the number of irqs, which is used to determine how many interrupts
UINT feature supports.
* DFL_FPGA_PORT_UINT_SET_IRQ
set/unset eventfds as AFU interrupt triggers.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Error reporting interrupt is very useful to notify users that some
errors are detected by the hardware. Once users are notified, they
could query hardware logged error states, no need to continuously
poll on these states.
This patch adds interrupt support for fme global error reporting sub
feature. It follows the common DFL interrupt notification and handling
mechanism. And it implements two ioctls below for user to query
number of irqs supported, and set/unset interrupt triggers.
Ioctls:
* DFL_FPGA_FME_ERR_GET_IRQ_NUM
get the number of irqs, which is used to determine whether/how many
interrupts fme error reporting feature supports.
* DFL_FPGA_FME_ERR_SET_IRQ
set/unset given eventfds as fme error reporting interrupt triggers.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Error reporting interrupt is very useful to notify users that some
errors are detected by the hardware. Once users are notified, they
could query hardware logged error states, no need to continuously
poll on these states.
This patch adds interrupt support for port error reporting sub feature.
It follows the common DFL interrupt notification and handling mechanism,
implements two ioctl commands below for user to query number of irqs
supported, and set/unset interrupt triggers.
Ioctls:
* DFL_FPGA_PORT_ERR_GET_IRQ_NUM
get the number of irqs, which is used to determine whether/how many
interrupts error reporting feature supports.
* DFL_FPGA_PORT_ERR_SET_IRQ
set/unset given eventfds as error interrupt triggers.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Expose UAPI to query MR, this will let user space application that
didn't allocate the MR but has access to by owning the matching command
FD to retrieve its information.
Link: https://lore.kernel.org/r/20200630093916.332097-8-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Introduce UAPI to query PD attributes, this can be used to retrieve PD
attributes by having the PD handle of the created one and owning the
command FD for the ucontxet.
Link: https://lore.kernel.org/r/20200630093916.332097-7-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Implement the query ucontext functionality by returning the original
ucontext data as part of an extra mlx5 attribute that holds the driver
UAPI response.
Link: https://lore.kernel.org/r/20200630093916.332097-6-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Expose UAPI to query ucontext, this will let user space application that
didn't allocate the ucontext but has access to by owning the matching
command FD to retrieve the ucontext information.
Link: https://lore.kernel.org/r/20200630093916.332097-4-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In cases such as DRM_FORMAT_MOD_SAMSUNG_16_16_TILE, the modifier
describes a generic pixel re-ordering which can be applicable to
multiple vendors.
Define an alias: DRM_FORMAT_MOD_GENERIC_16_16_TILE, which can be
used to describe this layout in a vendor-neutral way, and add a
comment about the expected usage of such "generic" modifiers.
Changes in v2:
- Move note about future cases to comment (Daniel)
Signed-off-by: Brian Starkey <brian.starkey@arm.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200626164800.11595-1-brian.starkey@arm.com
There exists the same macro definition of port type from 0 to 13
in include/uapi/linux/serial.h, remove these duplicated code in
include/uapi/linux/serial_core.h which includes the former header.
Acked-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1588853015-28392-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann says:
====================
pull-request: bpf-next 2020-07-04
The following pull-request contains BPF updates for your *net-next* tree.
We've added 73 non-merge commits during the last 17 day(s) which contain
a total of 106 files changed, 5233 insertions(+), 1283 deletions(-).
The main changes are:
1) bpftool ability to show PIDs of processes having open file descriptors
for BPF map/program/link/BTF objects, relying on BPF iterator progs
to extract this info efficiently, from Andrii Nakryiko.
2) Addition of BPF iterator progs for dumping TCP and UDP sockets to
seq_files, from Yonghong Song.
3) Support access to BPF map fields in struct bpf_map from programs
through BTF struct access, from Andrey Ignatov.
4) Add a bpf_get_task_stack() helper to be able to dump /proc/*/stack
via seq_file from BPF iterator progs, from Song Liu.
5) Make SO_KEEPALIVE and related options available to bpf_setsockopt()
helper, from Dmitry Yakunin.
6) Optimize BPF sk_storage selection of its caching index, from Martin
KaFai Lau.
7) Removal of redundant synchronize_rcu()s from BPF map destruction which
has been a historic leftover, from Alexei Starovoitov.
8) Several improvements to test_progs to make it easier to create a shell
loop that invokes each test individually which is useful for some CIs,
from Jesper Dangaard Brouer.
9) Fix bpftool prog dump segfault when compiled without skeleton code on
older clang versions, from John Fastabend.
10) Bunch of cleanups and minor improvements, from various others.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This new chain flag specifies that:
* the kernel dynamically allocates the chain name, if no chain name
is specified.
* If the immediate expression that refers to this chain is removed,
then this bound chain (and its content) is destroyed.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This enum definition was never exposed through UAPI. Rename
NFT_BASE_CHAIN to NFT_CHAIN_BASE for consistency.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This netlink attribute allows you to refer to chains inside a
transaction as an alternative to the name and the handle. The chain
binding support requires this new chain ID approach.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Amlogic uses a proprietary lossless image compression protocol and format
for their hardware video codec accelerators, either video decoders or
video input encoders.
It considerably reduces memory bandwidth while writing and reading
frames in memory.
The underlying storage is considered to be 3 components, 8bit or 10-bit
per component, YCbCr 420, single plane :
- DRM_FORMAT_YUV420_8BIT
- DRM_FORMAT_YUV420_10BIT
This modifier will be notably added to DMA-BUF frames imported from the V4L2
Amlogic VDEC decoder.
This introduces the basic layout composed of:
- a body content organized in 64x32 superblocks with 4096 bytes per
superblock in default mode.
- a 32 bytes per 128x64 header block
This layout is tranferrable between Amlogic SoCs supporting this modifier.
The Memory Saving option exist changing the layout superblock size to save memory when
using 8bit components pixels size.
Finally is also adds the Scatter Memory layout, meaning the header contains IOMMU
references to the compressed frames content to optimize memory access
and layout.
In this mode, only the header memory address is needed, thus the content
memory organization is tied to the current producer execution and cannot
be saved/dumped neither transferrable between Amlogic SoCs supporting this
modifier.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200703080728.25207-2-narmstrong@baylibre.com
This patch extends the function br_fill_ifinfo to return also the MRP
status for each instance on a bridge. It also adds a new filter
RTEXT_FILTER_MRP to return the MRP status only when this is set, not to
interfer with the vlans. The MRP status is return only on the bridge
interfaces.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add MRP attribute IFLA_BRIDGE_MRP_INFO to allow the userspace to get the
current state of the MRP instances. This is a nested attribute that
contains other attributes like, ring id, index of primary and secondary
port, priority, ring state, ring role.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
amd-drm-next-5.9-2020-07-01:
amdgpu:
- DC DMUB updates
- HDCP fixes
- Thermal interrupt fixes
- Add initial support for Sienna Cichlid GPU
- Add support for unique id on Arcturus
- Major swSMU code cleanup
- Skip BAR resizing if the bios already did id
- Fixes for DCN bandwidth calculations
- Runtime PM reference count fixes
- Add initial UVD support for SI
- Add support for ASSR on eDP links
- Lots of misc fixes and cleanups
- Enable runtime PM on vega10 boards that support BACO
- RAS fixes
- SR-IOV fixes
- Use IP discovery table on renoir
- DC stream synchronization fixes
amdkfd:
- Track SDMA usage per process
- Fix GCC10 compiler warnings
- Locking fix
radeon:
- Default to on chip GART for AGP boards on all arches
- Runtime PM reference count fixes
UAPI:
- Update comments to clarify MTYPE
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701155041.1102829-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
SAN Congestion Management generates ELS pkts whose size can vary and be >
64 bytes. Change the PUREX handling code to support non-standard ELS pkt
size.
Link: https://lore.kernel.org/r/20200630102229.29660-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Shyam Sundar <ssundar@marvell.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce helper bpf_get_task_stack(), which dumps stack trace of given
task. This is different to bpf_get_stack(), which gets stack track of
current task. One potential use case of bpf_get_task_stack() is to call
it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file.
bpf_get_task_stack() uses stack_trace_save_tsk() instead of
get_perf_callchain() for kernel stack. The benefit of this choice is that
stack_trace_save_tsk() doesn't require changes in arch/. The downside of
using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the
stack trace to unsigned long array. For 32-bit systems, we need to
translate it to u64 array.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com
Daniel Borkmann says:
====================
pull-request: bpf 2020-06-30
The following pull-request contains BPF updates for your *net* tree.
We've added 28 non-merge commits during the last 9 day(s) which contain
a total of 35 files changed, 486 insertions(+), 232 deletions(-).
The main changes are:
1) Fix an incorrect verifier branch elimination for PTR_TO_BTF_ID pointer
types, from Yonghong Song.
2) Fix UAPI for sockmap and flow_dissector progs that were ignoring various
arguments passed to BPF_PROG_{ATTACH,DETACH}, from Lorenz Bauer & Jakub Sitnicki.
3) Fix broken AF_XDP DMA hacks that are poking into dma-direct and swiotlb
internals and integrate it properly into DMA core, from Christoph Hellwig.
4) Fix RCU splat from recent changes to avoid skipping ingress policy when
kTLS is enabled, from John Fastabend.
5) Fix BPF ringbuf map to enforce size to be the power of 2 in order for its
position masking to work, from Andrii Nakryiko.
6) Fix regression from CAP_BPF work to re-allow CAP_SYS_ADMIN for loading
of network programs, from Maciej Żenczykowski.
7) Fix libbpf section name prefix for devmap progs, from Jesper Dangaard Brouer.
8) Fix formatting in UAPI documentation for BPF helpers, from Quentin Monnet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- bump version strings, by Simon Wunderlich
- update mailing list URL, by Sven Eckelmann
- fix typos and grammar in documentation, by Sven Eckelmann
- introduce a configurable per interface hop penalty,
by Linus Luessing
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAl769w8WHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoatoEACN9O+fCwDIlIdxw/bUlGFw1J+V
4rDReo5grIkl3BSdnLkkaPXTcazCWRoFhCdH6esTsF4/R/nwUa5AZuAbVd2NnCFU
H+hVc7q/w0Hqppx5APT0AghgLcp8mFuHV9jqtJOjv9Kvd05Il6XwGkALT8QX2e2P
ypMD+PBuQT1XCMl0v02lA0ki2TxvCtFYs1UznbAdl86GOGMua3t6fuRWQ2fpUMou
t9251NnwSvSyeYuI8Zws7Wza2S4FOilX/CAjjGi02D0xf7T/JDs3FiPd6obdNfe/
4j6R2dHWjxz1sb6bEdM3wfmUlU4StMQ+Vywpu1fh3gihLc2tmtMZhn6tJk3YjALt
zlwDO2p/si7FLrizfMS7mUesZJH7Ji79BZtZWXsdW3p1zU1jlfJh5dc2ZaKwL9qk
MSpd9DH2F459YC6ELcITLnhCymc7O/g3n3voNnSxi7uX4M5LsGt0mcx6g/iiFN16
nPNHzXo7WJjgRcmLEQ3kYeD7Kjqh5XEogKLnwhThBxCFIQkNFF3OYqWtJyj5Sg5o
+EredOAcq1Rc5CR//V43rQZLk4o1I9qQysKFqbjHohMIgLccz+47QAeBXrhku0ZT
XU6k09Q0pkuo3iVPYnIUTDUkHtvFtsCKR98+8tB5xDR5F155OpFg7AXaKIx5Iym8
4/eGrx22T+6TIeSGGg==
=ceu6
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-for-davem-20200630' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- update mailing list URL, by Sven Eckelmann
- fix typos and grammar in documentation, by Sven Eckelmann
- introduce a configurable per interface hop penalty,
by Linus Luessing
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This event code represents the state of a removable cover of a device.
Value 0 means that the cover is open or removed, value 1 means that the
cover is closed.
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Merlijn Wajer <merlijn@wizzup.org>
Link: https://lore.kernel.org/r/20200612125402.18393-2-merlijn@wizzup.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Some PCIe devices do not expect a PASID value in PRI Page Responses.
If the "PRG Response PASID Required" bit in the PRI capability is zero,
then the OS should not set the PASID field. Similarly on Arm SMMU,
responses to stall events do not have a PASID.
Currently iommu_page_response() systematically checks that the PASID in
the page response corresponds to the one in the page request. This can't
work with virtualization because a page response coming from a guest OS
won't have a PASID if the passed-through device does not require one.
Add a flag to page requests that declares whether the corresponding
response needs to have a PASID. When this flag isn't set, allow page
responses without PASID.
Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20200616144712.748818-1-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Cross-subsystem Changes:
- Improve dma-buf docs.
Core Changes:
- Add NV15, Q410, Q401 yuv formats.
- Add uncompressed AFBC modifier.
- Add DP helepr for reading Ignore MSA from DPCD.
- Add missing panel type for some panels
- Optimize drm/mm hole handling.
- Constify connector to infoframe functions.
- Add debugfs for VRR monitor range.
Driver Changes:
- Assorted small bugfixes in panfrost, malidp, panel/otm8009a.
- Convert tfp410 dt bindings to yaml, and rework time calculations.
- Add support for a few more simple panels.
- Cleanups and optimizations for ast.
- Allow adv7511 and simple-bridge to be used without connector creation.
- Cleanups to dw-hdmi function prototypes.
- Remove enabled bool from tiny/repaper and mipi-dbi, atomic handles it.
- Remove unused header file from dw-mipi-dsi
- Begin removing ttm_bo->offset.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl7103UACgkQ/lWMcqZw
E8NTPhAAiB/FjDzZq86qg61OSLeB8yEOHFfWvwRyydentg3aeIhuMXUKGEmBCwwZ
IAbmfbR2PwrpUF/iyZ0JuI2NWs9ErB2/9unhW9CyIo/Ij34LXL+1PW+qjq6FIq6O
quWMbZMm23rQbOnHXr9BHEJh9qs1WrvQkZF+8AxgJJ0JmLMjkbhcyupN6tZC2uWQ
2e1p0A3zRVb+Avk9xHaWz5pShYbZ68S7ysIpIOh0KGpQvmu7TExzqLgg/rJcqVKd
Ze6URs8SZ7Lwx6EM8ixJ4pd39OnrkWpEafacIqhpJv794Uftgec8mV7eBc4y8WJa
4/VNz7QbhtYKkFClTSvPhh001vUpDZnqdi/t7GsxHoY30InMpUD+zO29P1BkD+gf
geWE5NxV9Tv549GosuwggG4LWa+sZxl2iurdDATyxh2wips9zR1gNUzqK5rsnptZ
CN3OqeUaBRljM2oXV9o5/zc9fb3HMdFaLut1XKUX7fHOYrWXN+KBEv530/zaBTNq
7gzadci/VkebOxIJ/6wMFqqBedrNlEYAjD3X/WCJUk2Y1V4R60oMtBNVG27jW8ov
JY5wrt8HXMzLfdk2CIZ08mIXJmQqWHyFCBvJEtAyULMvKWUKzg32R4hqtirAJM5c
9w9SzZ4RjrJIBVKR8hl6zE8WGqH/BWRiJXa+NelUP3i64Hk/B44=
=72wu
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-06-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.9:
Cross-subsystem Changes:
- Improve dma-buf docs.
Core Changes:
- Add NV15, Q410, Q401 yuv formats.
- Add uncompressed AFBC modifier.
- Add DP helepr for reading Ignore MSA from DPCD.
- Add missing panel type for some panels
- Optimize drm/mm hole handling.
- Constify connector to infoframe functions.
- Add debugfs for VRR monitor range.
Driver Changes:
- Assorted small bugfixes in panfrost, malidp, panel/otm8009a.
- Convert tfp410 dt bindings to yaml, and rework time calculations.
- Add support for a few more simple panels.
- Cleanups and optimizations for ast.
- Allow adv7511 and simple-bridge to be used without connector creation.
- Cleanups to dw-hdmi function prototypes.
- Remove enabled bool from tiny/repaper and mipi-dbi, atomic handles it.
- Remove unused header file from dw-mipi-dsi
- Begin removing ttm_bo->offset.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b1e53620-7937-895c-bfcf-ed208be59c7c@linux.intel.com
Currently, drivers can only tell whether the link is up/down using
LINKSTATE_GET, but no additional information is given.
Add attributes to LINKSTATE_GET command in order to allow drivers
to expose the user more information in addition to link state to ease
the debug process, for example, reason for link down state.
Extended state consists of two attributes - link_ext_state and
link_ext_substate. The idea is to avoid 'vendor specific' states in order
to prevent drivers to use specific link_ext_state that can be in the future
common link_ext_state.
The substates allows drivers to add more information to the common
link_ext_state. For example, vendor can expose 'Autoneg' as link_ext_state
and add 'No partner detected during force mode' as link_ext_substate.
If a driver cannot pinpoint the extended state with the substate
accuracy, it is free to expose only the extended state and omit the
substate attribute.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to allow acting on dropped and/or ECN-marked packets, add two new
qevents to the RED qdisc: "early_drop" and "mark". Filters attached at
"early_drop" block are executed as packets are early-dropped, those
attached at the "mark" block are executed as packets are ECN-marked.
Two new attributes are introduced: TCA_RED_EARLY_DROP_BLOCK with the block
index for the "early_drop" qevent, and TCA_RED_MARK_BLOCK for the "mark"
qevent. Absence of these attributes signifies "don't care": no block is
allocated in that case, or the existing blocks are left intact in case of
the change callback.
For purposes of offloading, blocks attached to these qevents appear with
newly-introduced binder types, FLOW_BLOCK_BINDER_TYPE_RED_EARLY_DROP and
FLOW_BLOCK_BINDER_TYPE_RED_MARK.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some conflicts with ttm_bo->offset removal, but drm-misc-next needs updating to v5.8.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Metadata need not be collected in receive if the packet from bareudp
device is not targeted to openvswitch.
Signed-off-by: Martin <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FPGA user applications may be interested in interrupts generated by
DFL features. For example, users can implement their own FPGA
logics with interrupts enabled in AFU (Accelerated Function Unit,
dynamic region of DFL based FPGA). So user applications need to be
notified to handle these interrupts.
In order to allow userspace applications to monitor interrupts,
driver requires userspace to provide eventfds as interrupt
notification channels. Applications then poll/select on the eventfds
to get notified.
This patch introduces a generic helper functions to do eventfds binding
with given interrupts.
Sub feature drivers are expected to use XXX_GET_IRQ_NUM to query irq
info, and XXX_SET_IRQ to set eventfds for interrupts. This patch also
introduces helper functions for these 2 ioctls.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
In some setups multiple hard interfaces with similar link qualities
or throughput values are available. But people have expressed the desire
to consider one of them as a backup only.
Some creative solutions are currently in use: Such people are
configuring multiple batman-adv mesh/soft interfaces, wire them
together with some veth pairs and then tune the hop penalty to achieve
an effect similar to a tunable per interface hop penalty.
This patch introduces a new, configurable, per hard interface hop penalty
to simplify such setups.
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
* In mcde, set up fbdev after device registration and removde the last access
to dev->dev_private. Fixes an error message and a segmentation fault.
* Set the connector type for LogicPT Type 28 and newhaven_nhd_43_480272ef_atxl
panels.
* In uvesafb, fix the handling of the noblank option.
* Fix panel orientation for Asus T101HA and Acer S1003.
* Fix DMA configuration for sun4i if IOMMU is present.
* Fix regression in VT restoration. Unbreaks userspace (i.e., Xorg) VT handling.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl70X3MACgkQaA3BHVML
eiPP+ggAnquBWHn4QYVrw4tlcI8supVJurQflcYYYBzPksiijheJAE+74nfCWwTB
aijbyiPSp6tq8aP2NQpasW3pQdo0gQ2I9CwYzCxPTDk3X0iSxGEIToDieBTonmKn
UrvUZCQ5L775XJASus2gH4/b34Y0zIrQfCLj1ECOOSNjUWvTLntWrbJDl8S9SISh
teIAFZ1qAlcrTTNuAZ1lp+e+AlXEXKhi8+CqY0ltiV1q4JtO1SdOMnb5y0rxsYOy
H/KVWbZWdWkgIrZpJ1v/+jio0Mkh0jDi1ldJ6UDxWAv5viQnExICUY1d88Qw2xVo
73w4iJ/ujNIK3kDZQmvLMlVTjc1oSw==
=GRqS
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2020-06-25' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull (less than what git shortlog provides):
* In mcde, set up fbdev after device registration and removde the last access
to dev->dev_private. Fixes an error message and a segmentation fault.
* Set the connector type for LogicPT Type 28 and newhaven_nhd_43_480272ef_atxl
panels.
* In uvesafb, fix the handling of the noblank option.
* Fix panel orientation for Asus T101HA and Acer S1003.
* Fix DMA configuration for sun4i if IOMMU is present.
* Fix regression in VT restoration. Unbreaks userspace (i.e., Xorg) VT handling.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200625082717.GA14856@linux-uq9g
Minor overlapping changes in xfrm_device.c, between the double
ESP trailing bug fix setting the XFRM_INIT flag and the changes
in net-next preparing for bonding encryption support.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Don't insert ESP trailer twice in IPSEC code, from Huy Nguyen.
2) The default crypto algorithm selection in Kconfig for IPSEC is out
of touch with modern reality, fix this up. From Eric Biggers.
3) bpftool is missing an entry for BPF_MAP_TYPE_RINGBUF, from Andrii
Nakryiko.
4) Missing init of ->frame_sz in xdp_convert_zc_to_xdp_frame(), from
Hangbin Liu.
5) Adjust packet alignment handling in ax88179_178a driver to match
what the hardware actually does. From Jeremy Kerr.
6) register_netdevice can leak in the case one of the notifiers fail,
from Yang Yingliang.
7) Use after free in ip_tunnel_lookup(), from Taehee Yoo.
8) VLAN checks in sja1105 DSA driver need adjustments, from Vladimir
Oltean.
9) tg3 driver can sleep forever when we get enough EEH errors, fix from
David Christensen.
10) Missing {READ,WRITE}_ONCE() annotations in various Intel ethernet
drivers, from Ciara Loftus.
11) Fix scanning loop break condition in of_mdiobus_register(), from
Florian Fainelli.
12) MTU limit is incorrect in ibmveth driver, from Thomas Falcon.
13) Endianness fix in mlxsw, from Ido Schimmel.
14) Use after free in smsc95xx usbnet driver, from Tuomas Tynkkynen.
15) Missing bridge mrp configuration validation, from Horatiu Vultur.
16) Fix circular netns references in wireguard, from Jason A. Donenfeld.
17) PTP initialization on recovery is not done properly in qed driver,
from Alexander Lobakin.
18) Endian conversion of L4 ports in filters of cxgb4 driver is wrong,
from Rahul Lakkireddy.
19) Don't clear bound device TX queue of socket prematurely otherwise we
get problems with ktls hw offloading, from Tariq Toukan.
20) ipset can do atomics on unaligned memory, fix from Russell King.
21) Align ethernet addresses properly in bridging code, from Thomas
Martitz.
22) Don't advertise ipv4 addresses on SCTP sockets having ipv6only set,
from Marcelo Ricardo Leitner.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (149 commits)
rds: transport module should be auto loaded when transport is set
sch_cake: fix a few style nits
sch_cake: don't call diffserv parsing code when it is not needed
sch_cake: don't try to reallocate or unshare skb unconditionally
ethtool: fix error handling in linkstate_prepare_data()
wil6210: account for napi_gro_receive never returning GRO_DROP
hns: do not cast return value of napi_gro_receive to null
socionext: account for napi_gro_receive never returning GRO_DROP
wireguard: receive: account for napi_gro_receive never returning GRO_DROP
vxlan: fix last fdb index during dump of fdb with nhid
sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
tc-testing: avoid action cookies with odd length.
bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
net: dsa: sja1105: fix tc-gate schedule with single element
net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules
net: dsa: sja1105: unconditionally free old gating config
net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top
net: macb: free resources on failure path of at91ether_open()
net: macb: call pm_runtime_put_sync on failure path
...
This enhancement auto loads transport module when the transport
is set via SO_RDS_TRANSPORT socket option.
Reviewed-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The helper is used in tracing programs to cast a socket
pointer to a udp6_sock pointer.
The return value could be NULL if the casting is illegal.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20200623230815.3988481-1-yhs@fb.com
Three more helpers are added to cast a sock_common pointer to
an tcp_sock, tcp_timewait_sock or a tcp_request_sock for
tracing programs.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230811.3988277-1-yhs@fb.com
The helper is used in tracing programs to cast a socket
pointer to a tcp6_sock pointer.
The return value could be NULL if the casting is illegal.
A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added
so the verifier is able to deduce proper return types for the helper.
Different from the previous BTF_ID based helpers,
the bpf_skc_to_tcp6_sock() argument can be several possible
btf_ids. More specifically, all possible socket data structures
with sock_common appearing in the first in the memory layout.
This patch only added socket types related to tcp and udp.
All possible argument btf_id and return value btf_id
for helper bpf_skc_to_tcp6_sock() are pre-calculcated and
cached. In the future, it is even possible to precompute
these btf_id's at kernel build time.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com
When we modify or create a new fdb entry sometimes we want to avoid
refreshing its activity in order to track it properly. One example is
when a mac is received from EVPN multi-homing peer by FRR, which doesn't
want to change local activity accounting. It makes it static and sets a
flag to track its activity.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the ability to notify about activity of any entries
(static, permanent or ext_learn). EVPN multihoming peers need it to
properly and efficiently handle mac sync (peer active/locally active).
We add a new NFEA_ACTIVITY_NOTIFY attribute which is used to dump the
current activity state and to control if static entries should be monitored
at all. We use 2 bits - one to activate fdb entry tracking (disabled by
default) and the second to denote that an entry is inactive. We need
the second bit in order to avoid multiple notifications of inactivity.
Obviously this makes no difference for dynamic entries since at the time
of inactivity they get deleted, while the tracked non-dynamic entries get
the inactive bit set and get a notification.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an attribute to NDA which will contain all future fdb-specific
attributes in order to avoid polluting the NDA namespace with e.g.
bridge or vxlan specific attributes. The attribute is called
NDA_FDB_EXT_ATTRS and the structure would look like:
[NDA_FDB_EXT_ATTRS] = {
[NFEA_xxx]
}
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the past we had a pile of hacks to orchestrate access between fbdev
emulation and native kms clients. We've tried to streamline this, by
always preferring the kms side above fbdev calls when a drm master
exists, because drm master controls access to the display resources.
Unfortunately this breaks existing userspace, specifically Xorg. When
exiting Xorg first restores the console to text mode using the KDSET
ioctl on the vt. This does nothing, because a drm master is still
around. Then it drops the drm master status, which again does nothing,
because logind is keeping additional drm fd open to be able to
orchestrate vt switches. In the past this is the point where fbdev was
restored, as part of the ->lastclose hook on the drm side.
Now to fix this regression we don't want to go back to letting fbdev
restore things whenever it feels like, or to the pile of hacks we've
had before. Instead try and go with a minimal exception to make the
KDSET case work again, and nothing else.
This means that if userspace does a KDSET call when switching between
graphical compositors, there will be some flickering with fbcon
showing up for a bit. But a) that's not a regression and b) userspace
can fix it by improving the vt switching dance - logind should have
all the information it needs.
While pondering all this I'm also wondering wheter we should have a
SWITCH_MASTER ioctl to allow race-free master status handover. But
that's for another day.
v2: Somehow forgot to cc all the fbdev people.
v3: Fix typo Alex spotted.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208179
Cc: shlomo@fastmail.com
Reported-and-Tested-by: shlomo@fastmail.com
Cc: Michel Dänzer <michel@daenzer.net>
Fixes: 64914da24e ("drm/fbdev-helper: don't force restores")
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.7+
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Qiujun Huang <hqjagain@gmail.com>
Cc: Peter Rosin <peda@axentia.se>
Cc: linux-fbdev@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200624092910.3280448-1-daniel.vetter@ffwll.ch
This patch adds support of SO_KEEPALIVE flag and TCP related options
to bpf_setsockopt() routine. This is helpful if we want to enable or tune
TCP keepalive for applications which don't do it in the userspace code.
v3:
- update kernel-doc in uapi (Nikita Vetoshkin <nekto0n@yandex-team.ru>)
v4:
- update kernel-doc in tools too (Alexei Starovoitov)
- add test to selftests (Alexei Starovoitov)
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200620153052.9439-3-zeil@yandex-team.ru
For some reason, the TEST_ defines in the usb/ch9.h files did not have
the USB_ prefix on it, making it a bit confusing when reading the file,
as well as not the nicest thing to do in a uapi file.
So fix that up and add the USB_ prefix on to them, and fix up all
in-kernel usages. This included deleting the duplicate copy in the
net2272.h file.
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Mathias Nyman <mathias.nyman@intel.com>
Cc: Pawel Laszczak <pawell@cadence.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Jason Yan <yanaijie@huawei.com>
Cc: Jia-Ju Bai <baijiaju1990@gmail.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jules Irenge <jbi.octave@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Cc: Rob Gill <rrobgill@protonmail.com>
Cc: Macpaul Lin <macpaul.lin@mediatek.com>
Acked-by: Minas Harutyunyan <hminas@synopsys.com>
Acked-by: Bin Liu <b-liu@ti.com>
Acked-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Acked-by: Peter Chen <peter.chen@nxp.com>
Link: https://lore.kernel.org/r/20200618144206.2655890-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add support to get resource dump in raw format. It enable drivers to
return the entire device specific QP/CQ/MR context without a need from the
driver to set each field separately.
The raw query returns only the device specific data, general data is still
returned by using the existing queries.
Example:
$ rdma res show mr dev mlx5_1 mrn 2 -r -j
[{"ifindex":7,"ifname":"mlx5_1",
"data":[0,4,255,254,0,0,0,0,0,0,0,0,16,28,0,216,...]}]
Link: https://lore.kernel.org/r/20200623113043.1228482-9-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
RFC 4303 in section 3.3.3 suggests to disable anti-replay for manually
distributed ICVs in which case the sender does not need to monitor or
reset the counter. However, the sender still increments the counter and
when it reaches the maximum value, the counter rolls over back to zero.
This patch introduces new extra_flag XFRM_SA_XFLAG_OSEQ_MAY_WRAP which
allows sequence number to cycle in outbound packets if set. This flag is
used only in legacy and bmp code, because esn should not be negotiated
if anti-replay is disabled (see note in 3.3.3 section).
Signed-off-by: Petr Vaněk <pv@excello.cz>
Acked-by: Christophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
When producing the bpf-helpers.7 man page from the documentation from
the BPF user space header file, rst2man complains:
<stdin>:2636: (ERROR/3) Unexpected indentation.
<stdin>:2640: (WARNING/2) Block quote ends without a blank line; unexpected unindent.
Let's fix formatting for the relevant chunk (item list in
bpf_ringbuf_query()'s description), and for a couple other functions.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623153935.6215-1-quentin@isovalent.com
Switch most of BPF helper definitions from returning int to long. These
definitions are coming from comments in BPF UAPI header and are used to
generate bpf_helper_defs.h (under libbpf) to be later included and used from
BPF programs.
In actual in-kernel implementation, all the helpers are defined as returning
u64, but due to some historical reasons, most of them are actually defined as
returning int in UAPI (usually, to return 0 on success, and negative value on
error).
This actually causes Clang to quite often generate sub-optimal code, because
compiler believes that return value is 32-bit, and in a lot of cases has to be
up-converted (usually with a pair of 32-bit bit shifts) to 64-bit values,
before they can be used further in BPF code.
Besides just "polluting" the code, these 32-bit shifts quite often cause
problems for cases in which return value matters. This is especially the case
for the family of bpf_probe_read_str() functions. There are few other similar
helpers (e.g., bpf_read_branch_records()), in which return value is used by
BPF program logic to record variable-length data and process it. For such
cases, BPF program logic carefully manages offsets within some array or map to
read variable-length data. For such uses, it's crucial for BPF verifier to
track possible range of register values to prove that all the accesses happen
within given memory bounds. Those extraneous zero-extending bit shifts,
inserted by Clang (and quite often interleaved with other code, which makes
the issues even more challenging and sometimes requires employing extra
per-variable compiler barriers), throws off verifier logic and makes it mark
registers as having unknown variable offset. We'll study this pattern a bit
later below.
Another common pattern is to check return of BPF helper for non-zero state to
detect error conditions and attempt alternative actions in such case. Even in
this simple and straightforward case, this 32-bit vs BPF's native 64-bit mode
quite often leads to sub-optimal and unnecessary extra code. We'll look at
this pattern as well.
Clang's BPF target supports two modes of code generation: ALU32, in which it
is capable of using lower 32-bit parts of registers, and no-ALU32, in which
only full 64-bit registers are being used. ALU32 mode somewhat mitigates the
above described problems, but not in all cases.
This patch switches all the cases in which BPF helpers return 0 or negative
error from returning int to returning long. It is shown below that such change
in definition leads to equivalent or better code. No-ALU32 mode benefits more,
but ALU32 mode doesn't degrade or still gets improved code generation.
Another class of cases switched from int to long are bpf_probe_read_str()-like
helpers, which encode successful case as non-negative values, while still
returning negative value for errors.
In all of such cases, correctness is preserved due to two's complement
encoding of negative values and the fact that all helpers return values with
32-bit absolute value. Two's complement ensures that for negative values
higher 32 bits are all ones and when truncated, leave valid negative 32-bit
value with the same value. Non-negative values have upper 32 bits set to zero
and similarly preserve value when high 32 bits are truncated. This means that
just casting to int/u32 is correct and efficient (and in ALU32 mode doesn't
require any extra shifts).
To minimize the chances of regressions, two code patterns were investigated,
as mentioned above. For both patterns, BPF assembly was analyzed in
ALU32/NO-ALU32 compiler modes, both with current 32-bit int return type and
new 64-bit long return type.
Case 1. Variable-length data reading and concatenation. This is quite
ubiquitous pattern in tracing/monitoring applications, reading data like
process's environment variables, file path, etc. In such case, many pieces of
string-like variable-length data are read into a single big buffer, and at the
end of the process, only a part of array containing actual data is sent to
user-space for further processing. This case is tested in test_varlen.c
selftest (in the next patch). Code flow is roughly as follows:
void *payload = &sample->payload;
u64 len;
len = bpf_probe_read_kernel_str(payload, MAX_SZ1, &source_data1);
if (len <= MAX_SZ1) {
payload += len;
sample->len1 = len;
}
len = bpf_probe_read_kernel_str(payload, MAX_SZ2, &source_data2);
if (len <= MAX_SZ2) {
payload += len;
sample->len2 = len;
}
/* and so on */
sample->total_len = payload - &sample->payload;
/* send over, e.g., perf buffer */
There could be two variations with slightly different code generated: when len
is 64-bit integer and when it is 32-bit integer. Both variations were analysed.
BPF assembly instructions between two successive invocations of
bpf_probe_read_kernel_str() were used to check code regressions. Results are
below, followed by short analysis. Left side is using helpers with int return
type, the right one is after the switch to long.
ALU32 + INT ALU32 + LONG
=========== ============
64-BIT (13 insns): 64-BIT (10 insns):
------------------------------------ ------------------------------------
17: call 115 17: call 115
18: if w0 > 256 goto +9 <LBB0_4> 18: if r0 > 256 goto +6 <LBB0_4>
19: w1 = w0 19: r1 = 0 ll
20: r1 <<= 32 21: *(u64 *)(r1 + 0) = r0
21: r1 s>>= 32 22: r6 = 0 ll
22: r2 = 0 ll 24: r6 += r0
24: *(u64 *)(r2 + 0) = r1 00000000000000c8 <LBB0_4>:
25: r6 = 0 ll 25: r1 = r6
27: r6 += r1 26: w2 = 256
00000000000000e0 <LBB0_4>: 27: r3 = 0 ll
28: r1 = r6 29: call 115
29: w2 = 256
30: r3 = 0 ll
32: call 115
32-BIT (11 insns): 32-BIT (12 insns):
------------------------------------ ------------------------------------
17: call 115 17: call 115
18: if w0 > 256 goto +7 <LBB1_4> 18: if w0 > 256 goto +8 <LBB1_4>
19: r1 = 0 ll 19: r1 = 0 ll
21: *(u32 *)(r1 + 0) = r0 21: *(u32 *)(r1 + 0) = r0
22: w1 = w0 22: r0 <<= 32
23: r6 = 0 ll 23: r0 >>= 32
25: r6 += r1 24: r6 = 0 ll
00000000000000d0 <LBB1_4>: 26: r6 += r0
26: r1 = r6 00000000000000d8 <LBB1_4>:
27: w2 = 256 27: r1 = r6
28: r3 = 0 ll 28: w2 = 256
30: call 115 29: r3 = 0 ll
31: call 115
In ALU32 mode, the variant using 64-bit length variable clearly wins and
avoids unnecessary zero-extension bit shifts. In practice, this is even more
important and good, because BPF code won't need to do extra checks to "prove"
that payload/len are within good bounds.
32-bit len is one instruction longer. Clang decided to do 64-to-32 casting
with two bit shifts, instead of equivalent `w1 = w0` assignment. The former
uses extra register. The latter might potentially lose some range information,
but not for 32-bit value. So in this case, verifier infers that r0 is [0, 256]
after check at 18:, and shifting 32 bits left/right keeps that range intact.
We should probably look into Clang's logic and see why it chooses bitshifts
over sub-register assignments for this.
NO-ALU32 + INT NO-ALU32 + LONG
============== ===============
64-BIT (14 insns): 64-BIT (10 insns):
------------------------------------ ------------------------------------
17: call 115 17: call 115
18: r0 <<= 32 18: if r0 > 256 goto +6 <LBB0_4>
19: r1 = r0 19: r1 = 0 ll
20: r1 >>= 32 21: *(u64 *)(r1 + 0) = r0
21: if r1 > 256 goto +7 <LBB0_4> 22: r6 = 0 ll
22: r0 s>>= 32 24: r6 += r0
23: r1 = 0 ll 00000000000000c8 <LBB0_4>:
25: *(u64 *)(r1 + 0) = r0 25: r1 = r6
26: r6 = 0 ll 26: r2 = 256
28: r6 += r0 27: r3 = 0 ll
00000000000000e8 <LBB0_4>: 29: call 115
29: r1 = r6
30: r2 = 256
31: r3 = 0 ll
33: call 115
32-BIT (13 insns): 32-BIT (13 insns):
------------------------------------ ------------------------------------
17: call 115 17: call 115
18: r1 = r0 18: r1 = r0
19: r1 <<= 32 19: r1 <<= 32
20: r1 >>= 32 20: r1 >>= 32
21: if r1 > 256 goto +6 <LBB1_4> 21: if r1 > 256 goto +6 <LBB1_4>
22: r2 = 0 ll 22: r2 = 0 ll
24: *(u32 *)(r2 + 0) = r0 24: *(u32 *)(r2 + 0) = r0
25: r6 = 0 ll 25: r6 = 0 ll
27: r6 += r1 27: r6 += r1
00000000000000e0 <LBB1_4>: 00000000000000e0 <LBB1_4>:
28: r1 = r6 28: r1 = r6
29: r2 = 256 29: r2 = 256
30: r3 = 0 ll 30: r3 = 0 ll
32: call 115 32: call 115
In NO-ALU32 mode, for the case of 64-bit len variable, Clang generates much
superior code, as expected, eliminating unnecessary bit shifts. For 32-bit
len, code is identical.
So overall, only ALU-32 32-bit len case is more-or-less equivalent and the
difference stems from internal Clang decision, rather than compiler lacking
enough information about types.
Case 2. Let's look at the simpler case of checking return result of BPF helper
for errors. The code is very simple:
long bla;
if (bpf_probe_read_kenerl(&bla, sizeof(bla), 0))
return 1;
else
return 0;
ALU32 + CHECK (9 insns) ALU32 + CHECK (9 insns)
==================================== ====================================
0: r1 = r10 0: r1 = r10
1: r1 += -8 1: r1 += -8
2: w2 = 8 2: w2 = 8
3: r3 = 0 3: r3 = 0
4: call 113 4: call 113
5: w1 = w0 5: r1 = r0
6: w0 = 1 6: w0 = 1
7: if w1 != 0 goto +1 <LBB2_2> 7: if r1 != 0 goto +1 <LBB2_2>
8: w0 = 0 8: w0 = 0
0000000000000048 <LBB2_2>: 0000000000000048 <LBB2_2>:
9: exit 9: exit
Almost identical code, the only difference is the use of full register
assignment (r1 = r0) vs half-registers (w1 = w0) in instruction #5. On 32-bit
architectures, new BPF assembly might be slightly less optimal, in theory. But
one can argue that's not a big issue, given that use of full registers is
still prevalent (e.g., for parameter passing).
NO-ALU32 + CHECK (11 insns) NO-ALU32 + CHECK (9 insns)
==================================== ====================================
0: r1 = r10 0: r1 = r10
1: r1 += -8 1: r1 += -8
2: r2 = 8 2: r2 = 8
3: r3 = 0 3: r3 = 0
4: call 113 4: call 113
5: r1 = r0 5: r1 = r0
6: r1 <<= 32 6: r0 = 1
7: r1 >>= 32 7: if r1 != 0 goto +1 <LBB2_2>
8: r0 = 1 8: r0 = 0
9: if r1 != 0 goto +1 <LBB2_2> 0000000000000048 <LBB2_2>:
10: r0 = 0 9: exit
0000000000000058 <LBB2_2>:
11: exit
NO-ALU32 is a clear improvement, getting rid of unnecessary zero-extension bit
shifts.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200623032224.4020118-1-andriin@fb.com
Currently the MRP_PORT_ROLE_NONE has the value 0x2 but this is in conflict
with the IEC 62439-2 standard. The standard defines the following port
roles: primary (0x0), secondary(0x1), interconnect(0x2).
Therefore remove the port role none.
Fixes: 4714d13791 ("bridge: uapi: mrp: Add mrp attributes.")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Keepalived can set global static ip routes or virtual ip routes dynamically
following VRRP protocol states. Using a dedicated rtm_protocol will help
keeping track of it.
Changes in v2:
- fix tab/space indenting
Signed-off-by: Alexandre Cassen <acassen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the V4L2_FMT_FLAG_ENC_CAP_FRAME_INTERVAL flag to signal that
the coded frame interval can be set separately from the raw frame
interval for stateful encoders.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reviewed-by: Michael Tretter <m.tretter@pengutronix.de>
Acked-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
This patch lets user-space to request a non-consistent memory
allocation during CREATE_BUFS and REQBUFS ioctl calls.
= CREATE_BUFS
struct v4l2_create_buffers has seven 4-byte reserved areas,
so reserved[0] is renamed to ->flags. The struct, thus, now
has six reserved 4-byte regions.
= CREATE_BUFS32
struct v4l2_create_buffers32 has seven 4-byte reserved areas,
so reserved[0] is renamed to ->flags. The struct, thus, now
has six reserved 4-byte regions.
= REQBUFS
We use one bit of a ->reserved[1] member of struct v4l2_requestbuffers,
which is now renamed to ->flags. Unlike v4l2_create_buffers, struct
v4l2_requestbuffers does not have enough reserved room. Therefore for
backward compatibility ->reserved and ->flags were put into anonymous
union.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
By setting or clearing V4L2_FLAG_MEMORY_NON_CONSISTENT flag
user-space should be able to set or clear queue's NON_CONSISTENT
->dma_attrs. Queue's ->dma_attrs are passed to the underlying
allocator in __vb2_buf_mem_alloc(), so thus user-space is able
to request vb2 buffer's memory to be either consistent (coherent)
or non-consistent.
The patch set also adds a corresponding capability flag:
fill_buf_caps() reports V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
when queue supports user-space cache management hints. Note,
however, that MMAP_CACHE_HINTS capability only valid when the
queue is used for memory MMAP-ed streaming I/O.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
DIAGNOSE 0x318 (diag318) sets information regarding the environment
the VM is running in (Linux, z/VM, etc) and is observed via
firmware/service events.
This is a privileged s390x instruction that must be intercepted by
SIE. Userspace handles the instruction as well as migration. Data
is communicated via VCPU register synchronization.
The Control Program Name Code (CPNC) is stored in the SIE block. The
CPNC along with the Control Program Version Code (CPVC) are stored
in the kvm_vcpu_arch struct.
This data is reset on load normal and clear resets.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200622154636.5499-3-walling@linux.ibm.com
[borntraeger@de.ibm.com: fix sync_reg position]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Board serial number is a serial number, often available in PCI
*Vital Product Data*.
Also, update devlink-info.rst documentation file.
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PCI PF and VF devlink port can manage the function represented by
a devlink port.
Enable users to query port function's hardware address.
Example of a PCI VF port which supports a port function:
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
function:
hw_addr 00:11:22:33:44:66
$ devlink port show pci/0000:06:00.0/2 -jp
{
"port": {
"pci/0000:06:00.0/2": {
"type": "eth",
"netdev": "enp6s0pf0vf1",
"flavour": "pcivf",
"pfnum": 0,
"vfnum": 1,
"function": {
"hw_addr": "00:11:22:33:44:66"
}
}
}
}
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quite a lot of fixes here for no single reason. There's a collection of
the usual sort of device specific fixes and also a bunch of people have
been working on spidev and the userspace test program spidev_test so
they've got an unusually large collection of small fixes.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl7wl/8THGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0NJBB/4oTOxPIfkKQyPcSTQB8FC6AHJ5Z8FT
/4dtK4U32rn5Sms/CHNXMMFxomBgB46Ud19KJ/bEgqugJfI9Vl0bl9fZT+sDyVU3
rFPthVluIprFe1bITyfcgYmjRoznLRdzj9LMFsAWIe3Vmy584TdvXbl8n5COu5Wh
6sR6SuQnEn65x1HC4++3IsG70+ogz81sFRapk/7jXwS+YdwzWIqtN2s2IlUcTmcy
ylijl04XUSyo6e/v/sjZaaRmIz72mW+HbMKW4iC0mGZ6yuACoy7zop9BuwAg6S7z
dFDyqTZ5gq72ZeJc7fuco1y7jqQoKUC1oTRghPlbq2XG3Huta80l415l
=32lr
-----END PGP SIGNATURE-----
Merge tag 'spi-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"Quite a lot of fixes here for no single reason.
There's a collection of the usual sort of device specific fixes and
also a bunch of people have been working on spidev and the userspace
test program spidev_test so they've got an unusually large collection
of small fixes"
* tag 'spi-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spidev: fix a potential use-after-free in spidev_release()
spi: spidev: fix a race between spidev_release and spidev_remove
spi: stm32-qspi: Fix error path in case of -EPROBE_DEFER
spi: uapi: spidev: Use TABs for alignment
spi: spi-fsl-dspi: Free DMA memory with matching function
spi: tools: Add macro definitions to fix build errors
spi: tools: Make default_tx/rx and input_tx static
spi: dt-bindings: amlogic, meson-gx-spicc: Fix schema for meson-g12a
spi: rspi: Use requested instead of maximum bit rate
spi: spidev_test: Use %u to format unsigned numbers
spi: sprd: switch the sequence of setting WDG_LOAD_LOW and _HIGH
poll events should be 32-bits to cover EPOLLEXCLUSIVE.
Explicit word-swap the poll32_events for big endian to make sure the ABI
is not changed. We call this feature IORING_FEAT_POLL_32BITS,
applications who want to use EPOLLEXCLUSIVE should check the feature bit
first.
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- Fix the visibility of the region 'align' attribute. The new unit tests
for region alignment handling caught a corner case where the alignment
cannot be specified if the region is converted from static to dynamic
provisioning at runtime.
- Add support for device health retrieval for the persistent memory
supported by the papr_scm driver. This includes both the standard
sysfs "health flags" that the nfit persistent memory driver publishes
and a mechanism for the ndctl tool to retrieve a health-command payload.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEf41QbsdZzFdA8EfZHtKRamZ9iAIFAl7tMD8ACgkQHtKRamZ9
iAJNdA//aoULVhq/ZEo+tiPPXwgC9yHPJJ+Oe3+pcV2saWNeSBuHu047IdSQayRi
6VZgsQ3lLRlEmk5aUbK00T84tVtwN20fppw1H4XyqfyGg4hmW7O5HomY8ZyOAETc
0lEJet3+jf4Ym36BxBvfFTU9fOwtvLum5EHUw3Dc1Xl7UZVtp2zb0lklZx5xU/xr
/nKNkPktvj7ENtmIOKegPFC/eSeKtOmD9shzjFygLeG52ZU1n+TEvhm4LCUU7Z+Y
e5s/Ny3LUWzPY+bESXxTI1XpLthWOR6T+yPkNp6JoZHzk3P6vKawMGzRrvI1bUai
hOj7Bf6BeNnKqIXIIZOXO8/ISgveHbnvUp6Kk5Lq5OYzPuGn81zZhen3EfHkB0y2
Dc+xyceDsTeTlu2RGcrGrV60hCo2+5DqSLodu+kUAX6btvhtdrEgt3kmWAZSctk2
7gHy8H+54p5ocXS+eDqafX6qYGmQIdGbtfTtiTuZJKIIH+Dsh9+mcu3KtH1AwtpN
bV+/sbjuWOMmMBNnlVRtqhy1xA9bSVSaodthl+iWpw1nKXFPSKdl3w8ngSeZsfFD
eZnGa4hO5G02s96gozh1n61ye8KFJLk8gCjx100PQKCemvIWTWLIzlh2NjPnCdTx
SAI0vNfJgpmNZfb62NHYAGFwNwvOn4b7j9xFSqtHVh7+Vuyt6JM=
=qVy4
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"A feature (papr_scm health retrieval) and a fix (sysfs attribute
visibility) for v5.8.
Vaibhav explains in the merge commit below why missing v5.8 would be
painful and I agreed to try a -rc2 pull because only cosmetics kept
this out of -rc1 and his initial versions were posted in more than
enough time for v5.8 consideration:
'These patches are tied to specific features that were committed to
customers in upcoming distros releases (RHEL and SLES) whose
time-lines are tied to 5.8 kernel release.
Being able to track the health of an nvdimm is critical for our
customers that are running workloads leveraging papr-scm nvdimms.
Missing the 5.8 kernel would mean missing the distro timelines and
shifting forward the availability of this feature in distro kernels
by at least 6 months'
Summary:
- Fix the visibility of the region 'align' attribute.
The new unit tests for region alignment handling caught a corner
case where the alignment cannot be specified if the region is
converted from static to dynamic provisioning at runtime.
- Add support for device health retrieval for the persistent memory
supported by the papr_scm driver.
This includes both the standard sysfs "health flags" that the nfit
persistent memory driver publishes and a mechanism for the ndctl
tool to retrieve a health-command payload"
* tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm/region: always show the 'align' attribute
powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
powerpc/papr_scm: Fetch nvdimm health information from PHYP
seq_buf: Export seq_buf_printf
powerpc: Document details on H_SCM_HEALTH hcall
AFBC has a mode that guarantees use of AFBC with an uncompressed
payloads, we add a new modifier to support this mode.
V2: updated modifier comment
Signed-off-by: Ben Davis <ben.davis@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430083220.17347-1-ben.davis@arm.com
DRM_FORMAT_NV15 is a 2 plane format suitable for linear and 16x16
block-linear memory layouts (DRM_FORMAT_MOD_SAMSUNG_16_16_TILE). The
format is similar to P010 with 4:2:0 sub-sampling but has no padding
between components. Instead, luminance and chrominance samples are
grouped into 4s so that each group is packed into an integer number
of bytes:
YYYY = UVUV = 4 * 10 bits = 40 bits = 5 bytes
The '15' suffix refers to the optimum effective bits per pixel which is
achieved when the total number of luminance samples is a multiple of 8.
Q410 and Q401 are both 3 plane non-subsampled formats with 16 bits per
component, but only 10 bits are used and 6 are padded. 'Q' is chosen
as the first letter to denote 3 plane YUV444, (and is the next letter
along from P which is usually 2 plane).
V2: Updated block_w of NV15 to {4, 2, 0}
V3: Updated commit message to include specific modifier name
NV15:
Tested-by: Jonas Karlman <jonas@kwiboo.se>
Reviewed-by: Brian Starkey <brian.starkey@arm.com>
Signed-off-by: Ben Davis <ben.davis@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601162817.18230-1-ben.davis@arm.com
ID 1 is already used by the IOVA range capability, use ID 2.
Reported-by: Liu Yi L <yi.l.liu@intel.com>
Fixes: ad721705d0 ("vfio iommu: Add migration capability to report supported features")
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Replace hardcoded maximum USB string length (126 bytes) by definition
"USB_MAX_STRING_LEN".
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/1592471618-29428-1-git-send-email-macpaul.lin@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Several newer USB Device classes are not presently reported individually at
/sys/kernel/debug/usb/devices, (reported as "unk."). This patch adds the
following classes: 0fh (Personal Healthcare devices), 10h (USB Type-C combined
Audio/Video devices) 11h (USB billboard), 12h (USB Type-C Bridge). As defined
at [https://www.usb.org/defined-class-codes]
Corresponding classes defined in include/linux/usb/ch9.h.
Signed-off-by: Rob Gill <rrobgill@protonmail.com>
Link: https://lore.kernel.org/r/20200601211749.6878-1-rrobgill@protonmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexei Starovoitov says:
====================
pull-request: bpf 2020-06-17
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 2 day(s) which contain
a total of 14 files changed, 158 insertions(+), 59 deletions(-).
The main changes are:
1) Important fix for bpf_probe_read_kernel_str() return value, from Andrii.
2) [gs]etsockopt fix for large optlen, from Stanislav.
3) devmap allocation fix, from Toke.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the use-cases of close_range() is to drop file descriptors just before
execve(). This would usually be expressed in the sequence:
unshare(CLONE_FILES);
close_range(3, ~0U);
as pointed out by Linus it might be desirable to have this be a part of
close_range() itself under a new flag CLOSE_RANGE_UNSHARE.
This expands {dup,unshare)_fd() to take a max_fds argument that indicates the
maximum number of file descriptors to copy from the old struct files. When the
user requests that all file descriptors are supposed to be closed via
close_range(min, max) then we can cap via unshare_fd(min) and hence don't need
to do any of the heavy fput() work for everything above min.
The patch makes it so that if CLOSE_RANGE_UNSHARE is requested and we do in
fact currently share our file descriptor table we create a new private copy.
We then close all fds in the requested range and finally after we're done we
install the new fd table.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.
The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a 'struct nd_pkg_pdsm' and a maximal union named
'nd_pdsm_payload'. These new structs together with 'struct nd_cmd_pkg'
for a pdsm envelop thats sent by libndctl to libnvdimm and serviced by
papr_scm in 'papr_scm_service_pdsm()'. The PDSM request is
communicated by member 'struct nd_cmd_pkg.nd_command' together with
other information on the pdsm payload (size-in, size-out).
The patch also introduces 'struct pdsm_cmd_desc' instances of which
are stored in an array __pdsm_cmd_descriptors[] indexed with PDSM cmd
and corresponding access function pdsm_cmd_desc() is
introduced. 'struct pdsm_cdm_desc' holds the service function for a
given PDSM and corresponding payload in/out sizes.
A new function papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm. The function performs validation on the PDSM
payload based on info present in corresponding PDSM descriptor and if
valid calls the 'struct pdcm_cmd_desc.service' function to service the
PDSM.
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200615124407.32596-6-vaibhav@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fix definition of bpf_ringbuf_output() in UAPI header comments, which is used
to generate libbpf's bpf_helper_defs.h header. Return value is a number (error
code), not a pointer.
Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200615214926.3638836-1-andriin@fb.com
includes the per-inode DAX support, which was dependant on the DAX
infrastructure which came in via the XFS tree, and a number of
regression and bug fixes; most notably the "BUG: using
smp_processor_id() in preemptible code in ext4_mb_new_blocks" reported
by syzkaller.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAl7mgCcACgkQ8vlZVpUN
gaPftwf8C4w/7SG+CYLdwg0d2u9TKk77yDuWaioFHOcMSjZvG4TCSgtMhZxQnyty
9t4yqacILx12pCj/mZnrZp5BOSn9O2ZbuDoXNKNrFXU0BF+CsbnhvJvrrh1j/MUa
PPtcqyGFdOLSDvHSD9xPVT76juwh79aR8vB7qnQXaEO5wcLodZWoqBEFSKCl6Bo8
hjXs1EXidusKsoarQxW6mEITmnhU2S2fuCVDgVcoM/LmKwzbgqvlWrentq9u8qLH
W+XbjWgUtCM1byeDZWqe5FYyyJ8x+dTv7H5an3KR92EN6hKo5AOvzA0I41pZscq/
bJ9p2THDxJQX4rJBevGAS5mZ6hTkRw==
=z6eO
-----END PGP SIGNATURE-----
Merge tag 'ext4-for-linus-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull more ext4 updates from Ted Ts'o:
"This is the second round of ext4 commits for 5.8 merge window [1].
It includes the per-inode DAX support, which was dependant on the DAX
infrastructure which came in via the XFS tree, and a number of
regression and bug fixes; most notably the "BUG: using
smp_processor_id() in preemptible code in ext4_mb_new_blocks" reported
by syzkaller"
[1] The pull request actually came in 15 minutes after I had tagged the
rc1 release. Tssk, tssk, late.. - Linus
* tag 'ext4-for-linus-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4, jbd2: ensure panic by fix a race between jbd2 abort and ext4 error handlers
ext4: support xattr gnu.* namespace for the Hurd
ext4: mballoc: Use this_cpu_read instead of this_cpu_ptr
ext4: avoid utf8_strncasecmp() with unstable name
ext4: stop overwrite the errcode in ext4_setup_super
ext4: fix partial cluster initialization when splitting extent
ext4: avoid race conditions when remounting with options that change dax
Documentation/dax: Update DAX enablement for ext4
fs/ext4: Introduce DAX inode flag
fs/ext4: Remove jflag variable
fs/ext4: Make DAX mount option a tri-state
fs/ext4: Only change S_DAX on inode load
fs/ext4: Update ext4_should_use_dax()
fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
fs/ext4: Disallow verity if inode is DAX
fs/ext4: Narrow scope of DAX check in setflags
The UAPI <linux/spi/spidev.h> uses TABs for alignment.
Convert the recently introduced spaces to TABs to restore consistency.
Fixes: 7bb64402a0 ("spi: tools: Add macro definitions to fix build errors")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20200613073755.15906-1-geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
Symbols are needed for tools to describe instruction addresses. Pages
allocated for ftrace's purposes need symbols to be created for them.
Add such symbols to be visible via perf ksymbol events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200512121922.8997-8-adrian.hunter@intel.com
Symbols are needed for tools to describe instruction addresses. Pages
allocated for kprobe's purposes need symbols to be created for them.
Add such symbols to be visible via perf ksymbol events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20200512121922.8997-5-adrian.hunter@intel.com
Record (single instruction) changes to the kernel text (i.e.
self-modifying code) in order to support tracers like Intel PT and
ARM CoreSight.
A copy of the running kernel code is needed as a reference point (e.g.
from /proc/kcore). The text poke event records the old bytes and the
new bytes so that the event can be processed forwards or backwards.
The basic problem is recording the modified instruction in an
unambiguous manner given SMP instruction cache (in)coherence. That is,
when modifying an instruction concurrently any solution with one or
multiple timestamps is not sufficient:
CPU0 CPU1
0
1 write insn A
2 execute insn A
3 sync-I$
4
Due to I$, CPU1 might execute either the old or new A. No matter where
we record tracepoints on CPU0, one simply cannot tell what CPU1 will
have observed, except that at 0 it must be the old one and at 4 it
must be the new one.
To solve this, take inspiration from x86 text poking, which has to
solve this exact problem due to variable length instruction encoding
and I-fetch windows.
1) overwrite the instruction with a breakpoint and sync I$
This guarantees that that code flow will never hit the target
instruction anymore, on any CPU (or rather, it will cause an
exception).
2) issue the TEXT_POKE event
3) overwrite the breakpoint with the new instruction and sync I$
Now we know that any execution after the TEXT_POKE event will either
observe the breakpoint (and hit the exception) or the new instruction.
So by guarding the TEXT_POKE event with an exception on either side;
we can now tell, without doubt, which instruction another CPU will
have observed.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200512121922.8997-2-adrian.hunter@intel.com
Pull networking fixes from David Miller:
1) Fix cfg80211 deadlock, from Johannes Berg.
2) RXRPC fails to send norigications, from David Howells.
3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from
Geliang Tang.
4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu.
5) The ucc_geth driver needs __netdev_watchdog_up exported, from
Valentin Longchamp.
6) Fix hashtable memory leak in dccp, from Wang Hai.
7) Fix how nexthops are marked as FDB nexthops, from David Ahern.
8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni.
9) Fix crashes in tipc_disc_rcv(), from Tuong Lien.
10) Fix link speed reporting in iavf driver, from Brett Creeley.
11) When a channel is used for XSK and then reused again later for XSK,
we forget to clear out the relevant data structures in mlx5 which
causes all kinds of problems. Fix from Maxim Mikityanskiy.
12) Fix memory leak in genetlink, from Cong Wang.
13) Disallow sockmap attachments to UDP sockets, it simply won't work.
From Lorenz Bauer.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
net: ethernet: ti: ale: fix allmulti for nu type ale
net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
net: atm: Remove the error message according to the atomic context
bpf: Undo internal BPF_PROBE_MEM in BPF insns dump
libbpf: Support pre-initializing .bss global variables
tools/bpftool: Fix skeleton codegen
bpf: Fix memlock accounting for sock_hash
bpf: sockmap: Don't attach programs to UDP sockets
bpf: tcp: Recv() should return 0 when the peer socket is closed
ibmvnic: Flush existing work items before device removal
genetlink: clean up family attributes allocations
net: ipa: header pad field only valid for AP->modem endpoint
net: ipa: program upper nibbles of sequencer type
net: ipa: fix modem LAN RX endpoint id
net: ipa: program metadata mask differently
ionic: add pcie_print_link_status
rxrpc: Fix race between incoming ACK parser and retransmitter
net/mlx5: E-Switch, Fix some error pointer dereferences
net/mlx5: Don't fail driver on failure to create debugfs
net/mlx5e: CT: Fix ipv6 nat header rewrite actions
...
Alexei Starovoitov says:
====================
pull-request: bpf 2020-06-12
The following pull-request contains BPF updates for your *net* tree.
We've added 26 non-merge commits during the last 10 day(s) which contain
a total of 27 files changed, 348 insertions(+), 93 deletions(-).
The main changes are:
1) sock_hash accounting fix, from Andrey.
2) libbpf fix and probe_mem sanitizing, from Andrii.
3) sock_hash fixes, from Jakub.
4) devmap_val fix, from Jesper.
5) load_bytes_relative fix, from YiFei.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl7U/i8ACgkQ+7dXa6fL
C2u2eg/+Oy6ybq0hPovYVkFI9WIG7ZCz7w9Q6BEnfYMqqn3dnfJxKQ3l4pnQEOWw
f4QfvpvevsYfMtOJkYcG6s66rQgbFdqc5TEyBBy0QNp3acRolN7IXkcopvv9xOpQ
JxedpbFG1PTFLWjvBpyjlrUPouwLzq2FXAf1Ox0ZIMw6165mYOMWoli1VL8dh0A0
Ai7JUB0WrvTNbrwhV413obIzXT/rPCdcrgbQcgrrLPex8lQ47ZAE9bq6k4q5HiwK
KRzEqkQgnzId6cCNTFBfkTWsx89zZunz7jkfM5yx30MvdAtPSxvvpfIPdZRZkXsP
E2K9Fk1/6OQZTC0Op3Pi/bt+hVG/mD1p0sQUDgo2MO3qlSS+5mMkR8h3mJEgwK12
72P4YfOJkuAy2z3v4lL0GYdUDAZY6i6G8TMxERKu/a9O3VjTWICDOyBUS6F8YEAK
C7HlbZxAEOKTVK0BTDTeEUBwSeDrBbvH6MnRlZCG5g1Fos2aWP0udhjiX8IfZLO7
GN6nWBvK1fYzfsUczdhgnoCzQs3suoDo04HnsTPGJ8De52T4x2RsjV+gPx0nrNAq
eWChl1JvMWsY2B3GLnl9XQz4NNN+EreKEkk+PULDGllrArrPsp5Vnhb9FJO1PVCU
hMDJHohPiXnKbc8f4Bd78OhIvnuoGfJPdM5MtNe2flUKy2a2ops=
=YTGf
-----END PGP SIGNATURE-----
Merge tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull notification queue from David Howells:
"This adds a general notification queue concept and adds an event
source for keys/keyrings, such as linking and unlinking keys and
changing their attributes.
Thanks to Debarshi Ray, we do have a pull request to use this to fix a
problem with gnome-online-accounts - as mentioned last time:
https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47
Without this, g-o-a has to constantly poll a keyring-based kerberos
cache to find out if kinit has changed anything.
[ There are other notification pending: mount/sb fsinfo notifications
for libmount that Karel Zak and Ian Kent have been working on, and
Christian Brauner would like to use them in lxc, but let's see how
this one works first ]
LSM hooks are included:
- A set of hooks are provided that allow an LSM to rule on whether or
not a watch may be set. Each of these hooks takes a different
"watched object" parameter, so they're not really shareable. The
LSM should use current's credentials. [Wanted by SELinux & Smack]
- A hook is provided to allow an LSM to rule on whether or not a
particular message may be posted to a particular queue. This is
given the credentials from the event generator (which may be the
system) and the watch setter. [Wanted by Smack]
I've provided SELinux and Smack with implementations of some of these
hooks.
WHY
===
Key/keyring notifications are desirable because if you have your
kerberos tickets in a file/directory, your Gnome desktop will monitor
that using something like fanotify and tell you if your credentials
cache changes.
However, we also have the ability to cache your kerberos tickets in
the session, user or persistent keyring so that it isn't left around
on disk across a reboot or logout. Keyrings, however, cannot currently
be monitored asynchronously, so the desktop has to poll for it - not
so good on a laptop. This facility will allow the desktop to avoid the
need to poll.
DESIGN DECISIONS
================
- The notification queue is built on top of a standard pipe. Messages
are effectively spliced in. The pipe is opened with a special flag:
pipe2(fds, O_NOTIFICATION_PIPE);
The special flag has the same value as O_EXCL (which doesn't seem
like it will ever be applicable in this context)[?]. It is given up
front to make it a lot easier to prohibit splice&co from accessing
the pipe.
[?] Should this be done some other way? I'd rather not use up a new
O_* flag if I can avoid it - should I add a pipe3() system call
instead?
The pipe is then configured::
ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);
Messages are then read out of the pipe using read().
- It should be possible to allow write() to insert data into the
notification pipes too, but this is currently disabled as the
kernel has to be able to insert messages into the pipe *without*
holding pipe->mutex and the code to make this work needs careful
auditing.
- sendfile(), splice() and vmsplice() are disabled on notification
pipes because of the pipe->mutex issue and also because they
sometimes want to revert what they just did - but one or more
notification messages might've been interleaved in the ring.
- The kernel inserts messages with the wait queue spinlock held. This
means that pipe_read() and pipe_write() have to take the spinlock
to update the queue pointers.
- Records in the buffer are binary, typed and have a length so that
they can be of varying size.
This allows multiple heterogeneous sources to share a common
buffer; there are 16 million types available, of which I've used
just a few, so there is scope for others to be used. Tags may be
specified when a watchpoint is created to help distinguish the
sources.
- Records are filterable as types have up to 256 subtypes that can be
individually filtered. Other filtration is also available.
- Notification pipes don't interfere with each other; each may be
bound to a different set of watches. Any particular notification
will be copied to all the queues that are currently watching for it
- and only those that are watching for it.
- When recording a notification, the kernel will not sleep, but will
rather mark a queue as having lost a message if there's
insufficient space. read() will fabricate a loss notification
message at an appropriate point later.
- The notification pipe is created and then watchpoints are attached
to it, using one of:
keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);
where in both cases, fd indicates the queue and the number after is
a tag between 0 and 255.
- Watches are removed if either the notification pipe is destroyed or
the watched object is destroyed. In the latter case, a message will
be generated indicating the enforced watch removal.
Things I want to avoid:
- Introducing features that make the core VFS dependent on the
network stack or networking namespaces (ie. usage of netlink).
- Dumping all this stuff into dmesg and having a daemon that sits
there parsing the output and distributing it as this then puts the
responsibility for security into userspace and makes handling
namespaces tricky. Further, dmesg might not exist or might be
inaccessible inside a container.
- Letting users see events they shouldn't be able to see.
TESTING AND MANPAGES
====================
- The keyutils tree has a pipe-watch branch that has keyctl commands
for making use of notifications. Proposed manual pages can also be
found on this branch, though a couple of them really need to go to
the main manpages repository instead.
If the kernel supports the watching of keys, then running "make
test" on that branch will cause the testing infrastructure to spawn
a monitoring process on the side that monitors a notifications pipe
for all the key/keyring changes induced by the tests and they'll
all be checked off to make sure they happened.
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch
- A test program is provided (samples/watch_queue/watch_test) that
can be used to monitor for keyrings, mount and superblock events.
Information on the notifications is simply logged to stdout"
* tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
smack: Implement the watch_key and post_notification hooks
selinux: Implement the watch_key security hook
keys: Make the KEY_NEED_* perms an enum rather than a mask
pipe: Add notification lossage handling
pipe: Allow buffers to be marked read-whole-or-error for notifications
Add sample notification program
watch_queue: Add a key/keyring notification facility
security: Add hooks to rule on setting a watch
pipe: Add general notification queue support
pipe: Add O_NOTIFICATION_PIPE
security: Add a hook for the point of notification insertion
uapi: General notification queue definitions
Add SPI_TX_OCTAL and SPI_RX_OCTAL to fix the following build errors:
CC spidev_test.o
spidev_test.c: In function ‘transfer’:
spidev_test.c:131:13: error: ‘SPI_TX_OCTAL’ undeclared (first use in this function)
if (mode & SPI_TX_OCTAL)
^
spidev_test.c:131:13: note: each undeclared identifier is reported only once for each function it appears in
spidev_test.c:137:13: error: ‘SPI_RX_OCTAL’ undeclared (first use in this function)
if (mode & SPI_RX_OCTAL)
^
spidev_test.c: In function ‘parse_opts’:
spidev_test.c:290:12: error: ‘SPI_TX_OCTAL’ undeclared (first use in this function)
mode |= SPI_TX_OCTAL;
^
spidev_test.c:308:12: error: ‘SPI_RX_OCTAL’ undeclared (first use in this function)
mode |= SPI_RX_OCTAL;
^
LD spidev_test-in.o
ld: cannot find spidev_test.o: No such file or directory
Additionally, maybe SPI_CS_WORD and SPI_3WIRE_HIZ will be used in the future,
so add them too.
Fixes: 896fa73508 ("spi: spidev_test: Add support for Octal mode data transfers")
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/1591880212-13479-2-git-send-email-zhangqing@loongson.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
This adds the same per-file/per-directory DAX support for ext4 as was
done for xfs, now that we finally have consensus over what the
interface should be.
virtio-mem
doorbell mapping for vdpa
config interrupt support in ifc
fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl7fZ6APHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpkDoIAMcBcQx5su1iuX7vT35xzUWZO478eAf1jOMZ
7KxKUVBeztkcxVFUlRVRu9MR6wOzwHils+1HD6025775Smr5M6x3aJxR6xOORaBj
RoU6OVGkpDvbzsxlhW+xhONz4O7/RkveKJPCwzGjqHrsFeh92lkfTqroz/EuNpw+
LZsO0+DhdUf123HbwHQp5lxW8EjyrRabgeZZg/D9VLPhoCP88vCjRhBXU2GPuaUl
/UNXsQafn4xUgrxPaoN5f4Phn/P46NNrbZ1jmlkw/z/3QhF/DhktGXGaZsIHDCN/
vicUii0or5QLeBsZpMbKko/BIe2xWHxFjkMRhMOMZOfcBb6sMBI=
=auUa
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- virtio-mem: paravirtualized memory hotplug
- support doorbell mapping for vdpa
- config interrupt support in ifc
- fixes all over the place
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (40 commits)
vhost/test: fix up after API change
virtio_mem: convert device block size into 64bit
virtio-mem: drop unnecessary initialization
ifcvf: implement config interrupt in IFCVF
vhost: replace -1 with VHOST_FILE_UNBIND in ioctls
vhost_vdpa: Support config interrupt in vdpa
ifcvf: ignore continuous setting same status value
virtio-mem: Don't rely on implicit compiler padding for requests
virtio-mem: Try to unplug the complete online memory block first
virtio-mem: Use -ETXTBSY as error code if the device is busy
virtio-mem: Unplug subblocks right-to-left
virtio-mem: Drop manual check for already present memory
virtio-mem: Add parent resource for all added "System RAM"
virtio-mem: Better retry handling
virtio-mem: Offline and remove completely unplugged memory blocks
mm/memory_hotplug: Introduce offline_and_remove_memory()
virtio-mem: Allow to offline partially unplugged memory blocks
mm: Allow to offline unmovable PageOffline() pages via MEM_GOING_OFFLINE
virtio-mem: Paravirtualized memory hotunplug part 2
virtio-mem: Paravirtualized memory hotunplug part 1
...
* partition parser: Support MTD names containing one or more colons.
* mtdblock: clear cache_state to avoid writing to bad blocks repeatedly.
Raw NAND core changes:
* Stop using nand_release(), patched all drivers.
* Give more information about the ECC weakness when not matching the
chip's requirement.
* MAINTAINERS updates.
* Support emulated SLC mode on MLC NANDs.
* Support "constrained" controllers, adapt the core and ONFI/JEDEC
table parsing and Micron's code.
* Take check_only into account.
* Add an invalid ECC mode to discriminate with valid ones.
* Return an enum from of_get_nand_ecc_algo().
* Drop OOB_FIRST placement scheme.
* Introduce nand_extract_bits().
* Ensure a consistent bitflips numbering.
* BCH lib:
- Allow easy bit swapping.
- Rework a little bit the exported function names.
* Fix nand_gpio_waitrdy().
* Propage CS selection to sub operations.
* Add a NAND_NO_BBM_QUIRK flag.
* Give the possibility to verify a read operation is supported.
* Add a helper to check supported operations.
* Avoid indirect access to ->data_buf().
* Rename the use_bufpoi variables.
* Fix comments about the use of bufpoi.
* Rename a NAND chip option.
* Reorder the nand_chip->options flags.
* Translate obscure bitfields into readable macros.
* Timings:
- Fix default values.
- Add mode information to the timings structure.
Raw NAND controller driver changes:
* Fixed many error paths.
* Arasan
- New driver
* Au1550nd:
- Various cleanups
- Migration to ->exec_op()
* brcmnand:
- Misc cleanup.
- Support v2.1-v2.2 controllers.
- Remove unused including <linux/version.h>.
- Correctly verify erased pages.
- Fix Hamming OOB layout.
* Cadence
- Make cadence_nand_attach_chip static.
* Cafe:
- Set the NAND_NO_BBM_QUIRK flag
* cmx270:
- Remove this controller driver.
* cs553x:
- Misc cleanup
- Migration to ->exec_op()
* Davinci:
- Misc cleanup.
- Migration to ->exec_op()
* Denali:
- Add more delays before latching incoming data
* Diskonchip:
- Misc cleanup
- Migration to ->exec_op()
* Fsmc:
- Change to non-atomic bit operations.
* GPMI:
- Use nand_extract_bits()
- Fix runtime PM imbalance.
* Ingenic:
- Migration to exec_op()
- Fix the RB gpio active-high property on qi, lb60
- Make qi_lb60_ooblayout_ops static.
* Marvell:
- Misc cleanup and small fixes
* Nandsim:
- Fix the error paths, driver wide.
* Omap_elm:
- Fix runtime PM imbalance.
* STM32_FMC2:
- Misc cleanups (error cases, comments, timeout valus, cosmetic
changes).
SPI NOR core changes:
* Add, update support and fix few flashes.
* Prepare BFPT parsing for JESD216 rev D.
* Kernel doc fixes.
CFI changes:
* Support the absence of protection registers for Intel CFI flashes.
* Replace zero-length array with flexible-arrays.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAl7f8msWHHJpY2hhcmRA
c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wXhlEADO3dfrWS9bsbZokuMppHlOTAsm
d0hPexu0ztRYr2qWgXScENtrcJ/0ygDPxEQxwIiWYAqwFn6yOBbum+tOo25edbEH
hGpkV5551vc48vD55nvxdoyWiJAgx93jVmfXU/Ad8EBDV4wGTBwzJJvZ8bxovUIl
Ccs9p8KU/Z5c7yNhYtLOJChU3gfMS0WS5iQakSnnT82TFJIdC8d8Y+bjupRfvHbz
ZkEC44Y+QcvSX6C2PJ2U9ScBf6r6WZkHmpOef8UzrxdLRvnhU16u9yRlepsm2D+x
KycQ81KPBhagLI+9AWGZQYma5GH0z+40LmhxR6YBS0ipS2lAc1wM904KB8RGohfl
SY4EYQSyx2/42bLEgR823rCfIIrzzNvjwnWdcZik2p2IWsocpzhdW2Fe3eJ7ULUe
9toQMg8JObawyKw7vRJtdiiX/OsNNv53FJsRu6rHkq3kgLXcmAUQYMh02LQFAkD6
gT8W8wmseZixI6mnG7tV2KHtRU70xWTTwJgFp5FvvBAP0p7KfbIlgJ0XrzQor2vB
3Jhb7e2DrOfu2RatZ12bmQpvpoU1Jv1U81UNnwsNpXawCPuRUYG3KCt+hjjr7HiV
++7YZ01pQ1GQ/pgcprwwKcpw5iTah0uXUEnE6pVbyX7hxg+OBgsWh8SK9MZMUioM
3yUGbotWAu2j6uM46g==
=2M9Y
-----END PGP SIGNATURE-----
Merge tag 'mtd/for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD updates from Richard Weinberger:
"MTD core changes:
- partition parser: Support MTD names containing one or more colons.
- mtdblock: clear cache_state to avoid writing to bad blocks
repeatedly.
Raw NAND core changes:
- Stop using nand_release(), patched all drivers.
- Give more information about the ECC weakness when not matching the
chip's requirement.
- MAINTAINERS updates.
- Support emulated SLC mode on MLC NANDs.
- Support "constrained" controllers, adapt the core and ONFI/JEDEC
table parsing and Micron's code.
- Take check_only into account.
- Add an invalid ECC mode to discriminate with valid ones.
- Return an enum from of_get_nand_ecc_algo().
- Drop OOB_FIRST placement scheme.
- Introduce nand_extract_bits().
- Ensure a consistent bitflips numbering.
- BCH lib:
- Allow easy bit swapping.
- Rework a little bit the exported function names.
- Fix nand_gpio_waitrdy().
- Propage CS selection to sub operations.
- Add a NAND_NO_BBM_QUIRK flag.
- Give the possibility to verify a read operation is supported.
- Add a helper to check supported operations.
- Avoid indirect access to ->data_buf().
- Rename the use_bufpoi variables.
- Fix comments about the use of bufpoi.
- Rename a NAND chip option.
- Reorder the nand_chip->options flags.
- Translate obscure bitfields into readable macros.
- Timings:
- Fix default values.
- Add mode information to the timings structure.
Raw NAND controller driver changes:
- Fixed many error paths.
- Arasan
- New driver
- Au1550nd:
- Various cleanups
- Migration to ->exec_op()
- brcmnand:
- Misc cleanup.
- Support v2.1-v2.2 controllers.
- Remove unused including <linux/version.h>.
- Correctly verify erased pages.
- Fix Hamming OOB layout.
- Cadence
- Make cadence_nand_attach_chip static.
- Cafe:
- Set the NAND_NO_BBM_QUIRK flag
- cmx270:
- Remove this controller driver.
- cs553x:
- Misc cleanup
- Migration to ->exec_op()
- Davinci:
- Misc cleanup.
- Migration to ->exec_op()
- Denali:
- Add more delays before latching incoming data
- Diskonchip:
- Misc cleanup
- Migration to ->exec_op()
- Fsmc:
- Change to non-atomic bit operations.
- GPMI:
- Use nand_extract_bits()
- Fix runtime PM imbalance.
- Ingenic:
- Migration to exec_op()
- Fix the RB gpio active-high property on qi, lb60
- Make qi_lb60_ooblayout_ops static.
- Marvell:
- Misc cleanup and small fixes
- Nandsim:
- Fix the error paths, driver wide.
- Omap_elm:
- Fix runtime PM imbalance.
- STM32_FMC2:
- Misc cleanups (error cases, comments, timeout valus, cosmetic
changes).
SPI NOR core changes:
- Add, update support and fix few flashes.
- Prepare BFPT parsing for JESD216 rev D.
- Kernel doc fixes.
CFI changes:
- Support the absence of protection registers for Intel CFI flashes.
- Replace zero-length array with flexible-arrays"
* tag 'mtd/for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (208 commits)
mtd: clear cache_state to avoid writing to bad blocks repeatedly
mtd: parser: cmdline: Support MTD names containing one or more colons
mtd: physmap_of_gemini: remove defined but not used symbol 'syscon_match'
mtd: rawnand: Add an invalid ECC mode to discriminate with valid ones
mtd: rawnand: Return an enum from of_get_nand_ecc_algo()
mtd: rawnand: Drop OOB_FIRST placement scheme
mtd: rawnand: Avoid a typedef
mtd: Fix typo in mtd_ooblayout_set_databytes() description
mtd: rawnand: Stop using nand_release()
mtd: rawnand: nandsim: Reorganize ns_cleanup_module()
mtd: rawnand: nandsim: Rename a label in ns_init_module()
mtd: rawnand: nandsim: Manage lists on error in ns_init_module()
mtd: rawnand: nandsim: Fix the label pointing on nand_cleanup()
mtd: rawnand: nandsim: Free erase_block_wear on error
mtd: rawnand: nandsim: Use an additional label when freeing the nandsim object
mtd: rawnand: nandsim: Stop using nand_release()
mtd: rawnand: nandsim: Free the partition names in ns_free()
mtd: rawnand: nandsim: Free the allocated device on error in ns_init()
mtd: rawnand: nandsim: Free partition names on error in ns_init()
mtd: rawnand: nandsim: Fix the two ns_alloc_device() error paths
...
V2:
- Defer changing BPF-syscall to start at file-descriptor 1
- Use {} to zero initialise struct.
The recent commit fbee97feed ("bpf: Add support to attach bpf program to a
devmap entry"), introduced ability to attach (and run) a separate XDP
bpf_prog for each devmap entry. A bpf_prog is added via a file-descriptor.
As zero were a valid FD, not using the feature requires using value minus-1.
The UAPI is extended via tail-extending struct bpf_devmap_val and using
map->value_size to determine the feature set.
This will break older userspace applications not using the bpf_prog feature.
Consider an old userspace app that is compiled against newer kernel
uapi/bpf.h, it will not know that it need to initialise the member
bpf_prog.fd to minus-1. Thus, users will be forced to update source code to
get program running on newer kernels.
This patch remove the minus-1 checks, and have zero mean feature isn't used.
Followup patches either for kernel or libbpf should handle and avoid
returning file-descriptor zero in the first place.
Fixes: fbee97feed ("bpf: Add support to attach bpf program to a devmap entry")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159170950687.2102545.7235914718298050113.stgit@firesoul
If subblock size is large (e.g. 1G) 32 bit math involving it
can overflow. Rather than try to catch all instances of that,
let's tweak block size to 64 bit.
It ripples through UAPI which is an ABI change, but it's not too late to
make it, and it will allow supporting >4Gbyte blocks while might
become necessary down the road.
Fixes: 5f1f79bbc9 ("virtio-mem: Paravirtualized memory hotplug")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
* fix the deadlock on rfkill/wireless removal that a few
people reported
* fix an uninitialized variable
* update wiki URLs
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl7d8c0ACgkQB8qZga/f
l8SRzg//ZTtHTKOsfZ2IpsAmExkQ+1ZdsGHAGfkgDLQz4rvv1Lug7TvrFPiSyHSm
jwLlRQNsQ5+Cv2CRY3Xm7Qf8j9wBavYnfHhJkoTrnD3Z770KUS+BXBYb31+Odkxv
CzsR1GZYTWdYhCrzVIyE+GkQmW2pZ3L8U7ODioM7ETYaK0gAjmCb/HXLiX/m8cGa
O0uUlJZqE57Trfy5p+WO7cQOLJ9v6WXgSCcrDCNb9Ek25wg5J6RVOMEm7w6oV8oC
F8uZyVXPC0fSblzHC4cch0yX3z4YIuD12BVZBOVDLJKQBZwqohtxd0jT4MNHJB2y
BflU13M2kW5pw3l+cBPLZFOsURcDmOcBo9pNYCbi7Uxsd5Hvgft039jeXpukI3QW
e3d50KB0gSE/plOgXShPVSvm4eQ7WGS3Vyv2IfmU3dY6mxLv7kazSOErFD+fxUMy
vtdVN/Ie9XyRbh30n5MfTrE3PIf6k7XI3zirZrpMMNfu9fw4a3DQycqoZRBOoU1Y
l4ThlIduREp+wr14OnF2ueaho9hxVRxh+gnfuhWbzI8VKLHBCVOKe/MsTXzxg5OB
8xSA9Q1xo/bv+VymaQrY6ENG39sDZB+uI5fi0hnQ2Fu7BHPgp/Juzb56nQ/bWrfG
DOItqu5PoejvwMP+ju43i8oUDdqjlNgHhwDze+nCHiSnHUf+yWE=
=wP9I
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2020-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just a small update:
* fix the deadlock on rfkill/wireless removal that a few
people reported
* fix an uninitialized variable
* update wiki URLs
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- An iopen glock locking scheme rework that speeds up deletes of
inodes accessed from multiple nodes.
- Various bug fixes and debugging improvements.
- Convert gfs2-glocks.txt to ReST.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAl7eYjMUHGFncnVlbmJh
QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTr/9g//cJ6jgiD/+qzh0VzougVksVZIduAl
RMB+kldOjBS2ORbyYM87Jm1tdyakgZuFO91XlSwChWRdC3Y2mqdaJIEE/kATqfY9
7Frlw++SyFKLvIf04kDYGk2hXX+umXXYFfrIiKb0tzDSGkPRaARUb3RM4TRvlSDP
/U0JlYA/4aXMUge+VpYsbpSGeqHNfEzmcmCyPXGmZYyh1MZ/RocMZFYEsP9NH82J
l07fxowUd10LJPEmBajzjD2NmEvjdvF4gBCOfJVNIfOzCj0CwXL3vmtu1SUNOKr+
em266EWZF89eOcvfdtE6xF0w81oCAK43wYRjIODSgI9JCLXGiOYmlWZVwZoqu5iy
2GQDhj/taq3ObuVqjR5n6GYuqMoJ+D0LSD13qccDALq/Bdy4lq9TMLSdDbkrVIM/
8BVn0nI5MzUlTV3mq6uxhU0HqtYDwUEiHWURWw6bYRug5OvQy3nbg/XZptYlnD87
XQccE4ErjlgSHLiYx1YckFz/GG6ytrRAKl9bGMkZ0u2+XmDsH+iJJgLcaXCPUP9h
/hhYagKI55UBDer7we4tppbu+gnJrg3PXlgImf53iMc7ia0KQHd+TfSIFkGPuydI
aEKKhIQzje23JayMbPRnwPbNlw9zU1loPi7hPS3rCpDY+w8oawpFyIieEOcxJyEt
pYkOt4BQi9LvpGc=
=PsCY
-----END PGP SIGNATURE-----
Merge tag 'gfs2-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- An iopen glock locking scheme rework that speeds up deletes of inodes
accessed from multiple nodes
- Various bug fixes and debugging improvements
- Convert gfs2-glocks.txt to ReST
* tag 'gfs2-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: fix use-after-free on transaction ail lists
gfs2: new slab for transactions
gfs2: initialize transaction tr_ailX_lists earlier
gfs2: Smarter iopen glock waiting
gfs2: Wake up when setting GLF_DEMOTE
gfs2: Check inode generation number in delete_work_func
gfs2: Move inode generation number check into gfs2_inode_lookup
gfs2: Minor gfs2_lookup_by_inum cleanup
gfs2: Try harder to delete inodes locally
gfs2: Give up the iopen glock on contention
gfs2: Turn gl_delete into a delayed work
gfs2: Keep track of deleted inode generations in LVBs
gfs2: Allow ASPACE glocks to also have an lvb
gfs2: instrumentation wrt log_flush stuck
gfs2: introduce new gfs2_glock_assert_withdraw
gfs2: print mapping->nrpages in glock dump for address space glocks
gfs2: Only do glock put in gfs2_create_inode for free inodes
gfs2: Allow lock_nolock mount to specify jid=X
gfs2: Don't ignore inode write errors during inode_go_sync
docs: filesystems: convert gfs2-glocks.txt to ReST
- Add support for multi-function devices in pci code.
- Enable PF-VF linking for architectures using the
pdev->no_vf_scan flag (currently just s390).
- Add reipl from NVMe support.
- Get rid of critical section cleanup in entry.S.
- Refactor PNSO CHSC (perform network subchannel operation) in cio
and qeth.
- QDIO interrupts and error handling fixes and improvements, more
refactoring changes.
- Align ioremap() with generic code.
- Accept requests without the prefetch bit set in vfio-ccw.
- Enable path handling via two new regions in vfio-ccw.
- Other small fixes and improvements all over the code.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl7eVGcACgkQjYWKoQLX
FBhweQgAkicvx31x230rdfG+jQkQkl0UqF99vvWrJHEll77SqadfjzKAGIjUB+K0
EoeHVD5Wcj7BogDGcyHeQ0bZpu4WzE+y1nmnrsvu7TEEvcBmkJH0rF2jF+y0sb/O
3qvwFkX/CB5OqaMzKC/AEeRpcCKR+ZUXkWu1irbYth7CBXaycD9EAPc4cj8CfYGZ
r5njUdYOVk77TaO4aV+t5pCYc5TCRJaWXSsWaAv/nuLcIqsFBYOy2q+L47zITGXp
utZVanIDjzx+ikpaKicOIfC3hJsRuNX9MnlZKsQFwpVEZAUZmIUm29XdhGJTWSxU
RV7m1ORINbFP1nGAqWqkOvGo/LC0ZA==
=VhXR
-----END PGP SIGNATURE-----
Merge tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Add support for multi-function devices in pci code.
- Enable PF-VF linking for architectures using the pdev->no_vf_scan
flag (currently just s390).
- Add reipl from NVMe support.
- Get rid of critical section cleanup in entry.S.
- Refactor PNSO CHSC (perform network subchannel operation) in cio and
qeth.
- QDIO interrupts and error handling fixes and improvements, more
refactoring changes.
- Align ioremap() with generic code.
- Accept requests without the prefetch bit set in vfio-ccw.
- Enable path handling via two new regions in vfio-ccw.
- Other small fixes and improvements all over the code.
* tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits)
vfio-ccw: make vfio_ccw_regops variables declarations static
vfio-ccw: Add trace for CRW event
vfio-ccw: Wire up the CRW irq and CRW region
vfio-ccw: Introduce a new CRW region
vfio-ccw: Refactor IRQ handlers
vfio-ccw: Introduce a new schib region
vfio-ccw: Refactor the unregister of the async regions
vfio-ccw: Register a chp_event callback for vfio-ccw
vfio-ccw: Introduce new helper functions to free/destroy regions
vfio-ccw: document possible errors
vfio-ccw: Enable transparent CCW IPL from DASD
s390/pci: Log new handle in clp_disable_fh()
s390/cio, s390/qeth: cleanup PNSO CHSC
s390/qdio: remove q->first_to_kick
s390/qdio: fix up qdio_start_irq() kerneldoc
s390: remove critical section cleanup from entry.S
s390: add machine check SIGP
s390/pci: ioremap() align with generic code
s390/ap: introduce new ap function ap_get_qdev()
Documentation/s390: Update / remove developerWorks web links
...
Including:
- A big part of this is a change in how devices get connected to
IOMMUs in the core code. It contains the change from the old
add_device()/remove_device() to the new
probe_device()/release_device() call-backs. As a result
functionality that was previously in the IOMMU drivers has
been moved to the IOMMU core code, including IOMMU group
allocation for each device.
The reason for this change was to get more robust allocation
of default domains for the iommu groups.
A couple of fixes were necessary after this was merged into
the IOMMU tree, but there are no known bugs left. The last fix
is applied on-top of the merge commit for the topic branches.
- Removal of the driver private domain handling in the Intel
VT-d driver. This was fragile code and I am glad it is gone
now.
- More Intel VT-d updates from Lu Baolu:
- Nested Shared Virtual Addressing (SVA) support to the
Intel VT-d driver
- Replacement of the Intel SVM interfaces to the common
IOMMU SVA API
- SVA Page Request draining support
- ARM-SMMU Updates from Will:
- Avoid mapping reserved MMIO space on SMMUv3, so that
it can be claimed by the PMU driver
- Use xarray to manage ASIDs on SMMUv3
- Reword confusing shutdown message
- DT compatible string updates
- Allow implementations to override the default domain
type
- A new IOMMU driver for the Allwinner Sun50i platform
- Support for ATS gets disabled for untrusted devices (like
Thunderbolt devices). This includes a PCI patch, acked by
Bjorn.
- Some cleanups to the AMD IOMMU driver to make more use of
IOMMU core features.
- Unification of some printk formats in the Intel and AMD IOMMU
drivers and in the IOVA code.
- Updates for DT bindings
- A number of smaller fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl7eX5gACgkQK/BELZcB
GuOMMQ//Si8h3uC4QhTmeNM6OwYpTcImMuCtqOebVDOJYWfbjGb4U2ZvDSUu4r7u
KGj66pWBq9kciKaM5HcLnWNg4iNNG+iZHwYSOy2DAOdPorWh40aM/Obozdd4D4eK
sXt4uy1JEQem/Bm4eTwmvaJV5/riyK6xn1HVocPejstGSJCh4kal/bYuhj415qEa
LLrN0AcitoPaSRl4Pl7/wEtesk+Az0g94jY9qDhtxIQJXWlAwO25s+rIPy4S7QuW
WAFGU+Xp+J7WC3hQm6nHKQtURIqPHtqozT9Flws9YETuyeKwn47GRitMiAXZsy7R
t+kj1cHyglEhe2hdPnJBSFIjyrO3cCrV7CUVryJHigPCQOaQLjoEegThQCYU3VQu
FPRBX+bp4haHeo3BCBy2jQv4JZrPFkTVXeVEtpMRDOoJLb2OKaI34xbOvGy6dMM0
dFtpbAW2IjHuneJaQCbJIC+jaEYii8mr3Zwok4LS8u8Sy+7PPSKmt6Tti3enD8+C
pBB/0CxNJvQFhl13s6oI8NHTT9D6cPTbjxc2Gfc3UuKyyWsz+eR54gRhaBi0FypA
p6syMosNVjjOaHFd5K5gsbpUFCC3X/drIhqeXRLgQ51mqfkNZMuBBtiyLWTk7iJd
CK+1f2aqtBrpUdSNjTzE/XmR+AhjIn2oIcG/7jPCgYXQoSGM2Sg=
=a4z4
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"A big part of this is a change in how devices get connected to IOMMUs
in the core code. It contains the change from the old add_device() /
remove_device() to the new probe_device() / release_device()
call-backs.
As a result functionality that was previously in the IOMMU drivers has
been moved to the IOMMU core code, including IOMMU group allocation
for each device. The reason for this change was to get more robust
allocation of default domains for the iommu groups.
A couple of fixes were necessary after this was merged into the IOMMU
tree, but there are no known bugs left. The last fix is applied on-top
of the merge commit for the topic branches.
Other than that change, we have:
- Removal of the driver private domain handling in the Intel VT-d
driver. This was fragile code and I am glad it is gone now.
- More Intel VT-d updates from Lu Baolu:
- Nested Shared Virtual Addressing (SVA) support to the Intel VT-d
driver
- Replacement of the Intel SVM interfaces to the common IOMMU SVA
API
- SVA Page Request draining support
- ARM-SMMU Updates from Will:
- Avoid mapping reserved MMIO space on SMMUv3, so that it can be
claimed by the PMU driver
- Use xarray to manage ASIDs on SMMUv3
- Reword confusing shutdown message
- DT compatible string updates
- Allow implementations to override the default domain type
- A new IOMMU driver for the Allwinner Sun50i platform
- Support for ATS gets disabled for untrusted devices (like
Thunderbolt devices). This includes a PCI patch, acked by Bjorn.
- Some cleanups to the AMD IOMMU driver to make more use of IOMMU
core features.
- Unification of some printk formats in the Intel and AMD IOMMU
drivers and in the IOVA code.
- Updates for DT bindings
- A number of smaller fixes and cleanups.
* tag 'iommu-updates-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (109 commits)
iommu: Check for deferred attach in iommu_group_do_dma_attach()
iommu/amd: Remove redundant devid checks
iommu/amd: Store dev_data as device iommu private data
iommu/amd: Merge private header files
iommu/amd: Remove PD_DMA_OPS_MASK
iommu/amd: Consolidate domain allocation/freeing
iommu/amd: Free page-table in protection_domain_free()
iommu/amd: Allocate page-table in protection_domain_init()
iommu/amd: Let free_pagetable() not rely on domain->pt_root
iommu/amd: Unexport get_dev_data()
iommu/vt-d: Fix compile warning
iommu/vt-d: Remove real DMA lookup in find_domain
iommu/vt-d: Allocate domain info for real DMA sub-devices
iommu/vt-d: Only clear real DMA device's context entries
iommu: Remove iommu_sva_ops::mm_exit()
uacce: Remove mm_exit() op
iommu/sun50i: Constify sun50i_iommu_ops
iommu/hyper-v: Constify hyperv_ir_domain_ops
iommu/vt-d: Use pci_ats_supported()
iommu/arm-smmu-v3: Use pci_ats_supported()
...
* new gpu support: a405, a640, a650
* dpu: color processing support
* mdp5: support for msm8x36 (the thing with a405)
* some prep work for per-context pagetables (ie the part that
does not depend on in-flight iommu patches)
* last but not least, UABI update for submit ioctl to support
syncobj (from Bas)
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJe3bDxAAoJEAx081l5xIa+tkoQAIGUxvEgYBQ+S6RvANZAT+Wq
2JZS2JPvExcB3Xe4erI+y7DeIuK2VghQUAcxMWhrDGgU7jKLV7jq09HTKkdE7++4
feLQMZziy3rAN3H6Pe1+72ZI9jAeK7JpvyRxI1nSu1O1JnaZS2rHmCOnBT8yA8sw
tHld1b5KUMmgTLR6CcJQYz0qp7p8x5LE8MdWY57Px5AqcnXFf1z/oiYNiCcxK2Jl
tEic1b9mvCwvlGWYdu00aavqo7WESj3oWYxtb8MsmVBWjAHtTqrlBY21DyQzgdEu
sgc8QAG+zHJ7Ls81INSVfDQ1zrspn/n+yL8efMhQibpMAQqGgt17nF+ZIx50nLMi
USg5qBJKgBL2iccooA9QEioFE3tB6Ld8SfcjLGIU7jegi0Fw/KpVPqmUVjKdqrXT
qjUKExa4e4pFxOlgbOYc1lIzSLwpGjGpLWbRWj8aee1GyrWRJA0Y9aRo75G6Sr4e
SX6807kX+h0IrF1rJzftVKa+KviD9SD4NyAyah6OJvg0FVJnhbO75PmnAkB6GVnQ
Jgg7fALjjkANRd8764H2B0pjke6wPDnUNXnh32ei2FWxVfQfIu/qhlJg9cU7TdMf
Z2kcHijoRGjAfvddD+oDs3DS58b9o7DHKgsZuLWvh87MpVbv9CynZSh5SgGqqNKR
nHajwsRXQc6e/fXT4YzN
=hIK6
-----END PGP SIGNATURE-----
Merge tag 'drm-next-msm-5.8-2020-06-08' of git://anongit.freedesktop.org/drm/drm
Pull drm msm updates from Dave Airlie:
"This tree has been in next for a couple of weeks, but Rob missed an
arm32 build issue, so I was awaiting the tree with a patch reverted.
- new gpu support: a405, a640, a650
- dpu: color processing support
- mdp5: support for msm8x36 (the thing with a405)
- some prep work for per-context pagetables (ie the part that does
not depend on in-flight iommu patches)
- last but not least, UABI update for submit ioctl to support syncobj
(from Bas)"
* tag 'drm-next-msm-5.8-2020-06-08' of git://anongit.freedesktop.org/drm/drm: (30 commits)
Revert "drm/msm/dpu: add support for clk and bw scaling for display"
drm/msm/a6xx: skip HFI set freq if GMU is powered down
drm/msm: Update the MMU helper function APIs
drm/msm: Refactor address space initialization
drm/msm: Attach the IOMMU device during initialization
drm/msm/dpu: dpu_setup_dspp_pcc() can be static
drm/msm/a6xx: a6xx_hfi_send_start() can be static
drm/msm/a4xx: add a405_registers for a405 device
drm/msm/a4xx: add adreno a405 support
drm/msm/a6xx: update a6xx_hw_init for A640 and A650
drm/msm/a6xx: enable GMU log
drm/msm/a6xx: update pdc/rscc GMU registers for A640/A650
drm/msm/a6xx: A640/A650 GMU firmware path
drm/msm/a6xx: HFI v2 for A640 and A650
drm/msm/a6xx: add A640/A650 to gpulist
drm/msm/a6xx: use msm_gem for GMU memory objects
drm/msm: add internal MSM_BO_MAP_PRIV flag
drm/msm: add msm_gem_get_and_pin_iova_range
drm/msm: Check for powered down HW in the devfreq callbacks
drm/msm/dpu: update bandwidth threshold check
...
The wiki url is still the old "wireless.kernel.org"
instead of the new "wireless.wiki.kernel.org"
Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
Link: https://lore.kernel.org/r/20200605154112.16277-9-f.suligoi@asem.it
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* new gpu support: a405, a640, a650
* dpu: color processing support
* mdp5: support for msm8x36 (the thing with a405)
* some prep work for per-context pagetables (ie the part that
does not depend on in-flight iommu patches)
* last but not least, UABI update for submit ioctl to support
syncobj (from Bas)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CAF6AEGvLMubYPeKZ0rvOp45=+h4HZz-K9XNf0CXYcvPDVbnqLA@mail.gmail.com
Subsystem:
- new VL flag for backup switch over
Drivers:
- ingenic: only support device tree
- pcf2127: report battery switch over, handle nowayout
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEycoQi/giopmpPgB12wIijOdRNOUFAl7dWSkACgkQ2wIijOdR
NOXGbQ//cSTxUJbuYBNi/VCV7J3/khGlyoQQqDsru/tzuEwXHGBoG2LRNQMOauWd
2Osg61VQj4IY+WCqp4+ivn5H0K26y1PPKkt+UmrlRgkl0eeDFWmY4ejpziZ85D7Z
kDlzcUi3YWkd6m4YSJJrtdCcKljBMIEXb/PEKKK9y6dkrcG5990N8JchpmkCzrjx
fTPVIOfxu43msDc5b8egUDzPYnNbFw3ERAeasr6/EGTz+ksCspXtvWDk/mJzum0G
FiermTkO499Dr66Nf0AS3ex9SvEoqH+kd9KA1CKii5OlYEl7K9sI+eSmTQ1EutZO
L5WAvvQdW8UkARo6R4HAobhwK27pL+wpzUljbyXxt940/RTeqp82kl7rnH+0ihU7
tTbR2Vu+uwWrfQbPkCCj0TJmqIHgam5/Vhn1+ZR2f4U2JIlPvvHoLRVKO0oP7XKK
1ZDcP8zc9V2LQ2G2M1/ec6eOmoGW3EZDnKp4hcv9mnEiePSvVn04t5sa83NjNs4R
e+awVY1x5pFwoXu99gjlfQTV2kTyaA7Jywp6gIO7BKaw/Ci3+d3tlpowfsDH+UVI
WwKxNNqmuNXqoIep0zqUhqXHNIizKxGEk8wE4mr8HP2SlGJ+lUHAyrTTdpLeinN1
5qTEPT3BhjExSFfDZQyWV3+CzKMvxtfFA4/Ca/0iSoaqzMZpm1E=
=dsKr
-----END PGP SIGNATURE-----
Merge tag 'rtc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Not much this cycle apart from the ingenic rtc driver rework.
The fixes are mainly minor issues reported by coccinelle rather than
real world issues.
Subsystem:
- new VL flag for backup switch over
Drivers:
- ingenic: only support device tree
- pcf2127: report battery switch over, handle nowayout"
* tag 'rtc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (29 commits)
rtc: pcf2127: watchdog: handle nowayout feature
rtc: fsl-ftm-alarm: fix freeze(s2idle) failed to wake
rtc: abx80x: Provide debug feedback for invalid dt properties
rtc: abx80x: Add Device Tree matching table
rtc: rv3028: Add missed check for devm_regmap_init_i2c()
rtc: mpc5121: Use correct return value for mpc5121_rtc_probe()
rtc: goldfish: Use correct return value for goldfish_rtc_probe()
rtc: snvs: Add necessary clock operations for RTC APIs
rtc: snvs: Make SNVS clock always prepared
rtc: ingenic: Reset regulator register in probe
rtc: ingenic: Fix masking of error code
rtc: ingenic: Remove unused fields from private structure
rtc: ingenic: Set wakeup params in probe
rtc: ingenic: Enable clock in probe
rtc: ingenic: Use local 'dev' variable in probe
rtc: ingenic: Only support probing from devicetree
rtc: mc13xxx: fix a double-unlock issue
rtc: stmp3xxx: update contact email
rtc: max77686: Use single-byte writes on MAX77620
rtc: pcf2127: report battery switch over
...
Here is the large set of char/misc driver patches for 5.8-rc1
Included in here are:
- habanalabs driver updates, loads
- mhi bus driver updates
- extcon driver updates
- clk driver updates (approved by the clock maintainer)
- firmware driver updates
- fpga driver updates
- gnss driver updates
- coresight driver updates
- interconnect driver updates
- parport driver updates (it's still alive!)
- nvmem driver updates
- soundwire driver updates
- visorbus driver updates
- w1 driver updates
- various misc driver updates
In short, loads of different driver subsystem updates along with the
drivers as well.
All have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXtzkHw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yldOwCgus/DgpnI1UL4z+NdBxJrAXtkPmgAn2sgTUea
i5RblCmcVMqvHaGtYkY+
=tScN
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the large set of char/misc driver patches for 5.8-rc1
Included in here are:
- habanalabs driver updates, loads
- mhi bus driver updates
- extcon driver updates
- clk driver updates (approved by the clock maintainer)
- firmware driver updates
- fpga driver updates
- gnss driver updates
- coresight driver updates
- interconnect driver updates
- parport driver updates (it's still alive!)
- nvmem driver updates
- soundwire driver updates
- visorbus driver updates
- w1 driver updates
- various misc driver updates
In short, loads of different driver subsystem updates along with the
drivers as well.
All have been in linux-next for a while with no reported issues"
* tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (233 commits)
habanalabs: correctly cast u64 to void*
habanalabs: initialize variable to default value
extcon: arizona: Fix runtime PM imbalance on error
extcon: max14577: Add proper dt-compatible strings
extcon: adc-jack: Fix an error handling path in 'adc_jack_probe()'
extcon: remove redundant assignment to variable idx
w1: omap-hdq: print dev_err if irq flags are not cleared
w1: omap-hdq: fix interrupt handling which did show spurious timeouts
w1: omap-hdq: fix return value to be -1 if there is a timeout
w1: omap-hdq: cleanup to add missing newline for some dev_dbg
/dev/mem: Revoke mappings when a driver claims the region
misc: xilinx-sdfec: convert get_user_pages() --> pin_user_pages()
misc: xilinx-sdfec: cleanup return value in xsdfec_table_write()
misc: xilinx-sdfec: improve get_user_pages_fast() error handling
nvmem: qfprom: remove incorrect write support
habanalabs: handle MMU cache invalidation timeout
habanalabs: don't allow hard reset with open processes
habanalabs: GAUDI does not support soft-reset
habanalabs: add print for soft reset due to event
habanalabs: improve MMU cache invalidation code
...
* Fix performance problems found in dioread_nolock now that it is the
default, caused by transaction leaks.
* Clean up fiemap handling in ext4
* Clean up and refactor multiple block allocator (mballoc) code
* Fix a problem with mballoc with a smaller file systems running out
of blocks because they couldn't properly use blocks that had been
reserved by inode preallocation.
* Fixed a race in ext4_sync_parent() versus rename()
* Simplify the error handling in the extent manipulation code
* Make sure all metadata I/O errors are felected to ext4_ext_dirty()'s and
ext4_make_inode_dirty()'s callers.
* Avoid passing an error pointer to brelse in ext4_xattr_set()
* Fix race which could result to freeing an inode on the dirty last
in data=journal mode.
* Fix refcount handling if ext4_iget() fails
* Fix a crash in generic/019 caused by a corrupted extent node
-----BEGIN PGP SIGNATURE-----
iQEyBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAl7Ze8kACgkQ8vlZVpUN
gaNChAf4xn0ytFSrweI/S2Sp05G/2L/ocZ2TZZk2ZdGeN1E+ABdSIv/zIF9zuFgZ
/pY/C+fyEZWt4E3FlNO8gJzoEedkzMCMnUhSIfI+wZbcclyTOSNMJtnrnJKAEtVH
HOvGZJmg357jy407RCGhZpJ773nwU2xhBTr5OFxvSf9mt/vzebxIOnw5D7HPlC1V
Fgm6Du8q+tRrPsyjv1Yu4pUEVXMJ7qUcvt326AXVM3kCZO1Aa5GrURX0w3J4mzW1
tc1tKmtbLcVVYTo9CwHXhk/edbxrhAydSP2iACand3tK6IJuI6j9x+bBJnxXitnr
vsxsfTYMG18+2SxrJ9LwmagqmrRq
=HMTs
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"A lot of bug fixes and cleanups for ext4, including:
- Fix performance problems found in dioread_nolock now that it is the
default, caused by transaction leaks.
- Clean up fiemap handling in ext4
- Clean up and refactor multiple block allocator (mballoc) code
- Fix a problem with mballoc with a smaller file systems running out
of blocks because they couldn't properly use blocks that had been
reserved by inode preallocation.
- Fixed a race in ext4_sync_parent() versus rename()
- Simplify the error handling in the extent manipulation code
- Make sure all metadata I/O errors are felected to
ext4_ext_dirty()'s and ext4_make_inode_dirty()'s callers.
- Avoid passing an error pointer to brelse in ext4_xattr_set()
- Fix race which could result to freeing an inode on the dirty last
in data=journal mode.
- Fix refcount handling if ext4_iget() fails
- Fix a crash in generic/019 caused by a corrupted extent node"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits)
ext4: avoid unnecessary transaction starts during writeback
ext4: don't block for O_DIRECT if IOCB_NOWAIT is set
ext4: remove the access_ok() check in ext4_ioctl_get_es_cache
fs: remove the access_ok() check in ioctl_fiemap
fs: handle FIEMAP_FLAG_SYNC in fiemap_prep
fs: move fiemap range validation into the file systems instances
iomap: fix the iomap_fiemap prototype
fs: move the fiemap definitions out of fs.h
fs: mark __generic_block_fiemap static
ext4: remove the call to fiemap_check_flags in ext4_fiemap
ext4: split _ext4_fiemap
ext4: fix fiemap size checks for bitmap files
ext4: fix EXT4_MAX_LOGICAL_BLOCK macro
add comment for ext4_dir_entry_2 file_type member
jbd2: avoid leaking transaction credits when unreserving handle
ext4: drop ext4_journal_free_reserved()
ext4: mballoc: use lock for checking free blocks while retrying
ext4: mballoc: refactor ext4_mb_good_group()
ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling
ext4: mballoc: refactor ext4_mb_discard_preallocations()
...
A few large, long discussed works this time. The RNBD block driver has
been posted for nearly two years now, and the removal of FMR has been a
recurring discussion theme for a long time. The usual smattering of
features and bug fixes.
- Various small driver bugs fixes in rxe, mlx5, hfi1, and efa
- Continuing driver cleanups in bnxt_re, hns
- Big cleanup of mlx5 QP creation flows
- More consistent use of src port and flow label when LAG is used and a
mlx5 implementation
- Additional set of cleanups for IB CM
- 'RNBD' network block driver and target. This is a network block RDMA
device specific to ionos's cloud environment. It brings strong multipath
and resiliency capabilities.
- Accelerated IPoIB for HFI1
- QP/WQ/SRQ ioctl migration for uverbs, and support for multiple async fds
- Support for exchanging the new IBTA defiend ECE data during RDMA CM
exchanges
- Removal of the very old and insecure FMR interface from all ULPs and
drivers. FRWR should be preferred for at least a decade now.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl7X/IwACgkQOG33FX4g
mxp2uw/+MI2S/aXqEBvZfTT8yrkAwqYezS0VeTDnwH/T6UlTMDhHVN/2Ji3tbbX3
FEKT1i2mnAL5RqUAL1lr9g4sG/bVozrpN46Ws5Lu9dTbIPLKTNPWDuLFQDUShKY7
OyMI/bRx6anGnsOy20iiBqnrQbrrZj5TECgnmrkAl62QFdcl7aBWe/yYjy4CT11N
ub+aBXBREN1F1pc0HIjd2tI+8gnZc+mNm1LVVDRH9Capun/pI26qDNh7e6QwGyIo
n8ItraC8znLwv/nsUoTE7/JRcsTEe6vJI26PQmczZfNJs/4O65G7fZg0eSBseZYi
qKf7Uwtb3qW0R7jRUMEgFY4DKXVAA0G2ph40HXBuzOSsqlT6HqYMO2wgG8pJkrTc
qAjoSJGzfAHIsjxzxKI8wKuufCddjCm30VWWU7EKeriI6h1J0uPVqKkQMfYBTkik
696eZSBycAVgwayOng3XaehiTxOL7qGMTjUpDjUR6UscbiPG919vP+QsbIUuBXdb
YoddBQJdyGJiaCXv32ciJjo9bjPRRi/bII7Q5qzCNI2mi4ZVbudF4ffzyQvdHtNJ
nGnpRXoPi7kMvUrKTMPWkFjj0R5/UsPszsA51zbxPydfgBe0Dlc2PrrIG8dlzYAp
wbV0Lec+iJucKlt7EZtrjz1xOiOOaQt/5/cW1bWqL+wk2t6gAuY=
=9zTe
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"A more active cycle than most of the recent past, with a few large,
long discussed works this time.
The RNBD block driver has been posted for nearly two years now, and
flowing through RDMA due to it also introducing a new ULP.
The removal of FMR has been a recurring discussion theme for a long
time.
And the usual smattering of features and bug fixes.
Summary:
- Various small driver bugs fixes in rxe, mlx5, hfi1, and efa
- Continuing driver cleanups in bnxt_re, hns
- Big cleanup of mlx5 QP creation flows
- More consistent use of src port and flow label when LAG is used and
a mlx5 implementation
- Additional set of cleanups for IB CM
- 'RNBD' network block driver and target. This is a network block
RDMA device specific to ionos's cloud environment. It brings strong
multipath and resiliency capabilities.
- Accelerated IPoIB for HFI1
- QP/WQ/SRQ ioctl migration for uverbs, and support for multiple
async fds
- Support for exchanging the new IBTA defiend ECE data during RDMA CM
exchanges
- Removal of the very old and insecure FMR interface from all ULPs
and drivers. FRWR should be preferred for at least a decade now"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits)
RDMA/cm: Spurious WARNING triggered in cm_destroy_id()
RDMA/mlx5: Return ECE DC support
RDMA/mlx5: Don't rely on FW to set zeros in ECE response
RDMA/mlx5: Return an error if copy_to_user fails
IB/hfi1: Use free_netdev() in hfi1_netdev_free()
RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr()
RDMA/core: Move and rename trace_cm_id_create()
IB/hfi1: Fix hfi1_netdev_rx_init() error handling
RDMA: Remove 'max_map_per_fmr'
RDMA: Remove 'max_fmr'
RDMA/core: Remove FMR device ops
RDMA/rdmavt: Remove FMR memory registration
RDMA/mthca: Remove FMR support for memory registration
RDMA/mlx4: Remove FMR support for memory registration
RDMA/i40iw: Remove FMR leftovers
RDMA/bnxt_re: Remove FMR leftovers
RDMA/mlx5: Remove FMR leftovers
RDMA/core: Remove FMR pool API
RDMA/rds: Remove FMR support for memory registration
RDMA/srp: Remove support for FMR memory registration
...
When deleting an inode, keep track of the generation of the deleted inode in
the inode glock Lock Value Block (LVB). When trying to delete an inode
remotely, check the last-known inode generation against the deleted inode
generation to skip duplicate remote deletes. This avoids taking the resource
group glock in order to verify the block type.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
These are updates to SoC specific drivers that did not have
another subsystem maintainer tree to go through for some
reason:
- Some bus and memory drivers for the MIPS P5600 based
Baikal-T1 SoC that is getting added through the MIPS tree.
- There are new soc_device identification drivers for TI K3,
Qualcomm MSM8939
- New reset controller drivers for NXP i.MX8MP, Renesas
RZ/G1H, and Hisilicon hi6220
- The SCMI firmware interface can now work across ARM SMC/HVC
as a transport.
- Mediatek platforms now use a new driver for their "MMSYS"
hardware block that controls clocks and some other aspects
in behalf of the media and gpu drivers.
- Some Tegra processors have improved power management
support, including getting woken up by the PMIC and cluster
power down during idle.
- A new v4l staging driver for Tegra is added.
- Cleanups and minor bugfixes for TI, NXP, Hisilicon,
Mediatek, and Tegra.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl7XvgAACgkQmmx57+YA
GNmj/hAAnAJ/hYehLfgCe711HUntgeRkaoTVpCt8BJNMdxsa23sn3V6k5+WYn1uG
PtlgpefZEMHLUEEVDegR4nZXLG0Pzu1SR12KW34YPcQKkNo/+vlQ9zYUajnJ/KX6
10zdLSIzHfk1VtXKvvQQ8xFyE+S/trGmjC57E6gfoCUT3rl1maD+ccVXUBaz9oob
wuMxGXQAl57mio5yT1OfSk6Fev39xRE2dN1hzP7KUYhsemZajBwBBW5wVJZCsCB8
LCGmxVkavM7BV4r2NokbBDs5rlfedBl/P/IPd9Is5a5tuGUkSsVRG9zqShxYLGM3
S06az6POQFwXKFJoUKW0dK/Koy0D7BK+vhUBPzFv4HZ8iDCVf6Jju2MJ02GMqHPj
OOrXaCbLYrvN/edVUWeeFywqwMbYTRwC4DxyTq5m7HxEB004xTOhs0rX5aR0u4n1
bbsR97LguolwH9iEMzd3F3jCiKBcMecH3lAh5WcrtwlFIRrNhbWoGDoA/4TuORFS
b11rgsTRIJ5Vc++D1HnSnx0ZZvUzyluMvygdALnSgVah6xYe6KVw9Kg/wioAJ04G
uSTidqP3qRhsyET2HQo7CxdVfZbKfP25iKCKrdhQziztKvhF8qrUmZKloXOodRw+
ewYSRmv8c324OYYit1X43oAdW8dntq1XbSIauaqxEb4JC7x/xRY=
=44Jc
-----END PGP SIGNATURE-----
Merge tag 'arm-drivers-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM/SoC driver updates from Arnd Bergmann:
"These are updates to SoC specific drivers that did not have another
subsystem maintainer tree to go through for some reason:
- Some bus and memory drivers for the MIPS P5600 based Baikal-T1 SoC
that is getting added through the MIPS tree.
- There are new soc_device identification drivers for TI K3, Qualcomm
MSM8939
- New reset controller drivers for NXP i.MX8MP, Renesas RZ/G1H, and
Hisilicon hi6220
- The SCMI firmware interface can now work across ARM SMC/HVC as a
transport.
- Mediatek platforms now use a new driver for their "MMSYS" hardware
block that controls clocks and some other aspects in behalf of the
media and gpu drivers.
- Some Tegra processors have improved power management support,
including getting woken up by the PMIC and cluster power down
during idle.
- A new v4l staging driver for Tegra is added.
- Cleanups and minor bugfixes for TI, NXP, Hisilicon, Mediatek, and
Tegra"
* tag 'arm-drivers-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (155 commits)
clk: sprd: fix compile-testing
bus: bt1-axi: Build the driver into the kernel
bus: bt1-apb: Build the driver into the kernel
bus: bt1-axi: Use sysfs_streq instead of strncmp
bus: bt1-axi: Optimize the return points in the driver
bus: bt1-apb: Use sysfs_streq instead of strncmp
bus: bt1-apb: Use PTR_ERR_OR_ZERO to return from request-regs method
bus: bt1-apb: Fix show/store callback identations
bus: bt1-apb: Include linux/io.h
dt-bindings: memory: Add Baikal-T1 L2-cache Control Block binding
memory: Add Baikal-T1 L2-cache Control Block driver
bus: Add Baikal-T1 APB-bus driver
bus: Add Baikal-T1 AXI-bus driver
dt-bindings: bus: Add Baikal-T1 APB-bus binding
dt-bindings: bus: Add Baikal-T1 AXI-bus binding
staging: tegra-video: fix V4L2 dependency
tee: fix crypto select
drivers: soc: ti: knav_qmss_queue: Make knav_gp_range_ops static
soc: ti: add k3 platforms chipid module driver
dt-bindings: soc: ti: add binding for k3 platforms chipid module
...