Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1:
Before NMI IRET test
Sending NMI to self
NMI isr running stack 0x461000
Sending nested NMI to self
After nested NMI to self
Nested NMI isr running rip=40038e
After iret
After NMI to self
FAIL: NMI
Commit 4c4a6f790e (KVM: nVMX: track NMI blocking state separately
for each VMCS) tracks NMI blocking state separately for vmcs01 and
vmcs02. However it is not enough:
- The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault
on IRET, so the L2 can generate #PF which can be intercepted by L0.
- L0 walks L1's guest page table and sees the mapping is invalid, it
resumes the L1 guest and injects the #PF into L1. At this point the
vmcs02 has nmi_known_unmasked=true.
- L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field
of vmcs12 (and fixes the shadow page table) before resuming L2 guest.
- L1 executes VMRESUME to resume L2, causing a vmexit to L0
- during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the
interruptibility-state field of vmcs02, but nmi_known_unmasked is
still true.
- L2 immediately exits to L0 with another page fault, because L0 still has
not updated the NGVA->HPA page tables. However, nmi_known_unmasked is
true so vmx_recover_nmi_blocking does not do anything.
The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The PI vector for L0 and L1 must be different. If dest vcpu0
is in guest mode while vcpu1 is delivering a non-nested PI to
vcpu0, there wont't be any vmexit so that the non-nested interrupt
will be delayed.
Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are using the same vector for nested/non-nested posted
interrupts delivery, this may cause interrupts latency in
L1 since we can't kick the L2 vcpu out of vmx-nonroot mode.
This patch introduces a new vector which is only for nested
posted interrupts to solve the problems above.
Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts the change of commit f85c758dbe,
as the behavior it modified was intended.
The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual
says:
Table 4-7. Use of CR3 with PAE Paging
Bit Position(s) Contents
4:0 Ignored
31:5 Physical address of the 32-Byte aligned
page-directory-pointer table used for linear-address
translation
63:32 Ignored (these bits exist only on processors supporting
the Intel-64 architecture)
To placate the static checker, write the mask explicitly as an
unsigned long constant instead of using a 32-bit unsigned constant.
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: f85c758dbe
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Simplify and improve the code so that the PID is always available in
the uevent even when debugfs is not available.
This adds a userspace_pid field to struct kvm, as per Radim's
suggestion, so that the PID can be retrieved on destruction too.
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Fixes: 286de8f6ac ("KVM: trigger uevents when creating or destroying a VM")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just like in the allocator we must avoid touching multiple AGs out of
order when freeing blocks, as freeing still locks the AGF and can cause
the same AB-BA deadlocks as in the allocation path.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
write_sysreg() may misparse the value argument because it is used
without parentheses to protect it.
This patch adds the ( ) in order to avoid any surprises.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
[will: same change to write_sysreg_s]
Signed-off-by: Will Deacon <will.deacon@arm.com>
The check for column exclusion did not verify that the event being
checked was an L2 event, and not a software event.
Software events should not be checked for column exclusion.
This resulted in a group with both software and L2 events sometimes
incorrectly rejecting the L2 event for column exclusion and
not counting it.
Add a check for PMU type before applying column exclusion logic.
Fixes: 21bdbb7102 ("perf: add qcom l2 cache perf events driver")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In commit efe0160cfd ("powerpc/64: Linker on-demand sfpr functions
for modules"), we added an ld version check early in the powerpc
top-level Makefile.
Because the Makefile runs before the kernel config is setup, the
checks for CONFIG_CPU_LITTLE_ENDIAN etc. all take the default case. So
we end up configuring ld for 32-bit big endian.
That would be OK, except that for historical (or perhaps no) reason,
we use 'override LD' to add the endian flags to the LD variable
itself, rather than the normal approach of adding them to LDFLAGS.
The end result is that when we check the ld version we run it as:
$(CROSS_COMPILE)ld -EB -m elf32ppc --version
This often works, unless you are using a 64-bit only and/or little
endian only, toolchain. In which case you see something like:
$ make defconfig
powerpc64le-linux-ld: unrecognised emulation mode: elf32ppc
Supported emulations: elf64lppc elf32lppc elf32lppclinux elf32lppcsim
/bin/sh: 1: [: -ge: unexpected operator
The proper fix is to stop using 'override LD', but that will require a
fair bit of testing. Instead we can fix it for now just by reordering
the Makefile to do the version check earlier.
Fixes: efe0160cfd ("powerpc/64: Linker on-demand sfpr functions for modules")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
As for commit 68baf692c4 ("powerpc/pseries: Fix of_node_put()
underflow during DLPAR remove"), the call to of_node_put() must be
removed from pSeries_reconfig_remove_node().
dlpar_detach_node() and pSeries_reconfig_remove_node() both call
of_detach_node(), and thus the node should not be released in both
cases.
Fixes: 0829f6d1f6 ("of: device_node kobject lifecycle fixes")
Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There's a somewhat architectural issue with Radix MMU and KVM.
When coming out of a guest with AIL (Alternate Interrupt Location, ie,
MMU enabled), we start executing hypervisor code with the PID register
still containing whatever the guest has been using.
The problem is that the CPU can (and will) then start prefetching or
speculatively load from whatever host context has that same PID (if
any), thus bringing translations for that context into the TLB, which
Linux doesn't know about.
This can cause stale translations and subsequent crashes.
Fixing this in a way that is neither racy nor a huge performance
impact is difficult. We could just make the host invalidations always
use broadcast forms but that would hurt single threaded programs for
example.
We chose to fix it instead by partitioning the PID space between guest
and host. This is possible because today Linux only use 19 out of the
20 bits of PID space, so existing guests will work if we make the host
use the top half of the 20 bits space.
We additionally add support for a property to indicate to Linux the
size of the PID register which will be useful if we eventually have
processors with a larger PID space available.
There is still an issue with malicious guests purposefully setting the
PID register to a value in the hosts PID range. Hopefully future HW
can prevent that, but in the meantime, we handle it with a pair of
kludges:
- On the way out of a guest, before we clear the current VCPU in the
PACA, we check the PID and if it's outside of the permitted range
we flush the TLB for that PID.
- When context switching, if the mm is "new" on that CPU (the
corresponding bit was set for the first time in the mm cpumask), we
check if any sibling thread is in KVM (has a non-NULL VCPU pointer
in the PACA). If that is the case, we also flush the PID for that
CPU (core).
This second part is needed to handle the case where a process is
migrated (or starts a new pthread) on a sibling thread of the CPU
coming out of KVM, as there's a window where stale translations can
exist before we detect it and flush them out.
A future optimization could be added by keeping track of whether the
PID has ever been used and avoid doing that for completely fresh PIDs.
We could similarily mark PIDs that have been the subject of a global
invalidation as "fresh". But for now this will do.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop
unneeded include of kvm_book3s_asm.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Three small fixes. The transfer size fixes are actually correcting
some performance drops on the hpsa and smartpqi cards. The cards
actually have an internal cache for request speed up but bypass it for
transfers > 1MB. Since 4.3 the efficiency of our merges has rendered
the cache mostly unused, so limit transfers to under 1MB to recover
the cache boost.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZd9JkAAoJEAVr7HOZEZN4vp0P/3jzI6VM/yZM/dVRRc0BizCQ
fsZRVTsin0FPnsztaA/S4aGIpSLFZE2TfF2AXEuwMw2hqGfvdF+7lrPkQb50j7a+
YP3M17hBN1J015BKm9lCN/fhVVCONxhZtnqxoXxiiRaakKkTuucC32NyAB85ans0
2OTh+fpodSQmPtX2bDU3bFx0AdSIX5aZgW2XhLBtosH9dRT/HrJ4svkIr3FVMBzV
CZD0YXqsush9xwZJJy2nkcW44yhL1xtcdN30YkWuSJOnITt6RgWGMhiTrZCa2sNR
LZSRkb47WohaNY9TR67NhsGv3z8FY3FhXO507SCs1HBZL/DIXVg+DDkF5pirAdR9
THLXvaNAFuYORgYum4UQJOqUM9a0lh/CWRYioQtQBlXNnvvvUE2fRRsgGGHU+G+D
9mMjUPGQhlLhOM/1h5OxGqnArcugb18LflBfPhfJ22XnK0KDj9c3CW68quNHLiBg
SGLZxUjlnHe1Ebe6T4UR1/fv/mp2LsIhXvQvFnfAqCp8ehj7wjl02R8CG0c4ldGu
4u4o4zLxHyz27sXpST0je89ceGaHVyCMmFyQ4xxUB61TPSaC52FGjxz2m8RQAaSm
fowc6niH8RtJ+v7EwDje/9BFOIVfn/Y/wKcSjKdV7kvh9zZQ7FB06sW7fhhUhqqh
IYLXDAo2AYis/FnL932c
=7jN6
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three small fixes.
The transfer size fixes are actually correcting some performance drops
on the hpsa and smartpqi cards. The cards actually have an internal
cache for request speed up but bypass it for transfers > 1MB. Since
4.3 the efficiency of our merges has rendered the cache mostly unused,
so limit transfers to under 1MB to recover the cache boost"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sg: fix static checker warning in sg_is_valid_dxfer
scsi: smartpqi: limit transfer length to 1MB
scsi: hpsa: limit transfer length to 1MB
- add a missing "!" in the uuid tests
- remove the last remaining user of the uuid_be type, and then
the type and its helpers
-----BEGIN PGP SIGNATURE-----
iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAll3mDcLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPIvw//XeCc0g2xMHJAX7Z6T1KoEubDVxpFaVKMIqZgE8ia
NRy5RBq3cTKxhVRRj1KDSP7Zf1eNppYycY/fTZ0tRx7ssjFDjtxBMHyMv/wBvR/Z
Hg0YAyHtlk/S9hzZhB9xj9jarvXYXTvCLOgIsDlaPcgqdlDeSC0thhscJOqvliIo
+rdVp/fyQcUbtKXyCMtiaf0AJfncNa31VdD/VmFQEM9dltohyaWOzx+ZOcI2OhnD
YYjt2fMBFOH87q8A+OZMzA1j/LEhMyDxIiPB8N9+qYkuKhyfdZi9lhKwN3YZL0y0
IZ+AgKWEzAz0t08BTn5AURCytm84i5UtidE9s5WCnOIqtMT5D1hKcrmkgZKywQ2R
GFpXnw8J+LI4ZPhrC5dMmdVESvGSXeWZoztoPZBSRPrrYA4co2MemiwMP6SzBocu
S04Hgh5rMXJN/iJxasuNIIyJfA4eOyZVhszlKlkFT8YyGmaV3o9znvSkFd33HxR8
IpneM1ymMJHZvqKX9OmFPZWWpwyu4eToT+NgPbONzeKRNf3qTMRztCHaERNnFk8u
Zdhh2mVKAwWcAglJzJ8q72qywec8VIsC+b14BVpWmjtBva5XhC4TBQw3fz+BMpMb
Bjpj4d9KaynTV1d3ululkkYjSRLUO9/F0pOUJUFEuGJezmF06qkyJQAW/iHyhqze
ANE=
=FeA7
-----END PGP SIGNATURE-----
Merge tag 'uuid-for-4.13-2' of git://git.infradead.org/users/hch/uuid
Pull uuid fixes from Christoph Hellwig:
- add a missing "!" in the uuid tests
- remove the last remaining user of the uuid_be type, and then the type
and its helpers
* tag 'uuid-for-4.13-2' of git://git.infradead.org/users/hch/uuid:
uuid: remove uuid_be
thunderbolt: use uuid_t instead of uuid_be
uuid: fix incorrect uuid_equal conversion in test_uuid_test
- split the global dma coherent pool from the per-device pool.
This fixes a regression in the earlier 4.13 pull requests where the
global pool would override a per-device CMA pool. (Vladimir Murzin).
-----BEGIN PGP SIGNATURE-----
iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAll3l1sLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYN8BhAAqFxy2CrpEBk7gD2byOi9M4kTeXDYCESEoEAwuvTG
Fesbw5zumliBR2cjt/qk/uIDZ93fP4BuHn89NtIfcGOD1LqYOyIPwUTpmb9AgicD
y4eO1Gy/3DrG2haZcWYmDvq8yfSuR01H3ecY1KNsX1Y2kXxeBQfVKaUDR6fuix4+
uCf98LzIWs3TYmj7h48LVB/oNnigvs0oljrB2dWrWVJHbgGYEpmdPjBEe6r95e5U
5cHtPno5JA1lbBFt/nvsZl/NmzSd745SL3QwJsaVmSTf7oYnAuwyPI+5gqaoeQT6
24947e8hJjuLhBpO7RiqnJY9QdPxT0XKclkCcjnRb5j3dB9KL09f9Dz60exyJzSe
18V8+8+1m1BgvPsAOS/pLKYxKr9Kgzl9LFrFQaBkA5+7SPlywfV7HqaCkN/mKB4F
XJoQyRDLlZiDStDKbrhGEAHG6oYaZXnkpQ5xDitSXcSkh9/2a/elsG3caUBRI5qP
vKC0qvfBPjnHa/3lYNNoLgADB4tZCE3rRrVP6tqdHQbjuNUNK1wLNT7PiMfeoUVj
Oqql4le0AKlsxO4vRjavOrtaW1bVT+eAYLEtdQfXWQDvhffriEW6r6I8PGqIOiCO
OzxemCG2M6fcD9ho/VDpjo3Ei6tZylrxdTbrsm7ogQmo/U3ID9cfs452vIOYtCcB
9so=
=fJWP
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-4.13-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma mapping fixes from Christoph Hellwig:
"split the global dma coherent pool from the per-device pool.
This fixes a regression in the earlier 4.13 pull requests where the
global pool would override a per-device CMA pool (Vladimir Murzin)"
* tag 'dma-mapping-4.13-2' of git://git.infradead.org/users/hch/dma-mapping:
ARM: NOMMU: Wire-up default DMA interface
dma-coherent: introduce interface for default DMA pool
It's always bothered me that we only disable preemption in
copy_user_page around the call to flush_dcache_page_asm.
This patch extends this to after the copy.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
Helge noticed that we flush the TLB page in flush_cache_page but not in
flush_cache_range or flush_cache_mm.
For a long time, we have had random segmentation faults building
packages on machines with PA8800/8900 processors. These machines only
support equivalent aliases. We don't see these faults on machines that
don't require strict coherency. So, it appears TLB speculation
sometimes leads to cache corruption on machines that require coherency.
This patch adds TLB flushes to flush_cache_range and flush_cache_mm when
coherency is required. We only flush the TLB in flush_cache_page when
coherency is required.
The patch also optimizes flush_cache_range. It turns out we always have
the right context to use flush_user_dcache_range_asm and
flush_user_icache_range_asm.
The patch has been tested for some time on rp3440, rp3410 and A500-44.
It's been boot tested on c8000. No random segmentation faults were
observed during testing.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
Mic mute led does not work on HP ProBook 440 G4.
We can use CXT_FIXUP_MUTE_LED_GPIO fixup to support it.
BugLink: https://bugs.launchpad.net/bugs/1705586
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Copy the approach taken by gfx8, which simplifies the code, and set the
instance index properly. The latter is required for debugging, e.g. for
reading wave status by UMR.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If our device loses its connection for longer than the dead timeout we
will set NBD_DISCONNECTED in order to quickly fail any pending IO's that
flood in after the IO's that were waiting during the dead timer.
However if we re-connect at some point in the future we'll still see
this DISCONNECTED flag set if we then lose our connection again after
that, which means we won't get notifications for our newly lost
connections. Fix this by just clearing the DISCONNECTED flag on
reconnect in order to make sure everything works as it's supposed to.
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Some machines can't power off the machine, so disable the lockup detectors to
avoid this watchdog BUG to show up every few seconds:
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd-shutdow:1]
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.9+
The Page Deallocation Table (PDT) holds the physical addresses of all broken
memory addresses. With the physical address we now are able to show which DIMM
slot (e.g. 1a, 3c) actually holds the broken memory module so that users are
able to replace it.
Signed-off-by: Helge Deller <deller@gmx.de>
The value REQ_OP_FLUSH is only used by the block code for
request-based devices.
Remove the tests for REQ_OP_FLUSH from the bio-based dm-zoned-target.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Bumo dm-raid target version to 1.12.1 to reflect that commit cc27b0c78c
("md: fix deadlock between mddev_suspend() and md_write_start()") is
available.
This version change allows userspace to detect that MD fix is available.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Use runtime flag to ensure that an mddev gets suspended/resumed just once.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
During growing reshapes (i.e. stripes being added to a raid set), the
new stripe images are not in-sync and not part of the raid set until
the reshape is started.
LVM2 has to request multiple table reloads involving superblock updates
in order to reflect proper size of SubLVs in the cluster. Before a stripe
adding reshape starts, validate_raid_redundancy() fails as a result of that
because it checks the total number of devices against the number of rebuild
ones rather than the actual ones in the raid set (as retrieved from the
superblock) thus resulting in failed raid4/5/6/10 redundancy checks.
E.g. convert 3 stripes -> 7 stripes raid5 (which only allows for maximum
1 device to fail) requesting +4 delta disks causing 4 devices to rebuild
during reshaping thus failing activation.
To fix this, move validate_raid_redundancy() to get access to the
current raid_set members.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Add a firmware wrapper function, which asks PDC firmware for the DIMM slot of a
physical address. This is needed to show users which DIMM module needs
replacement in case a broken DIMM was encountered.
Signed-off-by: Helge Deller <deller@gmx.de>
Commit c9c2877d08 ("parisc: Add Page Deallocation Table (PDT) support")
introduced the pdc_pat_mem_read_pd_pdt() firmware helper function, which
crashed the system because it trashed the stack if the
pdc_pat_mem_read_pd_retinfo struct was located on the stack (and which is
in size less than the required 32 64-bit values).
Fix it by using the pdc_result struct instead when calling firmware and copy
the return values back into the result struct when finished sucessfully.
While debugging this code I noticed that the pdc_type wasn't set correctly
either, so let's fix that too.
Fixes: c9c2877d08 ("parisc: Add Page Deallocation Table (PDT) support")
Signed-off-by: Helge Deller <deller@gmx.de>
It's possible the preferred HMB size may not be a multiple of the
chunk_size. This patch moves len to function scope and uses that in
the for loop increment so the last iteration doesn't cause the total
size to exceed the allocated HMB size.
Based on an earlier patch from Keith Busch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Fixes: 87ad72a59a ("nvme-pci: implement host memory buffer support")
The FC-NVME spec hasn't locked down on the format string for TRADDR.
Currently the spec is lobbying for "nn-<16hexdigits>:pn-<16hexdigits>"
where the wwn's are hex values but not prefixed by 0x.
Most implementations so far expect a string format of
"nn-0x<16hexdigits>:pn-0x<16hexdigits>" to be used. The transport
uses the match_u64 parser which requires a leading 0x prefix to set
the base properly. If it's not there, a match will either fail or return
a base 10 value.
The resolution in T11 is pushing out. Therefore, to fix things now and
to cover any eventuality and any implementations already in the field,
this patch adds support for both formats.
The change consists of replacing the token matching routine with a
routine that validates the fixed string format, and then builds
a local copy of the hex name with a 0x prefix before calling
the system parser.
Note: the same parser routine exists in both the initiator and target
transports. Given this is about the only "shared" item, we chose to
replicate rather than create an interdendency on some shared code.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There are cases where threads are in the process of submitting new
io when the LLDD calls in to remove the remote port. In some cases,
the next io actually goes to the LLDD, who knows the remoteport isn't
present and rejects it. To properly recovery/restart these i/o's we
don't want to hard fail them, we want to treat them as temporary
resource errors in which a delayed retry will work.
Add a couple more checks on remoteport connectivity and commonize the
busy response handling when it's seen.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fabrics commands with opcode 0x7F use the fctype field to indicate data
direction.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Sagi Grimberg <sai@grmberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fixes: eb793e2c ("nvme.h: add NVMe over Fabrics definitions")
The WWID sysfs attribute can provide multiple means of a World Wide ID
for a NVMe device. It can either be a NGUID, a EUI-64 or a concatenation
of VID, Serial Number, Model and the Namespace ID in this order of
preference.
If the target also sends us a UUID use the UUID for identification and
give it the highest priority.
This eases generation of /dev/disk/by-* symlinks.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Pull HID fixes from Jiri Kosina:
- regression fix (missing IRQs) for devices that require 'always poll'
quirk, from Dmitry Torokhov
- new device ID addition to Ortek driver, from Benjamin Tissoires
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: ortek: add one more buggy device
HID: usbhid: fix "always poll" quirk
Pull s390 fixes from Martin Schwidefsky:
"Three bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: set change and reference bit on lazy key enablement
s390: chp: handle CRW_ERC_INIT for channel-path status change
s390/perf: fix problem state detection
When we're checking the entries in a directory buffer, make sure that
the entry length doesn't push us off the end of the buffer. Found via
xfs/388 writing ones to the length fields.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
This patch partially reverts 3df0e50 ("xen/blkfront: pseudo support for
multi hardware queues/rings"). The xen-blkfront queue/ring might hang due
to grants allocation failure in the situation when gnttab_free_head is
almost empty while many persistent grants are reserved for this queue/ring.
As persistent grants management was per-queue since 73716df ("xen/blkfront:
make persistent grants pool per-queue"), we should always allocate from
persistent grants first.
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When ring buf full, hw queue will be stopped. While blkif interrupt consume
request and make free space in ring buf, hw queue will be started again.
But since start queue is protected by spin lock while stop not, that will
cause a race.
interrupt: process:
blkif_interrupt() blkif_queue_rq()
kick_pending_request_queues_locked()
blk_mq_start_stopped_hw_queues()
clear_bit(BLK_MQ_S_STOPPED, &hctx->state)
blk_mq_stop_hw_queue(hctx)
blk_mq_run_hw_queue(hctx, async)
If ring buf is made empty in this case, interrupt will never come, then the
hw queue will be stopped forever, all processes waiting for the pending io
in the queue will hung.
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We should be returning normal negative error codes here. The "a"
variables comes from &c->async_write_error which is a blk_status_t
converted to a regular error code.
In the current code, the blk_status_t gets propogated back to
pool_create() and eventually results in an Oops.
Fixes: 4e4cbee93d ("block: switch bios to blk_status_t")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If the dm-integrity superblock was corrupted in such a way that the
journal_sections field was zero, the integrity target would deadlock
because it would wait forever for free space in the journal.
Detect this situation and refuse to activate the device.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 7eada909bf ("dm: add integrity target")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If this WARN_ON triggers it speaks to programmer error, and likely
implies corruption, but no released kernel should trigger it. This
WARN_ON serves to assist DM integrity developers as changes are
made/tested in the future.
BUG_ON is excessive for catching programmer error, if a user or
developer would like warnings to trigger a panic, they can enable that
via /proc/sys/kernel/panic_on_warn
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Unregister the driver before removing multi-instance hotplug
callbacks. This order avoids the warning issued from
__cpuhp_remove_state_cpuslocked when the number of remaining
instances isn't yet zero.
Fixes: 8017c27919 ("net/virtio-net: Convert to hotplug state machine")
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch saves the deflated pages to a list, instead of the PFN array.
Accordingly, the balloon_pfn_to_page() function is removed.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>