The root cause of panic is the num_pm of nfit_test1 is wrong.
Though 1 is specified for num_pm at nfit_test_init(), it must be 2,
because nfit_test1->spa_set[] array has 2 elements.
Since the array is smaller than expected, the driver breaks other area.
(it is often the link list of devres).
As a result, panic occurs like the following example.
CPU: 4 PID: 2233 Comm: lt-libndctl Tainted: G O 4.12.0-rc1+ #12
RIP: 0010:__list_del_entry_valid+0x6c/0xa0
Call Trace:
release_nodes+0x76/0x260
devres_release_all+0x3c/0x50
device_release_driver_internal+0x159/0x200
device_release_driver+0x12/0x20
bus_remove_device+0xfd/0x170
device_del+0x1e8/0x330
platform_device_del+0x28/0x90
platform_device_unregister+0x12/0x30
nfit_test_exit+0x2a/0x93b [nfit_test]
Cc: <stable@vger.kernel.org>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
bytes. Instead we convert them to use guid_t type. At the same time we
convert current users.
acpi_str_to_uuid() becomes useless after the conversion and it's safe to
get rid of it.
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The workqueue may still be running when the devres callbacks start
firing to deallocate an acpi_nfit_desc instance. Stop and flush the
workqueue before letting any other devres de-allocations proceed.
Reported-by: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Keep the nfit_test instances alive until after nfit_test_teardown(), as
we may be doing resource lookups until the final un-registrations have
completed. This fixes crashes of the form.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: __release_resource+0x12/0x90
Call Trace:
remove_resource+0x23/0x40
__wrap_remove_resource+0x29/0x30 [nfit_test_iomap]
acpi_nfit_remove_resource+0xe/0x10 [nfit]
devm_action_release+0xf/0x20
release_nodes+0x16d/0x2b0
devres_release_all+0x3c/0x60
device_release+0x21/0x90
kobject_release+0x6a/0x170
kobject_put+0x2f/0x60
put_device+0x17/0x20
platform_device_unregister+0x20/0x30
nfit_test_exit+0x36/0x960 [nfit_test]
Reported-by: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add a simulated dimm with an ACPI_NFIT_MEM_MAP_FAILED indication, and
set the ACPI_NFIT_MEM_HEALTH_ENABLED flag on all the dimms where
nfit_test simulates health events, but spread it out over several
redundant memdev entries to test that the nfit driver coalesces all the
flags.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For testing changes to the iset cookie algorithm we need a value that is
constant from run-to-run.
Stop including dynamic data in the emulated region_offset values. Also,
pick values that sort in a different order depending on whether the
comparison is a memcmp() of two 8-byte arrays or subtraction of two
64-bit values.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A recent flurry of bug discoveries in the nfit driver's DSM marshalling
routine has highlighted the fact that we do not have unit test coverage
for this routine. Add a self-test of acpi_nfit_ctl() routine before
probing the "nfit_test.0" device. This mocks stimulus to acpi_nfit_ctl()
and if any of the tests fail "nfit_test.0" will be unavailable causing
the rest of the tests to not run / fail.
This unit test will also be a place to land reproductions of quirky BIOS
behavior discovered in the field and ensure the kernel does not regress
against implementations it has seen in practice.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Update nfit_test infrastructure to enable labels for the dimm on the
nfit_test.1 bus. This bus has a pmem region without aliased blk space,
so it is a candidate for dynamically enabling label support by writing
a namespace index block.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Update nfit_test to handle multiple sub-allocations within a given pmem
region. The mock resource now tracks and un-tracks sub-ranges as they
are requested and released (either explicitly or via devm callback).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add an nfit_test specific attribute for gating whether a get_config_size
DSM, or any DSM for that matter, succeeds or fails. The get_config_size
DSM is initial motivation since that is the first command libnvdimm core
issues to determine the state of the namespace label area.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Commit 480b6837aa "nvdimm: fix PHYS_PFN/PFN_PHYS mixup" identified
that we were passing an invalid address to devm_nvdimm_ioremap(). With
that fixed it exposed a bug in the memory reservation size for flush
hint tables. Since we map a full page we need to mock a full page of
memory to back the flush hint table entries.
Cc: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Trigger an nmemX/nfit/flags attribute to fire an event whenever a
smart-threshold DSM is received.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We have had a couple bugs in this implementation in the past and before
we add another ->notify() implementation for nvdimm devices, lets allow
this routine to be exercised via nfit_test.
Rewrite acpi_nfit_notify() in terms of a generic struct device and
acpi_handle parameter, and then implement a mock acpi_evaluate_object()
that returns a _FIT payload.
Cc: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The unit tests crash when hotplug races the previous probe. This race
requires that the loading of the nfit_test module be terminated with
SIGTERM, and the module to be unloaded while the ars scan is still
running.
In contrast to the normal nfit driver, the unit test calls
acpi_nfit_init() twice to simulate hotplug, whereas the nominal case
goes through the acpi_nfit_notify() event handler. The
acpi_nfit_notify() path is careful to flush the previous region
registration before servicing the hotplug event. The unit test was
missing this guarantee.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff810cdce7>] pwq_activate_delayed_work+0x47/0x170
[..]
Call Trace:
[<ffffffff810ce186>] pwq_dec_nr_in_flight+0x66/0xa0
[<ffffffff810ce490>] process_one_work+0x2d0/0x680
[<ffffffff810ce331>] ? process_one_work+0x171/0x680
[<ffffffff810ce88e>] worker_thread+0x4e/0x480
[<ffffffff810ce840>] ? process_one_work+0x680/0x680
[<ffffffff810ce840>] ? process_one_work+0x680/0x680
[<ffffffff810d5343>] kthread+0xf3/0x110
[<ffffffff8199846f>] ret_from_fork+0x1f/0x40
[<ffffffff810d5250>] ? kthread_create_on_node+0x230/0x230
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
With the arrival of x86-machine-check support the nfit driver will add a
(conditionally-compiled) source file. Prepare for this by moving all
nfit source to drivers/acpi/nfit/. This is pure code movement, no
functional changes.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
While testing the new on-demand ARS patches we discovered that
differences between the nfit_test and normal nfit driver shutdown paths
can leak resources. Unify the shutdown paths to trigger via a devm_
callback when the acpi_desc->dev is unbound from its driver.
Reviewed-by: Lee, Chun-Yi <jlee@suse.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Let the provider module be explicitly passed in rather than implicitly
assumed by the module that calls nvdimm_bus_register(). This is in
preparation for unifying the nfit and nfit_test driver teardown paths.
Reviewed-by: Lee, Chun-Yi <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Pass the nfit buffer as a parameter rather than hanging it off of
acpi_desc.
Reviewed-by: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
New for ACPI 6.1, these fields are used in the common dimm
representation format defined by section 5.2.25.9 "NVDIMM representation
format".
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Test the virtual disk ranges that platform firmware like EDK2/OVMF might
emit.
Tested-by: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
DMA_CMA is incompatible with SWIOTLB used in enterprise distro
configurations. Switch to vmalloc() allocations for all resources.
Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Currently phys_to_pfn_t() is an exported symbol to allow nfit_test to
override it and indicate that nfit_test-pmem is not device-mapped. Now,
we want to enable nfit_test to operate without DMA_CMA and the pmem it
provides will no longer be physically contiguous, i.e. won't be capable
of supporting direct_access requests larger than a page. Make
pmem_direct_access() a weak symbol so that it can be replaced by the
tools/testing/nvdimm/ version, and move phys_to_pfn_t() to a static
inline now that it no longer needs to be overridden.
Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Clarify the distinction between "commands", the ioctls userspace calls
to request the kernel take some action on a given dimm device, and
"_DSMs", the actual function numbers used in the firmware interface to
the DIMM. _DSMs are ACPI specific whereas commands are Linux kernel
generic.
This is in preparation for breaking the 1:1 implicit relationship
between the kernel ioctl number space and the firmware specific function
numbers.
Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for providing an alternative (to block device) access
mechanism to persistent memory, convert pmem_rw_bytes() to
nsio_rw_bytes(). This allows ->rw_bytes() functionality without
requiring a 'struct pmem_device' to be instantiated.
In other words, when ->rw_bytes() is in use i/o is driven through
'struct nd_namespace_io', otherwise it is driven through 'struct
pmem_device' and the block layer. This consolidates the disjoint calls
to devm_exit_badblocks() and devm_memunmap() into a common
devm_nsio_disable() and cleans up the init path to use a unified
pmem_attach_disk() implementation.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Provide simulated SMART data to enable the ndctl implementation of SMART
data retrieval and parsing.
The payload is defined here, "Section 4.1 SMART and Health Info
(Function Index 1)":
http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add the boiler-plate for a 'clear error' command based on section
9.20.7.6 "Function Index 4 - Clear Uncorrectable Error" from the ACPI
6.1 specification, and add a reference implementation in nfit_test.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Simulate platform-firmware-initiated and asynchronous scrub results.
This injects poison in the middle of all nfit_test pmem address ranges.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The nvdimm unit test infrastructure performs its own initialization of
an acpi_nfit_desc to specify test overrides over the native
implementation. Make it clear which attributes and operations it is
overriding by re-using acpi_nfit_init_desc() as a common starting point.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The return value from an 'ndctl_fn' reports the command execution
status, i.e. was the command properly formatted and was it successfully
submitted to the bus provider. The new 'cmd_rc' parameter allows the bus
provider to communicate command specific results, translated into
common error codes.
Convert the ARS commands to this scheme to:
1/ Consolidate status reporting
2/ Prepare for for expanding ars unit test cases
3/ Make the implementation more generic
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ACPI 6.1 clarifies that "The system shall include an NVDIMM Control
Region Structure for every Function Interface in the NVDIMM."
Implement this clarification in nfit_test.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ACPI 6.1 and JEDEC Annex L Release 3 formalize the format interface
code. Add definitions and update their usage in the unit test.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Use the output length specified in the command to size the receive
buffer rather than the arbitrary 4K limit.
This bug was hiding the fact that the ndctl implementation of
ndctl_bus_cmd_new_ars_status() was not specifying an output buffer size.
Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A dma_addr_t is potentially smaller than a phys_addr_t on some archs.
Don't truncate the address when doing the pfn conversion.
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Matthew Wilcox <willy@linux.intel.com>
[willy: fix pfn_t_to_phys as well]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for getting a poison list using ARS DSMs, enable DSMs for
all manufactured NFITs supplied by the test framework. Also, supply
valid response data for ars_status.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The unit test infrastructure uses CMA and real memory to emulate nvdimm
resources. The call to devm_memremap_pages() can simply be mocked in
the same manner as memremap and we mock phys_to_pfn_t() to clear PFN_MAP
since these resources are not registered with in the pgmap_radix.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When support for _FIT was added, the code presumed that the data
returned by the _FIT method is identical to the NFIT table, which
starts with an acpi_table_header. However, the _FIT is defined
to return a data in the format of a series of NFIT type structure
entries and as a method, has an acpi_object header rather tahn
an acpi_table_header.
To address the differences, explicitly save the acpi_table_header
from the NFIT, since it is accessible through /sys, and change
the nfit pointer in the acpi_desc structure to point to the
table entries rather than the headers.
Reported-by: Jeff Moyer (jmoyer@redhat.com>
Signed-off-by: Linda Knippers <linda.knippers@hpe.com>
Acked-by: Vishal Verma <vishal.l.verma@intel.com>
[vishal: fix up unit test for new header assumptions]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Commit ca321d1ca6 "ACPICA: Update NFIT table to rename a flags field"
performed a tree-wide s/ACPI_NFIT_MEM_ARMED/ACPI_NFIT_MEM_NOT_ARMED/
operation, but missed the tools/testing/nvdimm/ directory.
Cc: Bob Moore <robert.moore@intel.com>
Cc: Lv Zheng <lv.zheng@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add a .notify callback to the acpi_nfit_driver that gets called on a
hotplug event. From this, evaluate the _FIT ACPI method which returns
the updated NFIT with handles for the hot-plugged NVDIMM.
Iterate over the new NFIT, and add any new tables found, and
register/enable the corresponding regions.
In the nfit test framework, after normal initialization, update the NFIT
with a new hot-plugged NVDIMM, and directly call into the driver to
update its view of the available regions.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Elliott, Robert <elliott@hpe.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: <linux-acpi@vger.kernel.org>
Cc: <linux-nvdimm@lists.01.org>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Enable the pmem driver to handle PFN device instances. Attaching a pmem
namespace to a pfn device triggers the driver to allocate and initialize
struct page entries for pmem. Memory capacity for this allocation comes
exclusively from RAM for now which is suitable for low PMEM to RAM
ratios. This mechanism will be expanded later for setting an "allocate
from PMEM" policy.
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This should result in a pretty sizeable performance gain for reads. For
rough comparison I did some simple read testing using PMEM to compare
reads of write combining (WC) mappings vs write-back (WB). This was
done on a random lab machine.
PMEM reads from a write combining mapping:
# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s
PMEM reads from a write-back mapping:
# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
1000000+0 records in
1000000+0 records out
4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s
To be able to safely support a write-back aperture I needed to add
support for the "read flush" _DSM flag, as outlined in the DSM spec:
http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
This flag tells the ND BLK driver that it needs to flush the cache lines
associated with the aperture after the aperture is moved but before any
new data is read. This ensures that any stale cache lines from the
previous contents of the aperture will be discarded from the processor
cache, and the new data will be read properly from the DIMM. We know
that the cache lines are clean and will be discarded without any
writeback because either a) the previous aperture operation was a read,
and we never modified the contents of the aperture, or b) the previous
aperture operation was a write and we must have written back the dirtied
contents of the aperture to the DIMM before the I/O was completed.
In order to add support for the "read flush" flag I needed to add a
generic routine to invalidate cache lines, mmio_flush_range(). This is
protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
only supported on x86.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[djbw: tools/testing/nvdimm/ and memunmap_pmem support]
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>