Commit Graph

59 Commits

Author SHA1 Message Date
Linus Torvalds
a379f71a30 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - a few block updates that fell in my lap

 - lib/ updates

 - checkpatch

 - autofs

 - ipc

 - a ton of misc other things

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (100 commits)
  mm: split gfp_mask and mapping flags into separate fields
  fs: use mapping_set_error instead of opencoded set_bit
  treewide: remove redundant #include <linux/kconfig.h>
  hung_task: allow hung_task_panic when hung_task_warnings is 0
  kthread: add kerneldoc for kthread_create()
  kthread: better support freezable kthread workers
  kthread: allow to modify delayed kthread work
  kthread: allow to cancel kthread work
  kthread: initial support for delayed kthread work
  kthread: detect when a kthread work is used by more workers
  kthread: add kthread_destroy_worker()
  kthread: add kthread_create_worker*()
  kthread: allow to call __kthread_create_on_node() with va_list args
  kthread/smpboot: do not park in kthread_create_on_cpu()
  kthread: kthread worker API cleanup
  kthread: rename probe_kthread_data() to kthread_probe_data()
  scripts/tags.sh: enable code completion in VIM
  mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping
  kdump, vmcoreinfo: report memory sections virtual addresses
  ipc/sem.c: add cond_resched in exit_sme
  ...
2016-10-11 17:34:10 -07:00
Masahiro Yamada
97139d4a6f treewide: remove redundant #include <linux/kconfig.h>
Kernel source files need not include <linux/kconfig.h> explicitly
because the top Makefile forces to include it with:

  -include $(srctree)/include/linux/kconfig.h

This commit removes explicit includes except the following:

  * arch/s390/include/asm/facilities_src.h
  * tools/testing/radix-tree/linux/kernel.h

These two are used for host programs.

Link: http://lkml.kernel.org/r/1473656164-11929-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:33 -07:00
Dan Williams
178d6f4be8 Merge branch 'for-4.9/libnvdimm' into libnvdimm-for-next 2016-10-07 16:46:24 -07:00
Dan Williams
bd4cd745b3 tools/testing/nvdimm: support for sub-dividing a pmem region
Update nfit_test to handle multiple sub-allocations within a given pmem
region.  The mock resource now tracks and un-tracks sub-ranges as they
are requested and released (either explicitly or via devm callback).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-10-07 09:20:53 -07:00
Dan Williams
73606afd46 tools/testing/nvdimm: test get_config_size DSM failures
Add an nfit_test specific attribute for gating whether a get_config_size
DSM, or any DSM for that matter, succeeds or fails.  The get_config_size
DSM is initial motivation since that is the first command libnvdimm core
issues to determine the state of the namespace label area.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-21 09:36:36 -07:00
Dan Williams
9d15ce9caa tools/testing/nvdimm: fix allocation range for mock flush hint tables
Commit 480b6837aa "nvdimm: fix PHYS_PFN/PFN_PHYS mixup" identified
that we were passing an invalid address to devm_nvdimm_ioremap(). With
that fixed it exposed a bug in the memory reservation size for flush
hint tables.  Since we map a full page we need to mock a full page of
memory to back the flush hint table entries.

Cc: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-19 13:49:48 -07:00
Dan Williams
231bf117aa tools/testing/nvdimm: unit test for acpi_nvdimm_notify()
Trigger an nmemX/nfit/flags attribute to fire an event whenever a
smart-threshold DSM is received.

Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-01 18:20:14 -07:00
Dan Williams
c14a868a5a tools/testing/nvdimm: unit test for acpi_nfit_notify()
We have had a couple bugs in this implementation in the past and before
we add another ->notify() implementation for nvdimm devices, lets allow
this routine to be exercised via nfit_test.

Rewrite acpi_nfit_notify() in terms of a generic struct device and
acpi_handle parameter, and then implement a mock acpi_evaluate_object()
that returns a _FIT payload.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-08-23 07:49:42 -07:00
Dan Williams
d8d378fa1a tools/testing/nvdimm: fix SIGTERM vs hotplug crash
The unit tests crash when hotplug races the previous probe. This race
requires that the loading of the nfit_test module be terminated with
SIGTERM, and the module to be unloaded while the ars scan is still
running.

In contrast to the normal nfit driver, the unit test calls
acpi_nfit_init() twice to simulate hotplug, whereas the nominal case
goes through the acpi_nfit_notify() event handler.  The
acpi_nfit_notify() path is careful to flush the previous region
registration before servicing the hotplug event. The unit test was
missing this guarantee.

 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<ffffffff810cdce7>] pwq_activate_delayed_work+0x47/0x170
 [..]
 Call Trace:
  [<ffffffff810ce186>] pwq_dec_nr_in_flight+0x66/0xa0
  [<ffffffff810ce490>] process_one_work+0x2d0/0x680
  [<ffffffff810ce331>] ? process_one_work+0x171/0x680
  [<ffffffff810ce88e>] worker_thread+0x4e/0x480
  [<ffffffff810ce840>] ? process_one_work+0x680/0x680
  [<ffffffff810ce840>] ? process_one_work+0x680/0x680
  [<ffffffff810d5343>] kthread+0xf3/0x110
  [<ffffffff8199846f>] ret_from_fork+0x1f/0x40
  [<ffffffff810d5250>] ? kthread_create_on_node+0x230/0x230

Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-08-10 15:59:09 -07:00
Vishal Verma
6839a6d96f nfit: do an ARS scrub on hitting a latent media error
When a latent (unknown to 'badblocks') error is encountered, it will
trigger a machine check exception. On a system with machine check
recovery, this will only SIGBUS the process(es) which had the bad page
mapped (as opposed to a kernel panic on platforms without machine
check recovery features). In the former case, we want to trigger a full
rescan of that nvdimm bus. This will allow any additional, new errors
to be captured in the block devices' badblocks lists, and offending
operations on them can be trapped early, avoiding machine checks.

This is done by registering a callback function with the
x86_mce_decoder_chain and calling the new ars_rescan functionality with
the address in the mce notificatiion.

Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-24 08:04:04 -07:00
Dan Williams
bdf97013ce nfit: move to nfit/ sub-directory
With the arrival of x86-machine-check support the nfit driver will add a
(conditionally-compiled) source file.  Prepare for this by moving all
nfit source to drivers/acpi/nfit/.  This is pure code movement, no
functional changes.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-24 08:04:04 -07:00
Dan Williams
58cd71b474 nfit, tools/testing/nvdimm/: unify shutdown paths
While testing the new on-demand ARS patches we discovered that
differences between the nfit_test and normal nfit driver shutdown paths
can leak resources.  Unify the shutdown paths to trigger via a devm_
callback when the acpi_desc->dev is unbound from its driver.

Reviewed-by: Lee, Chun-Yi <jlee@suse.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-22 13:35:54 -07:00
Dan Williams
bc9775d869 libnvdimm: move ->module to struct nvdimm_bus_descriptor
Let the provider module be explicitly passed in rather than implicitly
assumed by the module that calls nvdimm_bus_register().  This is in
preparation for unifying the nfit and nfit_test driver teardown paths.

Reviewed-by: Lee, Chun-Yi <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-21 20:03:19 -07:00
Dan Williams
e7a11b449e nfit: cleanup acpi_nfit_init calling convention
Pass the nfit buffer as a parameter rather than hanging it off of
acpi_desc.

Reviewed-by: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-21 14:12:18 -07:00
Dan Williams
5dc68e5574 tools/testing/nvdimm: add manufacturing_{date|location} dimm properties
New for ACPI 6.1, these fields are used in the common dimm
representation format defined by section 5.2.25.9 "NVDIMM representation
format".

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-21 14:12:18 -07:00
Dan Williams
7bfe97c763 tools/testing/nvdimm: add virtual ramdisk range
Test the virtual disk ranges that platform firmware like EDK2/OVMF might
emit.

Tested-by: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-21 14:12:18 -07:00
Dan Williams
7a9eb20666 pmem: kill __pmem address space
The __pmem address space was meant to annotate codepaths that touch
persistent memory and need to coordinate a call to wmb_pmem().  Now that
wmb_pmem() is gone, there is little need to keep this annotation.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-12 19:25:38 -07:00
Dan Williams
85d3fa02e4 tools/testing/nvdimm: simulate multiple flush hints per-dimm
Sample nfit data to test the kernel's handling of the multiple
flush-hint case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-11 16:13:41 -07:00
Dan Williams
b28f08ce50 tools/testing/nvdimm: remove __wrap_devm_memremap_pages placeholder
This now dead code was needed to prevent compile errors while being
staged in -next for v4.5.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-07 17:11:02 -07:00
Dan Williams
ee8520fe8c tools/testing/nvdimm: replace CONFIG_DMA_CMA dependency with vmalloc()
DMA_CMA is incompatible with SWIOTLB used in enterprise distro
configurations.  Switch to vmalloc() allocations for all resources.

Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-06-27 11:40:46 -07:00
Dan Williams
f295e53b60 libnvdimm, pmem: allow nfit_test to override pmem_direct_access()
Currently phys_to_pfn_t() is an exported symbol to allow nfit_test to
override it and indicate that nfit_test-pmem is not device-mapped.  Now,
we want to enable nfit_test to operate without DMA_CMA and the pmem it
provides will no longer be physically contiguous, i.e. won't be capable
of supporting direct_access requests larger than a page.  Make
pmem_direct_access() a weak symbol so that it can be replaced by the
tools/testing/nvdimm/ version, and move phys_to_pfn_t() to a static
inline now that it no longer needs to be overridden.

Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-06-24 11:39:29 -07:00
Dan Williams
6b0a57ed43 tools/testing/nvdimm: add pfn device dependency
Fail building nfit_test.ko when the configuration is missing pfn device
support.

Reported-by: Megha Dey <megha.dey@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-06-17 16:23:23 -07:00
Dan Williams
36092ee8ba Merge branch 'for-4.7/dax' into libnvdimm-for-next 2016-05-21 12:33:04 -07:00
Dan Williams
ab68f26221 /dev/dax, pmem: direct access to persistent memory
Device DAX is the device-centric analogue of Filesystem DAX
(CONFIG_FS_DAX).  It allows memory ranges to be allocated and mapped
without need of an intervening file system.  Device DAX is strict,
precise and predictable.  Specifically this interface:

1/ Guarantees fault granularity with respect to a given page size (pte,
pmd, or pud) set at configuration time.

2/ Enforces deterministic behavior by being strict about what fault
scenarios are supported.

For example, by forcing MADV_DONTFORK semantics and omitting MAP_PRIVATE
support device-dax guarantees that a mapping always behaves/performs the
same once established.  It is the "what you see is what you get" access
mechanism to differentiated memory vs filesystem DAX which has
filesystem specific implementation semantics.

Persistent memory is the first target, but the mechanism is also
targeted for exclusive allocations of performance differentiated memory
ranges.

This commit is limited to the base device driver infrastructure to
associate a dax device with pmem range.

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-05-20 22:02:53 -07:00
Dan Williams
1f716d05f8 Merge branch 'for-4.7/dsm' into libnvdimm-for-next 2016-05-18 10:06:59 -07:00
Dan Williams
2159669f58 Merge branch 'for-4.7/libnvdimm' into libnvdimm-for-next 2016-05-18 10:06:48 -07:00
Dan Williams
cd03412a51 libnvdimm, dax: introduce device-dax infrastructure
Device DAX is the device-centric analogue of Filesystem DAX
(CONFIG_FS_DAX).  It allows persistent memory ranges to be allocated and
mapped without need of an intervening file system.  This initial
infrastructure arranges for a libnvdimm pfn-device to be represented as
a different device-type so that it can be attached to a driver other
than the pmem driver.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-05-09 15:35:42 -07:00
Dan Williams
6634fb0690 tools/testing/nvdimm: ND_CMD_CALL support
Enable nfit_test to use nd_cmd_pkg marshaling.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-05-05 19:02:45 -07:00
Dan Williams
e3654eca70 nfit, libnvdimm: clarify "commands" vs "_DSMs"
Clarify the distinction between "commands", the ioctls userspace calls
to request the kernel take some action on a given dimm device, and
"_DSMs", the actual function numbers used in the firmware interface to
the DIMM.  _DSMs are ACPI specific whereas commands are Linux kernel
generic.

This is in preparation for breaking the 1:1 implicit relationship
between the kernel ioctl number space and the firmware specific function
numbers.

Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-04-28 16:23:16 -07:00
Dan Williams
200c79da82 libnvdimm, pmem, pfn: make pmem_rw_bytes generic and refactor pfn setup
In preparation for providing an alternative (to block device) access
mechanism to persistent memory, convert pmem_rw_bytes() to
nsio_rw_bytes().  This allows ->rw_bytes() functionality without
requiring a 'struct pmem_device' to be instantiated.

In other words, when ->rw_bytes() is in use i/o is driven through
'struct nd_namespace_io', otherwise it is driven through 'struct
pmem_device' and the block layer.  This consolidates the disjoint calls
to devm_exit_badblocks() and devm_memunmap() into a common
devm_nsio_disable() and cleans up the init path to use a unified
pmem_attach_disk() implementation.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-04-22 12:26:23 -07:00
Dan Williams
baa51277cf libnvdimm, test: add mock SMART data payload
Provide simulated SMART data to enable the ndctl implementation of SMART
data retrieval and parsing.

The payload is defined here, "Section 4.1 SMART and Health Info
(Function Index 1)":

    http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-04-11 11:11:14 -07:00
Dan Williams
d4f323672a nfit, libnvdimm: clear poison command support
Add the boiler-plate for a 'clear error' command based on section
9.20.7.6 "Function Index 4 - Clear Uncorrectable Error" from the ACPI
6.1 specification, and add a reference implementation in nfit_test.

Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-05 18:06:14 -08:00
Dan Williams
f471f1a7d0 tools/testing/nvdimm: expand ars unit testing
Simulate platform-firmware-initiated and asynchronous scrub results.
This injects poison in the middle of all nfit_test pmem address ranges.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-05 12:24:06 -08:00
Dan Williams
a61fe6f790 nfit, tools/testing/nvdimm: unify common init for acpi_nfit_desc
The nvdimm unit test infrastructure performs its own initialization of
an acpi_nfit_desc to specify test overrides over the native
implementation.  Make it clear which attributes and operations it is
overriding by re-using acpi_nfit_init_desc() as a common starting point.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-05 12:24:06 -08:00
Dan Williams
aef2533822 libnvdimm, nfit: centralize command status translation
The return value from an 'ndctl_fn' reports the command execution
status, i.e. was the command properly formatted and was it successfully
submitted to the bus provider.  The new 'cmd_rc' parameter allows the bus
provider to communicate command specific results, translated into
common error codes.

Convert the ARS commands to this scheme to:

1/ Consolidate status reporting

2/ Prepare for for expanding ars unit test cases

3/ Make the implementation more generic

Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-05 12:24:06 -08:00
Dan Williams
3b87356f50 nfit, tools/testing/nvdimm: test multiple control regions per-dimm
ACPI 6.1 clarifies that "The system shall include an NVDIMM Control
Region Structure for every Function Interface in the NVDIMM."
Implement this clarification in nfit_test.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-05 12:24:06 -08:00
Dan Williams
be26f9ae02 nfit, tools/testing/nvdimm: add format interface code definitions
ACPI 6.1 and JEDEC Annex L Release 3 formalize the format interface
code.  Add definitions and update their usage in the unit test.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-05 12:24:06 -08:00
Dan Williams
747ffe11b4 libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing
Use the output length specified in the command to size the receive
buffer rather than the arbitrary 4K limit.

This bug was hiding the fact that the ndctl implementation of
ndctl_bus_cmd_new_ars_status() was not specifying an output buffer size.

Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-02-19 15:21:52 -08:00
Dan Williams
76e9f0ee52 phys_to_pfn_t: use phys_addr_t
A dma_addr_t is potentially smaller than a phys_addr_t on some archs.
Don't truncate the address when doing the pfn conversion.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Matthew Wilcox <willy@linux.intel.com>
[willy: fix pfn_t_to_phys as well]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-01-31 09:10:19 -08:00
Dan Williams
8b63b6bfc1 Merge branch 'for-4.5/block-dax' into for-4.5/libnvdimm 2016-01-10 07:53:55 -08:00
Dan Williams
d26f73f083 nfit_test: Enable DSMs for all test NFITs
In preparation for getting a poison list using ARS DSMs, enable DSMs for
all manufactured NFITs supplied by the test framework.  Also, supply
valid response data for ars_status.

Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-01-09 08:39:03 -08:00
Dan Williams
9bfa84969d tools/testing/libnvdimm: cleanup mock resource lookup
Push the locking around get_nfit_res() into get_nfit_res().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-12-24 12:20:19 -08:00
Dan Williams
979fccfb73 libnvdimm, pfn: enable pfn sysfs interface unit testing
The unit test infrastructure uses CMA and real memory to emulate nvdimm
resources.  The call to devm_memremap_pages() can simply be mocked in
the same manner as memremap and we mock phys_to_pfn_t() to clear PFN_MAP
since these resources are not registered with in the pgmap_radix.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-12-15 00:34:21 -08:00
Linda Knippers
6b577c9d77 nfit: Adjust for different _FIT and NFIT headers
When support for _FIT was added, the code presumed that the data
returned by the _FIT method is identical to the NFIT table, which
starts with an acpi_table_header.  However, the _FIT is defined
to return a data in the format of a series of NFIT type structure
entries and as a method, has an acpi_object header rather tahn
an acpi_table_header.

To address the differences, explicitly save the acpi_table_header
from the NFIT, since it is accessible through /sys, and change
the nfit pointer in the acpi_desc structure to point to the
table entries rather than the headers.

Reported-by: Jeff Moyer (jmoyer@redhat.com>
Signed-off-by: Linda Knippers <linda.knippers@hpe.com>
Acked-by: Vishal Verma <vishal.l.verma@intel.com>
[vishal: fix up unit test for new header assumptions]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-11-30 14:51:46 -08:00
Dan Williams
f42957967f tools/testing/nvdimm, acpica: fix flag rename build breakage
Commit ca321d1ca6 "ACPICA: Update NFIT table to rename a flags field"
performed a tree-wide s/ACPI_NFIT_MEM_ARMED/ACPI_NFIT_MEM_NOT_ARMED/
operation, but missed the tools/testing/nvdimm/ directory.

Cc: Bob Moore <robert.moore@intel.com>
Cc: Lv Zheng <lv.zheng@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-11-12 09:21:18 -08:00
Vishal Verma
209851649d acpi: nfit: Add support for hot-add
Add a .notify callback to the acpi_nfit_driver that gets called on a
hotplug event. From this, evaluate the _FIT ACPI method which returns
the updated NFIT with handles for the hot-plugged NVDIMM.

Iterate over the new NFIT, and add any new tables found, and
register/enable the corresponding regions.

In the nfit test framework, after normal initialization, update the NFIT
with a new hot-plugged NVDIMM, and directly call into the driver to
update its view of the available regions.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Elliott, Robert <elliott@hpe.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: <linux-acpi@vger.kernel.org>
Cc: <linux-nvdimm@lists.01.org>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-11-02 15:28:07 -05:00
Dan Williams
32ab0a3f51 libnvdimm, pmem: 'struct page' for pmem
Enable the pmem driver to handle PFN device instances.  Attaching a pmem
namespace to a pfn device triggers the driver to allocate and initialize
struct page entries for pmem.  Memory capacity for this allocation comes
exclusively from RAM for now which is suitable for low PMEM to RAM
ratios.  This mechanism will be expanded later for setting an "allocate
from PMEM" policy.

Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-28 23:40:04 -04:00
Dan Williams
e1455744b2 libnvdimm, pfn: 'struct page' provider infrastructure
Implement the base infrastructure for libnvdimm PFN devices. Similar to
BTT devices they take a namespace as a backing device and layer
functionality on top. In this case the functionality is reserving space
for an array of 'struct page' entries to be handed out through
pfn_to_page(). For now this is just the basic libnvdimm-device-model for
configuring the base PFN device.

As the namespace claiming mechanism for PFN devices is mostly identical
to BTT devices drivers/nvdimm/claim.c is created to house the common
bits.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-28 23:39:36 -04:00
Dan Williams
4a9bf88a5c Merge branch 'pmem-api' into libnvdimm-for-next 2015-08-27 19:40:26 -04:00
Ross Zwisler
67a3e8fe90 nd_blk: change aperture mapping from WC to WB
This should result in a pretty sizeable performance gain for reads.  For
rough comparison I did some simple read testing using PMEM to compare
reads of write combining (WC) mappings vs write-back (WB).  This was
done on a random lab machine.

PMEM reads from a write combining mapping:
	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
	100000+0 records in
	100000+0 records out
	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s

PMEM reads from a write-back mapping:
	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
	1000000+0 records in
	1000000+0 records out
	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s

To be able to safely support a write-back aperture I needed to add
support for the "read flush" _DSM flag, as outlined in the DSM spec:

http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

This flag tells the ND BLK driver that it needs to flush the cache lines
associated with the aperture after the aperture is moved but before any
new data is read.  This ensures that any stale cache lines from the
previous contents of the aperture will be discarded from the processor
cache, and the new data will be read properly from the DIMM.  We know
that the cache lines are clean and will be discarded without any
writeback because either a) the previous aperture operation was a read,
and we never modified the contents of the aperture, or b) the previous
aperture operation was a write and we must have written back the dirtied
contents of the aperture to the DIMM before the I/O was completed.

In order to add support for the "read flush" flag I needed to add a
generic routine to invalidate cache lines, mmio_flush_range().  This is
protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
only supported on x86.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27 19:38:28 -04:00