linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-24 03:07:29 +07:00

Author	SHA1	Message	Date
Dan Williams	41fce90f26	libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment The following namespace configuration attempt: # ndctl create-namespace -e namespace0.0 -m devdax -a 1G -f libndctl: ndctl_dax_enable: dax0.1: failed to enable Error: namespace0.0: failed to enable failed to reconfigure namespace: No such device or address ...fails when the backing memory range is not physically aligned to 1G: # cat /proc/iomem \| grep Persistent 210000000-30fffffff : Persistent Memory (legacy) In the above example the 4G persistent memory range starts and ends on a 256MB boundary. We handle this case correctly when needing to handle cases that violate section alignment (128MB) collisions against "System RAM", and we simply need to extend that padding/truncation for the 1GB alignment use case. Cc: <stable@vger.kernel.org> Fixes: `315c562536` ("libnvdimm, pfn: add 'align' attribute...") Reported-and-tested-by: Jane Chu <jane.chu@oracle.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-12-19 15:37:34 -08:00
Dan Williams	19deaa217b	libnvdimm, pfn: fix start_pad handling for aligned namespaces The alignment checks at pfn driver startup fail to properly account for the 'start_pad' in the case where the namespace is misaligned relative to its internal alignment. This is typically triggered in 1G aligned namespace, but could theoretically trigger with small namespace alignments. When this triggers the kernel reports messages of the form: dax2.1: bad offset: 0x3c000000 dax disabled align: 0x40000000 Cc: <stable@vger.kernel.org> Fixes: `1ee6667cd8` ("libnvdimm, pfn, dax: fix initialization vs autodetect...") Reported-by: Jane Chu <jane.chu@oracle.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-12-19 15:10:06 -08:00
Dan Williams	26417ae4fc	libnvdimm, pfn: make 'resource' attribute only readable by root For the same reason that /proc/iomem returns 0's for non-root readers and acpi tables are root-only, make the 'resource' attribute for pfn devices only readable by root. Otherwise we disclose physical address information. Fixes: `f6ed58c70d` ("libnvdimm, pfn: 'resource'-address and 'size'...") Cc: <stable@vger.kernel.org> Reported-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-09-28 09:13:06 -07:00
Dan Williams	f13d2b61e5	libnvdimm, pfn, dax: limit namespace alignments to the supported set Now that we properly advertise the supported pte, pmd, and pud sizes, restrict the supported alignments that can be set on a namespace. This assumes that userspace was not previously relying on the ability to set odd alignments. At least ndctl only ever supported setting the namespace alignment to 4K, 2M, or 1G. Cc: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-08-15 09:32:12 -07:00
Oliver O'Halloran	1fdadbebc4	libnvdimm, pfn, dax: show supported dax/pfn region alignments in sysfs The alignment of a DAX and PFN regions dictates the page sizes that can be used to map the region. Even if the hardware page sizes are known the actual range of supported page sizes that can be used with DAX depends on the kernel configuration. As a result it's best that the kernel advertises the alignments that should be used with these region types. This patch adds the 'supported_alignments' region attribute to expose this information to userspace. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [djbw: integrate with nd_size_select_show() rename and other fixups] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-08-11 18:06:16 -07:00
Oliver O'Halloran	0dd6964306	libnvdimm: Stop using HPAGE_SIZE Currently libnvdimm uses HPAGE_SIZE as the default alignment for DAX and PFN devices. HPAGE_SIZE is the default hugetlbfs page size and when hugetlbfs is disabled it defaults to PAGE_SIZE. Given DAX has more in common with THP than hugetlbfs we should proably be using HPAGE_PMD_SIZE, but this is undefined when THP is disabled so lets just give it a new name. The other usage of HPAGE_SIZE in libnvdimm is when determining how large the altmap should be. For the reasons mentioned above it doesn't really make sense to use HPAGE_SIZE here either. PMD_SIZE seems to be safe to use in generic code and it happens to match the vmemmap allocation block on x86 and Power. It's still a hack, but it's a slightly nicer hack. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-07-25 16:30:51 -07:00
Dan Williams	9d92573fff	Merge branch 'for-4.13/dax' into libnvdimm-for-next	2017-07-03 16:54:58 -07:00
Dan Williams	c9e582aa68	libnvdimm, nfit: enable support for volatile ranges Allow volatile nfit ranges to participate in all the same infrastructure provided for persistent memory regions. A resulting resulting namespace device will still be called "pmem", but the parent region type will be "nd_volatile". This is in preparation for disabling the dax ->flush() operation in the pmem driver when it is hosted on a volatile range. Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-06-27 16:44:13 -07:00
Dan Williams	b3fde74ea1	libnvdimm, label: add address abstraction identifiers Starting with v1.2 labels, 'address abstractions' can be hinted via an address abstraction id that implies an info-block format. The standard address abstraction in the specification is the v2 format of the Block-Translation-Table (BTT). Support for that is saved for a later patch, for now we add support for the Linux supported address abstractions BTT (v1), PFN, and DAX. The new 'holder_class' attribute for namespace devices is added for tooling to specify the 'abstraction_guid' to store in the namespace label. For v1.1 labels this field is undefined and any setting of 'holder_class' away from the default 'none' value will only have effect until the driver is unloaded. Setting 'holder_class' requires that whatever device tries to claim the namespace must be of the specified class. Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-06-15 14:31:40 -07:00
Vishal Verma	3ae3d67ba7	libnvdimm: add an atomic vs process context flag to rw_bytes nsio_rw_bytes can clear media errors, but this cannot be done while we are in an atomic context due to locking within ACPI. From the BTT, ->rw_bytes may be called either from atomic or process context depending on whether the calls happen during initialization or during IO. During init, we want to ensure error clearing happens, and the flag marking process context allows nsio_rw_bytes to do that. When called during IO, we're in atomic context, and error clearing can be skipped. Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-05-10 21:46:22 -07:00
Dan Williams	d5483feda8	libnvdimm, pfn: fix 'npfns' vs section alignment Fix failures to create namespaces due to the vmem_altmap not advertising enough free space to store the memmap. WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0 [..] Call Trace: dump_stack+0x63/0x83 __warn+0xcb/0xf0 warn_slowpath_null+0x1d/0x20 arch_add_memory+0xde/0xf0 devm_memremap_pages+0x244/0x440 pmem_attach_disk+0x37e/0x490 [nd_pmem] nd_pmem_probe+0x7e/0xa0 [nd_pmem] nvdimm_bus_probe+0x71/0x120 [libnvdimm] driver_probe_device+0x2bb/0x460 bind_store+0x114/0x160 drv_attr_store+0x25/0x30 In commit `658922e57b` "libnvdimm, pfn: fix memmap reservation sizing" we arranged for the capacity to be allocated, but failed to also update the 'npfns' parameter. This leads to cases where there is enough capacity reserved to hold all the allocated sections, but vmemmap_populate_hugepages() still encounters -ENOMEM from altmap_alloc_block_buf(). This fix is a stop-gap until we can teach the core memory hotplug implementation to permit sub-section hotplug. Cc: <stable@vger.kernel.org> Fixes: `658922e57b` ("libnvdimm, pfn: fix memmap reservation sizing") Reported-by: Anisha Allada <anisha.allada@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-05-04 19:54:42 -07:00
Dan Williams	452bae0aed	libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering A debug patch to turn the standard device_lock() into something that lockdep can analyze yielded the following: ====================================================== [ INFO: possible circular locking dependency detected ] 4.11.0-rc4+ #106 Tainted: G O ------------------------------------------------------- lt-libndctl/1898 is trying to acquire lock: (&dev->nvdimm_mutex/3){+.+.+.}, at: [<ffffffffc023c948>] nd_attach_ndns+0x178/0x1b0 [libnvdimm] but task is already holding lock: (&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [<ffffffffc022e0b1>] nvdimm_bus_lock+0x21/0x30 [libnvdimm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&nvdimm_bus->reconfig_mutex){+.+.+.}: lock_acquire+0xf6/0x1f0 __mutex_lock+0x88/0x980 mutex_lock_nested+0x1b/0x20 nvdimm_bus_lock+0x21/0x30 [libnvdimm] nvdimm_namespace_capacity+0x1b/0x40 [libnvdimm] nvdimm_namespace_common_probe+0x230/0x510 [libnvdimm] nd_pmem_probe+0x14/0x180 [nd_pmem] nvdimm_bus_probe+0xa9/0x260 [libnvdimm] -> #0 (&dev->nvdimm_mutex/3){+.+.+.}: __lock_acquire+0x1107/0x1280 lock_acquire+0xf6/0x1f0 __mutex_lock+0x88/0x980 mutex_lock_nested+0x1b/0x20 nd_attach_ndns+0x178/0x1b0 [libnvdimm] nd_namespace_store+0x308/0x3c0 [libnvdimm] namespace_store+0x87/0x220 [libnvdimm] In this case '&dev->nvdimm_mutex/3' mirrors '&dev->mutex'. Fix this by replacing the use of device_lock() with nvdimm_bus_lock() to protect nd_{attach,detach}_ndns() operations. Cc: <stable@vger.kernel.org> Fixes: `8c2f7e8658` ("libnvdimm: infrastructure for btt devices") Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-05-01 08:29:37 -07:00
Dan Williams	bfb34527a3	libnvdimm, pfn: fix memmap reservation size versus 4K alignment When vmemmap_populate() allocates space for the memmap it does so in 2MB sized chunks. The libnvdimm-pfn driver incorrectly accounts for this when the alignment of the device is set to 4K. When this happens we trigger memory allocation failures in altmap_alloc_block_buf() and trigger warnings of the form: WARNING: CPU: 0 PID: 3376 at arch/x86/mm/init_64.c:656 arch_add_memory+0xe4/0xf0 [..] Call Trace: dump_stack+0x86/0xc3 __warn+0xcb/0xf0 warn_slowpath_null+0x1d/0x20 arch_add_memory+0xe4/0xf0 devm_memremap_pages+0x29b/0x4e0 Fixes: `315c562536` ("libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE") Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-02-04 14:47:31 -08:00
Dan Williams	af7d9f0c57	libnvdimm, pfn: fix align attribute Fix the format specifier so that the attribute can be parsed correctly. Currently it returns decimal 1000 for a 4096-byte alignment. Cc: <stable@vger.kernel.org> Reported-by: Dave Jiang <dave.jiang@intel.com> Fixes: `315c562536` ("libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE") Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-12-10 08:12:05 -08:00
Dan Williams	1ee6667cd8	libnvdimm, pfn, dax: fix initialization vs autodetect for mode + alignment The updated ndctl unit tests discovered that if a pfn configuration with a 4K alignment is read from the namespace, that alignment will be ignored in favor of the default 2M alignment. The result is that the configuration will fail initialization with a message like: dax6.1: bad offset: 0x22000 dax disabled align: 0x200000 Fix this by allowing the alignment read from the info block to override the default which is 2M not 0 in the autodetect path. This also fixes a similar problem with the mode and alignment settings silently being overwritten by the kernel when userspace has changed it. We now will either overwrite the info block if userspace changes the uuid or fail and warn if a live setting disagrees with the info block. Cc: <stable@vger.kernel.org> Cc: Micah Parrish <micah.parrish@hpe.com> Cc: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-06-23 17:50:39 -07:00
Dan Williams	36092ee8ba	Merge branch 'for-4.7/dax' into libnvdimm-for-next	2016-05-21 12:33:04 -07:00
Dan Williams	03dca343af	libnvdimm, dax: fix deletion The ndctl unit tests discovered that the dax enabling omitted updates to nd_detach_and_reset(). This routine clears device the configuration when the namespace is detached. Without this clearing userspace may assume that the device is in the process of being configured by another agent in the system. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-05-21 12:22:41 -07:00
Dan Williams	5e24c9fd36	libnvdimm, dax: fix alignment validation Testing the dax-device autodetect support revealed a probe failure with the following result: dax0.1: bad offset: 0x8200000 dax disabled The original pfn-device implementation inferred the alignment from ilog2(offset), now that the alignment is explicit the is_power_of_2() needs replacing with a real sanity check against the recorded alignment. Otherwise the alignment check is useless in the implicit case and only the minimum size of the offset matters. This self-consistency check is further validated by the probe path that will re-check that the offset is large enough to contain all the metadata required to enable the device. Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-05-21 11:01:41 -07:00
Dan Williams	c5ed926864	libnvdimm, dax: autodetect support For autodetecting a previously established dax configuration we need the info block to indicate block-device vs device-dax mode, and we need to have the default namespace probe hand-off the configuration to the dax_pmem driver. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-05-20 22:02:57 -07:00
Dan Williams	594d6d96ea	Merge branch 'for-4.7/dax' into libnvdimm-for-next	2016-05-18 09:59:34 -07:00
Dan Williams	45a0dac045	libnvdimm, dax: record the specified alignment of a dax-device instance We want to use the alignment as the allocation and mapping unit. Previously this information was only useful for establishing the data offset, but now it is important to remember the granularity for the later use. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-05-09 15:35:42 -07:00
Dan Williams	52ac23b25e	libnvdimm, dax: reserve space to store labels for device-dax We may want to subdivide a device-dax range into multiple devices so that each can have separate permissions or naming. Reserve 128K of label space by default so we have the capability of making allocation decisions persistent. This reservation is not something we can add later since it would result in the default size of a device-dax range changing between kernel versions. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-05-09 15:35:42 -07:00
Dan Williams	cd03412a51	libnvdimm, dax: introduce device-dax infrastructure Device DAX is the device-centric analogue of Filesystem DAX (CONFIG_FS_DAX). It allows persistent memory ranges to be allocated and mapped without need of an intervening file system. This initial infrastructure arranges for a libnvdimm pfn-device to be represented as a different device-type so that it can be attached to a driver other than the pmem driver. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-05-09 15:35:42 -07:00
Dan Williams	ac515c084b	libnvdimm, pmem, pfn: move pfn setup to the core Now that pmem internals have been disentangled from pfn setup, that code can move to the core. This is in preparation for adding another user of the pfn-device capabilities. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-04-22 12:26:23 -07:00
Dan Williams	200c79da82	libnvdimm, pmem, pfn: make pmem_rw_bytes generic and refactor pfn setup In preparation for providing an alternative (to block device) access mechanism to persistent memory, convert pmem_rw_bytes() to nsio_rw_bytes(). This allows ->rw_bytes() functionality without requiring a 'struct pmem_device' to be instantiated. In other words, when ->rw_bytes() is in use i/o is driven through 'struct nd_namespace_io', otherwise it is driven through 'struct pmem_device' and the block layer. This consolidates the disjoint calls to devm_exit_badblocks() and devm_memunmap() into a common devm_nsio_disable() and cleans up the init path to use a unified pmem_attach_disk() implementation. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-04-22 12:26:23 -07:00
Dan Williams	bd032943b5	libnvdimm, pfn, convert nd_pfn_probe() to devm Pass the device performing the probe so we can use a devm allocation for the pfn superblock. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-04-22 10:59:54 -07:00
Dan Williams	e5670563f5	libnvdimm, pfn: fix uuid validation If we detect a namespace has a stale info block in the init path, we should overwrite with the latest configuration. In fact, we already return -ENODEV when the parent uuid is invalid, the same should be done for the 'self' uuid. Otherwise we can get into a condition where userspace is unable to reconfigure the pfn-device without directly / manually invalidating the info block. Cc: <stable@vger.kernel.org> Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-04-07 19:59:27 -07:00
Dan Williams	f6ed58c70d	libnvdimm, pfn: 'resource'-address and 'size' attributes for pfn devices Currenty with a raw mode pmem namespace the physical memory address range for the device can be obtained via /sys/block/pmemX/device/{resource\|size}. Add similar attributes for pfn instances that takes the struct page memmap and section padding into account. Reported-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-03-05 12:25:45 -08:00
Dan Williams	cfe30b8720	libnvdimm, pmem: adjust for section collisions with 'System RAM' On a platform where 'Persistent Memory' and 'System RAM' are mixed within a given sparsemem section, trim the namespace and notify about the sub-optimal alignment. Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-03-05 12:25:45 -08:00
Dan Williams	45eb570a0d	libnvdimm, pfn: fix restoring memmap location This path was missed when turning on the memmap in pmem support. Permit 'pmem' as a valid location for the map. Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-01-29 17:43:16 -08:00
Dan Williams	d2c0f041e1	libnvdimm, pfn, pmem: allocate memmap array in persistent memory Use the new vmem_altmap capability to enable the pmem driver to arrange for a struct page memmap to be established in persistent memory. [linux@roeck-us.net: mn10300: declare __pfn_to_phys() to fix build error] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Dan Williams	a34d5e8a6a	libnvdimm, pfn: add parent uuid validation Track and check the uuid of the namespace hosting a pfn instance. This forces the pfn info block to be invalidated if the namespace is re-configured with a different uuid. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-12-13 11:40:20 -08:00
Dan Williams	315c562536	libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE When setting aside capacity for struct page it must be aligned to the largest mapping size that is to be made available via DAX. Make the alignment configurable to enable support for 1GiB page-size mappings. The offset for PFN_MODE_RAM may now be larger than SZ_8K, so fixup the offset check in nvdimm_namespace_attach_pfn(). Reported-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-12-12 15:04:26 -08:00
Dan Williams	f7c6ab80fa	libnvdimm, pfn: clean up pfn create parameters In all cases __nd_pfn_create is called with default parameters which are then overridden by values in the info block. Clean up pfn creation by dropping the parameters and setting default values internal to __nd_pfn_create. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-12-10 16:11:59 -08:00
Dan Williams	9f1e8cee77	libnvdimm, pfn: kill ND_PFN_ALIGN The alignment constraint isn't necessary now that devm_memremap_pages() allows for unaligned mappings. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-12-10 15:14:20 -08:00
Axel Lin	4ca8b57a0a	libnvdimm: pfn_devs: Fix locking in namespace_store Always take device_lock() before nvdimm_bus_lock() to prevent deadlock. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-09-17 11:47:50 -04:00
Dan Williams	32ab0a3f51	libnvdimm, pmem: 'struct page' for pmem Enable the pmem driver to handle PFN device instances. Attaching a pmem namespace to a pfn device triggers the driver to allocate and initialize struct page entries for pmem. Memory capacity for this allocation comes exclusively from RAM for now which is suitable for low PMEM to RAM ratios. This mechanism will be expanded later for setting an "allocate from PMEM" policy. Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-08-28 23:40:04 -04:00
Dan Williams	e1455744b2	libnvdimm, pfn: 'struct page' provider infrastructure Implement the base infrastructure for libnvdimm PFN devices. Similar to BTT devices they take a namespace as a backing device and layer functionality on top. In this case the functionality is reserving space for an array of 'struct page' entries to be handed out through pfn_to_page(). For now this is just the basic libnvdimm-device-model for configuring the base PFN device. As the namespace claiming mechanism for PFN devices is mostly identical to BTT devices drivers/nvdimm/claim.c is created to house the common bits. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-08-28 23:39:36 -04:00

38 Commits