linux_dsm_epyc7002/mm/sparse.c
Linus Torvalds f5a8eb632b arch: remove obsolete architecture ports
This removes the entire architecture code for blackfin, cris, frv, m32r,
 metag, mn10300, score, and tile, including the associated device drivers.
 
 I have been working with the (former) maintainers for each one to ensure
 that my interpretation was right and the code is definitely unused in
 mainline kernels. Many had fond memories of working on the respective
 ports to start with and getting them included in upstream, but also saw
 no point in keeping the port alive without any users.
 
 In the end, it seems that while the eight architectures are extremely
 different, they all suffered the same fate: There was one company
 in charge of an SoC line, a CPU microarchitecture and a software
 ecosystem, which was more costly than licensing newer off-the-shelf
 CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems
 that all the SoC product lines are still around, but have not used the
 custom CPU architectures for several years at this point. In contrast,
 CPU instruction sets that remain popular and have actively maintained
 kernel ports tend to all be used across multiple licensees.
 
 The removal came out of a discussion that is now documented at
 https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
 marking any ports as deprecated but remove them all at once after I made
 sure that they are all unused. Some architectures (notably tile, mn10300,
 and blackfin) are still being shipped in products with old kernels,
 but those products will never be updated to newer kernel releases.
 
 After this series, we still have a few architectures without mainline
 gcc support:
 
 - unicore32 and hexagon both have very outdated gcc releases, but the
   maintainers promised to work on providing something newer. At least
   in case of hexagon, this will only be llvm, not gcc.
 
 - openrisc, risc-v and nds32 are still in the process of finishing their
   support or getting it added to mainline gcc in the first place.
   They all have patched gcc-7.3 ports that work to some degree, but
   complete upstream support won't happen before gcc-8.1. Csky posted
   their first kernel patch set last week, their situation will be similar.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJawdL2AAoJEGCrR//JCVInuH0P/RJAZh1nTD+TR34ZhJq2TBoo
 PgygwDU7Z2+tQVU+EZ453Gywz9/NMRFk1RWAZqrLix4ZtyIMvC6A1qfT2yH1Y7Fb
 Qh6tccQeLe4ezq5u4S/46R/fQXu3Txr92yVwzJJUuPyU0arF9rv5MmI8e6p7L1en
 yb74kSEaCe+/eMlsEj1Cc1dgthDNXGKIURHkRsILoweysCpesjiTg4qDcL+yTibV
 FP2wjVbniKESMKS6qL71tiT5sexvLsLwMNcGiHPj94qCIQuI7DLhLdBVsL5Su6gI
 sbtgv0dsq4auRYAbQdMaH1hFvu6WptsuttIbOMnz2Yegi2z28H8uVXkbk2WVLbqG
 ZESUwutGh8MzOL2RJ4jyyQq5sfo++CRGlfKjr6ImZRv03dv0pe/W85062cK5cKNs
 cgDDJjGRorOXW7dyU6jG2gRqODOQBObIv3w5efdq5OgzOWlbI4EC+Y5u1Z0JF/76
 pSwtGXA6YhwC+9LLAlnVTHG+yOwuLmAICgoKcTbzTVDKA2YQZG/cYuQfI5S1wD8e
 X6urPx3Md2GCwLXQ9mzKBzKZUpu/Tuhx0NvwF4qVxy6x1PELjn68zuP7abDHr46r
 57/09ooVN+iXXnEGMtQVS/OPvYHSa2NgTSZz6Y86lCRbZmUOOlK31RDNlMvYNA+s
 3iIVHovno/JuJnTOE8LY
 =fQ8z
 -----END PGP SIGNATURE-----

Merge tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pul removal of obsolete architecture ports from Arnd Bergmann:
 "This removes the entire architecture code for blackfin, cris, frv,
  m32r, metag, mn10300, score, and tile, including the associated device
  drivers.

  I have been working with the (former) maintainers for each one to
  ensure that my interpretation was right and the code is definitely
  unused in mainline kernels. Many had fond memories of working on the
  respective ports to start with and getting them included in upstream,
  but also saw no point in keeping the port alive without any users.

  In the end, it seems that while the eight architectures are extremely
  different, they all suffered the same fate: There was one company in
  charge of an SoC line, a CPU microarchitecture and a software
  ecosystem, which was more costly than licensing newer off-the-shelf
  CPU cores from a third party (typically ARM, MIPS, or RISC-V). It
  seems that all the SoC product lines are still around, but have not
  used the custom CPU architectures for several years at this point. In
  contrast, CPU instruction sets that remain popular and have actively
  maintained kernel ports tend to all be used across multiple licensees.

  [ See the new nds32 port merged in the previous commit for the next
    generation of "one company in charge of an SoC line, a CPU
    microarchitecture and a software ecosystem"   - Linus ]

  The removal came out of a discussion that is now documented at
  https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
  marking any ports as deprecated but remove them all at once after I
  made sure that they are all unused. Some architectures (notably tile,
  mn10300, and blackfin) are still being shipped in products with old
  kernels, but those products will never be updated to newer kernel
  releases.

  After this series, we still have a few architectures without mainline
  gcc support:

   - unicore32 and hexagon both have very outdated gcc releases, but the
     maintainers promised to work on providing something newer. At least
     in case of hexagon, this will only be llvm, not gcc.

   - openrisc, risc-v and nds32 are still in the process of finishing
     their support or getting it added to mainline gcc in the first
     place. They all have patched gcc-7.3 ports that work to some
     degree, but complete upstream support won't happen before gcc-8.1.
     Csky posted their first kernel patch set last week, their situation
     will be similar

  [ Palmer Dabbelt points out that RISC-V support is in mainline gcc
    since gcc-7, although gcc-7.3.0 is the recommended minimum  - Linus ]"

This really says it all:

 2498 files changed, 95 insertions(+), 467668 deletions(-)

* tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits)
  MAINTAINERS: UNICORE32: Change email account
  staging: iio: remove iio-trig-bfin-timer driver
  tty: hvc: remove tile driver
  tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers
  serial: remove tile uart driver
  serial: remove m32r_sio driver
  serial: remove blackfin drivers
  serial: remove cris/etrax uart drivers
  usb: Remove Blackfin references in USB support
  usb: isp1362: remove blackfin arch glue
  usb: musb: remove blackfin port
  usb: host: remove tilegx platform glue
  pwm: remove pwm-bfin driver
  i2c: remove bfin-twi driver
  spi: remove blackfin related host drivers
  watchdog: remove bfin_wdt driver
  can: remove bfin_can driver
  mmc: remove bfin_sdh driver
  input: misc: remove blackfin rotary driver
  input: keyboard: remove bf54x driver
  ...
2018-04-02 20:20:12 -07:00

870 lines
23 KiB
C

// SPDX-License-Identifier: GPL-2.0
/*
* sparse memory mappings.
*/
#include <linux/mm.h>
#include <linux/slab.h>
#include <linux/mmzone.h>
#include <linux/bootmem.h>
#include <linux/compiler.h>
#include <linux/highmem.h>
#include <linux/export.h>
#include <linux/spinlock.h>
#include <linux/vmalloc.h>
#include "internal.h"
#include <asm/dma.h>
#include <asm/pgalloc.h>
#include <asm/pgtable.h>
/*
* Permanent SPARSEMEM data:
*
* 1) mem_section - memory sections, mem_map's for valid memory
*/
#ifdef CONFIG_SPARSEMEM_EXTREME
struct mem_section **mem_section;
#else
struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
____cacheline_internodealigned_in_smp;
#endif
EXPORT_SYMBOL(mem_section);
#ifdef NODE_NOT_IN_PAGE_FLAGS
/*
* If we did not store the node number in the page then we have to
* do a lookup in the section_to_node_table in order to find which
* node the page belongs to.
*/
#if MAX_NUMNODES <= 256
static u8 section_to_node_table[NR_MEM_SECTIONS] __cacheline_aligned;
#else
static u16 section_to_node_table[NR_MEM_SECTIONS] __cacheline_aligned;
#endif
int page_to_nid(const struct page *page)
{
return section_to_node_table[page_to_section(page)];
}
EXPORT_SYMBOL(page_to_nid);
static void set_section_nid(unsigned long section_nr, int nid)
{
section_to_node_table[section_nr] = nid;
}
#else /* !NODE_NOT_IN_PAGE_FLAGS */
static inline void set_section_nid(unsigned long section_nr, int nid)
{
}
#endif
#ifdef CONFIG_SPARSEMEM_EXTREME
static noinline struct mem_section __ref *sparse_index_alloc(int nid)
{
struct mem_section *section = NULL;
unsigned long array_size = SECTIONS_PER_ROOT *
sizeof(struct mem_section);
if (slab_is_available())
section = kzalloc_node(array_size, GFP_KERNEL, nid);
else
section = memblock_virt_alloc_node(array_size, nid);
return section;
}
static int __meminit sparse_index_init(unsigned long section_nr, int nid)
{
unsigned long root = SECTION_NR_TO_ROOT(section_nr);
struct mem_section *section;
if (mem_section[root])
return -EEXIST;
section = sparse_index_alloc(nid);
if (!section)
return -ENOMEM;
mem_section[root] = section;
return 0;
}
#else /* !SPARSEMEM_EXTREME */
static inline int sparse_index_init(unsigned long section_nr, int nid)
{
return 0;
}
#endif
#ifdef CONFIG_SPARSEMEM_EXTREME
int __section_nr(struct mem_section* ms)
{
unsigned long root_nr;
struct mem_section *root = NULL;
for (root_nr = 0; root_nr < NR_SECTION_ROOTS; root_nr++) {
root = __nr_to_section(root_nr * SECTIONS_PER_ROOT);
if (!root)
continue;
if ((ms >= root) && (ms < (root + SECTIONS_PER_ROOT)))
break;
}
VM_BUG_ON(!root);
return (root_nr * SECTIONS_PER_ROOT) + (ms - root);
}
#else
int __section_nr(struct mem_section* ms)
{
return (int)(ms - mem_section[0]);
}
#endif
/*
* During early boot, before section_mem_map is used for an actual
* mem_map, we use section_mem_map to store the section's NUMA
* node. This keeps us from having to use another data structure. The
* node information is cleared just before we store the real mem_map.
*/
static inline unsigned long sparse_encode_early_nid(int nid)
{
return (nid << SECTION_NID_SHIFT);
}
static inline int sparse_early_nid(struct mem_section *section)
{
return (section->section_mem_map >> SECTION_NID_SHIFT);
}
/* Validate the physical addressing limitations of the model */
void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn,
unsigned long *end_pfn)
{
unsigned long max_sparsemem_pfn = 1UL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
/*
* Sanity checks - do not allow an architecture to pass
* in larger pfns than the maximum scope of sparsemem:
*/
if (*start_pfn > max_sparsemem_pfn) {
mminit_dprintk(MMINIT_WARNING, "pfnvalidation",
"Start of range %lu -> %lu exceeds SPARSEMEM max %lu\n",
*start_pfn, *end_pfn, max_sparsemem_pfn);
WARN_ON_ONCE(1);
*start_pfn = max_sparsemem_pfn;
*end_pfn = max_sparsemem_pfn;
} else if (*end_pfn > max_sparsemem_pfn) {
mminit_dprintk(MMINIT_WARNING, "pfnvalidation",
"End of range %lu -> %lu exceeds SPARSEMEM max %lu\n",
*start_pfn, *end_pfn, max_sparsemem_pfn);
WARN_ON_ONCE(1);
*end_pfn = max_sparsemem_pfn;
}
}
/*
* There are a number of times that we loop over NR_MEM_SECTIONS,
* looking for section_present() on each. But, when we have very
* large physical address spaces, NR_MEM_SECTIONS can also be
* very large which makes the loops quite long.
*
* Keeping track of this gives us an easy way to break out of
* those loops early.
*/
int __highest_present_section_nr;
static void section_mark_present(struct mem_section *ms)
{
int section_nr = __section_nr(ms);
if (section_nr > __highest_present_section_nr)
__highest_present_section_nr = section_nr;
ms->section_mem_map |= SECTION_MARKED_PRESENT;
}
static inline int next_present_section_nr(int section_nr)
{
do {
section_nr++;
if (present_section_nr(section_nr))
return section_nr;
} while ((section_nr < NR_MEM_SECTIONS) &&
(section_nr <= __highest_present_section_nr));
return -1;
}
#define for_each_present_section_nr(start, section_nr) \
for (section_nr = next_present_section_nr(start-1); \
((section_nr >= 0) && \
(section_nr < NR_MEM_SECTIONS) && \
(section_nr <= __highest_present_section_nr)); \
section_nr = next_present_section_nr(section_nr))
/* Record a memory area against a node. */
void __init memory_present(int nid, unsigned long start, unsigned long end)
{
unsigned long pfn;
#ifdef CONFIG_SPARSEMEM_EXTREME
if (unlikely(!mem_section)) {
unsigned long size, align;
size = sizeof(struct mem_section*) * NR_SECTION_ROOTS;
align = 1 << (INTERNODE_CACHE_SHIFT);
mem_section = memblock_virt_alloc(size, align);
}
#endif
start &= PAGE_SECTION_MASK;
mminit_validate_memmodel_limits(&start, &end);
for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
unsigned long section = pfn_to_section_nr(pfn);
struct mem_section *ms;
sparse_index_init(section, nid);
set_section_nid(section, nid);
ms = __nr_to_section(section);
if (!ms->section_mem_map) {
ms->section_mem_map = sparse_encode_early_nid(nid) |
SECTION_IS_ONLINE;
section_mark_present(ms);
}
}
}
/*
* Subtle, we encode the real pfn into the mem_map such that
* the identity pfn - section_mem_map will return the actual
* physical page frame number.
*/
static unsigned long sparse_encode_mem_map(struct page *mem_map, unsigned long pnum)
{
unsigned long coded_mem_map =
(unsigned long)(mem_map - (section_nr_to_pfn(pnum)));
BUILD_BUG_ON(SECTION_MAP_LAST_BIT > (1UL<<PFN_SECTION_SHIFT));
BUG_ON(coded_mem_map & ~SECTION_MAP_MASK);
return coded_mem_map;
}
/*
* Decode mem_map from the coded memmap
*/
struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum)
{
/* mask off the extra low bits of information */
coded_mem_map &= SECTION_MAP_MASK;
return ((struct page *)coded_mem_map) + section_nr_to_pfn(pnum);
}
static int __meminit sparse_init_one_section(struct mem_section *ms,
unsigned long pnum, struct page *mem_map,
unsigned long *pageblock_bitmap)
{
if (!present_section(ms))
return -EINVAL;
ms->section_mem_map &= ~SECTION_MAP_MASK;
ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) |
SECTION_HAS_MEM_MAP;
ms->pageblock_flags = pageblock_bitmap;
return 1;
}
unsigned long usemap_size(void)
{
return BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS) * sizeof(unsigned long);
}
#ifdef CONFIG_MEMORY_HOTPLUG
static unsigned long *__kmalloc_section_usemap(void)
{
return kmalloc(usemap_size(), GFP_KERNEL);
}
#endif /* CONFIG_MEMORY_HOTPLUG */
#ifdef CONFIG_MEMORY_HOTREMOVE
static unsigned long * __init
sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat,
unsigned long size)
{
unsigned long goal, limit;
unsigned long *p;
int nid;
/*
* A page may contain usemaps for other sections preventing the
* page being freed and making a section unremovable while
* other sections referencing the usemap remain active. Similarly,
* a pgdat can prevent a section being removed. If section A
* contains a pgdat and section B contains the usemap, both
* sections become inter-dependent. This allocates usemaps
* from the same section as the pgdat where possible to avoid
* this problem.
*/
goal = __pa(pgdat) & (PAGE_SECTION_MASK << PAGE_SHIFT);
limit = goal + (1UL << PA_SECTION_SHIFT);
nid = early_pfn_to_nid(goal >> PAGE_SHIFT);
again:
p = memblock_virt_alloc_try_nid_nopanic(size,
SMP_CACHE_BYTES, goal, limit,
nid);
if (!p && limit) {
limit = 0;
goto again;
}
return p;
}
static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
{
unsigned long usemap_snr, pgdat_snr;
static unsigned long old_usemap_snr;
static unsigned long old_pgdat_snr;
struct pglist_data *pgdat = NODE_DATA(nid);
int usemap_nid;
/* First call */
if (!old_usemap_snr) {
old_usemap_snr = NR_MEM_SECTIONS;
old_pgdat_snr = NR_MEM_SECTIONS;
}
usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT);
pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
if (usemap_snr == pgdat_snr)
return;
if (old_usemap_snr == usemap_snr && old_pgdat_snr == pgdat_snr)
/* skip redundant message */
return;
old_usemap_snr = usemap_snr;
old_pgdat_snr = pgdat_snr;
usemap_nid = sparse_early_nid(__nr_to_section(usemap_snr));
if (usemap_nid != nid) {
pr_info("node %d must be removed before remove section %ld\n",
nid, usemap_snr);
return;
}
/*
* There is a circular dependency.
* Some platforms allow un-removable section because they will just
* gather other removable sections for dynamic partitioning.
* Just notify un-removable section's number here.
*/
pr_info("Section %ld and %ld (node %d) have a circular dependency on usemap and pgdat allocations\n",
usemap_snr, pgdat_snr, nid);
}
#else
static unsigned long * __init
sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat,
unsigned long size)
{
return memblock_virt_alloc_node_nopanic(size, pgdat->node_id);
}
static void __init check_usemap_section_nr(int nid, unsigned long *usemap)
{
}
#endif /* CONFIG_MEMORY_HOTREMOVE */
static void __init sparse_early_usemaps_alloc_node(void *data,
unsigned long pnum_begin,
unsigned long pnum_end,
unsigned long usemap_count, int nodeid)
{
void *usemap;
unsigned long pnum;
unsigned long **usemap_map = (unsigned long **)data;
int size = usemap_size();
usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nodeid),
size * usemap_count);
if (!usemap) {
pr_warn("%s: allocation failed\n", __func__);
return;
}
for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
if (!present_section_nr(pnum))
continue;
usemap_map[pnum] = usemap;
usemap += size;
check_usemap_section_nr(nodeid, usemap_map[pnum]);
}
}
#ifndef CONFIG_SPARSEMEM_VMEMMAP
struct page __init *sparse_mem_map_populate(unsigned long pnum, int nid,
struct vmem_altmap *altmap)
{
struct page *map;
unsigned long size;
size = PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION);
map = memblock_virt_alloc_try_nid(size,
PAGE_SIZE, __pa(MAX_DMA_ADDRESS),
BOOTMEM_ALLOC_ACCESSIBLE, nid);
return map;
}
void __init sparse_mem_maps_populate_node(struct page **map_map,
unsigned long pnum_begin,
unsigned long pnum_end,
unsigned long map_count, int nodeid)
{
void *map;
unsigned long pnum;
unsigned long size = sizeof(struct page) * PAGES_PER_SECTION;
size = PAGE_ALIGN(size);
map = memblock_virt_alloc_try_nid_raw(size * map_count,
PAGE_SIZE, __pa(MAX_DMA_ADDRESS),
BOOTMEM_ALLOC_ACCESSIBLE, nodeid);
if (map) {
for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
if (!present_section_nr(pnum))
continue;
map_map[pnum] = map;
map += size;
}
return;
}
/* fallback */
for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
struct mem_section *ms;
if (!present_section_nr(pnum))
continue;
map_map[pnum] = sparse_mem_map_populate(pnum, nodeid, NULL);
if (map_map[pnum])
continue;
ms = __nr_to_section(pnum);
pr_err("%s: sparsemem memory map backing failed some memory will not be available\n",
__func__);
ms->section_mem_map = 0;
}
}
#endif /* !CONFIG_SPARSEMEM_VMEMMAP */
#ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
static void __init sparse_early_mem_maps_alloc_node(void *data,
unsigned long pnum_begin,
unsigned long pnum_end,
unsigned long map_count, int nodeid)
{
struct page **map_map = (struct page **)data;
sparse_mem_maps_populate_node(map_map, pnum_begin, pnum_end,
map_count, nodeid);
}
#else
static struct page __init *sparse_early_mem_map_alloc(unsigned long pnum)
{
struct page *map;
struct mem_section *ms = __nr_to_section(pnum);
int nid = sparse_early_nid(ms);
map = sparse_mem_map_populate(pnum, nid, NULL);
if (map)
return map;
pr_err("%s: sparsemem memory map backing failed some memory will not be available\n",
__func__);
ms->section_mem_map = 0;
return NULL;
}
#endif
void __weak __meminit vmemmap_populate_print_last(void)
{
}
/**
* alloc_usemap_and_memmap - memory alloction for pageblock flags and vmemmap
* @map: usemap_map for pageblock flags or mmap_map for vmemmap
*/
static void __init alloc_usemap_and_memmap(void (*alloc_func)
(void *, unsigned long, unsigned long,
unsigned long, int), void *data)
{
unsigned long pnum;
unsigned long map_count;
int nodeid_begin = 0;
unsigned long pnum_begin = 0;
for_each_present_section_nr(0, pnum) {
struct mem_section *ms;
ms = __nr_to_section(pnum);
nodeid_begin = sparse_early_nid(ms);
pnum_begin = pnum;
break;
}
map_count = 1;
for_each_present_section_nr(pnum_begin + 1, pnum) {
struct mem_section *ms;
int nodeid;
ms = __nr_to_section(pnum);
nodeid = sparse_early_nid(ms);
if (nodeid == nodeid_begin) {
map_count++;
continue;
}
/* ok, we need to take cake of from pnum_begin to pnum - 1*/
alloc_func(data, pnum_begin, pnum,
map_count, nodeid_begin);
/* new start, update count etc*/
nodeid_begin = nodeid;
pnum_begin = pnum;
map_count = 1;
}
/* ok, last chunk */
alloc_func(data, pnum_begin, NR_MEM_SECTIONS,
map_count, nodeid_begin);
}
/*
* Allocate the accumulated non-linear sections, allocate a mem_map
* for each and record the physical to section mapping.
*/
void __init sparse_init(void)
{
unsigned long pnum;
struct page *map;
unsigned long *usemap;
unsigned long **usemap_map;
int size;
#ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
int size2;
struct page **map_map;
#endif
/* see include/linux/mmzone.h 'struct mem_section' definition */
BUILD_BUG_ON(!is_power_of_2(sizeof(struct mem_section)));
/* Setup pageblock_order for HUGETLB_PAGE_SIZE_VARIABLE */
set_pageblock_order();
/*
* map is using big page (aka 2M in x86 64 bit)
* usemap is less one page (aka 24 bytes)
* so alloc 2M (with 2M align) and 24 bytes in turn will
* make next 2M slip to one more 2M later.
* then in big system, the memory will have a lot of holes...
* here try to allocate 2M pages continuously.
*
* powerpc need to call sparse_init_one_section right after each
* sparse_early_mem_map_alloc, so allocate usemap_map at first.
*/
size = sizeof(unsigned long *) * NR_MEM_SECTIONS;
usemap_map = memblock_virt_alloc(size, 0);
if (!usemap_map)
panic("can not allocate usemap_map\n");
alloc_usemap_and_memmap(sparse_early_usemaps_alloc_node,
(void *)usemap_map);
#ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
size2 = sizeof(struct page *) * NR_MEM_SECTIONS;
map_map = memblock_virt_alloc(size2, 0);
if (!map_map)
panic("can not allocate map_map\n");
alloc_usemap_and_memmap(sparse_early_mem_maps_alloc_node,
(void *)map_map);
#endif
for_each_present_section_nr(0, pnum) {
usemap = usemap_map[pnum];
if (!usemap)
continue;
#ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
map = map_map[pnum];
#else
map = sparse_early_mem_map_alloc(pnum);
#endif
if (!map)
continue;
sparse_init_one_section(__nr_to_section(pnum), pnum, map,
usemap);
}
vmemmap_populate_print_last();
#ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
memblock_free_early(__pa(map_map), size2);
#endif
memblock_free_early(__pa(usemap_map), size);
}
#ifdef CONFIG_MEMORY_HOTPLUG
/* Mark all memory sections within the pfn range as online */
void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
{
unsigned long pfn;
for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
unsigned long section_nr = pfn_to_section_nr(pfn);
struct mem_section *ms;
/* onlining code should never touch invalid ranges */
if (WARN_ON(!valid_section_nr(section_nr)))
continue;
ms = __nr_to_section(section_nr);
ms->section_mem_map |= SECTION_IS_ONLINE;
}
}
#ifdef CONFIG_MEMORY_HOTREMOVE
/* Mark all memory sections within the pfn range as online */
void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
{
unsigned long pfn;
for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
unsigned long section_nr = pfn_to_section_nr(start_pfn);
struct mem_section *ms;
/*
* TODO this needs some double checking. Offlining code makes
* sure to check pfn_valid but those checks might be just bogus
*/
if (WARN_ON(!valid_section_nr(section_nr)))
continue;
ms = __nr_to_section(section_nr);
ms->section_mem_map &= ~SECTION_IS_ONLINE;
}
}
#endif
#ifdef CONFIG_SPARSEMEM_VMEMMAP
static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid,
struct vmem_altmap *altmap)
{
/* This will make the necessary allocations eventually. */
return sparse_mem_map_populate(pnum, nid, altmap);
}
static void __kfree_section_memmap(struct page *memmap,
struct vmem_altmap *altmap)
{
unsigned long start = (unsigned long)memmap;
unsigned long end = (unsigned long)(memmap + PAGES_PER_SECTION);
vmemmap_free(start, end, altmap);
}
#ifdef CONFIG_MEMORY_HOTREMOVE
static void free_map_bootmem(struct page *memmap)
{
unsigned long start = (unsigned long)memmap;
unsigned long end = (unsigned long)(memmap + PAGES_PER_SECTION);
vmemmap_free(start, end, NULL);
}
#endif /* CONFIG_MEMORY_HOTREMOVE */
#else
static struct page *__kmalloc_section_memmap(void)
{
struct page *page, *ret;
unsigned long memmap_size = sizeof(struct page) * PAGES_PER_SECTION;
page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size));
if (page)
goto got_map_page;
ret = vmalloc(memmap_size);
if (ret)
goto got_map_ptr;
return NULL;
got_map_page:
ret = (struct page *)pfn_to_kaddr(page_to_pfn(page));
got_map_ptr:
return ret;
}
static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid,
struct vmem_altmap *altmap)
{
return __kmalloc_section_memmap();
}
static void __kfree_section_memmap(struct page *memmap,
struct vmem_altmap *altmap)
{
if (is_vmalloc_addr(memmap))
vfree(memmap);
else
free_pages((unsigned long)memmap,
get_order(sizeof(struct page) * PAGES_PER_SECTION));
}
#ifdef CONFIG_MEMORY_HOTREMOVE
static void free_map_bootmem(struct page *memmap)
{
unsigned long maps_section_nr, removing_section_nr, i;
unsigned long magic, nr_pages;
struct page *page = virt_to_page(memmap);
nr_pages = PAGE_ALIGN(PAGES_PER_SECTION * sizeof(struct page))
>> PAGE_SHIFT;
for (i = 0; i < nr_pages; i++, page++) {
magic = (unsigned long) page->freelist;
BUG_ON(magic == NODE_INFO);
maps_section_nr = pfn_to_section_nr(page_to_pfn(page));
removing_section_nr = page_private(page);
/*
* When this function is called, the removing section is
* logical offlined state. This means all pages are isolated
* from page allocator. If removing section's memmap is placed
* on the same section, it must not be freed.
* If it is freed, page allocator may allocate it which will
* be removed physically soon.
*/
if (maps_section_nr != removing_section_nr)
put_page_bootmem(page);
}
}
#endif /* CONFIG_MEMORY_HOTREMOVE */
#endif /* CONFIG_SPARSEMEM_VMEMMAP */
/*
* returns the number of sections whose mem_maps were properly
* set. If this is <=0, then that means that the passed-in
* map was not consumed and must be freed.
*/
int __meminit sparse_add_one_section(struct pglist_data *pgdat,
unsigned long start_pfn, struct vmem_altmap *altmap)
{
unsigned long section_nr = pfn_to_section_nr(start_pfn);
struct mem_section *ms;
struct page *memmap;
unsigned long *usemap;
unsigned long flags;
int ret;
/*
* no locking for this, because it does its own
* plus, it does a kmalloc
*/
ret = sparse_index_init(section_nr, pgdat->node_id);
if (ret < 0 && ret != -EEXIST)
return ret;
memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, altmap);
if (!memmap)
return -ENOMEM;
usemap = __kmalloc_section_usemap();
if (!usemap) {
__kfree_section_memmap(memmap, altmap);
return -ENOMEM;
}
pgdat_resize_lock(pgdat, &flags);
ms = __pfn_to_section(start_pfn);
if (ms->section_mem_map & SECTION_MARKED_PRESENT) {
ret = -EEXIST;
goto out;
}
memset(memmap, 0, sizeof(struct page) * PAGES_PER_SECTION);
section_mark_present(ms);
ret = sparse_init_one_section(ms, section_nr, memmap, usemap);
out:
pgdat_resize_unlock(pgdat, &flags);
if (ret <= 0) {
kfree(usemap);
__kfree_section_memmap(memmap, altmap);
}
return ret;
}
#ifdef CONFIG_MEMORY_HOTREMOVE
#ifdef CONFIG_MEMORY_FAILURE
static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
{
int i;
if (!memmap)
return;
for (i = 0; i < nr_pages; i++) {
if (PageHWPoison(&memmap[i])) {
atomic_long_sub(1, &num_poisoned_pages);
ClearPageHWPoison(&memmap[i]);
}
}
}
#else
static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
{
}
#endif
static void free_section_usemap(struct page *memmap, unsigned long *usemap,
struct vmem_altmap *altmap)
{
struct page *usemap_page;
if (!usemap)
return;
usemap_page = virt_to_page(usemap);
/*
* Check to see if allocation came from hot-plug-add
*/
if (PageSlab(usemap_page) || PageCompound(usemap_page)) {
kfree(usemap);
if (memmap)
__kfree_section_memmap(memmap, altmap);
return;
}
/*
* The usemap came from bootmem. This is packed with other usemaps
* on the section which has pgdat at boot time. Just keep it as is now.
*/
if (memmap)
free_map_bootmem(memmap);
}
void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
unsigned long map_offset, struct vmem_altmap *altmap)
{
struct page *memmap = NULL;
unsigned long *usemap = NULL, flags;
struct pglist_data *pgdat = zone->zone_pgdat;
pgdat_resize_lock(pgdat, &flags);
if (ms->section_mem_map) {
usemap = ms->pageblock_flags;
memmap = sparse_decode_mem_map(ms->section_mem_map,
__section_nr(ms));
ms->section_mem_map = 0;
ms->pageblock_flags = NULL;
}
pgdat_resize_unlock(pgdat, &flags);
clear_hwpoisoned_pages(memmap + map_offset,
PAGES_PER_SECTION - map_offset);
free_section_usemap(memmap, usemap, altmap);
}
#endif /* CONFIG_MEMORY_HOTREMOVE */
#endif /* CONFIG_MEMORY_HOTPLUG */