mlx4_core: allocate ICM memory in page size chunks

When a system is under memory presure (high usage with fragments),
the original 256KB ICM chunk allocations will likely trigger kernel
memory management to enter slow path doing memory compact/migration
ops in order to complete high order memory allocations.

When that happens, user processes calling uverb APIs may get stuck
for more than 120s easily even though there are a lot of free pages
in smaller chunks available in the system.

Syslog:
...
Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
oracle_205573_e:205573 blocked for more than 120 seconds.
...

With 4KB ICM chunk size on x86_64 arch, the above issue is fixed.

However in order to support smaller ICM chunk size, we need to fix
another issue in large size kcalloc allocations.

E.g.
Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk
size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt
entry). So we need a 16MB allocation for a table->icm pointer array to
hold 2M pointers which can easily cause kcalloc to fail.

The solution is to use kvzalloc to replace kcalloc which will fall back
to vmalloc automatically if kmalloc fails.

Signed-off-by: Qing Huang <qing.huang@oracle.com>
Acked-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Qing Huang 2018-05-23 16:22:46 -07:00 committed by David S. Miller
parent 322eaa06d5
commit 1383cb8103

View File

@ -43,12 +43,12 @@
#include "fw.h" #include "fw.h"
/* /*
* We allocate in as big chunks as we can, up to a maximum of 256 KB * We allocate in page size (default 4KB on many archs) chunks to avoid high
* per chunk. * order memory allocations in fragmented/high usage memory situation.
*/ */
enum { enum {
MLX4_ICM_ALLOC_SIZE = 1 << 18, MLX4_ICM_ALLOC_SIZE = PAGE_SIZE,
MLX4_TABLE_CHUNK_SIZE = 1 << 18 MLX4_TABLE_CHUNK_SIZE = PAGE_SIZE,
}; };
static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk) static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
@ -398,9 +398,11 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
u64 size; u64 size;
obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size; obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size;
if (WARN_ON(!obj_per_chunk))
return -EINVAL;
num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk; num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk;
table->icm = kcalloc(num_icm, sizeof(*table->icm), GFP_KERNEL); table->icm = kvzalloc(num_icm * sizeof(*table->icm), GFP_KERNEL);
if (!table->icm) if (!table->icm)
return -ENOMEM; return -ENOMEM;
table->virt = virt; table->virt = virt;
@ -446,7 +448,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
mlx4_free_icm(dev, table->icm[i], use_coherent); mlx4_free_icm(dev, table->icm[i], use_coherent);
} }
kfree(table->icm); kvfree(table->icm);
return -ENOMEM; return -ENOMEM;
} }
@ -462,5 +464,5 @@ void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table)
mlx4_free_icm(dev, table->icm[i], table->coherent); mlx4_free_icm(dev, table->icm[i], table->coherent);
} }
kfree(table->icm); kvfree(table->icm);
} }