linux_dsm_epyc7002/kernel/sched
Ming Lei 11ea68f553 genirq, sched/isolation: Isolate from handling managed interrupts
The affinity of managed interrupts is completely handled in the kernel and
cannot be changed via the /proc/irq/* interfaces from user space. As the
kernel tries to spread out interrupts evenly accross CPUs on x86 to prevent
vector exhaustion, it can happen that a managed interrupt whose affinity
mask contains both isolated and housekeeping CPUs is routed to an isolated
CPU. As a consequence IO submitted on a housekeeping CPU causes interrupts
on the isolated CPU.

Add a new sub-parameter 'managed_irq' for 'isolcpus' and the corresponding
logic in the interrupt affinity selection code.

The subparameter indicates to the interrupt affinity selection logic that
it should try to avoid the above scenario.

This isolation is best effort and only effective if the automatically
assigned interrupt mask of a device queue contains isolated and
housekeeping CPUs. If housekeeping CPUs are online then such interrupts are
directed to the housekeeping CPU so that IO submitted on the housekeeping
CPU cannot disturb the isolated CPU.

If a queue's affinity mask contains only isolated CPUs then this parameter
has no effect on the interrupt routing decision, though interrupts are only
happening when tasks running on those isolated CPUs submit IO. IO submitted
on housekeeping CPUs has no influence on those queues.

If the affinity mask contains both housekeeping and isolated CPUs, but none
of the contained housekeeping CPUs is online, then the interrupt is also
routed to an isolated CPU. Interrupts are only delivered when one of the
isolated CPUs in the affinity mask submits IO. If one of the contained
housekeeping CPUs comes online, the CPU hotplug logic migrates the
interrupt automatically back to the upcoming housekeeping CPU. Depending on
the type of interrupt controller, this can require that at least one
interrupt is delivered to the isolated CPU in order to complete the
migration.

[ tglx: Removed unused parameter, added and edited comments/documentation
  	and rephrased the changelog so it contains more details. ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200120091625.17912-1-ming.lei@redhat.com
2020-01-22 16:29:49 +01:00
..
autogroup.c sched/autogroup: Make autogroup_path() always available 2019-06-24 19:23:40 +02:00
autogroup.h sched/headers: Simplify and clean up header usage in the scheduler 2018-03-04 12:39:29 +01:00
clock.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
completion.c sched/Documentation: Update wake_up() & co. memory-barrier guarantees 2018-07-17 09:30:34 +02:00
core.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-11-26 16:02:40 -08:00
cpuacct.c sched/headers: Simplify and clean up header usage in the scheduler 2018-03-04 12:39:29 +01:00
cpudeadline.c Linux 5.2-rc5 2019-06-17 12:12:27 +02:00
cpudeadline.h sched/headers: Simplify and clean up header usage in the scheduler 2018-03-04 12:39:29 +01:00
cpufreq_schedutil.c cpufreq: Avoid leaving stale IRQ work items during CPU offline 2019-12-12 17:59:43 +01:00
cpufreq.c cpufreq: Avoid leaving stale IRQ work items during CPU offline 2019-12-12 17:59:43 +01:00
cpupri.c Linux 5.2-rc5 2019-06-17 12:12:27 +02:00
cpupri.h sched/headers: Simplify and clean up header usage in the scheduler 2018-03-04 12:39:29 +01:00
cputime.c sched/vtime: Bring up complete kcpustat accessor 2019-11-21 07:33:24 +01:00
deadline.c sched/core: Further clarify sched_class::set_next_task() 2019-11-11 08:35:21 +01:00
debug.c Linux 5.2-rc6 2019-06-24 19:19:53 +02:00
fair.c sched/cfs: fix spurious active migration 2019-12-17 13:32:48 +01:00
features.h sched/fair/util_est: Implement faster ramp-up EWMA on utilization increases 2019-10-29 10:01:07 +01:00
idle.c Power management updates for 5.5-rc1 2019-11-26 19:06:44 -08:00
isolation.c genirq, sched/isolation: Isolate from handling managed interrupts 2020-01-22 16:29:49 +01:00
loadavg.c sched: loadavg: make calc_load_n() public 2018-10-26 16:26:32 -07:00
Makefile psi: pressure stall information for CPU, memory, and IO 2018-10-26 16:26:32 -07:00
membarrier.c membarrier: Fix RCU locking bug caused by faulty merge 2019-10-01 21:27:50 +02:00
pelt.c sched/debug: Add new tracepoint to track PELT at se level 2019-06-24 19:23:42 +02:00
pelt.h sched/topology: Remove unused 'sd' parameter from arch_scale_cpu_capacity() 2019-06-24 19:23:39 +02:00
psi.c psi: Fix a division error in psi poll() 2019-12-17 13:32:48 +01:00
rt.c sched/core: Further clarify sched_class::set_next_task() 2019-11-11 08:35:21 +01:00
sched-pelt.h sched/fair: Fix "runnable_avg_yN_inv" not used warnings 2019-06-17 12:15:58 +02:00
sched.h sched/uclamp: Fix overzealous type replacement 2019-11-17 10:46:05 +01:00
stats.c proc: introduce proc_create_seq{,_data} 2018-05-16 07:23:35 +02:00
stats.h sched/stats: Fix unlikely() use of sched_info_on() 2019-07-25 15:51:55 +02:00
stop_task.c sched/core: Further clarify sched_class::set_next_task() 2019-11-11 08:35:21 +01:00
swait.c kernel/sched/: remove caller signal_pending branch predictions 2019-01-04 13:13:48 -08:00
topology.c Linux 5.4-rc7 2019-11-11 08:34:59 +01:00
wait_bit.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
wait.c Add wake_up_interruptible_sync_poll_locked() 2019-10-31 15:12:23 +00:00