linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-24 00:40:51 +07:00

History

Lukasz Luba 736b78c748 PM: EM: Increase energy calculation precision [ Upstream commit 7fcc17d0cb12938d2b3507973a6f93fc9ed2c7a1 ] The Energy Model (EM) provides useful information about device power in each performance state to other subsystems like: Energy Aware Scheduler (EAS). The energy calculation in EAS does arithmetic operation based on the EM em_cpu_energy(). Current implementation of that function uses em_perf_state::cost as a pre-computed cost coefficient equal to: cost = power * max_frequency / frequency. The 'power' is expressed in milli-Watts (or in abstract scale). There are corner cases when the EAS energy calculation for two Performance Domains (PDs) return the same value. The EAS compares these values to choose smaller one. It might happen that this values are equal due to rounding error. In such scenario, we need better resolution, e.g. 1000 times better. To provide this possibility increase the resolution in the em_perf_state::cost for 64-bit architectures. The cost of increasing resolution on 32-bit is pretty high (64-bit division) and is not justified since there are no new 32bit big.LITTLE EAS systems expected which would benefit from this higher resolution. This patch allows to avoid the rounding to milli-Watt errors, which might occur in EAS energy estimation for each PD. The rounding error is common for small tasks which have small utilization value. There are two places in the code where it makes a difference: 1. In the find_energy_efficient_cpu() where we are searching for best_delta. We might suffer there when two PDs return the same result, like in the example below. Scenario: Low utilized system e.g. ~200 sum_util for PD0 and ~220 for PD1. There are quite a few small tasks ~10-15 util. These tasks would suffer for the rounding error. These utilization values are typical when running games on Android. One of our partners has reported 5..10mA less battery drain when running with increased resolution. Some details: We have two PDs: PD0 (big) and PD1 (little) Let's compare w/o patch set ('old') and w/ patch set ('new') We are comparing energy w/ task and w/o task placed in the PDs a) 'old' w/o patch set, PD0 task_util = 13 cost = 480 sum_util_w/o_task = 215 sum_util_w_task = 228 scale_cpu = 1024 energy_w/o_task = 480 * 215 / 1024 = 100.78 => 100 energy_w_task = 480 * 228 / 1024 = 106.87 => 106 energy_diff = 106 - 100 = 6 (this is equal to 'old' PD1's energy_diff in 'c)') b) 'new' w/ patch set, PD0 task_util = 13 cost = 480 * 1000 = 480000 sum_util_w/o_task = 215 sum_util_w_task = 228 energy_w/o_task = 480000 * 215 / 1024 = 100781 energy_w_task = 480000 * 228 / 1024 = 106875 energy_diff = 106875 - 100781 = 6094 (this is not equal to 'new' PD1's energy_diff in 'd)') c) 'old' w/o patch set, PD1 task_util = 13 cost = 160 sum_util_w/o_task = 283 sum_util_w_task = 293 scale_cpu = 355 energy_w/o_task = 160 * 283 / 355 = 127.55 => 127 energy_w_task = 160 * 296 / 355 = 133.41 => 133 energy_diff = 133 - 127 = 6 (this is equal to 'old' PD0's energy_diff in 'a)') d) 'new' w/ patch set, PD1 task_util = 13 cost = 160 * 1000 = 160000 sum_util_w/o_task = 283 sum_util_w_task = 293 scale_cpu = 355 energy_w/o_task = 160000 * 283 / 355 = 127549 energy_w_task = 160000 * 296 / 355 = 133408 energy_diff = 133408 - 127549 = 5859 (this is not equal to 'new' PD0's energy_diff in 'b)') 2. Difference in the 6% energy margin filter at the end of find_energy_efficient_cpu(). With this patch the margin comparison also has better resolution, so it's possible to have better task placement thanks to that. Fixes: `27871f7a8a` ("PM: Introduce an Energy Model management framework") Reported-by: CCJ Yeh <CCj.Yeh@mediatek.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>		2024-07-05 19:11:38 +02:00
..
acpi	ACPI: fix NULL pointer dereference	2024-07-05 18:07:39 +02:00
asm-generic	vmlinux.lds.h: Handle clang's module.{c,d}tor sections	2024-07-05 18:55:16 +02:00
clocksource	clocksource/drivers/timer-ti-dm: Save and restore timer TIOCP_CFG	2021-07-14 16:56:12 +02:00
crypto	crypto: shash - avoid comparing pointers to exported functions under CFI	2021-07-14 16:55:54 +02:00
drm	drm: Return -ENOTTY for non-drm ioctls	2021-07-28 14:35:47 +02:00
dt-bindings	init: add dsm gpl source	2024-07-05 18:00:04 +02:00
keys	certs: Add EFI_CERT_X509_GUID support for dbx entries	2021-06-30 08:47:30 -04:00
kunit	kunit: fix display of failed expectations for strings	2020-11-10 13:45:15 -07:00
kvm	ARM:	2020-10-23 11:17:56 -07:00
linux	PM: EM: Increase energy calculation precision	2024-07-05 19:11:38 +02:00
math-emu
media	media: subdev: disallow ioctl for saa6588/davinci	2021-07-19 09:45:02 +02:00
memory
misc
net	Revert "flow_offload: action should not be NULL when it is referenced"	2024-07-05 18:55:48 +02:00
pcmcia
ras	mm,hwpoison: introduce MF_MSG_UNSPLIT_THP	2020-10-16 11:11:17 -07:00
rdma	RDMA: Lift ibdev_to_node from rds to common code	2021-02-26 10:12:59 +01:00
scsi	init: add dsm gpl source	2024-07-05 18:00:04 +02:00
soc	init: add dsm gpl source	2024-07-05 18:00:04 +02:00
sound	ALSA: hda: intel-nhlt: verify config type	2021-03-09 11:11:14 +01:00
target	scsi: target: core: Add cmd length set before cmd complete	2021-03-17 17:06:25 +01:00
trace	init: add dsm gpl source	2024-07-05 18:00:04 +02:00
uapi	bpf: Fix a typo of reuseport map in bpf.h.	2024-07-05 19:10:44 +02:00
vdso
video	gpu: ipu-v3: remove unused functions	2020-10-26 10:42:38 +01:00
xen	Xen/gntdev: correct error checking in gntdev_map_grant_pages()	2021-02-23 15:53:24 +01:00