linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-24 14:10:49 +07:00

Go to file

Daniel Borkmann 5fc6ed1831 bpf: Fix leakage under speculation on mispredicted branches [ Upstream commit 9183671af6dbf60a1219371d4ed73e23f43b49db ] The verifier only enumerates valid control-flow paths and skips paths that are unreachable in the non-speculative domain. And so it can miss issues under speculative execution on mispredicted branches. For example, a type confusion has been demonstrated with the following crafted program: // r0 = pointer to a map array entry // r6 = pointer to readable stack slot // r9 = scalar controlled by attacker 1: r0 = (u64 )(r0) // cache miss 2: if r0 != 0x0 goto line 4 3: r6 = r9 4: if r0 != 0x1 goto line 6 5: r9 = (u8 )(r6) 6: // leak r9 Since line 3 runs iff r0 == 0 and line 5 runs iff r0 == 1, the verifier concludes that the pointer dereference on line 5 is safe. But: if the attacker trains both the branches to fall-through, such that the following is speculatively executed ... r6 = r9 r9 = (u8 )(r6) // leak r9 ... then the program will dereference an attacker-controlled value and could leak its content under speculative execution via side-channel. This requires to mistrain the branch predictor, which can be rather tricky, because the branches are mutually exclusive. However such training can be done at congruent addresses in user space using different branches that are not mutually exclusive. That is, by training branches in user space ... A: if r0 != 0x0 goto line C B: ... C: if r0 != 0x0 goto line D D: ... ... such that addresses A and C collide to the same CPU branch prediction entries in the PHT (pattern history table) as those of the BPF program's lines 2 and 4, respectively. A non-privileged attacker could simply brute force such collisions in the PHT until observing the attack succeeding. Alternative methods to mistrain the branch predictor are also possible that avoid brute forcing the collisions in the PHT. A reliable attack has been demonstrated, for example, using the following crafted program: // r0 = pointer to a [control] map array entry // r7 = (u64 )(r0 + 0), training/attack phase // r8 = (u64 )(r0 + 8), oob address // [...] // r0 = pointer to a [data] map array entry 1: if r7 == 0x3 goto line 3 2: r8 = r0 // crafted sequence of conditional jumps to separate the conditional // branch in line 193 from the current execution flow 3: if r0 != 0x0 goto line 5 4: if r0 == 0x0 goto exit 5: if r0 != 0x0 goto line 7 6: if r0 == 0x0 goto exit [...] 187: if r0 != 0x0 goto line 189 188: if r0 == 0x0 goto exit // load any slowly-loaded value (due to cache miss in phase 3) ... 189: r3 = (u64 )(r0 + 0x1200) // ... and turn it into known zero for verifier, while preserving slowly- // loaded dependency when executing: 190: r3 &= 1 191: r3 &= 2 // speculatively bypassed phase dependency 192: r7 += r3 193: if r7 == 0x3 goto exit 194: r4 = (u8 )(r8 + 0) // leak r4 As can be seen, in training phase (phase != 0x3), the condition in line 1 turns into false and therefore r8 with the oob address is overridden with the valid map value address, which in line 194 we can read out without issues. However, in attack phase, line 2 is skipped, and due to the cache miss in line 189 where the map value is (zeroed and later) added to the phase register, the condition in line 193 takes the fall-through path due to prior branch predictor training, where under speculation, it'll load the byte at oob address r8 (unknown scalar type at that point) which could then be leaked via side-channel. One way to mitigate these is to 'branch off' an unreachable path, meaning, the current verification path keeps following the is_branch_taken() path and we push the other branch to the verification stack. Given this is unreachable from the non-speculative domain, this branch's vstate is explicitly marked as speculative. This is needed for two reasons: i) if this path is solely seen from speculative execution, then we later on still want the dead code elimination to kick in in order to sanitize these instructions with jmp-1s, and ii) to ensure that paths walked in the non-speculative domain are not pruned from earlier walks of paths walked in the speculative domain. Additionally, for robustness, we mark the registers which have been part of the conditional as unknown in the speculative path given there should be no assumptions made on their content. The fix in here mitigates type confusion attacks described earlier due to i) all code paths in the BPF program being explored and ii) existing verifier logic already ensuring that given memory access instruction references one specific data structure. An alternative to this fix that has also been looked at in this scope was to mark aux->alu_state at the jump instruction with a BPF_JMP_TAKEN state as well as direction encoding (always-goto, always-fallthrough, unknown), such that mixing of different always-* directions themselves as well as mixing of always-* with unknown directions would cause a program rejection by the verifier, e.g. programs with constructs like 'if ([...]) { x = 0; } else { x = 1; }' with subsequent 'if (x == 1) { [...] }'. For unprivileged, this would result in only single direction always-* taken paths, and unknown taken paths being allowed, such that the former could be patched from a conditional jump to an unconditional jump (ja). Compared to this approach here, it would have two downsides: i) valid programs that otherwise are not performing any pointer arithmetic, etc, would potentially be rejected/broken, and ii) we are required to turn off path pruning for unprivileged, where both can be avoided in this work through pushing the invalid branch to the verification stack. The issue was originally discovered by Adam and Ofek, and later independently discovered and reported as a result of Benedict and Piotr's research work. Fixes: `b2157399cc` ("bpf: prevent out-of-bounds speculation") Reported-by: Adam Morrison <mad@cs.tau.ac.il> Reported-by: Ofek Kirzner <ofekkir@gmail.com> Reported-by: Benedict Schlueter <benedict.schlueter@rub.de> Reported-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de> Reviewed-by: Piotr Krysiuk <piotras@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>		2021-06-23 14:42:45 +02:00
arch	kvm: LAPIC: Restore guard to prevent illegal APIC register access	2021-06-23 14:42:41 +02:00
block	blk-mq: Swap two calls in blk_mq_exit_queue()	2021-05-19 10:13:14 +02:00
certs	certs: Fix blacklist flag type confusion	2021-03-04 11:37:59 +01:00
crypto	async_xor: check src_offs is not NULL before updating it	2021-06-16 12:01:40 +02:00
Documentation	ASoC: meson: gx-card: fix sound-dai dt schema	2021-06-16 12:01:45 +02:00
drivers	cxgb4: fix wrong ethtool n-tuple rule lookup	2021-06-23 14:42:45 +02:00
fs	fanotify: fix copy_event_to_user() fid error clean up	2021-06-23 14:42:41 +02:00
include	net: make get_net_ns return error if NET_NS is disabled	2021-06-23 14:42:44 +02:00
init	pid: take a reference when initializing `cad_pid`	2021-06-10 13:39:26 +02:00
ipc	ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry	2021-05-26 12:06:54 +02:00
kernel	bpf: Fix leakage under speculation on mispredicted branches	2021-06-23 14:42:45 +02:00
lib	lib/lz4: explicitly support in-place decompression	2021-06-10 13:39:29 +02:00
LICENSES	LICENSES/deprecated: add Zlib license text	2020-09-16 14:33:49 +02:00
mm	mm/memory-failure: make sure wait for page writeback in memory_failure	2021-06-23 14:42:40 +02:00
net	net: qrtr: fix OOB Read in qrtr_endpoint_post	2021-06-23 14:42:45 +02:00
samples	samples: vfio-mdev: fix error handing in mdpy_fb_probe()	2021-06-10 13:39:15 +02:00
scripts	scripts/clang-tools: switch explicitly to Python 3	2021-06-03 09:00:52 +02:00
security	KEYS: trusted: Fix memory leak on object td	2021-05-19 10:12:50 +02:00
sound	ASoC: core: Fix Null-point-dereference in fmt_single_name()	2021-06-16 12:01:45 +02:00
tools	ipv4: Fix device used for dst_alloc with local routes	2021-06-23 14:42:45 +02:00
usr	Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2020-08-07 13:29:39 -07:00
virt	Revert "irqbypass: do not start cons/prod when failed connect"	2021-06-03 09:00:34 +02:00
.clang-format	RDMA 5.10 pull request	2020-10-17 11:18:18 -07:00
.cocciconfig	scripts: add Linux .cocciconfig for coccinelle	2016-07-22 12:13:39 +02:00
.get_maintainer.ignore	Opt out of scripts/get_maintainer.pl	2019-05-16 10:53:40 -07:00
.gitattributes	.gitattributes: use 'dts' diff driver for dts files	2019-12-04 19:44:11 -08:00
.gitignore	kbuild: generate Module.symvers only when vmlinux exists	2021-05-19 10:12:59 +02:00
.mailmap	mailmap: add two more addresses of Uwe Kleine-König	2020-12-06 10:19:07 -08:00
COPYING	COPYING: state that all contributions really are covered by this file	2020-02-10 13:32:20 -08:00
CREDITS	MAINTAINERS: Move Jason Cooper to CREDITS	2020-11-30 10:20:34 +01:00
Kbuild	kbuild: rename hostprogs-y/always to hostprogs/always-y	2020-02-04 01:53:07 +09:00
Kconfig	kbuild: ensure full rebuild when the compiler is updated	2020-05-12 13:28:33 +09:00
MAINTAINERS	f2fs: move ioctl interface definitions to separated file	2021-05-19 10:13:00 +02:00
Makefile	Linux 5.10.45	2021-06-18 10:00:06 +02:00
README	Drop all 00-INDEX files from Documentation/	2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.