mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-26 10:35:04 +07:00
51bb38cb78
If raw_copy_from_user(to, from, N) returns K, callers expect the first N - K bytes starting at to to have been replaced with the contents of corresponding area starting at from and the last K bytes of destination *left* *unmodified*. What arch/sky/lib/usercopy.c is doing is broken - it can lead to e.g. data corruption on write(2). raw_copy_to_user() is inaccurate about return value, which is a bug, but consequences are less drastic than for raw_copy_from_user(). And just what are those access_ok() doing in there? I mean, look into linux/uaccess.h; that's where we do that check (as well as zero tail on failure in the callers that need zeroing). AFAICS, all of that shouldn't be hard to fix; something like a patch below might make a useful starting point. I would suggest moving these macros into usercopy.c (they are never used anywhere else) and possibly expanding them there; if you leave them alive, please at least rename __copy_user_zeroing(). Again, it must not zero anything on failed read. Said that, I'm not sure we won't be better off simply turning usercopy.c into usercopy.S - all that is left there is a couple of functions, each consisting only of inline asm. Guo Ren reply: Yes, raw_copy_from_user is wrong, it's no need zeroing code. unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n) { unsigned long res = n; might_fault(); if (likely(access_ok(from, n))) { kasan_check_write(to, n); res = raw_copy_from_user(to, from, n); } if (unlikely(res)) memset(to + (n - res), 0, res); return res; } EXPORT_SYMBOL(_copy_from_user); You are right and access_ok() should be removed. but, how about: do { ... "2: stw %3, (%1, 0) \n" \ + " subi %0, 4 \n" \ "9: stw %4, (%1, 4) \n" \ + " subi %0, 4 \n" \ "10: stw %5, (%1, 8) \n" \ + " subi %0, 4 \n" \ "11: stw %6, (%1, 12) \n" \ + " subi %0, 4 \n" \ " addi %2, 16 \n" \ " addi %1, 16 \n" \ Don't expand __ex_table AI Viro reply: Hey, I've no idea about the instruction scheduling on csky - if that doesn't slow the things down, all the better. It's just that copy_to_user() and friends are on fairly hot codepaths, and in quite a few situations they will dominate the speed of e.g. read(2). So I tried to keep the fast path unchanged. Up to the architecture maintainers, obviously. Which would be you... As for the fixups size increase (__ex_table size is unchanged)... You have each of those macros expanded exactly once. So the size is not a serious argument, IMO - useless complexity would be, if it is, in fact, useless; the size... not really, especially since those extra subi will at least offset it. Again, up to you - asm optimizations of (essentially) memcpy()-style loops are tricky and can depend upon the fairly subtle details of architecture. So even on something I know reasonably well I would resort to direct experiments if I can't pass the buck to architecture maintainers. It *is* worth optimizing - this is where read() from a file that is already in page cache spends most of the time, etc. Guo Ren reply: Thx, after fixup some typo “sub %0, 4”, apply the patch. TODO: - user copy/from codes are still need optimizing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> |
||
---|---|---|
.. | ||
addrspace.h | ||
asid.h | ||
atomic.h | ||
barrier.h | ||
bitops.h | ||
bug.h | ||
cache.h | ||
cacheflush.h | ||
checksum.h | ||
cmpxchg.h | ||
elf.h | ||
fixmap.h | ||
ftrace.h | ||
highmem.h | ||
io.h | ||
irqflags.h | ||
Kbuild | ||
kprobes.h | ||
memory.h | ||
mmu_context.h | ||
mmu.h | ||
page.h | ||
pci.h | ||
perf_event.h | ||
pgalloc.h | ||
pgtable.h | ||
probes.h | ||
processor.h | ||
ptrace.h | ||
reg_ops.h | ||
segment.h | ||
shmparam.h | ||
smp.h | ||
spinlock_types.h | ||
spinlock.h | ||
stackprotector.h | ||
string.h | ||
switch_to.h | ||
syscall.h | ||
syscalls.h | ||
tcm.h | ||
thread_info.h | ||
tlb.h | ||
tlbflush.h | ||
traps.h | ||
uaccess.h | ||
unistd.h | ||
uprobes.h | ||
vdso.h | ||
vmalloc.h |