linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-27 21:15:04 +07:00

Author	SHA1	Message	Date
Linus Torvalds	3b06b1a744	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc fixes from David Miller: - Fix symbol version generation for assembler on sparc, from Nagarathnam Muthusamy. - Fix compound page handling in gup_huge_pmd(), from Nitin Gupta. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix gup_huge_pmd Adding the type of exported symbols sed regex in Makefile.build requires line break between exported symbols Adding asm-prototypes.h for genksyms to generate crc	2017-07-11 21:34:24 -07:00
Linus Torvalds	98ced886dd	Kbuild thin archives updates for v4.13 Thin archives migration by Nicholas Piggin. THIN_ARCHIVES has been available for a while as an optional feature only for PowerPC architecture, but we do not need two different intermediate-artifact schemes. Using thin archives instead of conventional incremental linking has various advantages: - save disk space for builds - speed-up building a little - fix some link issues (for example, allyesconfig on ARM) due to more flexibility for the final linking - work better with dead code elimination we are planning As discussed before, this migration has been done unconditionally so that any problems caused by this will show up with "git bisect". With testing with 0-day and linux-next, some architectures actually showed up problems, but they were trivial and all fixed now. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZXsiSAAoJED2LAQed4NsGfqUQAIxbR4JcFCeGNNqgOV1q7Ban CaMzVZWPum0Mq+JWzknHrCJQzBE+4BPLbOtZH4Y0YhjXVfc2/M8QkzEzSWyEPm03 FyaQ6WTq479mv7Ot2nAwaRSUYNSOuvlCx5KUOxITMJ/VmxwXXc9fCuT3ORu9opdK 4iyh0P2D+IeABQlrS5k1Rj+y4u/BtpiGY9U5RDssn7u8sjEgBHWFXFfE2fQ0No+0 1lzwa5EVyPHuq0XTBeZkPSDNxtou4iZzQC9QeNIYlyiod1G9deE4lzB55s+Qtkk0 h6rN9WF+Rvy7/hjFUJy0TDPNx0io2kdJxMaMKp2HaES49w5fHv7NAgxuipFC91vE 5UKs1sXxBe8dpPjfZWY7QSQ/JQv6NuG7NWcSGM29BWy3yFefSAXCggM+nn5IWzLH pSutfOBGeceJdyKMcdn3AgcHCj0wddFxX8AXst+ZebnqVoNxR/Nu6HGmyaucwyp3 6fFTkbZ6DvOlu9MKbK0HSqrsT3DlAas2YWZKZ4Cc20wM99Z0OtFZlmpMCRIdiYtx hZBwze/ElheUbZu6igH6UX2lpOlat0V6nT5vKHGGeOJlwkxduKi3Kj6zVSkCHic5 w3NLXr5FDWdkrMiC6/Z0Uae5mtAWOYyt6z1CwjgVmFrAkqlL8aWNagOcDCSFc1qR +3Cv7pZQSRWy2TaaLMzo =PAWi -----END PGP SIGNATURE----- Merge tag 'kbuild-thinar-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild thin archives updates from Masahiro Yamada: "Thin archives migration by Nicholas Piggin. THIN_ARCHIVES has been available for a while as an optional feature only for PowerPC architecture, but we do not need two different intermediate-artifact schemes. Using thin archives instead of conventional incremental linking has various advantages: - save disk space for builds - speed-up building a little - fix some link issues (for example, allyesconfig on ARM) due to more flexibility for the final linking - work better with dead code elimination we are planning As discussed before, this migration has been done unconditionally so that any problems caused by this will show up with "git bisect". With testing with 0-day and linux-next, some architectures actually showed up problems, but they were trivial and all fixed now" * tag 'kbuild-thinar-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: tile: remove unneeded extra-y in Makefile kbuild: thin archives make default for all archs x86/um: thin archives build fix tile: thin archives fix linking ia64: thin archives fix linking sh: thin archives fix linking kbuild: handle libs-y archives separately from built-in.o archives kbuild: thin archives use P option to ar kbuild: thin archives final link close --whole-archives option ia64: remove unneeded extra-y in Makefile.gate tile: fix dependency and .*.cmd inclusion for incremental build sparc64: Use indirect calls in hamming weight stubs	2017-07-07 15:11:12 -07:00
David S. Miller	9289ea7f95	sparc64: Use indirect calls in hamming weight stubs Otherwise, depending upon link order, the branch relocation limits could be exceeded. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2017-06-30 08:59:55 +09:00
Nagarathnam Muthusamy	f5a651f1d5	Adding the type of exported symbols Missing symbol type for few functions prevents genksyms from generating symbol versions for those functions. This patch fixes them. Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-19 11:13:06 -07:00
Nagarathnam Muthusamy	d16c0649fe	sed regex in Makefile.build requires line break between exported symbols The following regex in Makefile.build matches only one ___EXPORT_SYMBOL per line. sed 's/.___EXPORT_SYMBOL[[:space:]]\([a-zA-Z0-9_]\)[[:space:]],.*/EXPORT_SYMBOL(\1);/' ATOMIC_OPS macro in atomic_64.S expands multiple symbols in same line hence version generation is done only for the last matched symbol. This patch adds new line between the symbol expansions. Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-19 11:13:05 -07:00
David S. Miller	1b4af13ff2	sparc64: Add __multi3 for gcc 7.x and later. Reported-by: Waldemar Brodkorb <wbx@openadk.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 11:30:33 -07:00
Dave Aldridge	3c7f622120	sparc64: fix fault handling in NGbzero.S and GENbzero.S When any of the functions contained in NGbzero.S and GENbzero.S vector through *bzero_from_clear_user, we may end up taking a fault when executing one of the store alternate address space instructions. If this happens, the exception handler does not restore the %asi register. This commit fixes the issue by introducing a new exception handler that ensures the %asi register is restored when a fault is handled. Orabug: 25577560 Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-09 12:16:25 -07:00
Linus Torvalds	5db6db0d40	Merge branch 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess unification updates from Al Viro: "This is the uaccess unification pile. It's _not_ the end of uaccess work, but the next batch of that will go into the next cycle. This one mostly takes copy_from_user() and friends out of arch/* and gets the zero-padding behaviour in sync for all architectures. Dealing with the nocache/writethrough mess is for the next cycle; fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am sold on access_ok() in there, BTW; just not in this pile), same for reducing __copy_... callsites, strn... stuff, etc. - there will be a pile about as large as this one in the next merge window. This one sat in -next for weeks. -3KLoC" 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits) HAVE_ARCH_HARDENED_USERCOPY is unconditional now CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now m32r: switch to RAW_COPY_USER hexagon: switch to RAW_COPY_USER microblaze: switch to RAW_COPY_USER get rid of padding, switch to RAW_COPY_USER ia64: get rid of copy_in_user() ia64: sanitize __access_ok() ia64: get rid of 'segment' argument of __do_{get,put}_user() ia64: get rid of 'segment' argument of __{get,put}_user_check() ia64: add extable.h powerpc: get rid of zeroing, switch to RAW_COPY_USER esas2r: don't open-code memdup_user() alpha: fix stack smashing in old_adjtimex(2) don't open-code kernel_setsockopt() mips: switch to RAW_COPY_USER mips: get rid of tail-zeroing in primitives mips: make copy_from_user() zero tail explicitly mips: clean and reorder the forest of macros... mips: consolidate __invoke_... wrappers ...	2017-05-01 14:41:04 -07:00
Al Viro	31af2f36d5	sparc: switch to RAW_COPY_USER ... and drop zeroing in sparc32. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-02 12:53:15 -04:00
Babu Moger	0ae2d26ffe	arch/sparc: Avoid DCTI Couples Avoid un-intended DCTI Couples. Use of DCTI couples is deprecated. Also address the "Programming Note" for optimal performance. Here is the complete text from Oracle SPARC Architecture Specs. 6.3.4.7 DCTI Couples "A delayed control transfer instruction (DCTI) in the delay slot of another DCTI is referred to as a “DCTI couple”. The use of DCTI couples is deprecated in the Oracle SPARC Architecture; no new software should place a DCTI in the delay slot of another DCTI, because on future Oracle SPARC Architecture implementations DCTI couples may execute either slowly or differently than the programmer assumes it will. SPARC V8 and SPARC V9 Compatibility Note The SPARC V8 architecture left behavior undefined for a DCTI couple. The SPARC V9 architecture defined behavior in that case, but as of UltraSPARC Architecture 2005, use of DCTI couples was deprecated. Software should not expect high performance from DCTI couples, and performance of DCTI couples should be expected to decline further in future processors. Programming Note As noted in TABLE 6-5 on page 115, an annulled branch-always (branch-always with a = 1) instruction is not architecturally a DCTI. However, since not all implementations make that distinction, for optimal performance, a DCTI should not be placed in the instruction word immediately following an annulled branch-always instruction (BA,A or BPA,A)." Signed-off-by: Babu Moger <babu.moger@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-27 21:51:40 -07:00
David S. Miller	0fd0ff01d4	sparc64: Delete now unused user copy fixup functions. Now that all of the user copy routines are converted to return accurate residual lengths when an exception occurs, we no longer need the broken fixup routines. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 21:26:04 -07:00
David S. Miller	ee841d0aff	sparc64: Convert U3copy_{from,to}_user to accurate exception reporting. Report the exact number of bytes which have not been successfully copied when an exception occurs, using the running remaining length. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 21:20:35 -07:00
David S. Miller	e93704e446	sparc64: Convert NG2copy_{from,to}_user to accurate exception reporting. Report the exact number of bytes which have not been successfully copied when an exception occurs, using the running remaining length. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 20:46:44 -07:00
David S. Miller	7ae3aaf53f	sparc64: Convert NGcopy_{from,to}_user to accurate exception reporting. Report the exact number of bytes which have not been successfully copied when an exception occurs, using the running remaining length. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 19:32:12 -07:00
David S. Miller	9570770480	sparc64: Convert NG4copy_{from,to}_user to accurate exception reporting. Report the exact number of bytes which have not been successfully copied when an exception occurs, using the running remaining length. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 18:58:05 -07:00
David S. Miller	cb736fdbb2	sparc64: Convert U1copy_{from,to}_user to accurate exception reporting. Report the exact number of bytes which have not been successfully copied when an exception occurs, using the running remaining length. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 11:32:12 -07:00
David S. Miller	d0796b555b	sparc64: Convert GENcopy_{from,to}_user to accurate exception reporting. Report the exact number of bytes which have not been successfully copied when an exception occurs, using the running remaining length. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 11:31:58 -07:00
David S. Miller	0096ac9f47	sparc64: Convert copy_in_user to accurate exception reporting. Report the exact number of bytes which have not been successfully copied when an exception occurs, using the running remaining length. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 11:31:58 -07:00
David S. Miller	83a17d2661	sparc64: Prepare to move to more saner user copy exception handling. The fixup helper function mechanism for handling user copy fault handling is not %100 accurrate, and can never be made so. We are going to transition the code to return the running return return length, which is always kept track in one or more registers of each of these routines. In order to convert them one by one, we have to allow the existing behavior to continue functioning. Therefore make all the copy code that wants the fixup helper to be used return negative one. After all of the user copy routines have been converted, this logic and the fixup helpers themselves can be removed completely. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-24 11:31:58 -07:00
Al Viro	fb2e6fdbbd	sparc32: debride memcpy.S a bit unreachable code, unused macros... Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-08-07 23:55:49 -04:00
Al Viro	70a6fcf328	[sparc] unify 32bit and 64bit string.h Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-08-07 23:55:48 -04:00
Al Viro	d3867f0483	sparc: move exports to definitions Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-08-07 23:55:43 -04:00
Peter Zijlstra	3a1adb23a5	locking/atomic, arch/sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Implement FETCH-OP atomic primitives, these are very similar to the existing OP-RETURN primitives we already have, except they return the value of the atomic variable _before_ modification. This is especially useful for irreversible operations -- such as bitops (because it becomes impossible to reconstruct the state prior to modification). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: James Y Knight <jyknight@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: sparclinux@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-06-16 10:48:30 +02:00
Rob Gardner	a7c5724b5c	sparc64: fix FP corruption in user copy functions Short story: Exception handlers used by some copy_to_user() and copy_from_user() functions do not diligently clean up floating point register usage, and this can result in a user process seeing invalid values in floating point registers. This sometimes makes the process fail. Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions use floating point registers and VIS alignaddr/faligndata to accelerate data copying when source and dest addresses don't align well. Linux uses a lazy scheme for saving floating point registers; It is not done upon entering the kernel since it's a very expensive operation. Rather, it is done only when needed. If the kernel ends up not using FP regs during the course of some trap or system call, then it can return to user space without saving or restoring them. The various memcpy functions begin their FP code with VISEntry (or a variation thereof), which saves the FP regs. They conclude their FP code with VISExit (or a variation) which essentially marks the FP regs "clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned off so that a lazy restore will be triggered when/if the user process accesses floating point regs again. The bug is that the user copy variants of memcpy, copy_from_user() and copy_to_user(), employ an exception handling mechanism to detect faults when accessing user space addresses, and when this handler is invoked, an immediate return from the function is forced, and VISExit is not executed, thus leaving the fprs register in an indeterminate state, but often with fprs.FPRS_FEF set and one or more dirty bits. This results in a return to user space with invalid values in the FP regs, and since fprs.FPRS_FEF is on, no lazy restore occurs. This bug affects copy_to_user() and copy_from_user() for NG4, NG2, U3, and U1. All are fixed by using a new exception handler for those loads and stores that are done during the time between VISEnter and VISExit. n.b. In NG4memcpy, the problematic code can be triggered by a copy size greater than 128 bytes and an unaligned source address. This bug is known to be the cause of random user process memory corruptions while perf is running with the callgraph option (ie, perf record -g). This occurs because perf uses copy_from_user() to read user stacks, and may fault when it follows a stack frame pointer off to an invalid page. Validation checks on the stack address just obscure the underlying problem. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-24 12:13:18 -05:00
Linus Torvalds	2c302e7e41	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc updates from David Miller: "Just a couple of fixes/cleanups: - Correct NUMA latency calculations on sparc64, from Nitin Gupta. - ASI_ST_BLKINIT_MRU_S value was wrong, from Rob Gardner. - Fix non-faulting load handling of non-quad values, also from Rob Gardner. - Cleanup VISsave assembler, from Sam Ravnborg. - Fix iommu-common code so it doesn't emit rediculous warnings on some architectures, particularly ARM" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix numa distance values sparc64: Don't restrict fp regs for no-fault loads iommu-common: Fix error code used in iommu_tbl_range_{alloc,free}(). sparc64: use ENTRY/ENDPROC in VISsave sparc64: Fix incorrect ASI_ST_BLKINIT_MRU_S value	2015-11-05 16:34:48 -08:00
Linus Torvalds	ca520cab25	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking and atomic updates from Ingo Molnar: "Main changes in this cycle are: - Extend atomic primitives with coherent logic op primitives (atomic_{or,and,xor}()) and deprecate the old partial APIs (atomic_{set,clear}_mask()) The old ops were incoherent with incompatible signatures across architectures and with incomplete support. Now every architecture supports the primitives consistently (by Peter Zijlstra) - Generic support for 'relaxed atomics': - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return() - atomic_read_acquire() - atomic_set_release() This came out of porting qwrlock code to arm64 (by Will Deacon) - Clean up the fragile static_key APIs that were causing repeat bugs, by introducing a new one: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. To be able to know the 'type' of the static key we encode it in the jump entry (by Peter Zijlstra) - Static key self-tests (by Jason Baron) - qrwlock optimizations (by Waiman Long) - small futex enhancements (by Davidlohr Bueso) - ... and misc other changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) jump_label/x86: Work around asm build bug on older/backported GCCs locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h locking/qrwlock: Make use of _{acquire\|release\|relaxed}() atomics locking/qrwlock: Implement queue_write_unlock() using smp_store_release() locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition locking, asm-generic: Add _{relaxed\|acquire\|release}() variants for 'atomic_long_t' locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication locking/atomics: Add _{acquire\|release\|relaxed}() variants of some atomic operations locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic locking/static_keys: Make verify_keys() static jump label, locking/static_keys: Update docs locking/static_keys: Provide a selftest jump_label: Provide a self-test s390/uaccess, locking/static_keys: employ static_branch_likely() x86, tsc, locking/static_keys: Employ static_branch_likely() locking/static_keys: Add selftest locking/static_keys: Add a new static_key interface locking/static_keys: Rework update logic locking/static_keys: Add static_key_{en,dis}able() helpers ...	2015-09-03 15:46:07 -07:00
Sam Ravnborg	73958c651f	sparc64: use ENTRY/ENDPROC in VISsave From 7d8a508d74e6cacf0f2438286a959c3195a35a37 Mon Sep 17 00:00:00 2001 From: Sam Ravnborg <sam@ravnborg.org> Date: Fri, 7 Aug 2015 20:26:12 +0200 Subject: [PATCH] sparc64: use ENTRY/ENDPROC in VISsave Commit `44922150d8` ("sparc64: Fix userspace FPU register corruptions") left a stale globl symbol which was not used. Fix this and introduce use of ENTRY/ENDPROC Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-07 15:22:40 -07:00
David S. Miller	44922150d8	sparc64: Fix userspace FPU register corruptions. If we have a series of events from userpsace, with %fprs=FPRS_FEF, like follows: ETRAP ETRAP VIS_ENTRY(fprs=0x4) VIS_EXIT RTRAP (kernel FPU restore with fpu_saved=0x4) RTRAP We will not restore the user registers that were clobbered by the FPU using kernel code in the inner-most trap. Traps allocate FPU save slots in the thread struct, and FPU using sequences save the "dirty" FPU registers only. This works at the initial trap level because all of the registers get recorded into the top-level FPU save area, and we'll return to userspace with the FPU disabled so that any FPU use by the user will take an FPU disabled trap wherein we'll load the registers back up properly. But this is not how trap returns from kernel to kernel operate. The simplest fix for this bug is to always save all FPU register state for anything other than the top-most FPU save area. Getting rid of the optimized inner-slot FPU saving code ends up making VISEntryHalf degenerate into plain VISEntry. Longer term we need to do something smarter to reinstate the partial save optimizations. Perhaps the fundament error is having trap entry and exit allocate FPU save slots and restore register state. Instead, the VISEntry et al. calls should be doing that work. This bug is about two decades old. Reported-by: James Y Knight <jyknight@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 19:13:25 -07:00
Peter Zijlstra	304a0d699a	sparc: Provide atomic_{or,xor,and} Implement atomic logic ops -- atomic_{or,xor,and}. These will replace the atomic_{set,clear}_mask functions that are available on some archs. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-07-27 14:06:23 +02:00
David S. Miller	2077cef4d5	sparc64: Fix several bugs in memmove(). Firstly, handle zero length calls properly. Believe it or not there are a few of these happening during early boot. Next, we can't just drop to a memcpy() call in the forward copy case where dst <= src. The reason is that the cache initializing stores used in the Niagara memcpy() implementations can end up clearing out cache lines before we've sourced their original contents completely. For example, considering NG4memcpy, the main unrolled loop begins like this: load src + 0x00 load src + 0x08 load src + 0x10 load src + 0x18 load src + 0x20 store dst + 0x00 Assume dst is 64 byte aligned and let's say that dst is src - 8 for this memcpy() call. That store at the end there is the one to the first line in the cache line, thus clearing the whole line, which thus clobbers "src + 0x28" before it even gets loaded. To avoid this, just fall through to a simple copy only mildly optimized for the case where src and dst are 8 byte aligned and the length is a multiple of 8 as well. We could get fancy and call GENmemcpy() but this is good enough for how this thing is actually used. Reported-by: David Ahern <david.ahern@oracle.com> Reported-by: Bob Picco <bpicco@meloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 09:22:10 -07:00
Andreas Larsson	1a17fdc4f4	sparc32: Implement xchg and atomic_xchg using ATOMIC_HASH locks Atomicity between xchg and cmpxchg cannot be guaranteed when xchg is implemented with a swap and cmpxchg is implemented with locks. Without this, e.g. mcs_spin_lock and mcs_spin_unlock are broken. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-07 12:51:44 -08:00
David S. Miller	f4da3628dc	sparc64: Fix FPU register corruption with AES crypto offload. The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the key material is preloaded into the FPU registers, and then we loop over and over doing the crypt operation, reusing those pre-cooked key registers. There are intervening blkcipher*() calls between the crypt operation calls. And those might perform memcpy() and thus also try to use the FPU. The sparc64 kernel FPU usage mechanism is designed to allow such recursive uses, but with a catch. There has to be a trap between the two FPU using threads of control. The mechanism works by, when the FPU is already in use by the kernel, allocating a slot for FPU saving at trap time. Then if, within the trap handler, we try to use the FPU registers, the pre-trap FPU register state is saved into the slot. Then at trap return time we notice this and restore the pre-trap FPU state. Over the long term there are various more involved ways we can make this work, but for a quick fix let's take advantage of the fact that the situation where this happens is very limited. All sparc64 chips that support the crypto instructiosn also are using the Niagara4 memcpy routine, and that routine only uses the FPU for large copies where we can't get the source aligned properly to a multiple of 8 bytes. We look to see if the FPU is already in use in this context, and if so we use the non-large copy path which only uses integer registers. Furthermore, we also limit this special logic to when we are doing kernel copy, rather than a user copy. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 19:37:58 -07:00
Linus Torvalds	dbb885fecc	Merge branch 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull arch atomic cleanups from Ingo Molnar: "This is a series kept separate from the main locking tree, which cleans up and improves various details in the atomics type handling: - Remove the unused atomic_or_long() method - Consolidate and compress atomic ops implementations between architectures, to reduce linecount and to make it easier to add new ops. - Rewrite generic atomic support to only require cmpxchg() from an architecture - generate all other methods from that" * 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) locking,arch: Use ACCESS_ONCE() instead of cast to volatile in atomic_read() locking, mips: Fix atomics locking, sparc64: Fix atomics locking,arch: Rewrite generic atomic support locking,arch,xtensa: Fold atomic_ops locking,arch,sparc: Fold atomic_ops locking,arch,sh: Fold atomic_ops locking,arch,powerpc: Fold atomic_ops locking,arch,parisc: Fold atomic_ops locking,arch,mn10300: Fold atomic_ops locking,arch,mips: Fold atomic_ops locking,arch,metag: Fold atomic_ops locking,arch,m68k: Fold atomic_ops locking,arch,m32r: Fold atomic_ops locking,arch,ia64: Fold atomic_ops locking,arch,hexagon: Fold atomic_ops locking,arch,cris: Fold atomic_ops locking,arch,avr32: Fold atomic_ops locking,arch,arm64: Fold atomic_ops locking,arch,arm: Fold atomic_ops ...	2014-10-13 15:48:00 +02:00
Peter Zijlstra	caa17d49f9	locking, sparc64: Fix atomics The patch folding the atomic ops had a silly fail in the _return primitives. Fixes: `4f3316c2b5` ("locking,arch,sparc: Fold atomic_ops") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: David S. Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/20140902094016.GD31157@worktop.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-09-10 11:45:04 +02:00
Andreas Larsson	74cad25c07	sparc: Let memset return the address argument This makes memset follow the standard (instead of returning 0 on success). This is needed when certain versions of gcc optimizes around memset calls and assume that the address argument is preserved in %o0. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 16:38:10 -07:00
Peter Zijlstra	4f3316c2b5	locking,arch,sparc: Fold atomic_ops Many of the atomic op implementations are the same except for one instruction; fold the lot into a few CPP macros and reduce LoC. This also prepares for easy addition of new ops. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Kirill Tkhai <tkhai@yandex.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/20140508135852.825281379@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-08-14 12:48:13 +02:00
Linus Torvalds	049711bf3c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next Pull sparc updates from David Miller: 1) Add sparc RAM output to /proc/iomem, from Bob Picco. 2) Allow seeks on /dev/mdesc, from Khalid Aziz. 3) Cleanup sparc64 I/O accessors, from Sam Ravnborg. 4) If update_mmu_cache{,_pmd}() is called with an not-valid mapping, do not insert it into the TLB miss hash tables otherwise we'll livelock. Based upon work by Christopher Alexander Tobias Schulze. 5) Fix BREAK detection in sunsab driver when no actual characters are pending, from Christopher Alexander Tobias Schulze. 6) Because we have modules --> openfirmware --> vmalloc ordering of virtual memory, the lazy VMAP TLB flusher can cons up an invocation of flush_tlb_kernel_range() that covers the openfirmware address range. Unfortunately this will flush out the firmware's locked TLB mapping which causes all kinds of trouble. Just split up the flush request if this happens, but in the long term the lazy VMAP flusher should probably be made a little bit smarter. Based upon work by Christopher Alexander Tobias Schulze. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next: sparc64: Fix up merge thinko. sparc: Add "install" target arch/sparc/math-emu/math_32.c: drop stray break operator sparc64: ldc_connect() should not return EINVAL when handshake is in progress. sparc64: Guard against flushing openfirmware mappings. sunsab: Fix detection of BREAK on sunsab serial console bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000 sparc64: Do not insert non-valid PTEs into the TSB hash table. sparc64: avoid code duplication in io_64.h sparc64: reorder functions in io_64.h sparc64: drop unused SLOW_DOWN_IO definitions sparc64: remove macro indirection in io_64.h sparc64: update IO access functions in PeeCeeI sparcspkr: use sbus_*() primitives for IO sparc: Add support for seek and shorter read to /dev/mdesc sparc: use %s for unaligned panic drivers/sbus/char: Micro-optimization in display7seg.c display7seg: Introduce the use of the managed version of kzalloc sparc64 - add mem to iomem resource	2014-08-06 09:41:23 -07:00
Sam Ravnborg	6b8b5507ed	sparc64: update IO access functions in PeeCeeI The PeeCeeI.c code used in() + out() for IO access. But these are in little endian and the native (big) endian result was required which resulted in some bit-shifting. Shift the code over to use the __raw_*() variants all over. This simplifies the code as we can drop the calls to le16_to_cpu() and le32_to_cpu(). And it should be a little faster too. With this change we now uses the same type of IO access functions in all of the file. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-21 21:43:18 -07:00
Steven Rostedt (Red Hat)	2563b9d965	sparc64,ftrace: Remove check of obsolete variable function_trace_stop Nothing sets function_trace_stop to disable function tracing anymore. Remove the check for it in the arch code. Link: http://lkml.kernel.org/r/20140703.211820.1674895115102216877.davem@davemloft.net Cc: David S. Miller <davem@davemloft.net> OKed-to-go-through-tracing-tree-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-07-18 13:57:03 -04:00
Linus Torvalds	c4222e4635	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next Pull sparc fixes from David Miller: "Sparc sparse fixes from Sam Ravnborg" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next: (67 commits) sparc64: fix sparse warnings in int_64.c sparc64: fix sparse warning in ftrace.c sparc64: fix sparse warning in kprobes.c sparc64: fix sparse warning in kgdb_64.c sparc64: fix sparse warnings in compat_audit.c sparc64: fix sparse warnings in init_64.c sparc64: fix sparse warnings in aes_glue.c sparc: fix sparse warnings in smp_32.c + smp_64.c sparc64: fix sparse warnings in perf_event.c sparc64: fix sparse warnings in kprobes.c sparc64: fix sparse warning in tsb.c sparc64: clean up compat_sigset_t.seta handling sparc64: fix sparse "Should it be static?" warnings in signal32.c sparc64: fix sparse warnings in sys_sparc32.c sparc64: fix sparse warning in pci.c sparc64: fix sparse warnings in smp_64.c sparc64: fix sparse warning in prom_64.c sparc64: fix sparse warning in btext.c sparc64: fix sparse warnings in sys_sparc_64.c + unaligned_64.c sparc64: fix sparse warning in process_64.c ... Conflicts: arch/sparc/include/asm/pgtable_64.h	2014-06-19 07:50:07 -10:00
David S. Miller	5aa4ecfd0d	sparc64: Add membar to Niagara2 memcpy code. This is the prevent previous stores from overlapping the block stores done by the memcpy loop. Based upon a glibc patch by Jose E. Marchesi Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-17 11:28:05 -07:00
Sam Ravnborg	e1039fb426	sparc32: introduce asm-generic/io.h Use asm-generic/io.h definitions where applicable. The inxx() and outxx() methods whcih was duplicated in pcic.c + leon_pci.c are replaced by a set of static inlins from asm-generic/io.h iomap.c is replaced by the generic versions, but are still present to support sparc64. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-02 01:30:21 -04:00
David S. Miller	b2d4383480	sparc64: Make PAGE_OFFSET variable. Choose PAGE_OFFSET dynamically based upon cpu type. Original UltraSPARC-I (spitfire) chips only supported a 44-bit virtual address space. Newer chips (T4 and later) support 52-bit virtual addresses and up to 47-bits of physical memory space. Therefore we have to adjust PAGE_SIZE dynamically based upon the capabilities of the chip. Note that this change alone does not allow us to support > 43-bit physical memory, to do that we need to re-arrange our page table support. The current encodings of the pmd_t and pgd_t pointers restricts us to "32 + 11" == 43 bits. This change can waste quite a bit of memory for the various tables. In particular, a future change should work to size and allocate kern_linear_bitmap[] and sparc64_valid_addr_bitmap[] dynamically. This isn't easy as we really cannot take a TLB miss when accessing kern_linear_bitmap[]. We'd have to lock it into the TLB or similar. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com>	2013-11-12 15:22:34 -08:00
Kirill Tkhai	61d9b9355b	sparc64: Remove RWSEM export leftovers The functions __down_read __down_read_trylock __down_write __down_write_trylock __up_read __up_write __downgrade_write are implemented inline, so remove corresponding EXPORT_SYMBOLs (They lead to compile errors on RT kernel). Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> CC: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-05 12:12:51 -07:00
Stephen Boyd	446f24d119	Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS The help text for this config is duplicated across the x86, parisc, and s390 Kconfig.debug files. Arnd Bergman noted that the help text was slightly misleading and should be fixed to state that enabling this option isn't a problem when using pre 4.4 gcc. To simplify the rewording, consolidate the text into lib/Kconfig.debug and modify it there to be more explicit about when you should say N to this config. Also, make the text a bit more generic by stating that this option enables compile time checks so we can cover architectures which emit warnings vs. ones which emit errors. The details of how an architecture decided to implement the checks isn't as important as the concept of compile time checking of copy_from_user() calls. While we're doing this, remove all the copy_from_user_overflow() code that's duplicated many times and place it into lib/ so that any architecture supporting this option can get the function for free. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Helge Deller <deller@gmx.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:09 -07:00
Akinobu Mita	54df2db36c	sparc/srmmu: clear trailing edge of bitmap properly srmmu_nocache_bitmap is cleared by bit_map_init(). But bit_map_init() attempts to clear by memset(), so it can't clear the trailing edge of bitmap properly on big-endian architecture if the number of bits is not a multiple of BITS_PER_LONG. Actually, the number of bits in srmmu_nocache_bitmap is not always a multiple of BITS_PER_LONG. It is calculated as below: bitmap_bits = srmmu_nocache_size >> SRMMU_NOCACHE_BITMAP_SHIFT; srmmu_nocache_size is decided proportionally by the amount of system RAM and it is rounded to a multiple of PAGE_SIZE. SRMMU_NOCACHE_BITMAP_SHIFT is defined as (PAGE_SHIFT - 4). So it can only be said that bitmap_bits is a multiple of 16. This fixes the problem by using bitmap_clear() instead of memset() in bit_map_init() and this also uses BITS_TO_LONGS() to calculate correct size at bitmap allocation time. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-31 19:29:12 -04:00
David S. Miller	193d2aadc0	sparc: Support atomic64_dec_if_positive properly. Sparc32 already supported it, as a consequence of using the generic atomic64 implementation. And the sparc64 implementation is rather trivial. This allows us to set ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE for all of sparc, and avoid the annoying warning from lib/atomic64_test.c Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-09 19:37:59 -08:00
David S. Miller	9f825962ef	sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy. This adds optimized memset/bzero/page-clear routines for Niagara-4. We basically can do what powerpc has been able to do for a decade (via the "dcbz" instruction), which is use cache line clearing stores for bzero and memsets with a 'c' argument of zero. As long as we make the cache initializing store to each 32-byte subblock of the L2 cache line, it works. As with other Niagara-4 optimized routines, the key is to make sure to avoid any usage of the %asi register, as reads and writes to it cost at least 50 cycles. For the user clear cases, we don't use these new routines, we use the Niagara-1 variants instead. Those have to use %asi in an unavoidable way. A Niagara-4 8K page clear costs just under 600 cycles. Add definitions of the MRU variants of the cache initializing store ASIs. By default, cache initializing stores install the line as Least Recently Used. If we know we're going to use the data immediately (which is true for page copies and clears) we can use the Most Recently Used variant, to decrease the likelyhood of the lines being evicted before they get used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-05 13:45:26 -07:00
David S. Miller	954f9ac43b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux There's a Niagara 2 memcpy fix in this tree and I have a Kconfig fix from Dave Jones which requires the sparc-next changes which went upstream yesterday. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-02 23:02:10 -04:00
David S. Miller	42a4172b6e	sparc64: Fix trailing whitespace in NG4 memcpy. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-28 13:08:22 -07:00

1 2 3

131 Commits