linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Borislav Petkov	e839004b49	x86/asm/head*.S: Change global labels to local Make the disassembly look less confusing: -- head_64.o.before.asm ++ head_64.o.after.asm 0000000000000120 <early_idt_handler>: 120: fc cld 121: 83 3c 24 02 cmpl $0x2,(%rsp) - 125: 0f 84 9d 00 00 00 je 1c8 <is_nmi> + 125: 0f 84 9d 00 00 00 je 1c8 <early_idt_handler+0xa8> 12b: 83 3d 00 00 00 00 02 cmpl $0x2,0x0(%rip) # 132 <early_idt_handler+0x12> 132: 74 7e je 1b2 <early_idt_handler+0x92> 134: ff 05 00 00 00 00 incl 0x0(%rip) # 13a <early_idt_handler+0x1a> @@ -1198,9 +1198,7 @@ Disassembly of section .init.text: 1bf: 5a pop %rdx 1c0: 59 pop %rcx 1c1: 58 pop %rax - 1c2: ff 0d 00 00 00 00 decl 0x0(%rip) # 1c8 <is_nmi> - -00000000000001c8 <is_nmi>: + 1c2: ff 0d 00 00 00 00 decl 0x0(%rip) # 1c8 <early_idt_handler+0xa8> 1c8: 48 83 c4 10 add $0x10,%rsp 1cc: 48 cf iretq -- head_32.o.before.asm ++ head_32.o.after.asm 0000016c <early_idt_handler>: 16c: fc cld 16d: 83 3c 24 02 cmpl $0x2,(%esp) - 171: 74 73 je 1e6 <is_nmi> + 171: 74 73 je 1e6 <ex_entry+0xc> 173: 36 83 3d 00 00 00 00 cmpl $0x2,%ss:0x0 17a: 02 17b: 74 5a je 1d7 <hlt_loop> @@ -483,8 +483,6 @@ Disassembly of section .init.text: 1dd: 59 pop %ecx 1de: 58 pop %eax 1df: 36 ff 0d 00 00 00 00 decl %ss:0x0 - -000001e6 <is_nmi>: 1e6: 83 c4 08 add $0x8,%esp 1e9: cf iret 1ea: 66 90 xchg %ax,%ax No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1431793079-11153-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-05-17 07:57:53 +02:00
Denys Vlasenko	3e1aa7cb59	x86/asm: Optimize unnecessarily wide TEST instructions By the nature of the TEST operation, it is often possible to test a narrower part of the operand: "testl $3, mem" -> "testb $3, mem", "testq $3, %rcx" -> "testb $3, %cl" This results in shorter instructions, because the TEST instruction has no sign-entending byte-immediate forms unlike other ALU ops. Note that this change does not create any LCP (Length-Changing Prefix) stalls, which happen when adding a 0x66 prefix, which happens when 16-bit immediates are used, which changes such TEST instructions: [test_opcode] [modrm] [imm32] to: [0x66] [test_opcode] [modrm] [imm16] where [imm16] has a different length now: 2 bytes instead of 4. This confuses the decoder and slows down execution. REX prefixes were carefully designed to almost never hit this case: adding REX prefix does not change instruction length except MOVABS and MOV [addr],RAX instruction. This patch does not add instructions which would use a 0x66 prefix, code changes in assembly are: -48 f7 07 01 00 00 00 testq $0x1,(%rdi) +f6 07 01 testb $0x1,(%rdi) -48 f7 c1 01 00 00 00 test $0x1,%rcx +f6 c1 01 test $0x1,%cl -48 f7 c1 02 00 00 00 test $0x2,%rcx +f6 c1 02 test $0x2,%cl -41 f7 c2 01 00 00 00 test $0x1,%r10d +41 f6 c2 01 test $0x1,%r10b -48 f7 c1 04 00 00 00 test $0x4,%rcx +f6 c1 04 test $0x4,%cl -48 f7 c1 08 00 00 00 test $0x8,%rcx +f6 c1 08 test $0x8,%cl Linus further notes: "There are no stalls from using 8-bit instruction forms. Now, changing from 64-bit or 32-bit 'test' instructions to 8-bit ones could cause problems if it ends up having forwarding issues, so that instead of just forwarding the result, you end up having to wait for it to be stable in the L1 cache (or possibly the register file). The forwarding from the store buffer is simplest and most reliable if the read is done at the exact same address and the exact same size as the write that gets forwarded. But that's true only if: (a) the write was very recent and is still in the write queue. I'm not sure that's the case here anyway. (b) on at least most Intel microarchitectures, you have to test a different byte than the lowest one (so forwarding a 64-bit write to a 8-bit read ends up working fine, as long as the 8-bit read is of the low 8 bits of the written data). A very similar issue might show up for registers too, not just memory writes, if you use 'testb' with a high-byte register (where instead of forwarding the value from the original producer it needs to go through the register file and then shifted). But it's mainly a problem for store buffers. But afaik, the way Denys changed the test instructions, neither of the above issues should be true. The real problem for store buffer forwarding tends to be "write 8 bits, read 32 bits". That can be really surprisingly expensive, because the read ends up having to wait until the write has hit the cacheline, and we might talk tens of cycles of latency here. But "write 32 bits, read the low 8 bits" should be fast on pretty much all x86 chips, afaik." Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1425675332-31576-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-07 11:12:43 +01:00
Ingo Molnar	d2c032e3dc	Linux 4.0-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJU9enEAAoJEHm+PkMAQRiG/ewIAJ4MW4tcAhaVj6ndCF3+uL/b RaVm1apUjsTloe5Fl0TT9J5CO3zdOetmMNToy2sf0W4MJDIyHf21o83l7eniV/6q al/c3fQ6HVtNjiSUNghTtzVlL+gUD1F60b9BGYi1V5h2Mp8u0NG1alTGLQfCB8sE ArB+v2aWEdSPn7mZDA0Yuc1In+8bkpht3oy+OLD/8JNkqqLnml9YOyPjM1cuRpBr NxKCLcPzSHH9/nR3T6XtkxXYV5xD3+CDm9roJhfHukoFmfT/G3C65Zcp2KEed/Cw QQpu+ox7fpUs10F/Fbfm8AE+tRB4o2sGh97sprXrO5oaFdx6FPIBo4WN8i/Vy68= =qpY+ -----END PGP SIGNATURE----- Merge tag 'v4.0-rc2' into x86/asm, to refresh the tree Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-04 06:35:43 +01:00
Alexander Kuleshov	5b171e8218	x86/asm/boot: Fix path in comments Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Cc: Martin Mares <mj@ucw.cz> Link: http://lkml.kernel.org/r/1422382588-10367-1-git-send-email-kuleshovmail@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-02-19 00:03:30 +01:00
Andrey Ryabinin	ef7f0d6a6c	x86_64: add KASan support This patch adds arch specific code for kernel address sanitizer. 16TB of virtual addressed used for shadow memory. It's located in range [ffffec0000000000 - fffffc0000000000] between vmemmap and %esp fixup stacks. At early stage we map whole shadow region with zero page. Latter, after pages mapped to direct mapping address range we unmap zero pages from corresponding shadow (see kasan_map_shadow()) and allocate and map a real shadow memory reusing vmemmap_populate() function. Also replace __pa with __pa_nodebug before shadow initialized. __pa with CONFIG_DEBUG_VIRTUAL=y make external function call (__phys_addr) __phys_addr is instrumented, so __asan_load could be called before shadow area initialized. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Jim Davis <jim.epost@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:41 -08:00
Linus Torvalds	b01d4e6893	x86: fix compile error due to X86_TRAP_NMI use in asm files It's an enum, not a #define, you can't use it in asm files. Introduced in commit `5fa10196bd` ("x86: Ignore NMIs that come in during early boot"), and sadly I didn't compile-test things like I should have before pushing out. My weak excuse is that the x86 tree generally doesn't introduce stupid things like this (and the ARM pull afterwards doesn't cause me to do a compile-test either, since I don't cross-compile). Cc: Don Zickus <dzickus@redhat.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-03-07 18:58:40 -08:00
H. Peter Anvin	5fa10196bd	x86: Ignore NMIs that come in during early boot Don Zickus reports: A customer generated an external NMI using their iLO to test kdump worked. Unfortunately, the machine hung. Disabling the nmi_watchdog made things work. I speculated the external NMI fired, caused the machine to panic (as expected) and the perf NMI from the watchdog came in and was latched. My guess was this somehow caused the hang. ---- It appears that the latched NMI stays latched until the early page table generation on 64 bits, which causes exceptions to happen which end in IRET, which re-enable NMI. Therefore, ignore NMIs that come in during early execution, until we have proper exception handling. Reported-and-tested-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1394221143-29713-1-git-send-email-dzickus@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v3.5+, older with some backport effort	2014-03-07 15:08:14 -08:00
Kees Cook	4df05f3619	x86: Make sure IDT is page aligned Since the IDT is referenced from a fixmap, make sure it is page aligned. Merge with 32-bit one, since it was already aligned to deal with F00F bug. Since bss is cleared before IDT setup, it can live there. This also moves the other *_idt_table variables into common locations. This avoids the risk of the IDT ever being moved in the bss and having the mapping be offset, resulting in calling incorrect handlers. In the current upstream kernel this is not a manifested bug, but heavily patched kernels (such as those using the PaX patch series) did encounter this bug. The tables other than idt_table technically do not need to be page aligned, at least not at the current time, but using a common declaration avoids mistakes. On 64 bits the table is exactly one page long, anyway. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20130716183441.GA14232@www.outflux.net Reported-by: PaX Team <pageexec@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-07-16 15:14:48 -07:00
Seiji Aguchi	cf910e83ae	x86, trace: Add irq vector tracepoints [Purpose of this patch] As Vaibhav explained in the thread below, tracepoints for irq vectors are useful. http://www.spinics.net/lists/mm-commits/msg85707.html <snip> The current interrupt traces from irq_handler_entry and irq_handler_exit provide when an interrupt is handled. They provide good data about when the system has switched to kernel space and how it affects the currently running processes. There are some IRQ vectors which trigger the system into kernel space, which are not handled in generic IRQ handlers. Tracing such events gives us the information about IRQ interaction with other system events. The trace also tells where the system is spending its time. We want to know which cores are handling interrupts and how they are affecting other processes in the system. Also, the trace provides information about when the cores are idle and which interrupts are changing that state. <snip> On the other hand, my usecase is tracing just local timer event and getting a value of instruction pointer. I suggested to add an argument local timer event to get instruction pointer before. But there is another way to get it with external module like systemtap. So, I don't need to add any argument to irq vector tracepoints now. [Patch Description] Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events. But there is an above use case to trace specific irq_vector rather than tracing all events. In this case, we are concerned about overhead due to unwanted events. So, add following tracepoints instead of introducing irq_vector_entry/exit. so that we can enable them independently. - local_timer_vector - reschedule_vector - call_function_vector - call_function_single_vector - irq_work_entry_vector - error_apic_vector - thermal_apic_vector - threshold_apic_vector - spurious_apic_vector - x86_platform_ipi_vector Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty makes a zero when tracepoints are disabled. Detailed explanations are as follows. - Create trace irq handlers with entering_irq()/exiting_irq(). - Create a new IDT, trace_idt_table, at boot time by adding a logic to _set_gate(). It is just a copy of original idt table. - Register the new handlers for tracpoints to the new IDT by introducing macros to alloc_intr_gate() called at registering time of irq_vector handlers. - Add checking, whether irq vector tracing is on/off, into load_current_idt(). This has to be done below debug checking for these reasons. - Switching to debug IDT may be kicked while tracing is enabled. - On the other hands, switching to trace IDT is kicked only when debugging is disabled. In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being used for other purposes. Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Steven Rostedt <rostedt@goodmis.org>	2013-06-20 22:25:34 -07:00
Seiji Aguchi	629f4f9d59	x86: Rename variables for debugging Rename variables for debugging to describe meaning of them precisely. Also, introduce a generic way to switch IDT by checking a current state, debug on/off. Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Link: http://lkml.kernel.org/r/51C323A8.7050905@hds.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Steven Rostedt <rostedt@goodmis.org>	2013-06-20 22:25:13 -07:00
Zhang Yanfei	e9d0626ed4	x86-64, init: Fix a possible wraparound bug in switchover in head_64.S In head_64.S, a switchover has been used to handle kernel crossing 1G, 512G boundaries. And commit `8170e6bed4` x86, 64bit: Use a #PF handler to materialize early mappings on demand said: During the switchover in head_64.S, before #PF handler is available, we use three pages to handle kernel crossing 1G, 512G boundaries with sharing page by playing games with page aliasing: the same page is mapped twice in the higher-level tables with appropriate wraparound. But from the switchover code, when we set up the PUD table: 114 addq $4096, %rdx 115 movq %rdi, %rax 116 shrq $PUD_SHIFT, %rax 117 andl $(PTRS_PER_PUD-1), %eax 118 movq %rdx, (4096+0)(%rbx,%rax,8) 119 movq %rdx, (4096+8)(%rbx,%rax,8) It seems line 119 has a potential bug there. For example, if the kernel is loaded at physical address 511G+1008M, that is 000000000 111111111 111111000 000000000000000000000 and the kernel _end is 512G+2M, that is 000000001 000000000 000000001 000000000000000000000 So in this example, when using the 2nd page to setup PUD (line 114~119), rax is 511. In line 118, we put rdx which is the address of the PMD page (the 3rd page) into entry 511 of the PUD table. But in line 119, the entry we calculate from (4096+8)(%rbx,%rax,8) has exceeded the PUD page. IMO, the entry in line 119 should be wraparound into entry 0 of the PUD table. The patch fixes the bug. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Link: http://lkml.kernel.org/r/5191DE5A.3020302@cn.fujitsu.com Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: <stable@vger.kernel.org> v3.9 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-05-28 15:41:59 -07:00
H. Peter Anvin	78d77df715	x86-64, init: Do not set NX bits on non-NX capable hardware During early init, we would incorrectly set the NX bit even if the NX feature was not supported. Instead, only set this bit if NX is actually available and enabled. We already do very early detection of the NX bit to enable it in EFER, this simply extends this detection to the early page table mask. Reported-by: Fernando Luis Vázquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1367476850.5660.2.camel@nexus Cc: <stable@vger.kernel.org> v3.9	2013-05-02 11:27:35 -07:00
Linus Torvalds	18a44a7ff1	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 fixes from Peter Anvin: "Additional x86 fixes. Three of these patches are pure documentation, two are pretty trivial; the remaining one fixes boot problems on some non-BIOS machines." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Make sure we can boot in the case the BDA contains pure garbage x86, efi: Mark disable_runtime as __initdata x86, doc: Fix incorrect comment about 64-bit code segment descriptors doc, kernel-parameters: Document 'console=hvc<n>' doc, xen: Mention 'earlyprintk=xen' in the documentation. ACPI: Overriding ACPI tables via initrd only works with an initrd and on X86	2013-02-27 16:16:39 -08:00
Konrad Rzeszutek Wilk	1256276c98	x86, doc: Fix incorrect comment about 64-bit code segment descriptors The AMD64 Architecture Programmer's Manual Volume 2, on page 89 mentions: "If the processor is running in 64-bit mode (L=1), the only valid setting of the D bit is 0." This matches with what the code does. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1361825650-14031-4-git-send-email-konrad.wilk@oracle.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-02-25 13:42:37 -08:00
Linus Torvalds	ac630dd98a	x86-64: don't set the early IDT to point directly to 'early_idt_handler' The code requires the use of the proper per-exception-vector stub functions (set up as the early_idt_handlers[] array - note the 's') that make sure to set up the error vector number. This is true regardless of whether CONFIG_EARLY_PRINTK is set or not. Why? The stack offset for the comparison of __KERNEL_CS won't be right otherwise, nor will the new check (from commit `8170e6bed4`: "x86, 64bit: Use a #PF handler to materialize early mappings on demand") for the page fault exception vector. Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-22 13:09:51 -08:00
H. Peter Anvin	8170e6bed4	x86, 64bit: Use a #PF handler to materialize early mappings on demand Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all 64-bit code has to use page tables. This makes it awkward before we have first set up properly all-covering page tables to access objects that are outside the static kernel range. So far we have dealt with that simply by mapping a fixed amount of low memory, but that fails in at least two upcoming use cases: 1. We will support load and run kernel, struct boot_params, ramdisk, command line, etc. above the 4 GiB mark. 2. need to access ramdisk early to get microcode to update that as early possible. We could use early_iomap to access them too, but it will make code to messy and hard to be unified with 32 bit. Hence, set up a #PF table and use a fixed number of buffers to set up page tables on demand. If the buffers fill up then we simply flush them and start over. These buffers are all in __initdata, so it does not increase RAM usage at runtime. Thus, with the help of the #PF handler, we can set the final kernel mapping from blank, and switch to init_level4_pgt later. During the switchover in head_64.S, before #PF handler is available, we use three pages to handle kernel crossing 1G, 512G boundaries with sharing page by playing games with page aliasing: the same page is mapped twice in the higher-level tables with appropriate wraparound. The kernel region itself will be properly mapped; other mappings may be spurious. early_make_pgtable is using kernel high mapping address to access pages to set page table. -v4: Add phys_base offset to make kexec happy, and add init_mapping_kernel() - Yinghai -v5: fix compiling with xen, and add back ident level3 and level2 for xen also move back init_level4_pgt from BSS to DATA again. because we have to clear it anyway. - Yinghai -v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai -v7: remove not needed clear_page for init_level4_page it is with fill 512,8,0 already in head_64.S - Yinghai -v8: we need to keep that handler alive until init_mem_mapping and don't let early_trap_init to trash that early #PF handler. So split early_trap_pf_init out and move it down. - Yinghai -v9: switchover only cover kernel space instead of 1G so could avoid touch possible mem holes. - Yinghai -v11: change far jmp back to far return to initial_code, that is needed to fix failure that is reported by Konrad on AMD systems. - Yinghai Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-01-29 15:20:06 -08:00
Fenghua Yu	42e78e9719	x86-64, hotplug: Add start_cpu0() entry point to head_64.S start_cpu0() is defined in head_64.S for 64-bit. The function sets up stack and jumps to start_secondary() for CPU0 wake up. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1352835171-3958-8-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-11-14 09:39:51 -08:00
Linus Torvalds	731a7378b8	Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 trampoline rework from H. Peter Anvin: "This code reworks all the "trampoline"/"realmode" code (various bits that need to live in the first megabyte of memory, most but not all of which runs in real mode at some point) in the kernel into a single object. The main reason for doing this is that it eliminates the last place in the kernel where we needed pages to be mapped RWX. This code separates all that code into proper R/RW/RX pages." Fix up conflicts in arch/x86/kernel/Makefile (mca removed next to reboot code), and arch/x86/kernel/reboot.c (reboot code moved around in one branch, modified in this one), and arch/x86/tools/relocs.c (mostly same code came in earlier due to working around the ld bugs just before the 3.4 release). Also remove stale x86-relocs entry from scripts/.gitignore as per Peter Anvin. * commit '61f5446169046c217a5479517edac3a890c3bee7': (36 commits) x86, realmode: Move end signature into header.S x86, relocs: When printing an error, say relative or absolute x86, relocs: More relocations which may end up as absolute x86, relocs: Workaround for binutils 2.22.52.0.1 section bug xen-acpi-processor: Add missing #include <xen/xen.h> acpi, bgrd: Add missing <linux/io.h> to drivers/acpi/bgrt.c x86, realmode: Change EFER to a single u64 field x86, realmode: Move kernel/realmode.c to realmode/init.c x86, realmode: Move not-common bits out of trampoline_common.S x86, realmode: Mask out EFER.LMA when saving trampoline EFER x86, realmode: Fix no cache bits test in reboot_32.S x86, realmode: Make sure all generated files are listed in targets x86, realmode: build fix: remove duplicate build x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline x86, realmode: fixes compilation issue in tboot.c x86, realmode: move relocs from scripts/ to arch/x86/tools x86, realmode: header for trampoline code x86, realmode: flattened rm hierachy x86, realmode: don't copy real_mode_header x86, realmode: fix 64-bit wakeup sequence ...	2012-05-29 20:14:53 -07:00
Jarkko Sakkinen	48927bbb97	x86, realmode: Move SMP trampoline to unified realmode code Migrated SMP trampoline code to the real mode blob. SMP trampoline code is not yet removed from .x86_trampoline because it is needed by the wakeup code. [ hpa: always enable compiling startup_32_smp in head_32.S... it is only a few instructions which go into .init on UP builds, and it makes the rest of the code less #ifdef ugly. ] Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-6-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-08 11:41:51 -07:00
H. Peter Anvin	9900aa2f95	x86-64: Handle exception table entries during early boot If we get an exception during early boot, walk the exception table to see if we should intercept it. The main use case for this is to allow rdmsr_safe()/wrmsr_safe() during CPU initialization. Since the exception table is currently sorted at runtime, and fairly late in startup, this code walks the exception table linearly. We obviously don't need to worry about modules, however: none have been loaded at this point. [ v2: Use early_fixup_exception() instead of linear search ] Link: http://lkml.kernel.org/r/1334794610-5546-5-git-send-email-hpa@zytor.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-04-19 15:42:45 -07:00
H. Peter Anvin	ffc4bc9c6f	x86, paravirt: Replace GET_CR2_INTO_RCX with GET_CR2_INTO_RAX GET_CR2_INTO_RCX is asinine: it is only used in one place, the actual paravirt call returns the value in %rax, not %rcx; and the one place that wants it wants the result in %r9. We actually generate as a result of this call: call ... movq %rax, %rcx xorq %rax, %rax /* this value isn't even used... */ movq %rcx, %r9 At least make the macro do what the paravirt call does, which is put the value into %rax. Nevermind the fact that the macro clobbers all the volatile registers. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1334794610-5546-4-git-send-email-hpa@zytor.com Cc: Glauber de Oliveira Costa <glommer@parallels.com>	2012-04-19 15:07:56 -07:00
H. Peter Anvin	84f4fc524e	x86: Add symbolic constant for exceptions with error code Add a symbolic constant for the bitmask which states which exceptions carry an error code. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1334794610-5546-3-git-send-email-hpa@zytor.com	2012-04-19 15:07:49 -07:00
Steven Rostedt	228bdaa95f	x86: Keep current stack in NMI breakpoints We want to allow NMI handlers to have breakpoints to be able to remove stop_machine from ftrace, kprobes and jump_labels. But if an NMI interrupts a current breakpoint, and then it triggers a breakpoint itself, it will switch to the breakpoint stack and corrupt the data on it for the breakpoint processing that it interrupted. Instead, have the NMI check if it interrupted breakpoint processing by checking if the stack that is currently used is a breakpoint stack. If it is, then load a special IDT that changes the IST for the debug exception to keep the same stack in kernel context. When the NMI is done, it puts it back. This way, if the NMI does trigger a breakpoint, it will keep using the same stack and not stomp on the breakpoint data for the breakpoint it interrupted. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-12-21 15:38:55 -05:00
H. Peter Anvin	4822b7fc6d	x86, trampoline: Common infrastructure for low memory trampolines Common infrastructure for low memory trampolines. This code installs the trampolines permanently in low memory very early. It also permits multiple pieces of code to be used for this purpose. This code also introduces a standard infrastructure for computing symbol addresses in the trampoline code. The only change to the actual SMP trampolines themselves is that the 64-bit trampoline has been made reusable -- the previous version would overwrite the code with a status variable; this moves the status variable to a separate location. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <4D5DFBE4.7090104@intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Matthieu Castet <castet.matthieu@free.fr> Cc: Stephen Rothwell <sfr@canb.auug.org.au>	2011-02-17 21:02:43 -08:00
Brian Gerst	650fb4393d	x86-64: Simplify loading initial_gs Load initial_gs as two 32-bit values instead of splitting a 64-bit value. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1279371808-24804-3-git-send-email-brgerst@gmail.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-07-21 21:23:51 -07:00
Daniel Mack	3ad2f3fbb9	tree-wide: Assorted spelling fixes In particular, several occurances of funny versions of 'success', 'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address', 'beginning', 'desirable', 'separate' and 'necessary' are fixed. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Joe Perches <joe@perches.com> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-09 11:13:56 +01:00
Linus Torvalds	e33c019722	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits) x86, mm: Correct the implementation of is_untracked_pat_range() x86/pat: Trivial: don't create debugfs for memtype if pat is disabled x86, mtrr: Fix sorting of mtrr after subtracting x86: Move find_smp_config() earlier and avoid bootmem usage x86, platform: Change is_untracked_pat_range() to bool; cleanup init x86: Change is_ISA_range() into an inline function x86, mm: is_untracked_pat_range() takes a normal semiclosed range x86, mm: Call is_untracked_pat_range() rather than is_ISA_range() x86: UV SGI: Don't track GRU space in PAT x86: SGI UV: Fix BAU initialization x86, numa: Use near(er) online node instead of roundrobin for NUMA x86, numa, bootmem: Only free bootmem on NUMA failure path x86: Change crash kernel to reserve via reserve_early() x86: Eliminate redundant/contradicting cache line size config options x86: When cleaning MTRRs, do not fold WP into UC x86: remove "extern" from function prototypes in <asm/proto.h> x86, mm: Report state of NX protections during boot x86, mm: Clean up and simplify NX enablement x86, pageattr: Make set_memory_(x\|nx) aware of NX support x86, sleep: Always save the value of EFER ... Fix up conflicts (added both iommu_shutdown and is_untracked_pat_range) to 'struct x86_platform_ops') in arch/x86/include/asm/x86_init.h arch/x86/kernel/x86_init.c	2009-12-08 13:27:33 -08:00
Linus Torvalds	ef26b1691d	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: include/linux/compiler-gcc4.h: Fix build bug - gcc-4.0.2 doesn't understand __builtin_object_size x86/alternatives: No need for alternatives-asm.h to re-invent stuff already in asm.h x86/alternatives: Check replacementlen <= instrlen at build time x86, 64-bit: Set data segments to null after switching to 64-bit mode x86: Clean up the loadsegment() macro x86: Optimize loadsegment() x86: Add missing might_fault() checks to copy_{to,from}_user() x86-64: __copy_from_user_inatomic() adjustments x86: Remove unused thread_return label from switch_to() x86, 64-bit: Fix bstep_iret jump x86: Don't use the strict copy checks when branch profiling is in use x86, 64-bit: Move K8 B step iret fixup to fault entry asm x86: Generate cmpxchg build failures x86: Add a Kconfig option to turn the copy_from_user warnings into errors x86: Turn the copy_from_user check into an (optional) compile time warning x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy x86: Use __builtin_object_size() to validate the buffer size for copy_from_user()	2009-12-05 15:32:03 -08:00
Brian Gerst	8ec6993d9f	x86, 64-bit: Set data segments to null after switching to 64-bit mode This prevents kernel threads from inheriting non-null segment selectors, and causing optimizations in __switch_to() to be ineffective. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Tim Blechmann <tim@klingt.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jan Beulich <JBeulich@novell.com> LKML-Reference: <1259165856-3512-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:44:30 +01:00
Suresh Siddha	b9af7c0d44	x86-64: preserve large page mapping for 1st 2MB kernel txt with CONFIG_DEBUG_RODATA In the first 2MB, kernel text is co-located with kernel static page tables setup by head_64.S. CONFIG_DEBUG_RODATA chops this 2MB large page mapping to small 4KB pages as we mark the kernel text as RO, leaving the static page tables as RW. With CONFIG_DEBUG_RODATA disabled, OLTP run on NHM-EP shows 1% improvement with 2% reduction in system time and 1% improvement in iowait idle time. To recover this, move the kernel static page tables to .data section, so that we don't have to break the first 2MB of kernel text to small pages with CONFIG_DEBUG_RODATA. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091014220254.063193621@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-10-20 14:46:00 +09:00
Linus Torvalds	5bb241b325	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Remove redundant non-NUMA topology functions x86: early_printk: Protect against using the same device twice x86: Reduce verbosity of "PAT enabled" kernel message x86: Reduce verbosity of "TSC is reliable" message x86: mce: Use safer ways to access MCE registers x86: mce, inject: Use real inject-msg in raise_local x86: mce: Fix thermal throttling message storm x86: mce: Clean up thermal throttling state tracking code x86: split NX setup into separate file to limit unstack-protected code xen: check EFER for NX before setting up GDT mapping x86: Cleanup linker script using new linker script macros. x86: Use section .data.page_aligned for the idt_table. x86: convert to use __HEAD and HEAD_TEXT macros. x86: convert compressed loader to use __HEAD and HEAD_TEXT macros. x86: fix fragile computation of vsyscall address	2009-09-26 10:13:35 -07:00
Tim Abbott	02b7da37f7	Use macros for .bss.page_aligned section. This patch changes the remaining direct references to .bss.page_aligned in C and assembly code to use the macros in include/linux/linkage.h. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2009-09-21 06:27:08 +02:00
Tim Abbott	4ae59b916d	x86: convert to use __HEAD and HEAD_TEXT macros. This has the consequence of changing the section name use for head code from ".text.head" to ".head.text". It also eliminates the ".text.head" output section (instead placing head code at the start of the .text output section), which should be harmless. This patch only changes the sections in the actual kernel, not those in the compressed boot loader. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-09-18 10:21:50 -07:00
Alexander van Heukelum	bc3f5d3dbd	x86: de-assembler-ize asm/desc.h asm/desc.h is included in three assembly files, but the only macro it defines, GET_DESC_BASE, is never used. This patch removes the includes, removes the macro GET_DESC_BASE and the ASSEMBLY guard from asm/desc.h. Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-06-17 21:35:10 -07:00
Cyrill Gorcunov	5e112ae23b	x86: head_64.S - use IDT_ENTRIES instead of hardcoded number Impact: cleanup Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: heukelum@fastmail.fm Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-24 18:08:38 +01:00
Cyrill Gorcunov	2a0b100111	x86: head_64.S - remove useless balign Impact: cleanup NEXT_PAGE already has 'balign' so no need to keep this redundant one. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: heukelum@fastmail.fm Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-24 18:08:38 +01:00
Brian Gerst	2add8e235c	x86: use linker to offset symbols by __per_cpu_load Impact: cleanup and bug fix Use the linker to create symbols for certain per-cpu variables that are offset by __per_cpu_load. This allows the removal of the runtime fixup of the GDT pointer, which fixes a bug with resume reported by Jiri Slaby. Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-09 10:30:30 +01:00
Brian Gerst	947e76cdc3	x86: move stack_canary into irq_stack Impact: x86_64 percpu area layout change, irq_stack now at the beginning Now that the PDA is empty except for the stack canary, it can be removed. The irqstack is moved to the start of the per-cpu section. If the stack protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack. tj: * updated subject * dropped asm relocation of irq_stack_ptr * updated comments a bit * rebased on top of stack canary changes Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-20 12:29:20 +09:00
Brian Gerst	8c7e58e690	x86: rework __per_cpu_load adjustments Impact: cleanup Use cpu_number to determine if the adjustment is necessary. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-20 12:29:20 +09:00
Tejun Heo	b12d8db8fb	x86: make pda a percpu variable [ Based on original patch from Christoph Lameter and Mike Travis. ] As pda is now allocated in percpu area, it can easily be made a proper percpu variable. Make it so by defining per cpu symbol from linker script and declaring it in C code for SMP and simply defining it for UP. This change cleans up code and brings SMP and UP closer a bit. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-16 14:20:03 +01:00
Tejun Heo	1a51e3a0ae	x86: fold pda into percpu area on SMP [ Based on original patch from Christoph Lameter and Mike Travis. ] Currently pdas and percpu areas are allocated separately. %gs points to local pda and percpu area can be reached using pda->data_offset. This patch folds pda into percpu area. Due to strange gcc requirement, pda needs to be at the beginning of the percpu area so that pda->stack_canary is at %gs:40. To achieve this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is added and used to reserve pda sized chunk at the start of the percpu area. After this change, for boot cpu, %gs first points to pda in the data.init area and later during setup_per_cpu_areas() gets updated to point to the actual pda. This means that setup_per_cpu_areas() need to reload %gs for CPU0 while clearing pda area for other cpus as cpu0 already has modified it when control reaches setup_per_cpu_areas(). This patch also removes now unnecessary get_local_pda() and its call sites. A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into per cpu area" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-16 14:19:46 +01:00
Tejun Heo	f32ff5388d	x86: load pointer to pda into %gs while brining up a CPU [ Based on original patch from Christoph Lameter and Mike Travis. ] CPU startup code in head_64.S loaded address of a zero page into %gs for temporary use till pda is loaded but address to the actual pda is available at the point. Load the real address directly instead. This will help unifying percpu and pda handling later on. This patch is mostly taken from Mike Travis' "x86_64: Fold pda into per cpu area" patch. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-16 14:19:26 +01:00
Tejun Heo	3e5d8f9784	x86: make percpu symbols zerobased on SMP [ Based on original patch from Christoph Lameter and Mike Travis. ] This patch makes percpu symbols zerobased on x86_64 SMP by adding PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on the percpu output section and using it in vmlinux_64.lds.S. A new PHDR is added as existing ones cannot contain sections near address zero. PERCPU_VADDR() also adds a new symbol __per_cpu_load which always points to the vaddr of the loaded percpu data.init region. The following adjustments have been made to accomodate the address change. * code to locate percpu gdt_page in head_64.S is updated to add the load address to the gdt_page offset. * __per_cpu_load is used in places where access to the init data area is necessary. * pda->data_offset is initialized soon after C code is entered as zero value doesn't work anymore. This patch is mostly taken from Mike Travis' "x86_64: Base percpu variables at zero" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-16 14:19:14 +01:00
Jiri Slaby	7aed55d108	x86: fix RIP printout in early_idt_handler Impact: fix debug/crash printout Since errorcode is popped out, RIP is on the top of the stack. Use real RIP value instead of wrong CS. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-04 10:20:29 +01:00
Suresh Siddha	b2bc273146	x86, cpa: rename PTE attribute macros for kernel direct mapping in early boot Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:11 +02:00
Ingo Molnar	6596f24223	Revert "x86_64: there's no need to preallocate level1_fixmap_pgt" This reverts commit 033786969d1d1b5af12a32a19d3a760314d05329. Suresh Siddha reported that this broke booting on his 2GB testbox. Reported-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 11:07:30 +02:00
Jeremy Fitzhardinge	8c5e5ac32f	xen64: add xen-head code to head_64.S Add the Xen entrypoint and ELF notes to head_64.S. Adapts xen-head.S to compile either 32-bit or 64-bit. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 10:58:41 +02:00
Jeremy Fitzhardinge	8840c0ccd7	x86_64: there's no need to preallocate level1_fixmap_pgt Early fixmap will allocate its own L1 pagetable page for fixmap mappings, so there's no need to preallocate one. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 10:54:11 +02:00
Jeremy Fitzhardinge	8490638cf0	x86: always set _PAGE_GLOBAL in _PAGE_KERNEL* flags Consistently set _PAGE_GLOBAL in _PAGE_KERNEL flags. This makes 32- and 64-bit code consistent, and removes some special cases where __PAGE_KERNEL* did not have _PAGE_GLOBAL set, causing confusion as a result of the inconsistencies. This patch only affects x86-64, which generally always supports PGD. The x86-32 patch is next. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:28 +02:00
Jeremy Fitzhardinge	cd5dce2fb0	x86: fix CPA self-test for "x86/paravirt: groundwork for 64-bit Xen support" Ingo Molnar wrote: > -tip auto-testing found pagetable corruption (CPA self-test failure): > > [ 32.956015] CPA self-test: > [ 32.958822] 4k 2048 large 508 gb 0 x 2556[ffff880000000000-ffff88003fe00000] miss 0 > [ 32.964000] CPA ffff88001d54e000: bad pte 1d4000e3 > [ 32.968000] CPA ffff88001d54e000: unexpected level 2 > [ 32.972000] CPA ffff880022c5d000: bad pte 22c000e3 > [ 32.976000] CPA ffff880022c5d000: unexpected level 2 > [ 32.980000] CPA ffff8800200ce000: bad pte 200000e3 > [ 32.984000] CPA ffff8800200ce000: unexpected level 2 > [ 32.988000] CPA ffff8800210f0000: bad pte 210000e3 > > config and full log can be found at: > > http://redhat.com/~mingo/misc/config-Mon_Jun_30_11_11_51_CEST_2008.bad > http://redhat.com/~mingo/misc/log-Mon_Jun_30_11_11_51_CEST_2008.bad Phew. OK, I've worked this out. Short version is that's it's a false alarm, and there was no real failure here. Long version: * I changed the code to create the physical mapping pagetables to reuse any existing mapping rather than replace it. Specifically, reusing an pud pointed to by the pgd caused this symptom to appear. * The specific PUD being reused is the one created statically in head_64.S, which creates an initial 1GB mapping. * That mapping doesn't have _PAGE_GLOBAL set on it, due to the inconsistency between __PAGE_* and PAGE_. The CPA test attempts to clear _PAGE_GLOBAL, and then checks to see that the resulting range is 1) shattered into 4k pages, and 2) has no _PAGE_GLOBAL. * However, since it didn't have _PAGE_GLOBAL on that range to start with, change_page_attr_clear() had nothing to do, and didn't bother shattering the range, * resulting in the reported messages The simple fix is to set _PAGE_GLOBAL in level2_ident_pgt. An additional fix to make CPA testing more robust by using some other pagetable bit (one of the unused available-to-software ones). This would solve spurious CPA test warnings under Xen which uses _PAGE_GLOBAL for its own purposes (ie, not under guest control). Also, we should revisit the use of _PAGE_GLOBAL in asm-x86/pgtable.h, and use it consistently, and drop MAKE_GLOBAL. The first time I proposed it it caused breakages in the very early CPA code; with luck that's all fixed now. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Mark McLoughlin <markmc@redhat.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:15 +02:00
Eduardo Habkost	a6523748bd	paravirt/x86, 64-bit: move __PAGE_OFFSET to leave a space for hypervisor Set __PAGE_OFFSET to the most negative possible address + 16*PGDIR_SIZE. The gap is to allow a space for a hypervisor to fit. The gap is more or less arbitrary, but it's what Xen needs. When booting native, kernel/head_64.S has a set of compile-time generated pagetables used at boot time. This patch removes their absolutely hard-coded layout, and makes it parameterised on __PAGE_OFFSET (and __START_KERNEL_map). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:04 +02:00
Glauber Costa	a939098afc	x86: move x86_64 gdt closer to i386 i386 and x86_64 used two different schemes for maintaining the gdt. With this patch, x86_64 initial gdt table is defined in a .c file, same way as i386 is now. Also, we call it "gdt_page", and the descriptor, "early_gdt_descr". This way we achieve common naming, which can allow for more code integration. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:16 +02:00
Glauber Costa	9cf4f298e2	x86: use stack_start in x86_64 call x86_64's init_rsp stack_start, just as i386 does. Put a zeroed stack segment for consistency. With this, we can eliminate one ugly ifdef in smpboot.c. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:14 +02:00
Ingo Molnar	6924d1ab8b	Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel	2008-07-08 09:16:56 +02:00
Rafael J. Wysocki	64e83b5a91	x86 ACPI: fix resume from suspend to RAM on uniprocessor x86-64 Since the trampoline code is now used for ACPI resume from suspend to RAM, the trampoline page tables have to be fixed up during boot not only on SMP systems, but also on UP systems that use the trampoline. Reference: http://bugzilla.kernel.org/show_bug.cgi?id=10923 Reported-by: Dionisus Torimens <djtm@gmx.net> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: pm list <linux-pm@lists.linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-05 08:42:28 +02:00
Cyrill Gorcunov	0e192b99d7	x86: head_64.S cleanup - use PMD_SHIFT instead of numeric constant Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:33 +02:00
Cyrill Gorcunov	05139d8fb4	x86: head_64.S cleanup - use straight move to CR4 register Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:33 +02:00
Cyrill Gorcunov	369101da7e	x86: head_64.S cleanup - use predefined flags from processor-flags.h We should better use already defined flags from processor-flags.h instead of defining own ones [>>> object code check >>>] original md5sum: 9cfa6dbf045a046bb5dfb85f8bcfe8c4 arch/x86/kernel/head_64.o text data bss dec hex filename 37361 4432 8192 49985 c341 arch/x86/kernel/head_64.o patched md5sum: 9cfa6dbf045a046bb5dfb85f8bcfe8c4 arch/x86/kernel/head_64.o text data bss dec hex filename 37361 4432 8192 49985 c341 arch/x86/kernel/head_64.o [<<< object code check <<<] Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:31 +02:00
Pavel Machek	e44b7b7525	x86: move suspend wakeup code to C Move wakeup code to .c, so that video mode setting code can be shared between boot and wakeup. Remove nasty assembly code in 64-bit case by re-using trampoline code. Stack setup was fixed to clear high 16bits of %esp, maybe that fixes some machines. .c code sharing and morse code was done H. Peter Anvin, Sam Ravnborg reviewed kbuild related stuff, and it seems okay to him. Rafael did some cleanups. [rjw: * Made the patch stop breaking compilation on x86-32 * Added arch/x86/kernel/acpi/sleep.h * Got rid of compiler warnings in arch/x86/kernel/acpi/sleep.c * Fixed 32-bit compilation on x86-64 systems * Added include/asm-x86/trampoline.h and fixed the non-SMP compilation on 64-bit x86 * Removed arch/x86/kernel/acpi/sleep_32.c which was not used * Fixed some breakage caused by the integration of smpboot.c done under us in the meantime] Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:37 +02:00
Andi Kleen	41bd4eac74	x86: move early exception handlers into init.text Currently they are in .text.head because the rest of head_64.S. .text.head is not removed as init data, but the early exception handlers should be because they are not needed after early boot of the BP. So move them over. Signed-off-by: Andi Kleen <ak@suse.de> Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:29 +02:00
Andi Kleen	749c970ae9	x86: replace early exception setup macro recursion with loop The early exception handlers are currently set up using a macro recursion. There is only one user left. Replace the macro with a standard loop in place. Noop patch, just a cleanup. [ tglx@linutronix.de: simplified ] Signed-off-by: Andi Kleen <ak@suse.de> Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:29 +02:00
Andi Kleen	5524ea320d	x86: don't set up early exception handlers for external interrupts All of early setup runs with interrupts disabled, so there is no need to set up early exception handlers for vectors >= 32 This saves some minor text size. Signed-off-by: Andi Kleen <ak@suse.de> Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:29 +02:00
Ingo Molnar	85eb69a16a	x86: increase the kernel text limit to 512 MB people sometimes do crazy stuff like building really large static arrays into their kernels or building allyesconfig kernels. Give more space to the kernel and push modules up a bit: kernel has 512 MB and modules have 1.5 GB. Should be enough for a few years ;-) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:45 +02:00
Ingo Molnar	d4afe41418	x86: rename KERNEL_TEXT_SIZE => KERNEL_IMAGE_SIZE The KERNEL_TEXT_SIZE constant was mis-named, as we not only map the kernel text but data, bss and init sections as well. That name led me on the wrong path with the KERNEL_TEXT_SIZE regression, because i knew how big of _text_ my images have and i knew about the 40 MB "text" limit so i wrongly thought to be on the safe side of the 40 MB limit with my 29 MB of text, while the total image size was slightly above 40 MB. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:56 +01:00
Ingo Molnar	88f3aec7af	x86: fix spontaneous reboot with allyesconfig bzImage recently the 64-bit allyesconfig bzImage kernel started spontaneously rebooting during early bootup. after a few fun hours spent with early init debugging, it turns out that we've got this rather annoying limit on the size of the kernel image: #define KERNEL_TEXT_SIZE (4010241024) which limit my vmlinux just happened to pass: text data bss dec hex filename 29703744 4222751 `8646224` 42572719 2899baf vmlinux 40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/ So it happily crashed right in head_64.S, which - as we all know - is the most debuggable code in the whole architecture ;-) So increase the limit to allow an up to 128MB kernel image to be mapped. (should anyone be that crazy or lazy) We have a full 4K of pagetable (level2_kernel_pgt) allocated for these mappings already, so there's no RAM overhead and the limit was rather pointless and arbitrary. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:56 +01:00
Sam Ravnborg	da5968ae30	x86: fix section mismatch in head_64.S:initial_code initial_code are initially used to hold a function pointer from __init and later from __cpuinit. This confuses modpost and changing initial_code to REFDATA silence the warning. (But now we do not discard the variable anymore). Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:31 +01:00
Thomas Gleixner	31eedd823c	x86: zap invalid and unused pmds in early boot The early boot code maps KERNEL_TEXT_SIZE (currently 40MB) starting from __START_KERNEL_map. The kernel itself only needs _text to _end mapped in the high alias. On relocatible kernels the ASM setup code adjusts the compile time created high mappings to the relocation. This creates invalid pmd entries for negative offsets: 0xffffffff80000000 -> pmd entry: ffffffffff2001e3 It points outside of the physical address space and is marked present. This starts at the virtual address __START_KERNEL_map and goes up to the point where the first valid physical address (0x0) is mapped. Zap the mappings before _text and after _end right away in early boot. This removes also the invalid entries. Furthermore it simplifies the range check for high aliases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Sam Ravnborg	f1fbabb312	x86: fix 64-bit sections fix 64-bit section warnings. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Andi Kleen	31422c51e0	x86: rename LARGE_PAGE_SIZE to PMD_PAGE_SIZE Fix up all users. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Ingo Molnar	076f9776f5	x86: make early printk selectable on 64-bit as well Enable CONFIG_EMBEDDED to select CONFIG_EARLY_PRINTK on 64-bit as well. saves ~2K: text data bss dec hex filename 7290283 3672091 1907848 12870222 c4624e vmlinux.before 7288373 3671795 1907848 12868016 c459b0 vmlinux.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:06 +01:00
Roland McGrath	8866cd9dc9	x86: early_idt_handler improvements, 64-bit It's not too pretty, but I found this made the "PANIC: early exception" messages become much more reliably useful: 1. print the vector number, 2. print the %cs value, 3. handle error-code-pushing vs non-pushing vectors. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:06 +01:00
Glauber de Oliveira Costa	49a697871e	x86: turn priviled operation into a macro in head_64.S under paravirt, read cr2 cannot be issued directly anymore. So wrap it in a macro, defined to the operation itself in case paravirt is off, but to something else if we have paravirt in the game Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:10 +01:00
Thomas Gleixner	250c22777f	x86_64: move kernel Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-11 11:17:24 +02:00

1 2 3

123 Commits