Commit Graph

588802 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
3ea223adcb perf tools: Add missing initialization of perf_sample.cpumode in synthesized samples
In 473398a21d ("perf tools: Add cpumode to struct perf_sample"), I
missed some places where perf_sample fields are directly initialized in
addition to what is done in perf_evsel__parse_sample(), namely when
synthesizing PERF_RECORD_{MMAP*,COMM,FORK,EXIT} for pre-existing threads
and also in intel_pt and intel_bts when synthesizing events from
processor trace, the jitdump code also was affected, fix it.

The problem was noticed with running:

  # perf record -e intel_pt//u true
  # perf script

Where the samples wouldn't get resolved because perf_sample.cpumode
would be left as zero, i.e. PERF_RECORD_MISC_CPUMODE_UNKNOWN, not
resolving as kernel, hypervisor or user cpu modes.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 473398a21d ("perf tools: Add cpumode to struct perf_sample")
Link: http://lkml.kernel.org/n/tip-n5sdauxgk24d5nun8kuuu2mh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-29 20:03:56 -03:00
James Hogan
e95008a121 MIPS: cpu_name_string: Use raw_smp_processor_id().
If cpu_name_string() is used in non-atomic context when preemption is
enabled, it can trigger a BUG such as this one:

BUG: using smp_processor_id() in preemptible [00000000] code: unaligned/156
caller is __show_regs+0x1e4/0x330
CPU: 2 PID: 156 Comm: unaligned Tainted: G        W       4.3.0-00366-ga3592179816d-dirty #1501
Stack : ffffffff80900000 ffffffff8019bc18 000000000000005f ffffffff80a20000
         0000000000000000 0000000000000009 ffffffff8019c0e0 ffffffff80835648
         a8000000ff2bdec0 ffffffff80a1e628 000000000000009c 0000000000000002
         ffffffff80840000 a8000000fff2ffb0 0000000000000020 ffffffff8020e43c
         a8000000fff2fcf8 ffffffff80a20000 0000000000000000 ffffffff808f2607
         ffffffff8082b138 ffffffff8019cd1c 0000000000000030 ffffffff8082b138
         0000000000000002 000000000000009c 0000000000000000 0000000000000000
         0000000000000000 a8000000fff2fc40 0000000000000000 ffffffff8044dbf4
         0000000000000000 0000000000000000 0000000000000000 ffffffff8010c400
         ffffffff80855bb0 ffffffff8010d008 0000000000000000 ffffffff8044dbf4
         ...
Call Trace:
[<ffffffff8010d008>] show_stack+0x90/0xb0
[<ffffffff8044dbf4>] dump_stack+0x84/0xe0
[<ffffffff8046d4ec>] check_preemption_disabled+0x10c/0x110
[<ffffffff8010c40c>] __show_regs+0x1e4/0x330
[<ffffffff8010d060>] show_registers+0x28/0xc0
[<ffffffff80110748>] do_ade+0xcc8/0xce0
[<ffffffff80105b84>] resume_userspace_check+0x0/0x10

This is possible because cpu_name_string() is used by __show_regs(),
which is used by both show_regs() and show_registers(). These two
functions are used by various exception handling functions, only some of
which ensure that interrupts or preemption is disabled.

However the following have interrupts explicitly enabled or not
explicitly disabled:
- do_reserved() (irqs enabled)
- do_ade() (irqs not disabled)

This can be hit by setting /sys/kernel/debug/mips/unaligned_action to 2,
and triggering an address error exception, e.g. an unaligned access or
access to kernel segment from user mode.

To fix the above cases, use raw_smp_processor_id() instead. It is
unusual for CPU names to be different in the same system, and even if
they were, its possible the process has migrated between the exception
of interest and the cpu_name_string() call anyway.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12212/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-03-29 23:56:36 +02:00
Manuel Lauss
e34b6fcf9b pcmcia: db1xxx_ss: fix last irq_to_gpio user
remove the usage of removed irq_to_gpio() function.  On pre-DB1200
boards, pass the actual carddetect GPIO number instead of the IRQ,
because we need the gpio to actually test card status (inserted or
not) and can get the irq number with gpio_to_irq() instead.

Tested on DB1300 and DB1500, this patch fixes PCMCIA on the DB1500,
which used irq_to_gpio().

Fixes: 832f5dacfa ("MIPS: Remove all the uses of custom gpio.h")
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-pcmcia@lists.infradead.org
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Cc: stable@vger.kernel.org	# v4.3+
Patchwork: https://patchwork.linux-mips.org/patch/12747/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-03-29 22:48:53 +02:00
Mickaël Salaün
505ce68c6d selftest/seccomp: Fix the seccomp(2) signature
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Will Drewry <wad@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-03-29 13:01:36 -06:00
Mickaël Salaün
6c045d07bb selftest/seccomp: Fix the flag name SECCOMP_FILTER_FLAG_TSYNC
Rename SECCOMP_FLAG_FILTER_TSYNC to SECCOMP_FILTER_FLAG_TSYNC to match
the UAPI.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Will Drewry <wad@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2016-03-29 13:01:28 -06:00
Stefano Stabellini
85d1a29de8 Xen on ARM and ARM64: update MAINTAINERS info
Not my full time job anymore, but still maintaining it.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-03-29 13:26:27 -04:00
Will Deacon
431597bb95 arm64: defconfig: updates for 4.6
A few defconfig updates got dropped on the floor during the merge window,
so I've rounded up the remainder here:

  * Fix duplicate definition of MMC_BLOCK_MINORS and bump to 32 for
    msm8916

  * CPUFreq support for the Juno platform, using the MHU/SCPI interface

  * Removal of the default command line, which assumed a console called
    ttyAMA0

  * Bits and pieces for the Hi6220 (96Boards HiKey)

Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-29 16:56:00 +01:00
Jan Kara
8f9e8f5fcc ocfs2: Fix Q_GETNEXTQUOTA for filesystem without quotas
When Q_GETNEXTQUOTA was called for filesystem with quotas disabled
ocfs2_get_next_id() oopses. Fix the problem by checking early whether
the filesystem has quotas enabled.

Signed-off-by: Jan Kara <jack@suse.cz>
2016-03-29 17:44:11 +02:00
Andrew Price
82c7d823cc dlm: config: Fix ENOMEM failures in make_cluster()
Commit 1ae1602de0 "configfs: switch ->default groups to a linked list"
left the NULL gps pointer behind after removing the kcalloc() call which
made it non-NULL. It also left the !gps check in place so make_cluster()
now fails with ENOMEM. Remove the remaining uses of the gps variable to
fix that.

Reviewed-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2016-03-29 10:28:08 -05:00
Jan Kara
17e8a8936c quota: Handle Q_GETNEXTQUOTA when quota is disabled
Currently we oopsed when Q_GETNEXTQUOTA got called when quota was
disabled. Properly check whether quota is enabled for the filesystem
before calling into quota format handler.

Reported-by: Ted Tso <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
2016-03-29 17:20:10 +02:00
Shannon Zhao
b8cfadfcef arm64: perf: Move PMU register related defines to asm/perf_event.h
To use the ARMv8 PMU related register defines from the KVM code, we move
the relevant definitions to asm/perf_event.h header file and rename them
with prefix ARMV8_PMU_. This allows us to get rid of kvm_perf_event.h.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-29 16:04:57 +01:00
James Morse
a6002ec5a8 arm64: opcodes.h: Add arm big-endian config options before including arm header
arm and arm64 use different config options to specify big endian. This
needs taking into account when including code/headers between the two
architectures.

A case in point is PAN, which uses the __instr_arm() macro to output
instructions. The macro comes from opcodes.h, which lives under arch/arm.
On a big-endian build the mismatched config options mean the instruction
isn't byte swapped correctly, resulting in undefined instruction exceptions
during boot:

| alternatives: patching kernel code
| kdevtmpfs[87]: undefined instruction: pc=ffffffc0004505b4
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| Internal error: Oops - undefined instruction: 0 [#1] SMP
| Modules linked in:
| CPU: 0 PID: 87 Comm: kdevtmpfs Not tainted 4.1.16+ #5
| Hardware name: Hisilicon PhosphorHi1382 EVB (DT)
| task: ffffffc336591700 ti: ffffffc3365a4000 task.ti: ffffffc3365a4000
| PC is at dump_instr+0x68/0x100
| LR is at do_undefinstr+0x1d4/0x2a4
| pc : [<ffffffc00076231c>] lr : [<ffffffc0000811d4>] pstate: 604001c5
| sp : ffffffc3365a6450

Cc: <stable@vger.kernel.org> #4.3.x-
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Xuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-29 16:04:57 +01:00
Bjorn Andersson
d75868496c MAINTAINERS: Add mailing list for remote processor subsystems
Add the newly created linux-remoteproc mailing list for the three
subsystems related to remote processor management and communication.

Acked-by: Ohad Ben-Cohen <ohad@wizery.com>
Acked-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-03-29 07:05:24 -07:00
Boris Ostrovsky
dc6416f1d7 xen/x86: Call cpu_startup_entry(CPUHP_AP_ONLINE_IDLE) from xen_play_dead()
This call has always been missing from xen_play dead() but until
recently this was rather benign. With new cpu hotplug framework
(commit 8df3e07e7f ("cpu/hotplug: Let upcoming cpu bring itself fully up").
however this call is required, otherwise a hot-plugged CPU will not
be properly brough up (by never calling cpuhp_online_idle())

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2016-03-29 09:34:10 -04:00
Konrad Rzeszutek Wilk
8041dcc881 Linux 4.6-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW9xVQAAoJEHm+PkMAQRiGLCYH/RaD6PE78ttLoAIm2rTUY+P9
 jtNhRVVNPmNn6103r0g9RYxyN0k+KyRQiO0I3XC77i30ksJV0duqhZfpg9nyHaFZ
 ljiYBLWFcc6qyMiE5B9ogEwveQriKtbUTrIPHer/vEmOdJHDRoUIVZ/qQ+eNzGmq
 aHa0bJalCXqkRpXatfZBUz8OyblrNjZMpX6907vqR5kRrcXqRn52X32gmy6bwSRp
 oRRQvSjzzZzApISz1uQIY5TFEb9ARrtK2FUf/rM0WS4gE1tB/p8Ccx7iopqk/zjI
 f+vZEdQ6o6FdPZ38XZGnwyr1nf52AV5/7wJs+5D1rLCHK8buEJD01BiT7qQvpnI=
 =q0+P
 -----END PGP SIGNATURE-----

Merge tag 'v4.6-rc1' into for-linus-4.6

Linux 4.6-rc1

* tag 'v4.6-rc1': (12823 commits)
  Linux 4.6-rc1
  f2fs/crypto: fix xts_tweak initialization
  NTB: Remove _addr functions from ntb_hw_amd
  orangefs: fix orangefs_superblock locking
  orangefs: fix do_readv_writev() handling of error halfway through
  orangefs: have ->kill_sb() evict the VFS side of things first
  orangefs: sanitize ->llseek()
  orangefs-bufmap.h: trim unused junk
  orangefs: saner calling conventions for getting a slot
  orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer
  orangefs: get rid of readdir_handle_s
  thp: fix typo in khugepaged_scan_pmd()
  MAINTAINERS: fill entries for KASAN
  mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
  kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2
  mm, kasan: stackdepot implementation. Enable stackdepot for SLAB
  arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
  mm, kasan: add GFP flags to KASAN API
  mm, kasan: SLAB support
  kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
  ...
2016-03-29 09:33:47 -04:00
Paul Burton
fa8ff601d7 MIPS: Fix MSA ld unaligned failure cases
Copying the content of an MSA vector from user memory may involve TLB
faults & mapping in pages. This will fail when preemption is disabled
due to an inability to acquire mmap_sem from do_page_fault, which meant
such vector loads to unmapped pages would always fail to be emulated.
Fix this by disabling preemption later only around the updating of
vector register state.

This change does however introduce a race between performing the load
into thread context & the thread being preempted, saving its current
live context & clobbering the loaded value. This should be a rare
occureence, so optimise for the fast path by simply repeating the load if
we are preempted.

Additionally if the copy failed then the failure path was taken with
preemption left disabled, leading to the kernel typically encountering
further issues around sleeping whilst atomic. The change to where
preemption is disabled avoids this issue.

Fixes: e4aa1f153a "MIPS: MSA unaligned memory access support"
Reported-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: James Cowgill <James.Cowgill@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: stable <stable@vger.kernel.org> # v4.3
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12345/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-03-29 14:18:18 +02:00
Qais Yousef
19fb5818ed MIPS: Fix broken malta qemu
Malta defconfig compiles with GIC on. Hence when compiling for SMP it causes
the new IPI code to be activated. But on qemu malta there's no GIC causing a
BUG_ON(!ipidomain) to be hit in mips_smp_ipi_init().

Since in that configuration one can only run a single core SMP (!), skip IPI
initialisation if we detect that this is the case. It is a sensible
behaviour to introduce and should keep such possible configuration to run
rather than die hard unnecessarily.

Signed-off-by: Qais Yousef <qsyousef@gmail.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12892/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-03-29 14:18:18 +02:00
Frederic Weisbecker
5529578a27 locking/atomic, sched: Unexport fetch_or()
This patch functionally reverts:

  5fd7a09cfb ("atomic: Export fetch_or()")

During the merge Linus observed that the generic version of fetch_or()
was messy:

  " This makes the ugly "fetch_or()" macro that the scheduler used
    internally a new generic helper, and does a bad job at it. "

  e23604edac Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Now that we have introduced atomic_fetch_or(), fetch_or() is only used
by the scheduler in order to deal with thread_info flags which type
can vary across architectures.

Lets confine fetch_or() back to the scheduler so that we encourage
future users to use the more robust and well typed atomic_t version
instead.

While at it, fetch_or() gets robustified, pasting improvements from a
previous patch by Ingo Molnar that avoids needless expression
re-evaluations in the loop.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1458830281-4255-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-29 11:52:11 +02:00
Frederic Weisbecker
f009a7a767 timers/nohz: Convert tick dependency mask to atomic_t
The tick dependency mask was intially unsigned long because this is the
type on which clear_bit() operates on and fetch_or() accepts it.

But now that we have atomic_fetch_or(), we can instead use
atomic_andnot() to clear the bit. This consolidates the type of our
tick dependency mask, reduce its size on structures and benefit from
possible architecture optimizations on atomic_t operations.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1458830281-4255-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-29 11:52:11 +02:00
Frederic Weisbecker
5acba71e18 locking/atomic: Introduce atomic_fetch_or()
This is deemed to replace the type generic fetch_or() which brings a lot
of issues such as macro induced block variable aliasing and sloppy types.
Not to mention fetch_or() doesn't refer to any namespace, adding even
more confusion.

So lets provide an atomic_t version. Current and next users of fetch_or()
are thus encouraged to use atomic_t.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1458830281-4255-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-29 11:52:11 +02:00
Huang Rui
34a4cceb78 x86/cpu: Add advanced power management bits
Bit 11 of CPUID 8000_0007 edx is processor feedback interface.
Bit 12 of CPUID 8000_0007 edx is accumulated power.

Print proper names in proc/cpuinfo

Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Tony Li <tony.li@amd.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: "Len Brown" <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1458871720-3209-1-git-send-email-ray.huang@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 11:12:11 +02:00
Borislav Petkov
5f870a3f71 x86/thread_info: Merge two !__ASSEMBLY__ sections
We have

  #ifndef __ASSEMBLY__
  ...
  #endif

  #ifndef __ASSEMBLY__
  ...
  #endif

Merge the two.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1459189217-25532-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 11:12:10 +02:00
Vladimir Zapolskiy
4a6772f514 x86/cpufreq: Remove duplicated TDP MSR macro definitions
The list of CPU model specific registers contains two copies of TDP
registers, remove the one, which is out of numerical order in the
list.

Fixes: 6a35fc2d6c ("cpufreq: intel_pstate: get P1 from TAR when available")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Kristen Carlson
 Accardi <kristen@linux.intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: http://lkml.kernel.org/r/1459018020-24577-1-git-send-email-vladimir_zapolskiy@mentor.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 11:12:10 +02:00
Harald Freudenberger
74b2375e67 s390/crypto: provide correct file mode at device register.
When the prng device driver calls misc_register() there is the possibility
to also provide the recommented file permissions. This fix now gives
useful values (0644) where previously just the default was used (resulting
in 0600 for the device file).

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-29 10:55:12 +02:00
Borislav Petkov
f7be8610bc x86/Documentation: Start documenting x86 topology
This should contain important aspects of how we represent the system
topology on x86. If people have questions about it and this file doesn't
answer it, then it must be updated.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20160328095609.GD26651@pd.tnic
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 10:45:04 +02:00
Borislav Petkov
8196dab4fc x86/cpu: Get rid of compute_unit_id
It is cpu_core_id anyway.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1458917557-8757-3-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 10:45:04 +02:00
Peter Zijlstra
32b62f4468 perf/x86/amd: Cleanup Fam10h NB event constraints
Avoid allocating the AMD NB event constraints data structure when not
needed. This gets rid of x86_max_cores usage and avoids allocating
this on AMD Core Perfctr supporting hardware (which has separate MSRs
for NB events).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: aherrmann@suse.com
Cc: Rui Huang <ray.huang@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: jencce.kernel@gmail.com
Link: http://lkml.kernel.org/r/20160320124629.GY6375@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 10:45:04 +02:00
Peter Zijlstra
ee6825c80e x86/topology: Fix AMD core count
It turns out AMD gets x86_max_cores wrong when there are compute
units.

The issue is that Linux assumes:

	nr_logical_cpus = nr_cores * nr_siblings

But AMD reports its CU unit as 2 cores, but then sets num_smp_siblings
to 2 as well.

Boris: fixup ras/mce_amd_inj.c too, to compute the Node Base Core
properly, according to the new nomenclature.

Fixes: 1f12e32f4c ("x86/topology: Create logical package id")
Reported-by: Xiong Zhou <jencce.kernel@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andreas Herrmann <aherrmann@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20160317095220.GO6344@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 10:45:04 +02:00
Ingo Molnar
5dc1037305 perf/urgent fix:
- Fix build break on powerpc due to missing headers with prototypes for
   functions defined in tools/perf/arch/powerpc/util/header.c (Sukadev Bhattiprolu)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW+ZkkAAoJENZQFvNTUqpALsgP/AzeyT5Yqj6qUWoNsSryPsph
 89E/M+008jJ8QTBbde1sO2u5SXpnHA26CPQBXawGSK7zrySeqvxzdhMXZ9c4OF8t
 tvOQYl+W6KWdXREovFwsGyrBA12auIymm5DzsldIrbCHQH6neDP6TXkxKAxQrWOf
 2gWvxBm5Q7yYnUdIEWE7PoqKeldpH1il+dCteU8ASIRNYWPYYCXVTV47MrXsAJcv
 paHliq5LIxG3+LAyNJMtbwK98IfFGn7PefZ7mEHKWA1APjnVjNKIH3Sg2CCvs92c
 mVRoNfHCzrO7hAVQzmFSIh4KEgDxbMl1EXx+zjD5q4nhCsr4Y0IzSgzojDu1v/FU
 vcKkRSepGQrZ1cPuGiIi//+qryUl8gXVEMQy18MEhpgw20/KhdsoRvsfgAV3pxFf
 oXw3PrMbi+pipCoZkMUH5xp9mmq1vKd3s9D8UScsQ/VWMcnw/HnXoJOB4FMglW5o
 BSQLraFZXXoX3/fsXDA8HV0xzMvkSY0bXooeVx/PKQlAW8RgDnqDaa6LxtZ2HeOV
 8lg2aXuM1zGVuJMnIiDfWPYtUSoI2NRwcwsAx03Hr9Ukv9WpCVknQ4G2h3/jQO7/
 9MDn5QtGQV1eQ4nTunGeSHIThGOHZFQOB/vG+eqQBJnPjHEN9m5pvlQxuMYg/3vG
 LMkHzt/X0NaqNn1Ebhuw
 =B8Q3
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-20160328' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fix from Arnaldo Carvalho de Melo:

  - Fix build break on PowerPC due to missing headers with prototypes for
    functions defined in tools/perf/arch/powerpc/util/header.c (Sukadev Bhattiprolu)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-29 10:39:12 +02:00
Simon Guo
71528d8bd7 powerpc: Correct used_vsr comment
The used_vsr flag is set if process has used VSX registers, not Altivec
registers. But the comment says otherwise, correct the comment.

Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-29 12:08:08 +11:00
Oliver O'Halloran
01d7c2a2de powerpc/process: Fix altivec SPR not being saved
In save_sprs() in process.c contains the following test:

	if (cpu_has_feature(cpu_has_feature(CPU_FTR_ALTIVEC)))
		t->vrsave = mfspr(SPRN_VRSAVE);

CPU feature with the mask 0x1 is CPU_FTR_COHERENT_ICACHE so the test
is equivilent to:

	if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
		cpu_has_feature(CPU_FTR_COHERENT_ICACHE))

On CPUs without support for both (i.e G5) this results in vrsave not
being saved between context switches. The vector register save/restore
code doesn't use VRSAVE to determine which registers to save/restore,
but the value of VRSAVE is used to determine if altivec is being used
in several code paths.

Fixes: 152d523e63 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()")
Cc: stable@vger.kernel.org
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-29 12:08:08 +11:00
Sebastian Siewior
08a5bb2921 powerpc/mm: Fixup preempt underflow with huge pages
hugepd_free() used __get_cpu_var() once. Nothing ensured that the code
accessing the variable did not migrate from one CPU to another and soon
this was noticed by Tiejun Chen in 94b09d7554 ("powerpc/hugetlb:
Replace __get_cpu_var with get_cpu_var"). So we had it fixed.

Christoph Lameter was doing his __get_cpu_var() replaces and forgot
PowerPC. Then he noticed this and sent his fixed up batch again which
got applied as 69111bac42 ("powerpc: Replace __get_cpu_var uses").

The careful reader will noticed one little detail: get_cpu_var() got
replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does
a preempt_enable() and nothing that does preempt_disable() so we
underflow the preempt counter.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-29 12:08:07 +11:00
Dan Williams
fc0c202813 x86, pmem: use memcpy_mcsafe() for memcpy_from_pmem()
Update the definition of memcpy_from_pmem() to return 0 or a negative
error code.  Implement x86/arch_memcpy_from_pmem() with memcpy_mcsafe().

Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-28 17:19:31 -07:00
Bjørn Mork
e84810c7b8 qmi_wwan: add "D-Link DWM-221 B1" device id
Thomas reports:
"Windows:

00 diagnostics
01 modem
02 at-port
03 nmea
04 nic

Linux:

T:  Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=2001 ProdID=7e19 Rev=02.32
S:  Manufacturer=Mobile Connect
S:  Product=Mobile Connect
S:  SerialNumber=0123456789ABCDEF
C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#= 5 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage"

Reported-by: Thomas Schäfer <tschaefer@t-online.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-28 20:12:01 -04:00
Vladimir Zapolskiy
4b28038fd6 remoteproc: st: fix check of syscon_regmap_lookup_by_phandle() return value
syscon_regmap_lookup_by_phandle() returns either a valid pointer to
struct regmap or ERR_PTR() error value, check for NULL is invalid and
on error path may lead to oops, the change corrects the check.

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-03-28 16:19:00 -07:00
Sukadev Bhattiprolu
379649cfea perf tools: Fix build break on powerpc
Commit 531d241063 ("perf tools: Do not include stringify.h from the
kernel sources") seems to have accidentially removed the inclusion of
"util/header.h" from "arch/powerpc/util/header.c".

"util/header.h" provides the prototype for get_cpuid() and is needed to
build perf on Powerpc:

	arch/powerpc/util/header.c:17:1: error: no previous prototype for 'get_cpuid' [-Werror=missing-prototypes]

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 531d241063 ("perf tools: Do not include stringify.h from the kernel sources")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[ Included "util.h" too, to get the scnprintf() prototype ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-28 17:46:20 -03:00
Linus Torvalds
1993b176a8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide
Pull IDE fixes from David Miller:
 "Just two small changes:

  1) Remove bogus init annotation in icside, from Arnd Bergmann.

  2) Don't use zero clock rates in palm_bk3710 driver, from Wolfram
     Sang"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
  ide: palm_bk3710: test clock rate to avoid division by 0
  ide: icside: remove incorrect initconst annotation
2016-03-28 15:17:02 -05:00
Linus Torvalds
d4dc3b2442 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Minor typing cleanup from Joe Perches, and some comment typo fixes
  from Adam Buchbinder"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Convert naked unsigned uses to unsigned int
  sparc: Fix misspellings in comments.
2016-03-28 15:14:11 -05:00
Linus Torvalds
ec3c07377b Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull arch/tile bugfixes from Chris Metcalf:
 "These include updates to MAINTAINERS, some comment spelling fixes, and
  a bugfix to the tile kgdb.c support"

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: Fix misspellings in comments.
  MAINTAINERS: update web link for tile architecture
  MAINTAINERS: update arch/tile maintainer email domain
  tile kgdb: fix bug in copy to gdb regs, and optimize memset
2016-03-28 15:04:55 -05:00
David S. Miller
0c84ea17ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for you net tree,
they are:

1) There was a race condition between parallel save/swap and delete,
   which resulted a kernel crash due to the increase ref for save, swap,
   wrong ref decrease operations. Reported and fixed by Vishwanath Pai.

2) OVS should call into CT NAT for packets of new expected connections only
   when the conntrack state is persisted with the 'commit' option to the
   OVS CT action. From Jarno Rajahalme.

3) Resolve kconfig dependencies with new OVS NAT support. From Arnd Bergmann.

4) Early validation of entry->target_offset to make sure it doesn't take us
   out from the blob, from Florian Westphal.

5) Again early validation of entry->next_offset to make sure it doesn't take
   out from the blob, also from Florian.

6) Check that entry->target_offset is always of of sizeof(struct xt_entry)
   for unconditional entries, when checking both from check_underflow()
   and when checking for loops in mark_source_chains(), again from
   Florian.

7) Fix inconsistent behaviour in nfnetlink_queue when
   NFQA_CFG_F_FAIL_OPEN is set and netlink_unicast() fails due to buffer
   overrun, we have to reinject the packet as the user expects.

8) Enforce nul-terminated table names from getsockopt GET_ENTRIES
   requests.

9) Don't assume skb->sk is set from nft_bridge_reject and synproxy,
   this fixes a recent update of the code to namespaceify
   ip_default_ttl, patch from Liping Zhang.

This batch comes with four patches to validate x_tables blobs coming
from userspace. CONFIG_USERNS exposes the x_tables interface to
unpriviledged users and to be honest this interface never received the
attention for this move away from the CAP_NET_ADMIN domain. Florian is
working on another round with more patches with more sanity checks, so
expect a bit more Netfilter fixes in this development cycle than usual.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-28 15:38:59 -04:00
Jaegeuk Kim
fd694733d5 f2fs: cover large section in sanity check of super
This patch fixes the bug which does not cover a large section case when checking
the sanity of superblock.
If f2fs detects misalignment, it will fix the superblock during the mount time,
so it doesn't need to trigger fsck.f2fs further.

Reported-by: Matthias Prager <linux@matthiasprager.de>
Reported-by: David Gnedt <david.gnedt@davizone.at>
Cc: stable 4.5+ <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-03-28 10:43:01 -07:00
Liping Zhang
29421198c3 netfilter: ipv4: fix NULL dereference
Commit fa50d974d1 ("ipv4: Namespaceify ip_default_ttl sysctl knob")
use sock_net(skb->sk) to get the net namespace, but we can't assume
that sk_buff->sk is always exist, so when it is NULL, oops will happen.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Reviewed-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:59:29 +02:00
Pablo Neira Ayuso
b301f25387 netfilter: x_tables: enforce nul-terminated table name from getsockopt GET_ENTRIES
Make sure the table names via getsockopt GET_ENTRIES is nul-terminated
in ebtables and all the x_tables variants and their respective compat
code. Uncovered by KASAN.

Reported-by: Baozeng Ding <sploving1@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:59:24 +02:00
Pablo Neira Ayuso
931401137f netfilter: nfnetlink_queue: honor NFQA_CFG_F_FAIL_OPEN when netlink unicast fails
When netlink unicast fails to deliver the message to userspace, we
should also check if the NFQA_CFG_F_FAIL_OPEN flag is set so we reinject
the packet back to the stack.

I think the user expects no packet drops when this flag is set due to
queueing to userspace errors, no matter if related to the internal queue
or when sending the netlink message to userspace.

The userspace application will still get the ENOBUFS error via recvmsg()
so the user still knows that, with the current configuration that is in
place, the userspace application is not consuming the messages at the
pace that the kernel needs.

Reported-by: "Yigal Reiss (yreiss)" <yreiss@cisco.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: "Yigal Reiss (yreiss)" <yreiss@cisco.com>
2016-03-28 17:59:20 +02:00
Florian Westphal
54d83fc74a netfilter: x_tables: fix unconditional helper
Ben Hawkes says:

 In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
 is possible for a user-supplied ipt_entry structure to have a large
 next_offset field. This field is not bounds checked prior to writing a
 counter value at the supplied offset.

Problem is that mark_source_chains should not have been called --
the rule doesn't have a next entry, so its supposed to return
an absolute verdict of either ACCEPT or DROP.

However, the function conditional() doesn't work as the name implies.
It only checks that the rule is using wildcard address matching.

However, an unconditional rule must also not be using any matches
(no -m args).

The underflow validator only checked the addresses, therefore
passing the 'unconditional absolute verdict' test, while
mark_source_chains also tested for presence of matches, and thus
proceeeded to the next (not-existent) rule.

Unify this so that all the callers have same idea of 'unconditional rule'.

Reported-by: Ben Hawkes <hawkes@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:59:15 +02:00
Florian Westphal
6e94e0cfb0 netfilter: x_tables: make sure e->next_offset covers remaining blob size
Otherwise this function may read data beyond the ruleset blob.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:59:08 +02:00
Florian Westphal
bdf533de69 netfilter: x_tables: validate e->target_offset early
We should check that e->target_offset is sane before
mark_source_chains gets called since it will fetch the target entry
for loop detection.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:59:04 +02:00
Arnd Bergmann
99b7248e2a openvswitch: call only into reachable nf-nat code
The openvswitch code has gained support for calling into the
nf-nat-ipv4/ipv6 modules, however those can be loadable modules
in a configuration in which openvswitch is built-in, leading
to link errors:

net/built-in.o: In function `__ovs_ct_lookup':
:(.text+0x2cc2c8): undefined reference to `nf_nat_icmp_reply_translation'
:(.text+0x2cc66c): undefined reference to `nf_nat_icmpv6_reply_translation'

The dependency on (!NF_NAT || NF_NAT) prevents similar issues,
but NF_NAT is set to 'y' if any of the symbols selecting
it are built-in, but the link error happens when any of them
are modular.

A second issue is that even if CONFIG_NF_NAT_IPV6 is built-in,
CONFIG_NF_NAT_IPV4 might be completely disabled. This is unlikely
to be useful in practice, but the driver currently only handles
IPv6 being optional.

This patch improves the Kconfig dependency so that openvswitch
cannot be built-in if either of the two other symbols are set
to 'm', and it replaces the incorrect #ifdef in ovs_ct_nat_execute()
with two "if (IS_ENABLED())" checks that should catch all corner
cases also make the code more readable.

The same #ifdef exists ovs_ct_nat_to_attr(), where it does not
cause a link error, but for consistency I'm changing it the same
way.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 05752523e5 ("openvswitch: Interface with NAT.")
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:58:59 +02:00
Jarno Rajahalme
5745b0be05 openvswitch: Fix checking for new expected connections.
OVS should call into CT NAT for packets of new expected connections only
when the conntrack state is persisted with the 'commit' option to the
OVS CT action.  The test for this condition is doubly wrong, as the CT
status field is ANDed with the bit number (IPS_EXPECTED_BIT) rather
than the mask (IPS_EXPECTED), and due to the wrong assumption that the
expected bit would apply only for the first (i.e., 'new') packet of a
connection, while in fact the expected bit remains on for the lifetime of
an expected connection.  The 'ctinfo' value IP_CT_RELATED derived from
the ct status can be used instead, as it is only ever applicable to
the 'new' packets of the expected connection.

Fixes: 05752523e5 ('openvswitch: Interface with NAT.')
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:58:51 +02:00
Vishwanath Pai
596cf3fe58 netfilter: ipset: fix race condition in ipset save, swap and delete
This fix adds a new reference counter (ref_netlink) for the struct ip_set.
The other reference counter (ref) can be swapped out by ip_set_swap and we
need a separate counter to keep track of references for netlink events
like dump. Using the same ref counter for dump causes a race condition
which can be demonstrated by the following script:

ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \
counters
ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \
counters
ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \
counters

ipset save &

ipset swap hash_ip3 hash_ip2
ipset destroy hash_ip3 /* will crash the machine */

Swap will exchange the values of ref so destroy will see ref = 0 instead of
ref = 1. With this fix in place swap will not succeed because ipset save
still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink).

Both delete and swap will error out if ref_netlink != 0 on the set.

Note: The changes to *_head functions is because previously we would
increment ref whenever we called these functions, we don't do that
anymore.

Reviewed-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-28 17:57:45 +02:00