Commit Graph

433 Commits

Author SHA1 Message Date
Prarit Bhargava
d3051b489a modules, lock around setting of MODULE_STATE_UNFORMED
A panic was seen in the following sitation.

There are two threads running on the system. The first thread is a system
monitoring thread that is reading /proc/modules. The second thread is
loading and unloading a module (in this example I'm using my simple
dummy-module.ko).  Note, in the "real world" this occurred with the qlogic
driver module.

When doing this, the following panic occurred:

 ------------[ cut here ]------------
 kernel BUG at kernel/module.c:3739!
 invalid opcode: 0000 [#1] SMP
 Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
 CPU: 37 PID: 186343 Comm: cat Tainted: GF          O--------------   3.10.0+ #7
 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
 task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
 RIP: 0010:[<ffffffff810d64c5>]  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
 RSP: 0018:ffff88080fa7fe18  EFLAGS: 00010246
 RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
 RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
 RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
 R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
 R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
 FS:  00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Stack:
  ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
  ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
  ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
 Call Trace:
  [<ffffffff810d666c>] m_show+0x19c/0x1e0
  [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0
  [<ffffffff812281ed>] proc_reg_read+0x3d/0x80
  [<ffffffff811c0f2c>] vfs_read+0x9c/0x170
  [<ffffffff811c1a58>] SyS_read+0x58/0xb0
  [<ffffffff81605829>] system_call_fastpath+0x16/0x1b
 Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
 RIP  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
  RSP <ffff88080fa7fe18>

    Consider the two processes running on the system.

    CPU 0 (/proc/modules reader)
    CPU 1 (loading/unloading module)

    CPU 0 opens /proc/modules, and starts displaying data for each module by
    traversing the modules list via fs/seq_file.c:seq_open() and
    fs/seq_file.c:seq_read().  For each module in the modules list, seq_read
    does

            op->start()  <-- this is a pointer to m_start()
            op->show()   <- this is a pointer to m_show()
            op->stop()   <-- this is a pointer to m_stop()

    The m_start(), m_show(), and m_stop() module functions are defined in
    kernel/module.c. The m_start() and m_stop() functions acquire and release
    the module_mutex respectively.

    ie) When reading /proc/modules, the module_mutex is acquired and released
    for each module.

    m_show() is called with the module_mutex held.  It accesses the module
    struct data and attempts to write out module data.  It is in this code
    path that the above BUG_ON() warning is encountered, specifically m_show()
    calls

    static char *module_flags(struct module *mod, char *buf)
    {
            int bx = 0;

            BUG_ON(mod->state == MODULE_STATE_UNFORMED);
    ...

    The other thread, CPU 1, in unloading the module calls the syscall
    delete_module() defined in kernel/module.c.  The module_mutex is acquired
    for a short time, and then released.  free_module() is called without the
    module_mutex.  free_module() then sets mod->state = MODULE_STATE_UNFORMED,
    also without the module_mutex.  Some additional code is called and then the
    module_mutex is reacquired to remove the module from the modules list:

        /* Now we can delete it from the lists */
        mutex_lock(&module_mutex);
        stop_machine(__unlink_module, mod, NULL);
        mutex_unlock(&module_mutex);

This is the sequence of events that leads to the panic.

CPU 1 is removing dummy_module via delete_module().  It acquires the
module_mutex, and then releases it.  CPU 1 has NOT set dummy_module->state to
MODULE_STATE_UNFORMED yet.

CPU 0, which is reading the /proc/modules, acquires the module_mutex and
acquires a pointer to the dummy_module which is still in the modules list.
CPU 0 calls m_show for dummy_module.  The check in m_show() for
MODULE_STATE_UNFORMED passed for dummy_module even though it is being
torn down.

Meanwhile CPU 1, which has been continuing to remove dummy_module without
holding the module_mutex, now calls free_module() and sets
dummy_module->state to MODULE_STATE_UNFORMED.

CPU 0 now calls module_flags() with dummy_module and ...

static char *module_flags(struct module *mod, char *buf)
{
        int bx = 0;

        BUG_ON(mod->state == MODULE_STATE_UNFORMED);

and BOOM.

Acquire and release the module_mutex lock around the setting of
MODULE_STATE_UNFORMED in the teardown path, which should resolve the
problem.

Testing: In the unpatched kernel I can panic the system within 1 minute by
doing

while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done

and

while (true) do cat /proc/modules; done

in separate terminals.

In the patched kernel I was able to run just over one hour without seeing
any issues.  I also verified the output of panic via sysrq-c and the output
of /proc/modules looks correct for all three states for the dummy_module.

        dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
        dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
        dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2014-10-15 10:20:09 +10:30
Linus Torvalds
6325e940e7 arm64 updates for 3.18:
- eBPF JIT compiler for arm64
 - CPU suspend backend for PSCI (firmware interface) with standard idle
   states defined in DT (generic idle driver to be merged via a different
   tree)
 - Support for CONFIG_DEBUG_SET_MODULE_RONX
 - Support for unmapped cpu-release-addr (outside kernel linear mapping)
 - set_arch_dma_coherent_ops() implemented and bus notifiers removed
 - EFI_STUB improvements when base of DRAM is occupied
 - Typos in KGDB macros
 - Clean-up to (partially) allow kernel building with LLVM
 - Other clean-ups (extern keyword, phys_addr_t usage)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNB6NAAoJEGvWsS0AyF7x22sP/1qPQvFoY71fSqTZmSY+kfgW
 UMXhDFZOd+khD2TPHWptbgBRDElTQjRPHyISv/8ILKwDNoMlUDLlYkp1XPLM/nlB
 ea9ou2GX8iktqgM2JF5r4vk1hjH6JqEGOUHyWKZc7ibphTVm3dhg3nWL1A4peOUG
 0UyX79kl8BLAaggLSUhjtUz1GMpSNlb6Pc1ForUXaPMayBlOcVoOzh1ir7b5wb3e
 IvotUY1gv+opE9uK0QPr1AJSfpCogPEfQ2TSCP8MQZjxkrEz69n0HaFvdy60rwf4
 DaJiqBoQ5MSP3Bw+qvoYgyz+tfiPFAvEF+O3YQ5x3LBTteoooriFYH4mL7DsicAs
 2WLor/342mHykE0bOc44/gNl8B/xaZNzvO2ezLYrjVGsiY2QHTZ7fXB8arPUvQSS
 RUXVfHmcv4qthZjI17rgreBKvsfeFIMighSfvMJnVhGqDSvB8abjiPwZjzqB91Bq
 pu5MDitNgR3k3ctwzRaS6JtH2CluVFv97xIS4VaD/hm3JnS5NPeTXFou3Gb3lvon
 d/wXOIB3vY8FDMIt+BMCQPzWiU0liZ/sN7p1bsOmkgZ1wLOZ0nmsaHF09PDRGbtA
 vifopwaw9qtNlcVrTB/rDBCDaT0Ds/mTYD/a3+ch5CYUeLmQmfW/vBMfq/3gUt65
 JdI/nTVXawbl2CpBWw36
 =SAfQ
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 - eBPF JIT compiler for arm64
 - CPU suspend backend for PSCI (firmware interface) with standard idle
   states defined in DT (generic idle driver to be merged via a
   different tree)
 - Support for CONFIG_DEBUG_SET_MODULE_RONX
 - Support for unmapped cpu-release-addr (outside kernel linear mapping)
 - set_arch_dma_coherent_ops() implemented and bus notifiers removed
 - EFI_STUB improvements when base of DRAM is occupied
 - Typos in KGDB macros
 - Clean-up to (partially) allow kernel building with LLVM
 - Other clean-ups (extern keyword, phys_addr_t usage)

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (51 commits)
  arm64: Remove unneeded extern keyword
  ARM64: make of_device_ids const
  arm64: Use phys_addr_t type for physical address
  aarch64: filter $x from kallsyms
  arm64: Use DMA_ERROR_CODE to denote failed allocation
  arm64: Fix typos in KGDB macros
  arm64: insn: Add return statements after BUG_ON()
  arm64: debug: don't re-enable debug exceptions on return from el1_dbg
  Revert "arm64: dmi: Add SMBIOS/DMI support"
  arm64: Implement set_arch_dma_coherent_ops() to replace bus notifiers
  of: amba: use of_dma_configure for AMBA devices
  arm64: dmi: Add SMBIOS/DMI support
  arm64: Correct ftrace calls to aarch64_insn_gen_branch_imm()
  arm64:mm: initialize max_mapnr using function set_max_mapnr
  setup: Move unmask of async interrupts after possible earlycon setup
  arm64: LLVMLinux: Fix inline arm64 assembly for use with clang
  arm64: pageattr: Correctly adjust unaligned start addresses
  net: bpf: arm64: fix module memory leak when JIT image build fails
  arm64: add PSCI CPU_SUSPEND based cpu_suspend support
  arm64: kernel: introduce cpu_init_idle CPU operation
  ...
2014-10-08 05:34:24 -04:00
Kyle McMartin
6c34f1f542 aarch64: filter $x from kallsyms
Similar to ARM, AArch64 is generating $x and $d syms... which isn't
terribly helpful when looking at %pF output and the like. Filter those
out in kallsyms, modpost and when looking at module symbols.

Seems simplest since none of these check EM_ARM anyway, to just add it
to the strchr used, rather than trying to make things overly
complicated.

initcall_debug improves:
dmesg_before.txt: initcall $x+0x0/0x154 [sg] returned 0 after 26331 usecs
dmesg_after.txt: initcall init_sg+0x0/0x154 [sg] returned 0 after 15461 usecs

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-10-02 17:01:51 +01:00
Jani Nikula
6a4c264313 module: rename KERNEL_PARAM_FL_NOARG to avoid confusion
Make it clear this is about kernel_param_ops, not kernel_param (which
will soon have a flags field of its own). No functional changes.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Jon Mason <jon.mason@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-08-27 21:54:07 +09:30
Andy Lutomirski
ff7e0055bb module: Clean up ro/nx after early module load failures
The commit

    4982223e51 module: set nx before marking module MODULE_STATE_COMING.

introduced a regression: if a module fails to parse its arguments or
if mod_sysfs_setup fails, then the module's memory will be freed
while still read-only.  Anything that reuses that memory will crash
as soon as it tries to write to it.

Cc: stable@vger.kernel.org # v3.16
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-08-16 04:47:00 +09:30
Linus Torvalds
c8d6637d04 This finally applies the stricter sysfs perms checking we pulled out
before last merge window.  A few stragglers are fixed (thanks linux-next!)
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT6CrEAAoJENkgDmzRrbjx3GoQAI1rt8XbTE8zVGf1PKp0SL10
 gWWL9BnnHtUFriwgIbT4mBa1p0wnavIzJIeUBH0rJb2BNAbf7mBT7CFPrMuS+iV2
 WlRoy/chIFnX5A7m6ddaHnzL8lPhMFvUi8dpvxO6FwpyhhNcUHqmb+uCZeLjTX/m
 Gj5mlOlilvH2NSugKyiTapCgcQMQqaaxcwKxyg1z3FRo12gwKvTBdjzdA3Fg7k4T
 TAEbTG4Fq6Q7DkQYDpJK2KWDkPmJ7hxExHFW/M0m1r7DpxY1oHI95TsugU3Mr2mM
 90S15vA6Sn0l1+bRiv5qHF26VjOpdhC8uQhydjnX+lqzBGBRNoMUE/ubmxd43G4m
 /VlVJ9ZD40HLEmRFdtJI6UZSHYwDh7eruVH7Sjj8KFiqGps/F6nDOhV7fVLOdI+0
 J9pLBbj1mA38pIK/XC3r2k8Z/u9GB/7tJFirzmk5rIVzNb/4GBrn/Cgf+GDX7djz
 r8c2QnLeUIht5fm34qKNnSQ/o+ZBKmG6f2bLuBesntZMsAD2cC5TUEP15NERuF3a
 Wa7Wn1Y9WuonH7O3j+PoUOys/bGLXZeFXfKYS8A8SGroE99xo/QhkRm/sNU0+wEz
 JTN4Sra03imE/YSniFnRyRiAShR3KAVen/yfOx6XPs/r5XrFG14Q7cqCKjp1EjHj
 TX5scRWFM5qntTSloGJt
 =9mjn
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "This finally applies the stricter sysfs perms checking we pulled out
  before last merge window.  A few stragglers are fixed (thanks
  linux-next!)"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  arch/powerpc/platforms/powernv/opal-dump.c: fix world-writable sysfs files
  arch/powerpc/platforms/powernv/opal-elog.c: fix world-writable sysfs files
  drivers/video/fbdev/s3c2410fb.c: don't make debug world-writable.
  ARM: avoid ARM binutils leaking ELF local symbols
  scripts: modpost: Remove numeric suffix pattern matching
  scripts: modpost: fix compilation warning
  sysfs: disallow world-writable files.
  module: return bool from within_module*()
  module: add within_module() function
  modules: Fix build error in moduleloader.h
2014-08-10 21:31:58 -07:00
Russell King
2e3a10a155 ARM: avoid ARM binutils leaking ELF local symbols
Symbols starting with .L are ELF local symbols and should not appear
in ELF symbol tables.  However, unfortunately ARM binutils leaks the
.LANCHOR symbols into the symbol table, which leads kallsyms to report
these symbols rather than the real name.  It is not very useful when
%pf reports symbols against these leaked .LANCHOR symbols.

Arrange for kallsyms to ignore these symbols using the same mechanism
that is used for the ARM mapping symbols.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-07-27 20:52:47 +09:30
Petr Mladek
9b20a352d7 module: add within_module() function
It is just a small optimization that allows to replace few
occurrences of within_module_init() || within_module_core()
with a single call.

Signed-off-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-07-27 20:52:43 +09:30
Jarod Wilson
002c77a48b crypto: fips - only panic on bad/missing crypto mod signatures
Per further discussion with NIST, the requirements for FIPS state that
we only need to panic the system on failed kernel module signature checks
for crypto subsystem modules. This moves the fips-mode-only module
signature check out of the generic module loading code, into the crypto
subsystem, at points where we can catch both algorithm module loads and
mode module loads. At the same time, make CONFIG_CRYPTO_FIPS dependent on
CONFIG_MODULE_SIG, as this is entirely necessary for FIPS mode.

v2: remove extraneous blank line, perform checks in static inline
function, drop no longer necessary fips.h include.

CC: "David S. Miller" <davem@davemloft.net>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Stephan Mueller <stephan.mueller@atsec.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03 21:38:32 +08:00
Linus Torvalds
4251c2a670 Most of this is cleaning up various driver sysfs permissions so we can
re-add the perm check (we unified the module param and sysfs checks, but
 the module ones were stronger so we weakened them temporarily).
 
 Param parsing gets documented, and also "--" now forces args to be
 handed to init (and ignored by the kernel).
 
 Module NX/RO protections get tightened: we now set them before calling
 parse_args().
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTl+oJAAoJENkgDmzRrbjxtUEP/jIXml01jE2HquOJ/DfrCJOt
 ry5L5Iy8wVBRotTszrXqlD6+W8fLYsEdhM65Wof1H7X1qjaulqYZmrL7bQn4rIGN
 YPUmO5rOzECeAPNW5+e2JLnR4bmS99gVcWzJFCHUBd7Z8ceKaoIk7/XvUg6Mdjg7
 v0kJ5X+U9da2sVYYcZ71euth4ADLFDRNRexA1mPI6mKzJLOBgfvCBWZnkFVdBcjd
 VmL6ceFo/yP9Ed4pgG/4uXq1dZ4ZttpjPusDmNcjq+snOzsQb4tW+KB2Pr6iTwQy
 TDt7lQm5+xfUXgUG/S5L6PYn10P44Voo7AEJa+QK5YPSOY/eRVA0h4/ayP0vqDaJ
 LpZjqXbW77G4yOgEV9KRFLLXiFXykTh2TyCPYL5G2XVXQp1OmViu2f21JWJLFLgL
 mqOXYWdowOGVOOoTgwxIdxczCFCATJUaU5Ig6ay8C02E2mCwIV+IaGSdpsCiyjz/
 dNNumMxWg0NMo/c0YG4K3Ake6ZaGrwbnuJYijaEj6mgpifhh7k4yhFciXGLpkLnS
 Yuo4ORO0GX34z1+bX0iwrgMGPdy7+BnbXsDdWJsbsnwnKKes/Sp44fNl4lPwdM3n
 siaPsxmfAtl9EGqbkU1Fk+x5+X/Lv2I/7/nX5n53520RLkJJpbeMDfHUqpbrqeUN
 JNUTOZ9o72EqDVKnn175
 =IxSN
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "Most of this is cleaning up various driver sysfs permissions so we can
  re-add the perm check (we unified the module param and sysfs checks,
  but the module ones were stronger so we weakened them temporarily).

  Param parsing gets documented, and also "--" now forces args to be
  handed to init (and ignored by the kernel).

  Module NX/RO protections get tightened: we now set them before calling
  parse_args()"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: set nx before marking module MODULE_STATE_COMING.
  samples/kobject/: avoid world-writable sysfs files.
  drivers/hid/hid-picolcd_fb: avoid world-writable sysfs files.
  drivers/staging/speakup/: avoid world-writable sysfs files.
  drivers/regulator/virtual: avoid world-writable sysfs files.
  drivers/scsi/pm8001/pm8001_ctl.c: avoid world-writable sysfs files.
  drivers/hid/hid-lg4ff.c: avoid world-writable sysfs files.
  drivers/video/fbdev/sm501fb.c: avoid world-writable sysfs files.
  drivers/mtd/devices/docg3.c: avoid world-writable sysfs files.
  speakup: fix incorrect perms on speakup_acntsa.c
  cpumask.h: silence warning with -Wsign-compare
  Documentation: Update kernel-parameters.tx
  param: hand arguments after -- straight to init
  modpost: Fix resource leak in read_dump()
2014-06-11 16:09:14 -07:00
Rusty Russell
4982223e51 module: set nx before marking module MODULE_STATE_COMING.
We currently set RO & NX on modules very late: after we move them from
MODULE_STATE_UNFORMED to MODULE_STATE_COMING, and after we call
parse_args() (which can exec code in the module).

Much better is to do it in complete_formation() and then call
the notifier.

This means that the notifiers will be called on a module which
is already RO & NX, so that may cause problems (ftrace already
changed so they're unaffected).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-05-14 10:55:47 +09:30
Linus Torvalds
60b88f3941 Fixed one missing place for the new taint flag, and remove a warning
giving only false positives (now we finally figured out why).
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTYdCQAAoJENkgDmzRrbjxms4QALIGN2l8VugSoh3TSRHSGZtj
 5clH84FXkDR8DFA0w9rYxAsr1EhTadet8U1nCm6LWaz8FAPizH2hyUq6tFMU1+Jk
 zdWRPYLhuUBWW+XVFSeYo2gIclFHEYefawX9SmRcZJxuDy7xHW/bkmX/NT5p/Ll7
 3eKRPckO09agofLQgIOJGL21IQPFXYiCwur5b/OvNfzEkBfRmUALbO2oFhU+oebZ
 2P4M3Wmp7gEGbus2dB23v06BqpEhrdpXlAnvM61PS8exhsQI6ojgL3ZAYEl+6wkr
 whd0SjYs5Sd+3czlQDhlArYlcOlVAhvY4F5CHysEmM/CxjF1YAnk2Q7RLOV958Bk
 TTfDGG2b8qkJwN/2+CymDXyIUIppNPMuPXSOp3XQrRGOz8Uyh1URQD8l24Ssmrtt
 +3fUPDZ6npmtkxZdBu0SkdesCXYOtOeqpqt7MQpJiYbVMxx+ul4LnPB/A1+wf/Xx
 uvXMrpp1fz/hs9ZOK8n+nRMtbsc75LDQ0lYGcbbW8YJRkluf5/GJgyG8ptIvbbFW
 kh90ObVaJ2FN0Uj31POdtsOwM7tf2W5C1lZkE/aWf+wgNylHAylYoUHRIGFOcCqV
 PeWrD0Chz+bzrZk1sT6cHIvTu6u5ShjkOfcEGhWK2JFllxpKO4eZV4O1IaGhWaoV
 Y9JtmJNSOnnS261i1Rmb
 =725P
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module fixes from Rusty Russell:
 "Fixed one missing place for the new taint flag, and remove a warning
  giving only false positives (now we finally figured out why)"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: remove warning about waiting module removal.
  Fix: tracing: use 'E' instead of 'X' for unsigned module taint flag
2014-05-01 10:35:01 -07:00
Steven Rostedt (Red Hat)
a949ae560a ftrace/module: Hardcode ftrace_module_init() call into load_module()
A race exists between module loading and enabling of function tracer.

	CPU 1				CPU 2
	-----				-----
  load_module()
   module->state = MODULE_STATE_COMING

				register_ftrace_function()
				 mutex_lock(&ftrace_lock);
				 ftrace_startup()
				  update_ftrace_function();
				   ftrace_arch_code_modify_prepare()
				    set_all_module_text_rw();
				   <enables-ftrace>
				    ftrace_arch_code_modify_post_process()
				     set_all_module_text_ro();

				[ here all module text is set to RO,
				  including the module that is
				  loading!! ]

   blocking_notifier_call_chain(MODULE_STATE_COMING);
    ftrace_init_module()

     [ tries to modify code, but it's RO, and fails!
       ftrace_bug() is called]

When this race happens, ftrace_bug() will produces a nasty warning and
all of the function tracing features will be disabled until reboot.

The simple solution is to treate module load the same way the core
kernel is treated at boot. To hardcode the ftrace function modification
of converting calls to mcount into nops. This is done in init/main.c
there's no reason it could not be done in load_module(). This gives
a better control of the changes and doesn't tie the state of the
module to its notifiers as much. Ftrace is special, it needs to be
treated as such.

The reason this would work, is that the ftrace_module_init() would be
called while the module is in MODULE_STATE_UNFORMED, which is ignored
by the set_all_module_text_ro() call.

Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com

Reported-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@vger.kernel.org # 2.6.38+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-28 10:37:21 -04:00
Rusty Russell
51e158c12a param: hand arguments after -- straight to init
The kernel passes any args it doesn't need through to init, except it
assumes anything containing '.' belongs to the kernel (for a module).
This change means all users can clearly distinguish which arguments
are for init.

For example, the kernel uses debug ("dee-bug") to mean log everything to
the console, where systemd uses the debug from the Scandinavian "day-boog"
meaning "fail to boot".  If a future versions uses argv[] instead of
reading /proc/cmdline, this confusion will be avoided.

eg: test 'FOO="this is --foo"' -- 'systemd.debug="true true true"'

Gives:
argv[0] = '/debug-init'
argv[1] = 'test'
argv[2] = 'systemd.debug=true true true'
envp[0] = 'HOME=/'
envp[1] = 'TERM=linux'
envp[2] = 'FOO=this is --foo'

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-04-28 11:48:34 +09:30
Rusty Russell
79465d2fd4 module: remove warning about waiting module removal.
We remove the waiting module removal in commit 3f2b9c9cdf (September
2013), but it turns out that modprobe in kmod (< version 16) was
asking for waiting module removal.  No one noticed since modprobe would
check for 0 usage immediately before trying to remove the module, and
the race is unlikely.

However, it means that anyone running old (but not ancient) kmod
versions is hitting the printk designed to see if anyone was running
"rmmod -w".  All reports so far have been false positives, so remove
the warning.

Fixes: 3f2b9c9cdf
Reported-by: Valerio Vanni <valerio.vanni@inwind.it>
Cc: Elliott, Robert (Server Storage) <Elliott@hp.com>
Cc: stable@kernel.org
Acked-by: Lucas De Marchi <lucas.de.marchi@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-04-28 11:06:59 +09:30
Christoph Lameter
08f141d3db modules: use raw_cpu_write for initialization of per cpu refcount.
The initialization of a structure is not subject to synchronization.
The use of __this_cpu would trigger a false positive with the additional
preemption checks for __this_cpu ops.

So simply disable the check through the use of raw_cpu ops.

Trace:

  __this_cpu_write operation in preemptible [00000000] code: modprobe/286
  caller is __this_cpu_preempt_check+0x38/0x60
  CPU: 3 PID: 286 Comm: modprobe Tainted: GF            3.12.0-rc4+ #187
  Call Trace:
    dump_stack+0x4e/0x82
    check_preemption_disabled+0xec/0x110
    __this_cpu_preempt_check+0x38/0x60
    load_module+0xcfd/0x2650
    SyS_init_module+0xa6/0xd0
    tracesys+0xe1/0xe6

Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:14 -07:00
Linus Torvalds
6f4c98e1c2 Nothing major: the stricter permissions checking for sysfs broke
a staging driver; fix included.  Greg KH said he'd take the patch
 but hadn't as the merge window opened, so it's included here
 to avoid breaking build.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJTQMH9AAoJENkgDmzRrbjxo4UP/jwlenP44v+RFpo/dn8Z8E2n
 SREQscU5ZZKvuyFD6kUdvOz8YC/nTrJvXoVkMUF05GVbuvb8/8UPtT9ECVemd0rW
 xNy4aFfv9rbrqRLBLpLK9LAgTuhwlbTgGxgL78zRn3hWmf1hBZWCY+cEvKM8l/+9
 oEQdORL0sUpZh7iryAeGqbOrXT4gqJEvSLOFwiYTSo6ryzWIilmdXSUAh6s8MIEX
 PR1+oH9J8B6J29lcXKMf8/sDI1EBUeSLdBmMCuN5Y7xpYxsQLroVx94kPbdBY+XK
 ZRoYuUGSUJfGRZY46cFKApIGeF07z1DGoyXghbSWEQrI+23TMUmrKUg47LSukE4Y
 yCUf8HAtqIA3gVc9GKDdSp/2UpkAhTTv5ogKgnIzs1InWtOIBdDRSVUQXDosFEXw
 6ZZe1pQs2zfXyXxO4j0Wq36K4RgI0aqOVw+dcC+w5BidjVylgnYRV0PSDd72tid7
 bIfnjDbUBo+o4LanPNGYK474KyO7AslgTE50w6zwbJzgdwCQ36hCpKqScBZzm60a
 42LrgTVoIHHWAL1tDzWL/LzWflZGdJAezzNje0/f2Q3bGMiNHWoljAvUphkTZ7qt
 E8+jWqmM+riH3e8Y5wKpO1BKt7NGHISEy//bUlnqTwisjIzVILZ6VjfugQ1AI+0x
 llTXPBotFvfvXqxunBg7
 =yzUO
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "Nothing major: the stricter permissions checking for sysfs broke a
  staging driver; fix included.  Greg KH said he'd take the patch but
  hadn't as the merge window opened, so it's included here to avoid
  breaking build"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  staging: fix up speakup kobject mode
  Use 'E' instead of 'X' for unsigned module taint flag.
  VERIFY_OCTAL_PERMISSIONS: stricter checking for sysfs perms.
  kallsyms: fix percpu vars on x86-64 with relocation.
  kallsyms: generalize address range checking
  module: LLVMLinux: Remove unused function warning from __param_check macro
  Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE
  module: remove MODULE_GENERIC_TABLE
  module: allow multiple calls to MODULE_DEVICE_TABLE() per module
  module: use pr_cont
2014-04-06 09:38:07 -07:00
Linus Torvalds
176ab02d49 Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 LTO changes from Peter Anvin:
 "More infrastructure work in preparation for link-time optimization
  (LTO).  Most of these changes is to make sure symbols accessed from
  assembly code are properly marked as visible so the linker doesn't
  remove them.

  My understanding is that the changes to support LTO are still not
  upstream in binutils, but are on the way there.  This patchset should
  conclude the x86-specific changes, and remaining patches to actually
  enable LTO will be fed through the Kbuild tree (other than keeping up
  with changes to the x86 code base, of course), although not
  necessarily in this merge window"

* 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  Kbuild, lto: Handle basic LTO in modpost
  Kbuild, lto: Disable LTO for asm-offsets.c
  Kbuild, lto: Add a gcc-ld script to let run gcc as ld
  Kbuild, lto: add ld-version and ld-ifversion macros
  Kbuild, lto: Drop .number postfixes in modpost
  Kbuild, lto, workaround: Don't warn for initcall_reference in modpost
  lto: Disable LTO for sys_ni
  lto: Handle LTO common symbols in module loader
  lto, workaround: Add workaround for initcall reordering
  lto: Make asmlinkage __visible
  x86, lto: Disable LTO for the x86 VDSO
  initconst, x86: Fix initconst mistake in ts5500 code
  initconst: Fix initconst mistake in dcdbas
  asmlinkage: Make trace_hardirqs_on/off_caller visible
  asmlinkage, x86: Fix 32bit memcpy for LTO
  asmlinkage Make __stack_chk_failed and memcmp visible
  asmlinkage: Mark rwsem functions that can be called from assembler asmlinkage
  asmlinkage: Make main_extable_sort_needed visible
  asmlinkage, mutex: Mark __visible
  asmlinkage: Make trace_hardirq visible
  ...
2014-03-31 14:13:25 -07:00
Rusty Russell
57673c2b0b Use 'E' instead of 'X' for unsigned module taint flag.
Takashi Iwai <tiwai@suse.de> says:
> The letter 'X' has been already used for SUSE kernels for very long
> time, to indicate the external supported modules.  Can the new flag be
> changed to another letter for avoiding conflict...?
> (BTW, we also use 'N' for "no support", too.)

Note: this code should be cleaned up, so we don't have such maps in
three places!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-31 14:52:43 +10:30
Dave Jones
8c90487cdc Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC
Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC, so we can repurpose
the flag to encompass a wider range of pushing the CPU beyond its
warrany.

Signed-off-by: Dave Jones <davej@fedoraproject.org>
Link: http://lkml.kernel.org/r/20140226154949.GA770@redhat.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-03-20 16:28:09 -07:00
Mathieu Desnoyers
66cc69e34e Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE
Users have reported being unable to trace non-signed modules loaded
within a kernel supporting module signature.

This is caused by tracepoint.c:tracepoint_module_coming() refusing to
take into account tracepoints sitting within force-loaded modules
(TAINT_FORCED_MODULE). The reason for this check, in the first place, is
that a force-loaded module may have a struct module incompatible with
the layout expected by the kernel, and can thus cause a kernel crash
upon forced load of that module on a kernel with CONFIG_TRACEPOINTS=y.

Tracepoints, however, specifically accept TAINT_OOT_MODULE and
TAINT_CRAP, since those modules do not lead to the "very likely system
crash" issue cited above for force-loaded modules.

With kernels having CONFIG_MODULE_SIG=y (signed modules), a non-signed
module is tainted re-using the TAINT_FORCED_MODULE taint flag.
Unfortunately, this means that Tracepoints treat that module as a
force-loaded module, and thus silently refuse to consider any tracepoint
within this module.

Since an unsigned module does not fit within the "very likely system
crash" category of tainting, add a new TAINT_UNSIGNED_MODULE taint flag
to specifically address this taint behavior, and accept those modules
within Tracepoints. We use the letter 'X' as a taint flag character for
a module being loaded that doesn't know how to sign its name (proposed
by Steven Rostedt).

Also add the missing 'O' entry to trace event show_module_flags() list
for the sake of completeness.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
NAKed-by: Ingo Molnar <mingo@redhat.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: David Howells <dhowells@redhat.com>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-13 12:11:51 +10:30
Jiri Slaby
27bba4d6bb module: use pr_cont
When dumping loaded modules, we print them one by one in separate
printks. Let's use pr_cont as they are continuation prints.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-13 12:10:59 +10:30
Joe Mario
80375980f1 lto: Handle LTO common symbols in module loader
Here is the workaround I made for having the kernel not reject modules
built with -flto.  The clean solution would be to get the compiler to not
emit the symbol.  Or if it has to emit the symbol, then emit it as
initialized data but put it into a comdat/linkonce section.

Minor tweaks by AK over Joe's patch.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1391846481-31491-5-git-send-email-ak@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13 20:24:50 -08:00
Tetsuo Handa
22e669568d module: Add missing newline in printk call.
Add missing \n and also follow commit bddb12b3 "kernel/module.c: use pr_foo()".

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-01-21 09:59:16 +10:30
Linus Torvalds
ce6513f758 Mainly boring here, too. rmmod --wait finally removed, though.
Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSe3ngAAoJENkgDmzRrbjxqkMP/jFwTIVy+tZbPL36xR4C7UI/
 JZ9JU2c2HTyAtqp/T/bljA0QUDQWUASCfwG5WmRgyvMkEwhfuGrQ3dveQLRq5iKD
 Ln/LIN8JXXijhRr+ywhXLAcp1P5ysSJJYYS5lZTCmJ2Cv9jnAvmUl0KqdTEx+ZNH
 YsWBiI9+WmwhODiAdUlqtThDK37w8OsWeMq2agf97bBERlRYnRZvzwy3tSP2mf5j
 4wx8viOdzPC7NVblyX1cj3gonFFQJtMI4s/e787QzkUpNQjvrN3XecPiQX6aBCX3
 seVjuv6panv1tw1HqyU1KXWo7fs2uCc9mVR5Rr3Zok+8qpKWkj0dyCnF3A+ufsrO
 vlkrFLUsv/U1NUkWJM6mJKzMjKRD4iF702QsEEpNA5rlOsAMMGSSlju4eu6GvadI
 ZJ+ZDaNWUDPbWa9Xgjyp+DKWR6vybNgEHZmLmcCdeLt1u8Th1E/ujsKxv4SN6eIO
 2v+lNPjGEivoNXUX52toRZ1324U3FFzburCSA0c55+r1sjPT6SXCfl8kISSKvVtt
 iFemsDxhaSwqVzqbsx3ztU010Z0f9uVbpZHAQgZ514Uk25HtwhkaQSdiIP+cPXE8
 rClzj9m4gD+Jy0T+P0HjPlSxKCGSlgLiEBWEigX36/F4Isv+GL1HjvrGGCWM4VnO
 lIyw5ux/UH8USct9nH4x
 =xg2p
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "Mainly boring here, too.  rmmod --wait finally removed, though"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  modpost: fix bogus 'exported twice' warnings.
  init: fix in-place parameter modification regression
  asmlinkage, module: Make ksymtab and kcrctab symbols and __this_module __visible
  kernel: add support for init_array constructors
  modpost: Optionally ignore secondary errors seen if a single module build fails
  module: remove rmmod --wait option.
2013-11-15 13:27:50 +09:00
Andrew Morton
bddb12b32f kernel/module.c: use pr_foo()
kernel/module.c uses a mix of printk(KERN_foo and pr_foo().  Convert it
all to pr_foo and make the offered cleanups.

Not sure what to do about the printk(KERN_DEFAULT).  We don't have a
pr_default().

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Joe Perches <joe@perches.com>
Cc: Frantisek Hrbata <fhrbata@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:34 +09:00
Frantisek Hrbata
eb3057df73 kernel: add support for init_array constructors
This adds the .init_array section as yet another section with constructors. This
is needed because gcc could add __gcov_init calls to .init_array or .ctors
section, depending on gcc (and binutils) version .

v2: - reuse mod->ctors for .init_array section for modules, because gcc uses
      .ctors or .init_array, but not both at the same time
v3: - fail to load if that does happen somehow.

Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-10-17 15:05:17 +10:30
Rusty Russell
3f2b9c9cdf module: remove rmmod --wait option.
The option to wait for a module reference count to reach zero was in
the initial module implementation, but it was never supported in
modprobe (you had to use rmmod --wait).  After discussion with Lucas,
It has been deprecated (with a 10 second sleep) in kmod for the last
year.

This finally removes it: the flag will evoke a printk warning and a
normal (non-blocking) remove attempt.

Cc: Lucas De Marchi <lucas.de.marchi@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-09-23 15:44:58 +09:30
Linus Torvalds
45d9a2220f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile 1 from Al Viro:
 "Unfortunately, this merge window it'll have a be a lot of small piles -
  my fault, actually, for not keeping #for-next in anything that would
  resemble a sane shape ;-/

  This pile: assorted fixes (the first 3 are -stable fodder, IMO) and
  cleanups + %pd/%pD formats (dentry/file pathname, up to 4 last
  components) + several long-standing patches from various folks.

  There definitely will be a lot more (starting with Miklos'
  check_submount_and_drop() series)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
  direct-io: Handle O_(D)SYNC AIO
  direct-io: Implement generic deferred AIO completions
  add formats for dentry/file pathnames
  kvm eventfd: switch to fdget
  powerpc kvm: use fdget
  switch fchmod() to fdget
  switch epoll_ctl() to fdget
  switch copy_module_from_fd() to fdget
  git simplify nilfs check for busy subtree
  ibmasmfs: don't bother passing superblock when not needed
  don't pass superblock to hypfs_{mkdir,create*}
  don't pass superblock to hypfs_diag_create_files
  don't pass superblock to hypfs_vm_create_files()
  oprofile: get rid of pointless forward declarations of struct super_block
  oprofilefs_create_...() do not need superblock argument
  oprofilefs_mkdir() doesn't need superblock argument
  don't bother with passing superblock to oprofile_create_stats_files()
  oprofile: don't bother with passing superblock to ->create_files()
  don't bother passing sb to oprofile_create_files()
  coh901318: don't open-code simple_read_from_buffer()
  ...
2013-09-05 08:50:26 -07:00
Al Viro
a2e0578be3 switch copy_module_from_fd() to fdget
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-03 23:04:44 -04:00
Li Zhong
942e443127 module: Fix mod->mkobj.kobj potentially freed too early
DEBUG_KOBJECT_RELEASE helps to find the issue attached below.

After some investigation, it seems the reason is:
The mod->mkobj.kobj(ffffffffa01600d0 below) is freed together with mod
itself in free_module(). However, its children still hold references to
it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
child(holders below) tries to decrease the reference count to its parent
in kobject_del(), BUG happens as it tries to access already freed memory.

This patch tries to fix it by waiting for the mod->mkobj.kobj to be
really released in the module removing process (and some error code
paths).

[ 1844.175287] kobject: 'holders' (ffff88007c1f1600): kobject_release, parent ffffffffa01600d0 (delayed)
[ 1844.178991] kobject: 'notes' (ffff8800370b2a00): kobject_release, parent ffffffffa01600d0 (delayed)
[ 1845.180118] kobject: 'holders' (ffff88007c1f1600): kobject_cleanup, parent ffffffffa01600d0
[ 1845.182130] kobject: 'holders' (ffff88007c1f1600): auto cleanup kobject_del
[ 1845.184120] BUG: unable to handle kernel paging request at ffffffffa01601d0
[ 1845.185026] IP: [<ffffffff812cda81>] kobject_put+0x11/0x60
[ 1845.185026] PGD 1a13067 PUD 1a14063 PMD 7bd30067 PTE 0
[ 1845.185026] Oops: 0000 [#1] PREEMPT
[ 1845.185026] Modules linked in: xfs libcrc32c [last unloaded: kprobe_example]
[ 1845.185026] CPU: 0 PID: 18 Comm: kworker/0:1 Tainted: G           O 3.11.0-rc6-next-20130819+ #1
[ 1845.185026] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 1845.185026] Workqueue: events kobject_delayed_cleanup
[ 1845.185026] task: ffff88007ca51f00 ti: ffff88007ca5c000 task.ti: ffff88007ca5c000
[ 1845.185026] RIP: 0010:[<ffffffff812cda81>]  [<ffffffff812cda81>] kobject_put+0x11/0x60
[ 1845.185026] RSP: 0018:ffff88007ca5dd08  EFLAGS: 00010282
[ 1845.185026] RAX: 0000000000002000 RBX: ffffffffa01600d0 RCX: ffffffff8177d638
[ 1845.185026] RDX: ffff88007ca5dc18 RSI: 0000000000000000 RDI: ffffffffa01600d0
[ 1845.185026] RBP: ffff88007ca5dd18 R08: ffffffff824e9810 R09: ffffffffffffffff
[ 1845.185026] R10: ffff8800ffffffff R11: dead4ead00000001 R12: ffffffff81a95040
[ 1845.185026] R13: ffff88007b27a960 R14: ffff88007c1f1600 R15: 0000000000000000
[ 1845.185026] FS:  0000000000000000(0000) GS:ffffffff81a23000(0000) knlGS:0000000000000000
[ 1845.185026] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1845.185026] CR2: ffffffffa01601d0 CR3: 0000000037207000 CR4: 00000000000006b0
[ 1845.185026] Stack:
[ 1845.185026]  ffff88007c1f1600 ffff88007c1f1600 ffff88007ca5dd38 ffffffff812cdb7e
[ 1845.185026]  0000000000000000 ffff88007c1f1640 ffff88007ca5dd68 ffffffff812cdbfe
[ 1845.185026]  ffff88007c974800 ffff88007c1f1640 ffff88007ff61a00 0000000000000000
[ 1845.185026] Call Trace:
[ 1845.185026]  [<ffffffff812cdb7e>] kobject_del+0x2e/0x40
[ 1845.185026]  [<ffffffff812cdbfe>] kobject_delayed_cleanup+0x6e/0x1d0
[ 1845.185026]  [<ffffffff81063a45>] process_one_work+0x1e5/0x670
[ 1845.185026]  [<ffffffff810639e3>] ? process_one_work+0x183/0x670
[ 1845.185026]  [<ffffffff810642b3>] worker_thread+0x113/0x370
[ 1845.185026]  [<ffffffff810641a0>] ? rescuer_thread+0x290/0x290
[ 1845.185026]  [<ffffffff8106bfba>] kthread+0xda/0xe0
[ 1845.185026]  [<ffffffff814ff0f0>] ? _raw_spin_unlock_irq+0x30/0x60
[ 1845.185026]  [<ffffffff8106bee0>] ? kthread_create_on_node+0x130/0x130
[ 1845.185026]  [<ffffffff8150751a>] ret_from_fork+0x7a/0xb0
[ 1845.185026]  [<ffffffff8106bee0>] ? kthread_create_on_node+0x130/0x130
[ 1845.185026] Code: 81 48 c7 c7 28 95 ad 81 31 c0 e8 9b da 01 00 e9 4f ff ff ff 66 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 1d <f6> 87 00 01 00 00 01 74 1e 48 8d 7b 38 83 6b 38 01 0f 94 c0 84
[ 1845.185026] RIP  [<ffffffff812cda81>] kobject_put+0x11/0x60
[ 1845.185026]  RSP <ffff88007ca5dd08>
[ 1845.185026] CR2: ffffffffa01601d0
[ 1845.185026] ---[ end trace 49a70afd109f5653 ]---

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-09-03 16:35:47 +09:30
Chen Gang
cc56ded3fd kernel/module.c: use scnprintf() instead of sprintf()
For some strings, they are permitted to be larger than PAGE_SIZE, so
need use scnprintf() instead of sprintf(), or it will cause issue.

One case is:

  if a module version is crazy defined (length more than PAGE_SIZE),
  'modinfo' command is still OK (print full contents),
  but for "cat /sys/modules/'modname'/version", will cause issue in kernel.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-08-20 15:37:46 +09:30
Steven Rostedt
0ce814096f module: Add NOARG flag for ops with param_set_bool_enable_only() set function
The ops that uses param_set_bool_enable_only() as its set function can
easily handle being used without an argument. There's no reason to
fail the loading of the module if it does not have one.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-08-20 15:37:43 +09:30
Linus Torvalds
8133633368 Nothing interesting. Except the most embarrassing bugfix ever. But let's
ignore that.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJR3NrgAAoJENkgDmzRrbjxaQ0P/RwaKDlmrrFiFD6yYZRQKM1G
 esv3H5j3jU29Eo5WvY1IEoWbMCK4ObL9DL86CCL/XgYZIhJfTiND+R2W8XBSEXec
 qwJXb3oMWVMDBBnxknl/pf0RWhwaFwxsQ89Yco6zsGTWQ0kB6+PAaUNOh91MBOYJ
 UM3nJD79Xqv9X/WcTtPNUmVmmZ7WJFqQ3UYhpTL2WRfRnEO/ku8GwAvLunSFuLBP
 iI4IjSMM8ssLL10g3Cm7J/pzLTTRctmjEZzIclACUrAgm3V+s1NeARztKZiuCVoz
 3Eu0VIp5t2rvifOajj0UF0pdO8Q3XpsMj+lr21gg2rc2VTHKZ3oobEzGX7Ev0lae
 BH1DdW3ivgyq3cBi56Dh3aZDggF24h+QO6qN92s73l0qYK1t+eAg1qc76lsayclB
 ozJJD9/xuEqLV3GScbMdqb4AxTc6Y0ks0SWD07PvoQkthH/2tPiLpSvrITdKdCrY
 0+RssMuO4StM1e8ALjHqDz+/2k8MS70b0e8JUp0e1WaKzMUKKUYZIqQTGqczHH7u
 4hWJiSMnsZqzLLKUX2A6nVlg0eo6zgyqHuxk6O4fHghXWOJJlLjWcMQW26dBS3ki
 FpWYsJeHw6HcITcbWXkdozygf0GOwiaHaI6ivVnB3L8spc6VFD4gvDpbcpAeixGX
 BfFA7nG/Hy3aOT05juMj
 =OVO/
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "Nothing interesting.  Except the most embarrassing bugfix ever.  But
  let's ignore that"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: cleanup call chain.
  module: do percpu allocation after uniqueness check.  No, really!
  modules: don't fail to load on unknown parameters.
  ABI: Clarify when /sys/module/MODULENAME is created
  There is no /sys/parameters
  module: don't modify argument of module_kallsyms_lookup_name()
2013-07-10 14:51:41 -07:00
Rusty Russell
9eb76d7797 module: cleanup call chain.
Fold alloc_module_percpu into percpu_modalloc().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-03 10:15:10 +09:30
Rusty Russell
8d8022e8ab module: do percpu allocation after uniqueness check. No, really!
v3.8-rc1-5-g1fb9341 was supposed to stop parallel kvm loads exhausting
percpu memory on large machines:

    Now we have a new state MODULE_STATE_UNFORMED, we can insert the
    module into the list (and thus guarantee its uniqueness) before we
    allocate the per-cpu region.

In my defence, it didn't actually say the patch did this.  Just that
we "can".

This patch actually *does* it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Jim Hull <jim.hull@hp.com>
Cc: stable@kernel.org # 3.8
2013-07-03 10:15:09 +09:30
Rusty Russell
54041d8a73 modules: don't fail to load on unknown parameters.
Although parameters are supposed to be part of the kernel API, experimental
parameters are often removed.  In addition, downgrading a kernel might cause
previously-working modules to fail to load.

On balance, it's probably better to warn, and load the module anyway.
This may let through a typo, but at least the logs will show it.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:38:21 +09:30
Mathias Krause
4f6de4d51f module: don't modify argument of module_kallsyms_lookup_name()
If we pass a pointer to a const string in the form "module:symbol"
module_kallsyms_lookup_name() will try to split the string at the colon,
i.e., will try to modify r/o data. That will, in fact, fail on a kernel
with enabled CONFIG_DEBUG_RODATA.

Avoid modifying the passed string in module_kallsyms_lookup_name(),
modify find_module_all() instead to pass it the module name length.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-02 15:38:18 +09:30
Steven Rostedt
89c837351d kmemleak: No need for scanning specific module sections
As kmemleak now scans all module sections that are allocated, writable
and non executable, there's no need to scan individual sections that
might reference data.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
2013-05-17 09:53:36 +01:00
Steven Rostedt
06c9494c0e kmemleak: Scan all allocated, writeable and not executable module sections
Instead of just picking data sections by name (names that start
with .data, .bss or .ref.data), use the section flags and scan all
sections that are allocated, writable and not executable. Which should
cover all sections of a module that might reference data.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[catalin.marinas@arm.com: removed unused 'name' variable]
[catalin.marinas@arm.com: collapsed 'if' blocks]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
2013-05-17 09:53:07 +01:00
Rusty Russell
944a1fa012 module: don't unlink the module until we've removed all exposure.
Otherwise we get a race between unload and reload of the same module:
the new module doesn't see the old one in the list, but then fails because
it can't register over the still-extant entries in sysfs:

 [  103.981925] ------------[ cut here ]------------
 [  103.986902] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xab/0xd0()
 [  103.993606] Hardware name: CrownBay Platform
 [  103.998075] sysfs: cannot create duplicate filename '/module/pch_gbe'
 [  104.004784] Modules linked in: pch_gbe(+) [last unloaded: pch_gbe]
 [  104.011362] Pid: 3021, comm: modprobe Tainted: G        W    3.9.0-rc5+ #5
 [  104.018662] Call Trace:
 [  104.021286]  [<c103599d>] warn_slowpath_common+0x6d/0xa0
 [  104.026933]  [<c1168c8b>] ? sysfs_add_one+0xab/0xd0
 [  104.031986]  [<c1168c8b>] ? sysfs_add_one+0xab/0xd0
 [  104.037000]  [<c1035a4e>] warn_slowpath_fmt+0x2e/0x30
 [  104.042188]  [<c1168c8b>] sysfs_add_one+0xab/0xd0
 [  104.046982]  [<c1168dbe>] create_dir+0x5e/0xa0
 [  104.051633]  [<c1168e78>] sysfs_create_dir+0x78/0xd0
 [  104.056774]  [<c1262bc3>] kobject_add_internal+0x83/0x1f0
 [  104.062351]  [<c126daf6>] ? kvasprintf+0x46/0x60
 [  104.067231]  [<c1262ebd>] kobject_add_varg+0x2d/0x50
 [  104.072450]  [<c1262f07>] kobject_init_and_add+0x27/0x30
 [  104.078075]  [<c1089240>] mod_sysfs_setup+0x80/0x540
 [  104.083207]  [<c1260851>] ? module_bug_finalize+0x51/0xc0
 [  104.088720]  [<c108ab29>] load_module+0x1429/0x18b0

We can teardown sysfs first, then to be sure, put the state in
MODULE_STATE_UNFORMED so it's ignored while we deconstruct it.

Reported-by: Veaceslav Falico <vfalico@redhat.com>
Tested-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-04-17 13:23:02 +09:30
James Hogan
a4b6a77b77 module: fix symbol versioning with symbol prefixes
Fix symbol versioning on architectures with symbol prefixes. Although
the build was free from warnings the actual modules still wouldn't load
as the ____versions table contained unprefixed symbol names, which were
being compared against the prefixed symbol names when checking the
symbol versions.

This is fixed by modifying modpost to add the symbol prefix to the
____versions table it outputs (Modules.symvers still contains unprefixed
symbol names). The check_modstruct_version() function is also fixed as
it checks the version of the unprefixed "module_layout" symbol which
would no longer work.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Kliegman <kliegs@chromium.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use VMLINUX_SYMBOL_STR)
2013-03-20 11:27:26 +10:30
Rusty Russell
b92021b09d CONFIG_SYMBOL_PREFIX: cleanup.
We have CONFIG_SYMBOL_PREFIX, which three archs define to the string
"_".  But Al Viro broke this in "consolidate cond_syscall and
SYSCALL_ALIAS declarations" (in linux-next), and he's not the first to
do so.

Using CONFIG_SYMBOL_PREFIX is awkward, since we usually just want to
prefix it so something.  So various places define helpers which are
defined to nothing if CONFIG_SYMBOL_PREFIX isn't set:

1) include/asm-generic/unistd.h defines __SYMBOL_PREFIX.
2) include/asm-generic/vmlinux.lds.h defines VMLINUX_SYMBOL(sym)
3) include/linux/export.h defines MODULE_SYMBOL_PREFIX.
4) include/linux/kernel.h defines SYMBOL_PREFIX (which differs from #7)
5) kernel/modsign_certificate.S defines ASM_SYMBOL(sym)
6) scripts/modpost.c defines MODULE_SYMBOL_PREFIX
7) scripts/Makefile.lib defines SYMBOL_PREFIX on the commandline if
   CONFIG_SYMBOL_PREFIX is set, so that we have a non-string version
   for pasting.

(arch/h8300/include/asm/linkage.h defines SYMBOL_NAME(), too).

Let's solve this properly:
1) No more generic prefix, just CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX.
2) Make linux/export.h usable from asm.
3) Define VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR().
4) Make everyone use them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com> (metag)
2013-03-15 15:09:43 +10:30
Linus Torvalds
d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
Al Viro
3dadecce20 switch vfs_getattr() to struct path
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26 02:46:08 -05:00
Rusty Russell
a3535c7e4f module: clean up load_module a little more.
1fb9341ac3 made our locking in
load_module more complicated: we grab the mutex once to insert the
module in the list, then again to upgrade it once it's formed.

Since the locking is self-contained, it's neater to do this in
separate functions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-21 17:20:09 +10:30
Rusty Russell
373d4d0997 taint: add explicit flag to show whether lock dep is still OK.
Fix up all callers as they were before, with make one change: an
unsigned module taints the kernel, but doesn't turn off lockdep.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-21 17:17:57 +10:30
Rusty Russell
64748a2c90 module: printk message when module signature fail taints kernel.
Reported-by: Chris Samuel <chris@csamuel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-21 17:17:05 +10:30
Linus Torvalds
ee61abb322 module: fix missing module_mutex unlock
Commit 1fb9341ac3 ("module: put modules in list much earlier") moved
some of the module initialization code around, and in the process
changed the exit paths too.  But for the duplicate export symbol error
case the change made the ddebug_cleanup path jump to after the module
mutex unlock, even though it happens with the mutex held.

Rusty has some patches to split this function up into some helper
functions, hopefully the mess of complex goto targets will go away
eventually.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-20 20:22:58 -08:00
Linus Torvalds
226364766f Various minor fixes, but a slightly more complex one to fix the per-cpu overload
problem introduced recently by kvm id changes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ/IaJAAoJENkgDmzRrbjxOjAQAIrI9+Jo3Lsxk1v9gXeo9xn2
 ST4LNv7/oW2+3NFBOkKsGVpcXe1JtGySIXyx9k+dELPa5xe4Rs4HE3pHQj/VoEx8
 FKz3oUXSHkuh+paKuFXvZ2u/z0/FI99GmqHPObvGQ4iS3hTXAibzO83yYYPxwApq
 Zq4kof/dAcVVPLm8fGVAMPA2Rbh/WmjDfrIv8gv71QkDjtRLzcr40VIgky5cvu7V
 FWcBl4/DVoKkGnDPsLDhLK9QGqgBGhFIlNIcVX4Jv50DiCibOyzdjeUXYxMftoGr
 Rw56hHwGpPdqbRIjBkR071vIl/mlXTmxIv+d77vZNBin2MIBwAzCQXo8I1/HojCK
 /wKhI+RFj0J5DaDo/BTB80cmI3X2oah5sRUebW6vd9HjunhFFndg4mVeDNPa0E0+
 F72xWlj79BjdIOuD06TLg6Tg2klL49nC8bUc0wrsh6onEjhd9v7Cp/X/rxi5cKYW
 eEv3oLkKwUHoheF9gBlpnT0Yyl/HpFe+nemblzj/ybRKnk4A5vtJqV9eZnqoOS16
 lgIkKOpgXT9dzSom2EL/f4sMCeLLYC44DQwOvxNKt/BdMY0r5y8OLaJORXQGfEDF
 Ztvu2G8PmELxV0B3JZcGR/zOcKxpOBsrGoVn0/EQIul3A/0C0ID7i5zwJAyX6LP7
 V+6vyF2eHMf10tB0rbfB
 =SpOo
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module fixes and a virtio block fix from Rusty Russell:
 "Various minor fixes, but a slightly more complex one to fix the
  per-cpu overload problem introduced recently by kvm id changes."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: put modules in list much earlier.
  module: add new state MODULE_STATE_UNFORMED.
  module: prevent warning when finit_module a 0 sized file
  virtio-blk: Don't free ida when disk is in use
2013-01-20 16:44:28 -08:00
Tejun Heo
774a1221e8 module, async: async_synchronize_full() on module init iff async is used
If the default iosched is built as module, the kernel may deadlock
while trying to load the iosched module on device probe if the probing
was running off async.  This is because async_synchronize_full() at
the end of module init ends up waiting for the async job which
initiated the module loading.

 async A				modprobe

 1. finds a device
 2. registers the block device
 3. request_module(default iosched)
					4. modprobe in userland
					5. load and init module
					6. async_synchronize_full()

Async A waits for modprobe to finish in request_module() and modprobe
waits for async A to finish in async_synchronize_full().

Because there's no easy to track dependency once control goes out to
userland, implementing properly nested flushing is difficult.  For
now, make module init perform async_synchronize_full() iff module init
has queued async jobs as suggested by Linus.

This avoids the described deadlock because iosched module doesn't use
async and thus wouldn't invoke async_synchronize_full().  This is
hacky and incomplete.  It will deadlock if async module loading nests;
however, this works around the known problem case and seems to be the
best of bad options.

For more details, please refer to the following thread.

  http://thread.gmane.org/gmane.linux.kernel/1420814

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Alex Riesen <raa.lkml@gmail.com>
Tested-by: Ming Lei <ming.lei@canonical.com>
Tested-by: Alex Riesen <raa.lkml@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-16 09:05:33 -08:00
Rusty Russell
1fb9341ac3 module: put modules in list much earlier.
Prarit's excellent bug report:
> In recent Fedora releases (F17 & F18) some users have reported seeing
> messages similar to
>
> [   15.478160] kvm: Could not allocate 304 bytes percpu data
> [   15.478174] PERCPU: allocation failed, size=304 align=32, alloc from
> reserved chunk failed
>
> during system boot.  In some cases, users have also reported seeing this
> message along with a failed load of other modules.
>
> What is happening is systemd is loading an instance of the kvm module for
> each cpu found (see commit e9bda3b).  When the module load occurs the kernel
> currently allocates the modules percpu data area prior to checking to see
> if the module is already loaded or is in the process of being loaded.  If
> the module is already loaded, or finishes load, the module loading code
> releases the current instance's module's percpu data.

Now we have a new state MODULE_STATE_UNFORMED, we can insert the
module into the list (and thus guarantee its uniqueness) before we
allocate the per-cpu region.

Reported-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Prarit Bhargava <prarit@redhat.com>
2013-01-12 13:27:46 +10:30
Rusty Russell
0d21b0e347 module: add new state MODULE_STATE_UNFORMED.
You should never look at such a module, so it's excised from all paths
which traverse the modules list.

We add the state at the end, to avoid gratuitous ABI break (ksplice).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-12 13:27:05 +10:30
Sasha Levin
52441fa8f2 module: prevent warning when finit_module a 0 sized file
If we try to finit_module on a file sized 0 bytes vmalloc will
scream and spit out a warning.

Since modules have to be bigger than 0 bytes anyways we can just
check that beforehand and avoid the warning.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-03 11:10:32 +10:30
Linus Torvalds
7a684c452e Nothing all that exciting; a new module-from-fd syscall for those who want
to verify the source of the module (ChromeOS) and/or use standard IMA on it
 or other security hooks.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ0VKlAAoJENkgDmzRrbjxjuEQALVHpD1cSmryOzVwkNn7rVGP
 PV3KVbUs+qzUCm2c3AafIIlSBm2LOUl+cR3uNC7di8aHarRF3VHkK2OQ4Fx97ECd
 KKBqAyY3R0q1mAKujb/MWwiK0YgosEDIOzGGn2yQhNFsxKqnMB02P4j82IO7+g+w
 Cc3XuDyWHoH2I+ySgz0Q8NHAqufD/DMZUKud7jw2Lsv6PuICJ1Oqgl/Gd/muxort
 4a5tV3tjhRGywHS/8b2fbDUXkybC5NKK0FN+gyoaROmJ/THeHEQDGXZT9bc2vmVx
 HvRy/5k8dzQ6LAJ2mLnPvy0pmv0u7NYMvjxTxxUlUkFMkYuVticikQfwSYDbDPt4
 mbsLxchpgi8z4x8HltEERffCX5tldo/5hz1uemqhqIsMRIrRFnlHkSIgkGjVHf2u
 LXQBLT8uTm6C0VyNQPrI/hUZzIax7WtKbPSoK9lmExNbKqloEFh/mVXvfQxei2kp
 wnUZcnmPIqSvw7b4CWu7HibMYu2VvGBgm3YIfJRi4AQme1mzFYLpZoxF5Pj+Ykbt
 T//Hb1EsNQTTFCg7MZhnJSAw/EVUvNDUoullORClyqw6+xxjVKqWpPJgYDRfWOlJ
 Xa+s7DNrL+Oo1WWR8l5ruoQszbR8szIyeyPKKxRUcQj2zsqghoWuzKAx2saSEw3W
 pNkoJU+dGC7kG/yVAS8N
 =uoJj
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module update from Rusty Russell:
 "Nothing all that exciting; a new module-from-fd syscall for those who
  want to verify the source of the module (ChromeOS) and/or use standard
  IMA on it or other security hooks."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MODSIGN: Fix kbuild output when using default extra_certificates
  MODSIGN: Avoid using .incbin in C source
  modules: don't hand 0 to vmalloc.
  module: Remove a extra null character at the top of module->strtab.
  ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants
  ASN.1: Define indefinite length marker constant
  moduleparam: use __UNIQUE_ID()
  __UNIQUE_ID()
  MODSIGN: Add modules_sign make target
  powerpc: add finit_module syscall.
  ima: support new kernel module syscall
  add finit_module syscall to asm-generic
  ARM: add finit_module syscall to ARM
  security: introduce kernel_module_from_file hook
  module: add flags arg to sys_finit_module()
  module: add syscall to load module from fd
2012-12-19 07:55:08 -08:00
Tao Ma
8ec7d50f1e kernel: remove reference to feature-removal-schedule.txt
In commit 9c0ece069b ("Get rid of Documentation/feature-removal.txt"),
Linus removed feature-removal-schedule.txt from Documentation, but there
is still some reference to this file.  So remove them.

Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:12 -08:00
Rusty Russell
82fab442f5 modules: don't hand 0 to vmalloc.
In commit d0a21265df David Rientjes unified various archs'
module_alloc implementation (including x86) and removed the graduitous
shortcut for size == 0.

Then, in commit de7d2b567d, Joe Perches added a warning for
zero-length vmallocs, which can happen without kallsyms on modules
with no init sections (eg. zlib_deflate).

Fix this once and for all; the module code has to handle zero length
anyway, so get it right at the caller and remove the now-gratuitous
checks within the arch-specific module_alloc implementations.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42608
Reported-by: Conrad Kostecki <ConiKost@gmx.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-14 13:06:43 +10:30
Satoru Takeuchi
54523ec71f module: Remove a extra null character at the top of module->strtab.
There is a extra null character('\0') at the top of module->strtab for
each module. Commit 59ef28b introduced this bug and this patch fixes it.

Live dump log of the current linus git kernel(HEAD is 2844a4870):
============================================================================
crash> mod | grep loop
ffffffffa01db0a0  loop             16689  (not loaded)  [CONFIG_KALLSYMS]
crash> module.core_symtab ffffffffa01db0a0
  core_symtab = 0xffffffffa01db320crash> rd 0xffffffffa01db320 12
ffffffffa01db320:  0000005500000001 0000000000000000   ....U...........
ffffffffa01db330:  0000000000000000 0002007400000002   ............t...
ffffffffa01db340:  ffffffffa01d8000 0000000000000038   ........8.......
ffffffffa01db350:  001a00640000000e ffffffffa01daeb0   ....d...........
ffffffffa01db360:  00000000000000a0 0002007400000019   ............t...
ffffffffa01db370:  ffffffffa01d8068 000000000000001b   h...............
crash> module.core_strtab ffffffffa01db0a0
  core_strtab = 0xffffffffa01dbb30 ""
crash> rd 0xffffffffa01dbb30 4
ffffffffa01dbb30:  615f70616d6b0000 66780063696d6f74   ..kmap_atomic.xf
ffffffffa01dbb40:  73636e75665f7265 72665f646e696600   er_funcs.find_fr
============================================================================

We expect Just first one byte of '\0', but actually first two bytes
are '\0'. Here is The relationship between symtab and strtab.

	symtab_idx	strtab_idx	symbol
	-----------------------------------------------
	0		0x1		"\0" # startab_idx should be 0
	1		0x2		"kmap_atomic"
	2		0xe		"xfer_funcs"
	3		0x19		"find_fr..."

By applying this patch, it becomes as follows.

	symtab_idx	strtab_idx	symbol
	-----------------------------------------------
	0		0x0		"\0"	# extra byte is removed
	1		0x1		"kmap_atomic"
	2		0xd		"xfer_funcs"
	3		0x18		"find_fr..."

Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Cc: Masaki Kimura <masaki.kimura.kz@hitachi.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-14 13:06:43 +10:30
Kees Cook
2e72d51b4a security: introduce kernel_module_from_file hook
Now that kernel module origins can be reasoned about, provide a hook to
the LSMs to make policy decisions about the module file. This will let
Chrome OS enforce that loadable kernel modules can only come from its
read-only hash-verified root filesystem. Other LSMs can, for example,
read extended attributes for signatures, etc.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-14 13:05:24 +10:30
Rusty Russell
2f3238aebe module: add flags arg to sys_finit_module()
Thanks to Michael Kerrisk for keeping us honest.  These flags are actually
useful for eliminating the only case where kmod has to mangle a module's
internals: for overriding module versioning.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Acked-by: Kees Cook <keescook@chromium.org>
2012-12-14 13:05:23 +10:30
Kees Cook
34e1169d99 module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.

Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.

If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on.  In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.

This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-12-14 13:05:22 +10:30
Rusty Russell
59ef28b1f1 module: fix out-by-one error in kallsyms
Masaki found and patched a kallsyms issue: the last symbol in a
module's symtab wasn't transferred.  This is because we manually copy
the zero'th entry (which is always empty) then copy the rest in a loop
starting at 1, though from src[0].  His fix was minimal, I prefer to
rewrite the loops in more standard form.

There are two loops: one to get the size, and one to copy.  Make these
identical: always count entry 0 and any defined symbol in an allocated
non-init section.

This bug exists since the following commit was introduced.
   module: reduce symbol table for loaded modules (v2)
   commit: 4a4962263f

LKML: http://lkml.org/lkml/2012/10/24/27
Reported-by: Masaki Kimura <masaki.kimura.kz@hitachi.com>
Cc: stable@kernel.org
2012-10-31 13:56:37 +10:30
David Howells
caabe24057 MODSIGN: Move the magic string to the end of a module and eliminate the search
Emit the magic string that indicates a module has a signature after the
signature data instead of before it.  This allows module_sig_check() to
be made simpler and faster by the elimination of the search for the
magic string.  Instead we just need to do a single memcmp().

This works because at the end of the signature data there is the
fixed-length signature information block.  This block then falls
immediately prior to the magic number.

From the contents of the information block, it is trivial to calculate
the size of the signature data and thus the size of the actual module
data.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-19 17:30:40 -07:00
David Howells
1d0059f3a4 MODSIGN: Add FIPS policy
If we're in FIPS mode, we should panic if we fail to verify the signature on a
module or we're asked to load an unsigned module in signature enforcing mode.
Possibly FIPS mode should automatically enable enforcing mode.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-10 20:01:19 +10:30
Rusty Russell
106a4ee258 module: signature checking hook
We do a very simple search for a particular string appended to the module
(which is cache-hot and about to be SHA'd anyway).  There's both a config
option and a boot parameter which control whether we accept or fail with
unsigned modules and modules that are signed with an unknown key.

If module signing is enabled, the kernel will be tainted if a module is
loaded that is unsigned or has a signature for which we don't have the
key.

(Useful feedback and tweaks by David Howells <dhowells@redhat.com>)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-10 20:00:55 +10:30
Rusty Russell
9bb9c3be56 module: wait when loading a module which is currently initializing.
The original module-init-tools module loader used a fnctl lock on the
.ko file to avoid attempts to simultaneously load a module.
Unfortunately, you can't get an exclusive fcntl lock on a read-only
fd, making this not work for read-only mounted filesystems.
module-init-tools has a hacky sleep-and-loop for this now.

It's not that hard to wait in the kernel, and only return -EEXIST once
the first module has finished loading (or continue loading the module
if the first one failed to initialize for some reason).  It's also
consistent with what we do for dependent modules which are still loading.

Suggested-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 14:31:03 +09:30
Rusty Russell
6f13909f4f module: fix symbol waiting when module fails before init
We use resolve_symbol_wait(), which blocks if the module containing
the symbol is still loading.  However:

1) The module_wq we use is only woken after calling the modules' init
   function, but there are other failure paths after the module is
   placed in the linked list where we need to do the same thing.

2) wake_up() only wakes one waiter, and our waitqueue is shared by all
   modules, so we need to wake them all.

3) wake_up_all() doesn't imply a memory barrier: I feel happier calling
   it after we've grabbed and dropped the module_mutex, not just after
   the state assignment.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 14:31:03 +09:30
David Howells
786d35d45c Make most arch asm/module.h files use asm-generic/module.h
Use the mapping of Elf_[SPE]hdr, Elf_Addr, Elf_Sym, Elf_Dyn, Elf_Rel/Rela,
ELF_R_TYPE() and ELF_R_SYM() to either the 32-bit version or the 64-bit version
into asm-generic/module.h for all arches bar MIPS.

Also, use the generic definition mod_arch_specific where possible.

To this end, I've defined three new config bools:

 (*) HAVE_MOD_ARCH_SPECIFIC

     Arches define this if they don't want to use the empty generic
     mod_arch_specific struct.

 (*) MODULES_USE_ELF_RELA

     Arches define this if their modules can contain RELA records.  This causes
     the Elf_Rela mapping to be emitted and allows apply_relocate_add() to be
     defined by the arch rather than have the core emit an error message.

 (*) MODULES_USE_ELF_REL

     Arches define this if their modules can contain REL records.  This causes
     the Elf_Rel mapping to be emitted and allows apply_relocate() to be
     defined by the arch rather than have the core emit an error message.

Note that it is possible to allow both REL and RELA records: m68k and mips are
two arches that do this.

With this, some arch asm/module.h files can be deleted entirely and replaced
with a generic-y marker in the arch Kbuild file.

Additionally, I have removed the bits from m32r and score that handle the
unsupported type of relocation record as that's now handled centrally.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-28 14:31:03 +09:30
Matthew Garrett
c99af3752b module: taint kernel when lve module is loaded
Cloudlinux have a product called lve that includes a kernel module. This
was previously GPLed but is now under a proprietary license, but the
module continues to declare MODULE_LICENSE("GPL") and makes use of some
EXPORT_SYMBOL_GPL symbols. Forcibly taint it in order to avoid this.

Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Alex Lyashkov <umka@cloudlinux.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2012-09-28 14:31:02 +09:30
David Howells
ef26a5a6ea Guard check in module loader against integer overflow
The check:

	if (len < hdr->e_shoff + hdr->e_shnum * sizeof(Elf_Shdr))

may not work if there's an overflow in the right-hand side of the condition.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-05-23 22:28:53 +09:30
Jim Cromie
b48420c1d3 dynamic_debug: make dynamic-debug work for module initialization
This introduces a fake module param $module.dyndbg.  Its based upon
Thomas Renninger's $module.ddebug boot-time debugging patch from
https://lkml.org/lkml/2010/9/15/397

The 'fake' module parameter is provided for all modules, whether or
not they need it.  It is not explicitly added to each module, but is
implemented in callbacks invoked from parse_args.

For builtin modules, dynamic_debug_init() now directly calls
parse_args(..., &ddebug_dyndbg_boot_params_cb), to process the params
undeclared in the modules, just after the ddebug tables are processed.

While its slightly weird to reprocess the boot params, parse_args() is
already called repeatedly by do_initcall_levels().  More importantly,
the dyndbg queries (given in ddebug_query or dyndbg params) cannot be
activated until after the ddebug tables are ready, and reusing
parse_args is cleaner than doing an ad-hoc parse.  This reparse would
break options like inc_verbosity, but they probably should be params,
like verbosity=3.

ddebug_dyndbg_boot_params_cb() handles both bare dyndbg (aka:
ddebug_query) and module-prefixed dyndbg params, and ignores all other
parameters.  For example, the following will enable pr_debug()s in 4
builtin modules, in the order given:

  dyndbg="module params +p; module aio +p" module.dyndbg=+p pci.dyndbg

For loadable modules, parse_args() in load_module() calls
ddebug_dyndbg_module_params_cb().  This handles bare dyndbg params as
passed from modprobe, and errors on other unknown params.

Note that modprobe reads /proc/cmdline, so "modprobe foo" grabs all
foo.params, strips the "foo.", and passes these to the kernel.
ddebug_dyndbg_module_params_cb() is again called for the unknown
params; it handles dyndbg, and errors on others.  The "doing" arg
added previously contains the module name.

For non CONFIG_DYNAMIC_DEBUG builds, the stub function accepts
and ignores $module.dyndbg params, other unknowns get -ENOENT.

If no param value is given (as in pci.dyndbg example above), "+p" is
assumed, which enables all pr_debug callsites in the module.

The dyndbg fake parameter is not shown in /sys/module/*/parameters,
thus it does not use any resources.  Changes to it are made via the
control file.

Also change pr_info in ddebug_exec_queries to vpr_info,
no need to see it all the time.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
CC: Thomas Renninger <trenn@suse.de>
CC: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-30 14:31:46 -04:00
Sasha Levin
f946eeb931 module: Remove module size limit
Module size was limited to 64MB, this was legacy limitation due to vmalloc()
which was removed a while ago.

Limiting module size to 64MB is both pointless and affects real world use
cases.

Cc: Tim Abbott <tim.abbott@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-26 12:50:53 +10:30
Steven Rostedt
d53799be67 module: move __module_get and try_module_get() out of line.
With the preempt, tracepoint and everything, it's getting a bit
chubby.  For an Ubuntu-based config:

Before:
	$ size -t `find * -name '*.ko'` | grep TOTAL
	56199906        3870760	1606616	61677282	3ad1ee2	(TOTALS)
	$ size vmlinux
	   text	   data	    bss	    dec	    hex	filename
	8509342	 850368	3358720	12718430	 c2115e	vmlinux

After:
	$ size -t `find * -name '*.ko'` | grep TOTAL
	56183760	3867892	1606616	61658268	3acd49c	(TOTALS)
	$ size vmlinux
	   text	   data	    bss	    dec	    hex	filename
	8501842	 849088	3358720	12709650	 c1ef12	vmlinux

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (made all out-of-line)
2012-03-26 12:50:52 +10:30
Pawel Moll
026cee0086 params: <level>_initcall-like kernel parameters
This patch adds a set of macros that can be used to declare
kernel parameters to be parsed _before_ initcalls at a chosen
level are executed.  We rename the now-unused "flags" field of
struct kernel_param as the level.  It's signed, for when we
use this for early params as well, in future.

Linker macro collating init calls had to be modified in order
to add additional symbols between levels that are later used
by the init code to split the calls into blocks.

Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-26 12:50:51 +10:30
Dave Young
02608bef8f module: add kernel param to force disable module load
Sometimes we need to test a kernel of same version with code or config
option changes.

We already have sysctl to disable module load, but add a kernel
parameter will be more convenient.

Since modules_disabled is int, so here use bint type in core_param.
TODO: make sysctl accept bool and change modules_disabled to bool

Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-03-26 12:50:50 +10:30
Kevin Winchester
53999bf34d error: implicit declaration of function 'module_flags_taint'
Recent changes to kernel/module.c caused the following compile
error:

  kernel/module.c: In function ‘show_taint’:
  kernel/module.c:1024:2: error: implicit declaration of function ‘module_flags_taint’ [-Werror=implicit-function-declaration]
  cc1: some warnings being treated as errors

Correct this error by moving the definition of module_flags_taint
outside of the #ifdef CONFIG_MODULE_UNLOAD section.

Signed-off-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-15 16:21:07 -08:00
Kay Sievers
cca3e70730 modules: sysfs - export: taint, coresize, initsize
Recent tools do not want to use /proc to retrieve module information. A few
values are currently missing from sysfs to replace the information available
in /proc/modules.

This adds /sys/module/*/{coresize,initsize,taint} attributes.

TAINT_PROPRIETARY_MODULE (P) and TAINT_OOT_MODULE (O) flags are both always
shown now, and do no longer exclude each other, also in /proc/modules.

Replace the open-coded sysfs attribute initializers with the __ATTR() macro.

Add the new attributes to Documentation/ABI.

Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:15 +10:30
Jim Cromie
5e12416927 module: replace DEBUGP with pr_debug
Use more flexible pr_debug.  This allows:

  echo "module module +p" > /dbg/dynamic_debug/control

to turn on debug messages when needed.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:15 +10:30
Eric Dumazet
bd77c04772 module: struct module_ref should contains long fields
module_ref contains two "unsigned int" fields.

Thats now too small, since some machines can open more than 2^32 files.

Check commit 518de9b39e (fs: allow for more than 2^31 files) for
reference.

We can add an aligned(2 * sizeof(unsigned long)) attribute to force
alloc_percpu() allocating module_ref areas in single cache lines.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Tejun Heo <tj@kernel.org>
CC: Robin Holt <holt@sgi.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:14 +10:30
Kevin Cernekee
48fd11880b module: Fix performance regression on modules with large symbol tables
Looking at /proc/kallsyms, one starts to ponder whether all of the extra
strtab-related complexity in module.c is worth the memory savings.

Instead of making the add_kallsyms() loop even more complex, I tried the
other route of deleting the strmap logic and naively copying each string
into core_strtab with no consideration for consolidating duplicates.

Performance on an "already exists" insmod of nvidia.ko (runs
add_kallsyms() but does not actually initialize the module):

	Original scheme: 1.230s
	With naive copying: 0.058s

Extra space used: 35k (of a 408k module).

Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <73defb5e4bca04a6431392cc341112b1@localhost>
2012-01-13 09:32:14 +10:30
Kevin Cernekee
70b1e9161e module: Add comments describing how the "strmap" logic works
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:13 +10:30
Linus Torvalds
32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Ben Hutchings
2449b8ba07 module,bug: Add TAINT_OOT_MODULE flag for modules not built in-tree
Use of the GPL or a compatible licence doesn't necessarily make the code
any good.  We already consider staging modules to be suspect, and this
should also be true for out-of-tree modules which may receive very
little review.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Reviewed-by: Dave Jones <davej@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (patched oops-tracing.txt)
2011-11-07 07:54:42 +10:30
Ben Hutchings
1cd0d6c302 module: Enable dynamic debugging regardless of taint
Dynamic debugging is currently disabled for tainted modules, except
for TAINT_CRAP.  This prevents use of dynamic debugging for
out-of-tree modules once the next patch is applied.

This condition was apparently intended to avoid a crash if a force-
loaded module has an incompatible definition of dynamic debug
structures.  However, a administrator that forces us to load a module
is claiming that it *is* compatible even though it fails our version
checks.  If they are mistaken, there are any number of ways the module
could crash the system.

As a side-effect, proprietary and other tainted modules can now use
dynamic_debug.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-11-07 07:54:40 +10:30
Paul Gortmaker
9984de1a5a kernel: Map most files to use export.h instead of module.h
The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else.  Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

  -#include <linux/module.h>
  +#include <linux/export.h>

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 09:20:12 -04:00
Mathieu Desnoyers
b75ef8b44b Tracepoint: Dissociate from module mutex
Copy the information needed from struct module into a local module list
held within tracepoint.c from within the module coming/going notifier.

This vastly simplifies locking of tracepoint registration /
unregistration, because we don't have to take the module mutex to
register and unregister tracepoints anymore. Steven Rostedt ran into
dependency problems related to modules mutex vs kprobes mutex vs ftrace
mutex vs tracepoint mutex that seems to be hard to fix without removing
this dependency between tracepoint and module mutex. (note: it should be
investigated whether kprobes could benefit of being dissociated from the
modules mutex too.)

This also fixes module handling of tracepoint list iterators, because it
was expecting the list to be sorted by pointer address. Given we have
control on our own list now, it's OK to sort this list which has
tracepoints as its only purpose. The reason why this sorting is required
is to handle the fact that seq files (and any read() operation from
user-space) cannot hold the tracepoint mutex across multiple calls, so
list entries may vanish between calls. With sorting, the tracepoint
iterator becomes usable even if the list don't contain the exact item
pointed to by the iterator anymore.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Jason Baron <jbaron@redhat.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/20110810191839.GC8525@Krystal
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-08-10 20:38:14 -04:00
Kay Sievers
88bfa32479 module: add /sys/module/<name>/uevent files
Userspace wants to manage module parameters with udev rules.
This currently only works for loaded modules, but not for
built-in ones.

To allow access to the built-in modules we need to
re-trigger all module load events that happened before any
userspace was running. We already do the same thing for all
devices, subsystems(buses) and drivers.

This adds the currently missing /sys/module/<name>/uevent files
to all module entries.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (split & trivial fix)
2011-07-24 22:06:04 +09:30
Kay Sievers
4befb026cf module: change attr callbacks to take struct module_kobject
This simplifies the next patch, where we have an attribute on a
builtin module (ie. module == NULL).

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (split into 2)
2011-07-24 22:06:04 +09:30
Jonas Bonn
74e08fcf7b modules: add default loader hook implementations
The module loader code allows architectures to hook into the code by
providing a small number of entry points that each arch must implement.
This patch provides __weakly linked generic implementations of these
entry points for architectures that don't need to do anything special.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-07-24 22:06:04 +09:30
Linus Torvalds
1f3a8e093f Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
* 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (970 commits)
  staging: usbip: replace usbip_u{dbg,err,info} and printk with dev_ and pr_
  staging:iio: Trivial kconfig reorganization and uniformity improvements.
  staging:iio:documenation partial update.
  staging:iio: use pollfunc allocation helpers in remaining drivers.
  staging:iio:max1363 misc cleanups and use of for_each_bit_set to simplify event code spitting out.
  staging:iio: implement an iio_info structure to take some of the constant elements out of iio_dev.
  staging:iio:meter:ade7758: Use private data space from iio_allocate_device
  staging:iio:accel:lis3l02dq make write_reg_8 take value not a pointer to value.
  staging:iio: ring core cleanups + check if read_last available in lis3l02dq
  staging:iio:core cleanup: squash tiny wrappers and use dev_set_name to handle creation of event interface name.
  staging:iio: poll func allocation clean up.
  staging:iio:ad7780 trivial unused header cleanup.
  staging:iio:adc: AD7780: Use private data space from iio_allocate_device + trivial fixes
  staging:iio:adc:AD7780: Convert to new channel registration method
  staging:iio:adc: AD7606: Drop dev_data in favour of iio_priv()
  staging:iio:adc: AD7606: Consitently use indio_dev
  staging:iio: Rip out helper for software rings.
  staging:iio:adc:AD7298: Use private data space from iio_allocate_device
  staging:iio: rationalization of different buffer implementation hooks.
  staging:iio:imu:adis16400 avoid allocating rx, tx, and state separately from iio_dev.
  ...

Fix up trivial conflicts in
 - drivers/staging/intel_sst/intelmid.c: patches applied in both branches
 - drivers/staging/rt2860/common/cmm_data_{pci,usb}.c: removed vs spelling
 - drivers/staging/usbip/vhci_sysfs.c: trivial header file inclusion
2011-05-23 12:49:28 -07:00
Alessio Igor Bogani
9d63487f86 module: Use binary search in lookup_symbol()
The function is_exported() with its helper function lookup_symbol() are used to
verify if a provided symbol is effectively exported by the kernel or by the
modules. Now that both have their symbols sorted we can replace a linear search
with a binary search which provide a considerably speed-up.

This work was supported by a hardware donation from the CE Linux Forum.

Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19 16:55:27 +09:30
Alessio Igor Bogani
403ed27846 module: Use the binary search for symbols resolution
Takes advantage of the order and locates symbols using binary search.

This work was supported by a hardware donation from the CE Linux Forum.

Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dirk Behme <dirk.behme@googlemail.com>
2011-05-19 16:55:27 +09:30
Rusty Russell
de4d8d5346 module: each_symbol_section instead of each_symbol
Instead of having a callback function for each symbol in the kernel,
have a callback for each array of symbols.

This eases the logic when we move to sorted symbols and binary search.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
2011-05-19 16:55:26 +09:30
Jan Glauber
01526ed083 module: split unset_section_ro_nx function.
Split the unprotect function into a function per section to make
the code more readable and add the missing static declaration.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19 16:55:26 +09:30
Jan Glauber
448694a1d5 module: undo module RONX protection correctly.
While debugging I stumbled over two problems in the code that protects module
pages.

First issue is that disabling the protection before freeing init or unload of
a module is not symmetric with the enablement. For instance, if pages are set
to RO the page range from module_core to module_core + core_ro_size is
protected. If a module is unloaded the page range from module_core to
module_core + core_size is set back to RW.
So pages that were not set to RO are also changed to RW.
This is not critical but IMHO it should be symmetric.

Second issue is that while set_memory_rw & set_memory_ro are used for
RO/RW changes only set_memory_nx is involved for NX/X. One would await that
the inverse function is called when the NX protection should be removed,
which is not the case here, unless I'm missing something.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19 16:55:26 +09:30
Jan Glauber
4d10380e72 module: zero mod->init_ro_size after init is freed.
Reset mod->init_ro_size to zero after the init part of a module is unloaded.
Otherwise we need to check if module->init is NULL in the unprotect functions
in the next patch.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19 16:55:26 +09:30
Daniel J Blueman
5d05c70849 minor ANSI prototype sparse fix
Fix function prototype to be ANSI-C compliant, consistent with other
function prototypes, addressing a sparse warning.

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-19 16:55:25 +09:30
Roland Vossen
7816c45bf1 modules: Enabled dynamic debugging for staging modules
Driver modules from the staging directory are marked 'tainted'
by module.c. Subsequently, tainted modules are denied dynamic
debugging. This is unwanted behavior, since staging modules should
be able to use the dynamic debugging mechanism.

Please merge this also into the staging-linus branch.

Signed-off-by: Roland Vossen <rvossen@broadcom.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-25 16:45:22 -07:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Kees Cook
9f36e2c448 printk: use %pK for /proc/kallsyms and /proc/modules
In an effort to reduce kernel address leaks that might be used to help
target kernel privilege escalation exploits, this patch uses %pK when
displaying addresses in /proc/kallsyms, /proc/modules, and
/sys/module/*/sections/*.

Note that this changes %x to %p, so some legitimately 0 values in
/proc/kallsyms would have changed from 00000000 to "(null)".  To avoid
this, "(null)" is not used when using the "K" format.  Anything that was
already successfully parsing "(null)" in addition to full hex digits
should have no problem with this change.  (Thanks to Joe Perches for the
suggestion.) Due to the %x to %p, "void *" casts are needed since these
addresses are already "unsigned long" everywhere internally, due to their
starting life as ELF section offsets.

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Cc: Eugene Teo <eugene@redhat.com>
Cc: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 17:44:12 -07:00