By aligning n_buckets to power of 2 we can turn the "bucket = hashval %
n_buckets" into a less expensive bucket = hashval & (n_buckets - 1).
This removes the DIV instruction as shown below.
Before:
xor %edx,%edx
divl 0x8(%rbx)
mov %edx,%eax
add $0x1,%rax
shl $0x4,%rax
add %rbx,%rax
After:
lea -0x1(%rdi),%edx
and %edx,%eax
add $0x1,%rax
shl $0x4,%rax
add %rbx,%rax
With a microbenchmark, measuring the time to locate the bucket (i.e.
time_to_calculate_hashval + time_to_calculate_bucket_position) we have
the results below (time in clock cycles):
keylen before after
2-10 79.0 61.9 (-21.65%)
11-17 81.0 64.4 (-20.48%)
18-25 90.0 73.2 (-18.69%)
26-32 104.7 87.0 (-16.82%)
33-40 108.4 89.6 (-17.37%)
41-48 111.2 91.9 (-17.38%)
49-55 120.1 102.1 (-15.04%)
56-63 134.4 115.7 (-13.91%)
As expected the gain is constant, regardless of the key length.
The time to clculate the hashval varies with the key length, which
explains the bigger gains for short keys.
Add static inline function to align a value to it's next power of 2.
This is commonly done by a SWAR like the one in:
http://aggregate.org/MAGIC/#Next Largest Power of 2
However a microbench shows that the implementation herer is a faster.
It doesn't really impact the possible user of this function, but it's
interesting nonetheless.
Using a x86_64 i7 Ivy Bridge it shows a ~4% advantage by using clz
instead instead of the OR and SHL chain. And this is by using a BSR
since Ivy Bridge doesn't have LZCNT. New Haswell processors have the
LZCNT instruction which can make this even better. ARM also has a CLZ
instruction so it should be better, too.
Code used to test:
...
v = val[i];
t1 = get_cycles(0);
a = ALIGN_POWER2(v);
t1 = get_cycles(t1);
t2 = get_cycles(0);
v = nlpo2(v);
t2 = get_cycles(t2);
printf("%u\t%llu\t%llu\t%d\n", v, t1, t2, v == a);
...
In which val is an array of 20 random unsigned int, nlop2 is the SWAR
implementation and get_cycles uses RDTSC to measure the performance.
Averages:
ALIGN_POWER2: 30 cycles
nlop2: 31.4 cycles
It's used in so many places without checking, that's really pointless to
check for it in macro.h.
Also remove AC_C_TYPEOF from configure.ac since we don't use -ansi.
Commit 8efede20ef ("Use _Static_assert") introduced the usage of
_Static_assert(). However, _Static_assert() is a fairly new thing,
since it was introduced only in gcc 4.6. In order to support older
compilers, this patch adds a configure.in test that checks whether
_Static_assert() is usable or not, and adjust the behavior of the
assert_cc() macro accordingly.
With readdir_r() we should be providing enough space to store the dir
name. This could be accomplished by define an union like systemd does:
union dirent_storage {
struct dirent de;
uint8_t storage[offsetof(struct dirent, d_name) +
((NAME_MAX + 1 + sizeof(long)) & ~(sizeof(long) - 1))];
};
However in all places that we use readdir_r() we have no concerns about
reentrance nor we have problems with threads. Thus use the simpler
readdir() instead.
We also remove the error logging here (that could be added back by
checking errno), but it was not adding much value so it's gone.
It occurred to an openSUSE user that our mkinitrd would throw a
warning when used with kmod:
libkmod: conf_files_list: unsupported file mode /dev/null: 0x21b6
Grepping for the error message revealed that there might be a missing
"else" keyword here, since it is unusual to put an "if" directly after
closing brace.
At least in qemu 1.4.1 for vexpress/arm-cortexa9, this resulted in an
illegal instruction error. Solve that by returning an error when
__NR_finit_module is -1.
This reverts commit 38829712e5. It fixes
the problem, but it breaks the testsuite for those who don't have
__NR_finit_module. The testsuite would have to make the same check.
Instead, I'm reverting this change and I'm going to apply another patch
from Jan Luebbe who got this right from the beginning.
There are several exported enums by libkmod without document, this patch
mainly added documentation for below enums like the way kmod_resources
be documented in.
* kmod_index
* kmod_remove
* kmod_insert
* kmod_probe
* kmod_filter
* kmod_module_initstate
This is not the best way to document these exported enums, however, it's
the simple way due to gtkdoc limits. It doesn't support export plain
enum like below: see https://bugzilla.gnome.org/show_bug.cgi?id=657444
---------8<-------head.h--------------8<-----------
...
enum foo {
...
};
...
---------8<-------end of head.h-------8<-----------
---------8<-------source.c------------8<-----------
...
/**
* document for foo here
*/
...
typedef enum foo foo;
...
---------8<-------end of source.c-----8<----------
Add __attribute__((format)) to log_filep() and _show() functions, fixing
the bugs they found in the source code.
For functions that receive va_list instead of being variadic functions
we put 0 in the last argument, so at least the string is checked and we
get warnings of -Wformat-nonliteral type. So, it's better than adding a
pragma here to shut up the warning.
Check for finit_module() and don't use our own static inline function if
there's such function in libc (or another lib).
In testsuite we need to unconditionally define HAVE_FINIT_MODULE because
we want to override this function, and never use the static inline one
in missing.h
Depending on kernel header and simply not passing the flags in
finit_module() if this header is not found is not good.
Add a missing.h header in which stuff like this should be added.
"The secure_getenv() function is intended for use in general-purpose
libraries to avoid vulnerabilities that could occur if set-user-ID or
set-group-ID programs accidentally trusted the environment."
Fix compilation issue with musl-libc:
CC libkmod/libkmod-list.lo
In file included from libkmod/libkmod-private.h:183:0,
from libkmod/libkmod-list.c:24:
libkmod/libkmod-util.h:33:45: warning: 'struct stat' declared inside parameter list [enabled by default]
libkmod/libkmod-util.h:33:45: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
When a module is being loaded directly from disk (no compression, etc),
pass the file descriptor to the new finit_module() syscall. If the
finit_module syscall is exported by the kernel syscall headers, use it.
Additionally, if the kernel's module.h file is available, map kmod flags
to finit_module flags.
If the module is built with CONFIG_MODULE_SIG, add the the signer's
name, hexadecimal key id and hash algorithm to the list returned in
kmod_module_get_info(). The modinfo output then looks like this:
filename: /home/mmarek/kmod/testsuite/rootfs-pristine/test-modinfo/ext4-x86_64-sha256.ko
license: GPL
description: Fourth Extended Filesystem
author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts'o and others
alias: ext3
alias: ext2
depends: mbcache,jbd2
intree: Y
vermagic: 3.7.0 SMP mod_unload
signer: Magrathea: Glacier signing key
sig_key: E3:C8:FC:A7:3F:B3:1D:DE:84:81:EF:38:E3:4C:DE:4B:0C:FD:1B:F9
sig_hashalgo: sha256
The signature algorithm (RSA) and key identifier type (X509) are not
displayed, because they are constant information for every signed
module. But it would be trivial to add this. Note: No attempt is made at
verifying the signature, I don't think that modinfo is the right tool
for this.
When told to force load a module, we were removing only the value of
vermagic instead of the complete entry.
Philippe De Swert (philippe.deswert@jollamobile.com) sent a patch that
was additionally mangling also the last two chars of the key
("vermagic="). Instead of creating an invalid entry in .modinfo section
like this, this patch removes the complete entry, key + value, by
zeroing the entire string.
Much thanks to Philippe who found the issue and pointed to the fix.
If we are accessing several times the modules and reading some sections
by sucessive calls to the functions below, we are incurring in a penalty
of having to open, parse the header and close the file. For each
function.
- kmod_module_get_info()
- kmod_module_get_versions()
- kmod_module_get_symbols()
- kmod_module_get_dependency_symbols()
These functions are particularly important to depmod. It calls all of
them, for each module. Moreover there's a huge bottleneck in the open
operation if we are using compression. Every time we open the module we
need to uncompress the file and after getting the information we need we
discard the result. This is clearly shown by profiling depmod with perf
(record + report), using compressed modules:
64.07% depmod libz.so.1.2.7 [.] 0x00000000000074b8 ◆
18.18% depmod libz.so.1.2.7 [.] crc32 ▒
2.42% depmod libz.so.1.2.7 [.] inflate ▒
1.17% depmod libc-2.16.so [.] __memcpy_ssse3_back ▒
0.96% depmod [kernel.kallsyms] [k] copy_user_generic_string ▒
0.89% depmod libc-2.16.so [.] __strcmp_sse42 ▒
0.82% depmod [kernel.kallsyms] [k] hrtimer_interrupt ▒
0.77% depmod libc-2.16.so [.] _int_malloc ▒
0.44% depmod kmod-nolib [.] kmod_elf_get_strings ▒
0.41% depmod kmod-nolib [.] kmod_elf_get_dependency_symbols ▒
0.37% depmod kmod-nolib [.] kmod_elf_get_section ▒
0.36% depmod kmod-nolib [.] kmod_elf_get_symbols
...
Average of running depmod 5 times, dropping caches between them, in a
slow spinning disk:
Before: 12.25 +- 0.20
After: 8.20 +- 0.21
m-i-t: 9.62 +- 0.27
So this patch leads to an improvement of ~33% over unpatched version,
ending up with 15% speedup over module-init-tools.
Although the hash table implementation allows passing a callback function
to free a value when it is removed from the hash table, hash_del() wasn't
freeing it if it was provided. Now it does.
As a bonus, it now checks if the callback is set in hash_add() as well.
This is a broken option that only leads to misery and incompatabilities
with other systems. Kbuild doesn't come close to supporting directories
other than /lib/modules with several targets simply failing without
hacky fixes. Simply remove the option and all traces of it, as it
doesn't make sense in today's world.
With this flag kmod_module_probe_insert_module() check if module is
blacklisted only if it's also an alias. This is needed in order to allow
blacklisting a module by name and effectively blacklisting all its
aliases as module-init-tools was doing.
Before this patch we could load pcspkr module as follows:
/etc/modprobe.d/test.conf:
alias yay pcspkr
blacklist pcspkr
$ modprobe yay
Now libkmod has support to blacklist "yay" because "pcspkr" is blacklisted.
Only the public header maintains #ifndef in the header, together with
pragma. The other ones contain only pragma.
As reported by Shawn Landden on systemd mailing list this is compatible
with all major compilers and gcc has this since version 3.3.