Remove the overwriting of main_argv[] hack and use the values
from the udev object.
Pass the udev object to call_foreach_file().
In the udevstart case, export SUBSYSTEM and UDEVSTART to the
environment.
This makes the udev operation completely lockless by storing a
file for every node in /dev/.udevdb/* This solved the problem
with deadlocking concurrent udev processes waiting for each other
to release the file lock under heavy load.
Here is the first patch to cleanup the internal processing of the
various stages of an udev event. It should not change any behavior,
but if your system depends on udev, please always test it before reboot :)
We pass only one generic structure around between add, remove,
namedev, db and dev_d handling and make all relevant data available
to all internal stages. All udev structures are renamed to "udev".
We replace the fake parameter by a flag in the udev structure.
We open the class device in the main binaries and not in udev_add, to
make it possible to use libsysfs for udevstart directory crawling.
The last sleep parameters are removed.
Here we remove all the sysfs sleep loops from udev as wait_for_sysfs
will do this for us and any other hotplug user. We still keep a small
blacklist of subsystems we don't care about but any missing entry here
will no longer lead to a spinning udev waiting for files.
> > > > That explains the spaces. What about stuff trailing %s, if %s does not
> > > > contain spaces. I.e, in the above example, model is ST336753LC and the
> > > > resulting device file is /dev/scsi_disks/some-proceding-stuff-
> > > > ST336753LC.
> > >
> > > I expect the model value has trailing spaces.
> > >
> > > You may look with:
> > > udevinfo -a -p /block/sdX
> >
> > Yes it does, and it seems for most SCSI devices, vendor and model will
> > have trailing spaces.
>
> It all depends on the vendor and model :)
>
> > I have included a patch to udev-036 to deal with
> > this issue. It trims off trailing whitespace for all sysfs attributes.
> > It might be better to trim off leading whitespace as well.
>
> We already trim it off when matching, but we also allow matching if you
> do put the spaces in there. This patch breaks that, right?
Correct, I have a new patch that trims after the comparison, so it
should work in both cases.
Here is the patch, that should prevent all of the known deadlocks with
corrupt tdb databases we discovered.
Thanks to Frank Steiner <fsteiner-mail@bio.ifi.lmu.de>, who tested all this
endlessly with a NFS mounted /dev. The conclusion is, that udev will not work
on filesystems without proper record locking, but we should prevent the
endless loops anyway. This patch implements:
o recovery from a corrupted udev database. udev will continue
without database support now, instead of doing nothing. So the node should
be generated in any case, remove will obviously not work for custom names.
o added iteration limits to the tdb-code at the places we discovered endless
loops. In the case tdb tries to find more than 100.000 entries with the
same hash, we better give up :)
o prevent a {all_partitions} loop caused by corrupt db data
o log all tdb errors to syslog
o switch sleep() to usleep() cause we want to use alarm()
Here is the correction for the dev.d/ scripts too. We should pass
the right argv[0] here too. A script may depend on the right value, as
udev does with udev/udevstart.
Here is the old version:
[pid 4692] execve("/etc/dev.d/default/log.dev", ["./udev", "block"], [/* 41 vars */]) = 0
this the new one:
[pid 9832] execve("/etc/dev.d/default/log.dev", ["/etc/dev.d/default/log.dev", "block"], [/* 41 vars */]) = 0
when udevstart was running we didn't set the environment and the
subsystem argument for the callouts the dev.d/ scripts.
Here is a fix, that sets that with every udevstart iteration, corrects
argv[0] to be the basename() only not the whole path and adds a test
for invoking callouts without arguments.
On Mon, 2004-09-13 at 01:56 +0200, Marco d'Itri wrote:
> Starting from udev 031, the %-arguments passed to PROGRAMs are not
> correct when the new udevstart code is being used.
>
> KERNEL="event[0-9]*", NAME="input/%k", PROGRAM="/etc/udev/inputdev.sh %k %n %M %m", RESULT="inputdev", MODE="0664", GROUP="video"
>
> generates this log (just echo $*):
>
> event0 0 13 64
> event0 0 13 64
> event0 0 13 64
>
> while the correct log (generated using the old shell script instead of
> udevstart) would be:
>
> event0 0 13 64
> event1 1 13 65
> event2 2 13 66
Yes, I can simulate this, please try the attached patch. I expect, that
it fixes it, cause we better not mangle the parsed config while matching
the rules.
On Mon, 2004-09-06 at 17:45 +0200, Kay Sievers wrote:
> On Mon, 2004-09-06 at 16:46 +0200, David Zeuthen wrote:
>
> Nice, I like it. It's a easy way to group device nodes of the same type,
> but coming from different kernel subsystems.
>
That's a good way of putting it, yeah.
> > Here's a patch against udev-030 that can help create compatibility
> > symlinks like /dev/cdrom, /dev/cdrom1 etc. The patch introduces a new
> > substitution type %C (for Compatibility) that can be used as follows
>
> I suggest using %e for enumeration here, cause "compatibility" can
> easily be misunderstood.
>
Good point, I've changed that.
> And we need a few lines added to the man page at udev.8.in :)
>
Done. I've also added an example.
Also, Kay pointed out offlist that the rules can be written to not
require a shell script; this actually works
KERNEL="sr*", NAME="%k", SYMLINK="cdrom%e"
KERNEL="scd*", NAME="%k", SYMLINK="cdrom%e"
KERNEL="pcd*", NAME="%k", SYMLINK="cdrom%e"
KERNEL="hd[a-z]", PROGRAM="/bin/cat /proc/ide/%k/media", RESULT="cdrom", NAME="\%k", SYMLINK="cdrom%e"
KERNEL="fd[0-9]", NAME="%k", SYMLINK="floppy%e"
KERNEL="hd[a-z]", PROGRAM="/bin/cat /proc/ide/%k/media", RESULT="floppy", NAME=\"%k", SYMLINK="floppy%e"
New patch is attached.
David
I noticed a comment in namedev.c which stated
"Figure out where the device symlink is at. For char devices this will
always be in the class_dev->path. But for block devices, it's
different. The main block device will have the device symlink in it's
path, but all partitions have the symlink in its parent directory. But
we need to watch out for block devices that do not have parents, yet
look like a partition (fd0, loop0, etc.). They all do not have a device
symlink yet. We do a sit and spin on waiting for them right now, we
should possibly have a whitelist for these devices here..."
I went ahead and created a whitelist for the block devices that look
like partitions (mainly by using devices.txt) and tested for any
performance increase that we would see. The whitelist only impacts
udevstart time depending on the state of UDEV_NO_SLEEP. Since the list
was short, I just did a sequential search and ordered the list in such a
way that those block devices which have more /dev entires (ex. loop0,
loop1, loop2, etc) appear sooner in the list and will thus be found
quicker. I've enclosed the patch and some of the performance results I
saw below. Basically, as the number of block devices without device
symlinks increased, the use of the whitelist improved udevstart
performance compared to just sitting and spinning. I just thought it
was interesting and thought I'd share. If you feel the patch is
beneficial please consider for merging. Also, if you'd be interested in
expanding the whitelist for other devices which are missing device
symlinks and seeing if there are added performance increases let me know
and I'll do what I can. Thanks,
Leann
Note: ex. loop represents all the loop devices (i.e. loop0, loop1,
loop2, etc)
block devices present with whitelist time
On Sat, Apr 17, 2004 at 03:30:29AM +0200, Kay Sievers wrote:
> On Sat, Apr 17, 2004 at 02:04:55AM +0200, Kay Sievers wrote:
> > On Fri, Apr 16, 2004 at 04:04:42PM -0700, Greg KH wrote:
> > > Oh, and if you run the latest udev_test.pl, we have a bunch more tests,
> > > including a few that fail, if you were looking for something to do :)
> >
> > Will do it. We need to change apply_format(). I tries to expand the '%%'
> > with the next iteration over the string and removes the '%'.
The tests are all successful now.
If this patch breaks something else, we simply have too few tests :)
Here is a patch to change the netdev handling in the database and for
the dev.d/ calls. I applies on top of the udevd.patch, cause klibc has
no sysinfo().
o netdev's are also put into our database now. I want this for the
udevruler gui to get a list of all handled devices.
All devices in the db are stamped with the system uptime value at
the creation time. 'udevinfo -d' prints it.
o the DEVPATH value is the key for udevdb, but if we rename
a netdev, the name is replaced in the kernel, so we add
the changed name to the db to match with the remove event.
NOTE: The dev.d/ scripts still get the original name from the
hotplug call. Should we replace DEVPATH with the new name too?
o We now only add a device to the db, if we have successfully created
the main node or successfully renamed a netdev. This is the main part
of the patch, cause I needed to clean the retval passing trough all
the functions used for node creation.
o DEVNODE sounds a bit ugly for netdev's so I exported DEVNAME too.
Can we change the name?
o I've added a UDEV_NO_DEVD to possibly skip the script execution
and used it in udev-test.pl.
udevstart is the same horror now, if you have scripts with logging
statements in dev.d/ it takes minutes to finish, can we skip the
scripts here too?
o The get_device_type() function is changed to be more strict, cause
'udevinfo -a -p /block/' gets a class device for it and tries to
print the major/minor values.
o bugfix, the RESULT value has now a working newline removal and a test
for this case.
Hmm, Arndt Bergmann sent a patch like this one a few weeks ago and
I want to bring the question back, if we want to handle net device
naming with udev.
With this patch it is actually possible to specify something like this
in udev.rules:
KERNEL="dummy*", SYSFS{address}="00:00:00:00:00:00", SYSFS{features}="0x0", NAME="blind%n"
KERNEL="eth*", SYSFS{address}="00:0d:60:77:30:91", NAME="private"
and you will get:
[root@pim udev.kay]# cat /proc/net/dev
Inter-| Receive | Transmit
face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
lo: 1500 30 0 0 0 0 0 0 1500 30 0 0 0 0 0 0
private: 278393 1114 0 0 0 0 0 0 153204 1468 0 0 0 0 0 0
sit0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
blind0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The udevinfo program is also working:
[root@pim udev.kay]# ./udevinfo -a -p /sys/class/net/private
looking at class device '/sys/class/net/private':
SYSFS{addr_len}="6"
SYSFS{address}="00:0d:60:77:30:91"
SYSFS{broadcast}="ff:ff:ff:ff:ff:ff"
SYSFS{features}="0x3a9"
SYSFS{flags}="0x1003"
SYSFS{ifindex}="2"
SYSFS{iflink}="2"
SYSFS{mtu}="1500"
SYSFS{tx_queue_len}="1000"
SYSFS{type}="1"
follow the class device's "device"
looking at the device chain at '/sys/devices/pci0000:00/0000:00:1e.0/0000:02:01.0':
BUS="pci"
ID="0000:02:01.0"
SYSFS{class}="0x020000"
SYSFS{detach_state}="0"
SYSFS{device}="0x101e"
SYSFS{irq}="11"
SYSFS{subsystem_device}="0x0549"
SYSFS{subsystem_vendor}="0x1014"
SYSFS{vendor}="0x8086"
The matching device will be renamed to the given name. The device name
will not be put into the udev database, cause the kernel renames the
device and the sysfs name disappears.
I like it, cause it plugs in nicely. We have all the naming features
and sysfs queries and walks inside of udev. The sysfs timing races
are already solved and the management tools are working for net devices
too. nameif can only match the MAC address now. udev can match any sysfs
value of the device tree the net device is connected to.
But right, net devices do not have device nodes :)
Patch from Andrey, which restores the ability to use RESULT values in a
"symlink only" rule. We need to call apply_format() directly after
the matching rule, otherwise the RESULT value may be lost.
mknod gets an uninitialized variable, which leads to interesting file
modes. the bug is in namedev, devices with no match must not use the
uninitialized stuff were dev points to.
On Mon, Mar 15, 2004 at 09:28:17PM +0100, Kay Sievers wrote:
> Here is a first simple and pretty stupid try to make a simple tool for
> composing of a udev rule.
>
> It reads the udevdb to get all currently handled devices and presents a
> list, where you can choose the device to compose the rule for.
>
> The composed rule is just printed out in a window, nothing else by now.
>
> Do we want something like this?
> Nevermind, I always wanted to know, how this newt thing works :)
Here is the next step, I still can't sleep and there are to many patches
pending to make something useful :)
Cause nobody wanted to play with me, I've made a screenshot.
The device list is sorted in alphabetical order now and if there are only
a few recently discovered devices, they are placed on top of the list.
For those who want to have a look:
http://vrfy.org/projects/udev/udevruler.png
The patch applies on top of today's mmap() patch. The db format is
changed to have the file and line number of the applied rule. So it
should be easy to edit the matching rule with this beast. It compiles
with "make all udevruler".
Here we replace the various fgets() with a mmap() call for the config
file reading, due to the reported performance problems with klibc.
Thanks to Patrick's testing, it makes a very small, close to nothing
speed gain for libc users, but a 6 times speed increase for klibc users
with a 1000 line config file.
I've created a udev_lib.[hc] for this and also moved all the generic
stuff from udev.h in there and uninlined the functions.
Martin Schwenke <martin@meltin.net> asked for this feature and posted a
patch:
The following patch almost let's me have the following configuration:
PROGRAM="/sbin/aliaser %b %k %n %M %m", RESULT="?*", NAME="%c{1}", SYMLINK="%c{2+}"
allowing me to specify an arbitrary number of symlinks by saying
"giveme the second and later words"."
Here is the actual version with tests and a few words in the man page.
Here I change the callout fork logic.
The current cersion is unable to read a pipe which is not flushed at once,
Now we read until it's closed.
The maximum argument count is calculated by the strlen now. We have 100
chars for our result buffer so we can't have more than 50 parameters.
So it's much more clear what will happen now and not some magic boundary
where we use shell behind it.
Parameter can be combined to one by using apostrophes.
this on works now:
BUS="scsi", PROGRAM="/bin/sh -c 'echo foo3 foo4 foo5 foo6 foo7 foo8 foo9 | sed s/foo9/bar9/'", KERNEL="sda3", NAME="%c{7}"
Two new test are also added.
This allows to set the permissions along with the rule.
This is not a general replacement for the permissions config, but it
may be easier sometimes for the user to specify the permissions along
with the rule, cause the permissions config file wants the final node
name to match, which seems sometimes a bit difficult to guess, if
format % chars are used in the NAME field.
Any value not given in the rule is still be read from the permissions
file or set to the default. This one will also work:
BUS="usb", KERNEL="video*", NAME="my-%k", OWNER="$local"
A few words to man page are also added and add_perm_dev() is moved into
namedev_parse.c where it belongs to.
Hey, I wrote the strn*() macros just 10 days ago and yesterday this trap
caught me with the %c{x} bug.
The names are misleading cause we all expect that the from field is limited by
the size argument, but we actually limit the overall size of the destination
string to prevent a overflow.
Here we rename all strn*() macros to str*max(). That should be
more self-explanatory.
Hey, it may never happen, that one wants to distinguish attributes by
trailing spaces, but we should not lose the control over it, just for
being lazy :)
Here we remove the trailing spaces of the sysfs attribute only if the
configured value to match doesn't have any trailing spaces by itself.
So if you put a attribute in a rule with spaces at the end, the sysfs
attribute _must_ match exactly.
Is that cool for everyone?
As usual, 2 tests are added for it with a artificial sysfs file and
a few words to the man page.
On Wed, Mar 03, 2004 at 04:56:34PM -0800, Greg KH wrote:
> On Wed, Mar 03, 2004 at 03:57:04PM -0800, Patrick Mansfield wrote:
> >
> > Here is a patch for some new tests.
>
> Applied, thanks.
Here is a small improvement, which looks much better.
Hey Pat, thanks a lot for finding the recent bug, hope this one will
not break it again :)
On Wed, Mar 03, 2004 at 02:43:34PM -0800, Patrick Mansfield wrote:
> Here is a fix and a new test for the problem Atul hit, where if we have a
> NAME based on a result of the form:
>
> NAME="foo-%c{7}"
>
> udev truncates the name. Without any prefix (the foo- in this example),
> the rule was working OK.
Here is a fix for the fix :)
Sorry, I broke it yesterday.
Here I try to cleanup our various multifield iteration over the strings.
Inspired by our nice list.h we now have a macro to iterate over the string
and process the parts of it:
It makes the code more readable and we don't change the string while we
process it like the former strsep() does.
Example:
foreach_strpart(dev->symlink, " ", pos, len) {
if (strncmp(&dev->symlink[pos], find_name, len) != 0)
continue;
...
}
For the callout part selector %c{2} we separate now not only by space but
also newline and return characters, cause some programs may give multiline
values back. A possible RESULT match must contain wildcards for these
characters.
Also a bug in the recent udevinfo symlink query feature is fixed.
On Sat, Feb 28, 2004 at 09:56:32PM +0100, Kay Sievers wrote:
> Andrey pointed out that we don't print the right filename in the debug
> output. Here is a fix for that. It applies on top of Andrey's symlink
> patch, cause we are touching the same part of the code.
The copy/paste devil catched me :)
Here is a fixed one.
We carried the the old callout part selector syntax for two releases
now after it was replaced by the new %c{1} syntax. So here we remove
the old syntax and use the code to possibly specify the maximum count
of chars to insert into the string. It will work with all of our format
chars.
I don't know if somebody will use it, but the code is already there :)
's%3s{vendor}' returns "IBM" now, instead of "IBM-ESXS".
Also added is a test for it and a few words in the man page.
Here is for now my last patch to the string handling for a rather
theorethical case, where the node is very very very long. :)
We have accordant to strfieldcat(to, from) now a strintcat(to, i) macro,
which appends the ascii representation of a integer to a string in a
safe way.
Mainly a cleanup of the earlier patches with a few missing pieces
and some cosmetical changes.
I've moved the udev_init_config() to very early init, otherwise we
don't get any logging for the processing of the input. What would I
do without gdb :)
Greg, it's the 7th patch in your box to apply. I will stop now and
wait for you :)
Here we truncate our input strings from the environment to our
defined limit. It's a bit theroretical but better check for it.
It cleans up some magic length definitions and removes the code
duplication in udev, udevtest and udevsend.
udevd needs to be killed after installation, cause the message size
is changed with this patch.
Should we do this with the 'make install', like we do with the '.udevdb'?
As promised, here is the next round. We provide in addition to the
already used macros:
strfieldcpy(to, from)
strfieldcat(to, from)
the corresponding friends, if the size of the target is not known and
must be provided by the caller:
strnfieldcpy(to, from, maxsize)
strnfieldcat(to, from, maxsize)
and switch nearly all possibly unsafe users of strcat(), strncat(),
strcpy() and strncpy() to these safer macros.
The last known remaining issue seems the use of sprintf() and
snprintf(). I will take on it later today or tomorrow.
On Tue, Feb 24, 2004 at 11:50:52PM +0100, Kay Sievers wrote:
> Here is the first step towards a safer string handling.
> More will follow, but for now only the easy ones :)
>
> Thanks to all who pointed this out. strncat() isn't a nice function. We
> all should remember that the destination string is not terminated if the
> given lenght is shorter than the strlen of the source string.
>
> And shame on the various implementers of strfieldcat() I found in the
> unapplied patches on this list, it's not really better than strncpy()
> and hides the real problem.
Hmm, bk didn't checked in one file, maybe I edited it again as root.
Nevermind, here is the more complete version.