There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
When a directory is deleted, we don't take too much care about killing off
all the dirents that belong to it — on the basis that on remount, the scan
will conclude that the directory is dead anyway.
This doesn't work though, when the deleted directory contained a child
directory which was moved *out*. In the early stages of the fs build
we can then end up with an apparent hard link, with the child directory
appearing both in its true location, and as a child of the original
directory which are this stage of the mount process we don't *yet* know
is defunct.
To resolve this, take out the early special-casing of the "directories
shall not have hard links" rule in jffs2_build_inode_pass1(), and let the
normal nlink processing happen for directories as well as other inodes.
Then later in the build process we can set ic->pino_nlink to the parent
inode#, as is required for directories during normal operaton, instead
of the nlink. And complain only *then* about hard links which are still
in evidence even after killing off all the unreachable paths.
Reported-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
SELinux would like to implement a new labeling behavior of newly created
inodes. We currently label new inodes based on the parent and the creating
process. This new behavior would also take into account the name of the
new object when deciding the new label. This is not the (supposed) full path,
just the last component of the path.
This is very useful because creating /etc/shadow is different than creating
/etc/passwd but the kernel hooks are unable to differentiate these
operations. We currently require that userspace realize it is doing some
difficult operation like that and than userspace jumps through SELinux hoops
to get things set up correctly. This patch does not implement new
behavior, that is obviously contained in a seperate SELinux patch, but it
does pass the needed name down to the correct LSM hook. If no such name
exists it is fine to pass NULL.
Signed-off-by: Eric Paris <eparis@redhat.com>
When JFFS2 is used for large volumes, the mount times are quite long.
Increasing the hash size provides a significant speed boost on the OLPC
XO-1 laptop.
Add logic that dynamically selects a hash size based on the size of
the medium. A 64mb medium will result in a hash size of 128, and a 512mb
medium will result in a hash size of 1024.
Signed-off-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We're about to start calling this from the jffs2_garbage_collect_pass(), and
we'll want to know whether it actually did anything or not.
Signed-off-by: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
'rb_prev()', 'rb_next()' and 'rb_replace_node()' are declared in
include/linux/rbtree.h, no need for JFFS2 to re-declare them. I
believe these are left-overs from the old days when the common
RB tree code did not have those call and JFFS2 had private
implementation.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
To support NFS export, we need to know the parent inode of directories.
Rather than growing the jffs2_inode_cache structure, share space with
the nlink field -- which was always set to 1 for directories anyway.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Haven't had any complaints about it recently, despite having the test
code enabled to verify that the calculated length is correct.
Kill it off, just by #undef TEST_TOTLEN for now; removing it for real
can come a little later.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This should never happen unless there's corruption on the medium and the
actual data nodes go missing. But the failure mode (an oops when we assume
the fragtree isn't empty and go looking for its last node) isn't useful.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
In particular, remove the bit in the LICENCE file about contacting
Red Hat for alternative arrangements. Their errant IS department broke
that arrangement a long time ago -- the policy of collecting copyright
assignments from contributors came to an end when the plug was pulled on
the servers hosting the project, without notice or reason.
We do still dual-license it for use with eCos, with the GPL+exception
licence approved by the FSF as being GPL-compatible. It's just that nobody
has the right to license it differently.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We originally used to read every node and allocate a jffs2_tmp_dnode_info
structure for each, before processing them in (reverse) version order
and discarding the ones which are obsoleted by later nodes.
With huge logfiles, this behaviour caused memory problems. For example, a
file involved in OLPC trac #1292 has 1822391 nodes, and would cause the XO
machine to run out of memory during the first stage of read_inode().
Instead of just inserting nodes into a tree in version order as we find
them, we now put them into a tree in order of their offset within the
file, which allows us to immediately discard nodes which are completely
obsoleted.
We don't use a full tree with 'fragments' pointing to the real data
structure, as we do in the normal fragtree. We sort only on the start
address, and add an 'overlapped' flag to the tmp_dnode_info to indicate
that the node in question is (partially) overlapped by another.
When the scan is complete, we start at the end of the file, adding each
node to a real fragtree as before. Where the node is non-overlapped, we
just add it (it doesn't matter that it's not the latest version; there is
no overlap). When the node at the end of the tree _is_ overlapped, we sort
it and all its overlapping nodes into version order and then add them to
the fragtree in that order.
This 'early discard' reduces the peak allocation of tmp_dnode_info
structures from 1.8M to a mere 62872 (3.5%) in the degenerate case
referenced above.
This version of the patch also correctly rememembers the highest node
version# seen for an inode when it's scanned.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
When compiling a LE-capable JFFS2 on PowerPC, wbuf.c fails to compile:
fs/jffs2/wbuf.c:973: error: braced-group within expression allowed only inside a function
fs/jffs2/wbuf.c:973: error: initializer element is not constant
fs/jffs2/wbuf.c:973: error: (near initialization for ‘oob_cleanmarker.magic’)
fs/jffs2/wbuf.c:974: error: braced-group within expression allowed only inside a function
fs/jffs2/wbuf.c:974: error: initializer element is not constant
fs/jffs2/wbuf.c:974: error: (near initialization for ‘oob_cleanmarker.nodetype’)
fs/jffs2/wbuf.c:975: error: braced-group within expression allowed only inside a function
fs/jffs2/wbuf.c:976: error: initializer element is not constant
fs/jffs2/wbuf.c:976: error: (near initialization for ‘oob_cleanmarker.totlen’)
Provide constant_cpu_to_je{16,32} functions, and use them for initialising the
offending structure.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Use rb_first() and rb_last() to implement frag_first() and frag_last().
Signed-off-by: Akinbou Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This patch makes the needlessly global jffs2_obsolete_node_frag()
static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This patch makes two needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* git://git.infradead.org/~dwmw2/rbtree-2.6:
[RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
Update UML kernel/physmem.c to use rb_parent() accessor macro
[RBTREE] Update hrtimers to use rb_parent() accessor macro.
[RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
[RBTREE] Merge colour and parent fields of struct rb_node.
[RBTREE] Remove dead code in rb_erase()
[RBTREE] Update JFFS2 to use rb_parent() accessor macro.
[RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
[RBTREE] Update key.c to use rb_parent() accessor macro.
[RBTREE] Update ext3 to use rb_parent() accessor macro.
[RBTREE] Change rbtree off-tree marking in I/O schedulers.
[RBTREE] Add accessor macros for colour and parent fields of rb_node
This allows us to drop another pointer from the struct jffs2_raw_node_ref,
shrinking it to 8 bytes on 32-bit machines (if the TEST_TOTLEN) paranoia
check is turned off, which will be committed soon).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Preallocation of refs is shortly going to be a per-eraseblock thing,
rather than per-filesystem. Add the required argument to the function.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
As the first step towards eliminating the ref->next_phys member and saving
memory by using an _array_ of struct jffs2_raw_node_ref per eraseblock,
stop the write functions from allocating their own refs; have them just
_reserve_ the appropriate number instead. Then jffs2_link_node_ref() can
just fill them in.
Use a linked list of pre-allocated refs in the superblock, for now. Once
we switch to an array, it'll just be a case of extending that array.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We don't need the upper layers to deal with the physical offset. It's
_always_ c->nextblock->offset + c->sector_size - c->nextblock->free_size
so we might as well just let the actual write functions deal with that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We'll be using a proper list of nodes in the jffs2_xattr_datum and
jffs2_xattr_ref structures, because the existing code to overwrite
them is just broken. Put it in the common part at the front of the
structure which is shared with the jffs2_inode_cache, so that the
jffs2_link_node_ref() function can do the right thing.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Let's avoid the potential for forgetting to set ref->next_in_ino, by doing
it within jffs2_link_node_ref() instead.
This highlights the ugliness of what we're currently doing with
xattr_datum and xattr_ref structures -- we should find a nicer way of
dealing with that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Well, almost. We'll actually keep a 'TEST_TOTLEN' macro set for now, and keep
doing some paranoia checks to make sure it's all working correctly. But if
TEST_TOTLEN is unset, the size of struct jffs2_raw_node_ref drops from 16
bytes to 12 on 32-bit machines. That's a saving of about half a megabyte of
memory on the OLPC prototype board, with 125K or so nodes in its 512MiB of
flash.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
If __totlen is going away, we need to pass the length in separately.
Also stop callers from needlessly setting ref->next_phys to NULL,
since that's done for them... and since that'll also be going away soon.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
To eliminate the __totlen field from struct jffs2_raw_node_ref, we need
to allocate nodes for dirty space instead of just tweaking the accounting
data. Introduce jffs2_scan_dirty_space() in preparation for that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
For RWCOMPAT and ROCOMPAT nodes, we should still allow the mount to
succeed. Just abandon the summary and fall through to the full scan.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We should preserve these when we come to garbage collect them, not let
them get erased. Use jffs2_garbage_collect_pristine() for this, and make
sure the summary code copes -- just refrain from writing a summary for any
block which contains a node we don't understand.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The same sequence of code was repeated in many places, to add a new
struct jffs2_raw_node_ref to an eraseblock and adjust the space accounting
accordingly. Move it out-of-line.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Device node major/minor numbers are just stored in the payload of a single
data node. Just extend that to 4 bytes and use new_encode_dev() for it.
We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo).
This preserves backwards compatibility with older code as much as
possible. If we do make devices with major or minor numbers above 255, and
then mount the file system with the old code, it'll just read the first
two bytes and get the numbers wrong. If it comes to garbage-collect it,
it'll then write back those wrong numbers. But that's about the best we
can expect.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This patch can reduce 4-byte of memory usage per inode_cache.
[4/10] jffs2-xattr-v5.1-04-remove_ilist_from_ic.patch
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
This attached patches provide xattr support including POSIX-ACL and
SELinux support on JFFS2 (version.5).
There are some significant differences from previous version posted
at last December.
The biggest change is addition of EBS(Erase Block Summary) support.
Currently, both kernel and usermode utility (sumtool) can recognize
xattr nodes which have JFFS2_NODETYPE_XATTR/_XREF nodetype.
In addition, some bugs are fixed.
- A potential race condition was fixed.
- Unexpected fail when updating a xattr by same name/value pair was fixed.
- A bug when removing xattr name/value pair was fixed.
The fundamental structures (such as using two new nodetypes and exclusion
mechanism by rwsem) are unchanged. But most of implementation were reviewed
and updated if necessary.
Espacially, we had to change several internal implementations related to
load_xattr_datum() to avoid a potential race condition.
[1/2] xattr_on_jffs2.kernel.version-5.patch
[2/2] xattr_on_jffs2.utils.version-5.patch
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The goal of summary is to speed up the mount time. Erase block summary (EBS)
stores summary information at the end of every (closed) erase block. It is
no longer necessary to scan all nodes separetly (and read all pages of them)
just read this "small" summary, where every information is stored which is
needed at mount time.
This summary information is stored in a JFFS2_FEATURE_RWCOMPAT_DELETE. During
the mount process if there is no summary info the orignal scan process will
be executed. EBS works with NAND and NOR flashes, too.
There is a user space tool called sumtool to generate this summary
information for a JFFS2 image.
Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Remove support for virtual blocks, which are build by
concatenation of multiple physical erase blocks.
For more information please read the MTD mailing list thread
"[PATCH] remove support for virtual blocks"
Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
From: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Instead of building fragtree starting from node with the smallest version
number, start from the highest. This helps to avoid reading and checking
obsolete nodes.
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move functions to read inodes into readinode.c
Move functions to handle fragtree and dentry lists into nodelist.[ch]
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>