After updating alloc_dinode counts in ocfs2_alloc_dinode_update_counts(),
if ocfs2_alloc_dinode_update_bitmap() failed, there is a rare case that
some space may be lost.
So, roll back alloc_dinode counts when ocfs2_block_group_set_bits()
failed.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Younger Liu <younger.liucn@gmail.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs2_do_flock() calls ocfs2_file_lock() to get the cross-node clock and
then call flock_lock_file_wait() to compete with local processes. In
case flock_lock_file_wait() failed, say -ENOMEM, clean up work is not
done. This patch adds the cleanup --drop the cross-node lock which was
just granted.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ensure that ocfs2_update_inode_fsync_trans() is called any time we touch
an inode in a given transaction. This is a follow-on to the previous
patch to reduce lock contention and deadlocking during an fsync
operation.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Wengang <wen.gang.wang@oracle.com>
Cc: Greg Marsden <greg.marsden@oracle.com>
Cc: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Do not put bh when buffer_uptodate failed in ocfs2_write_block and
ocfs2_write_super_or_backup, because it will put bh in b_end_io.
Otherwise it will hit a warning "VFS: brelse: Trying to free free
buffer".
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When ocfs2_create_new_inode_locks() return error, inode open lock may
not be obtainted for this inode. So other nodes can remove this file
and free dinode when inode still remain in memory on this node, which is
not correct and may trigger BUG. So __ocfs2_mknod_locked should return
error when ocfs2_create_new_inode_locks() failed.
Node_1 Node_2
create fileA, call ocfs2_mknod()
-> ocfs2_get_init_inode(), allocate inodeA
-> ocfs2_claim_new_inode(), claim dinode(dinodeA)
-> call ocfs2_create_new_inode_locks(),
create open lock failed, return error
-> __ocfs2_mknod_locked return success
unlink fileA
try open lock succeed,
and free dinodeA
create another file, call ocfs2_mknod()
-> ocfs2_get_init_inode(), allocate inodeB
-> ocfs2_claim_new_inode(), as Node_2 had freed dinodeA,
so claim dinodeA and update generation for dinodeA
call __ocfs2_drop_dl_inodes()->ocfs2_delete_inode()
to free inodeA, and finally triggers BUG
on(inode->i_generation != le32_to_cpu(fe->i_generation))
in function ocfs2_inode_lock_update().
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Orabug: 18108070
ocfs2_xattr_extend_allocation() hits panic when creating xattr during
data extent alloc phase. The problem occurs if due to local alloc
fragmentation, clusters are spread over multiple extents. In this case
ocfs2_add_clusters_in_btree() finds no space to store more than one
extent record and therefore fails returning RESTART_META. The situation
is anticipated for xattr update case but not xattr create case. This
fix simply ports that code to create case.
Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In dlm_query_region_handler(), once kmalloc failed, it will unlock
dlm_domain_lock without lock first, then deadlock happens.
Signed-off-by: Zhonghua Guo <guozhonghua@h3c.com>
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Tested-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
llseek requires ocfs2 inode lock for updating the file size in SEEK_END.
because the file size maybe update on another node.
This bug can be reproduce the following scenario: at first, we dd a test
fileA, the file size is 10k.
on NodeA:
---------
1) open the test fileA, lseek the end of file. and print the position.
2) close the test fileA
on NodeB:
1) open the test fileA, append the 5k data to test FileA.
2) lseek the end of file. and print the position.
3) close file.
At first we run the test program1 on NodeA , the result is 10k. And
then run the test program2 on NodeB, the result is 15k. At last, we run
the test program1 on NodeA again, the result is 10k.
After applying this patch the three step result is 15k.
test result: 1000000 times lseek call;
index lseek with inode lock (unit:us) lseek without inode lock (unit:us)
1 1168162 555383
2 1168011 549504
3 1170538 549396
4 1170375 551685
5 1170444 556719
6 1174364 555307
7 1163294 551552
8 1170080 549350
9 1162464 553700
10 1165441 552594
avg 1168317 552519
avg with lock - avg without lock = 615798
(avg with lock - avg without lock)/1000000=0.615798 us
Signed-off-by: Jensen <shencanquan@huawei.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In o2nm_cluster, cl_idle_timeout_ms, cl_keepalive_delay_ms, as well as
cl_reconnect_delay_ms, are defined as type of unsigned int. So we
should also use unsigned int in the helper functions.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following patches are reverted in this patch because these patches
caused performance regression in the remote unlink() calls.
ea455f8ab6 - ocfs2: Push out dropping of dentry lock to ocfs2_wq
f7b1aa69be - ocfs2: Fix deadlock on umount
5fd1318937 - ocfs2: Don't oops in ocfs2_kill_sb on a failed mount
Previous patches in this series removed the possible deadlocks from
downconvert thread so the above patches shouldn't be needed anymore.
The regression is caused because these patches delay the iput() in case
of dentry unlocks. This also delays the unlocking of the open lockres.
The open lockresource is required to test if the inode can be wiped from
disk or not. When the deleting node does not get the open lock, it
marks it as orphan (even though it is not in use by another
node/process) and causes a journal checkpoint. This delays operations
following the inode eviction. This also moves the inode to the orphaned
inode which further causes more I/O and a lot of unneccessary orphans.
The following script can be used to generate the load causing issues:
declare -a create
declare -a remove
declare -a iterations=(1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384)
unique="`mktemp -u XXXXX`"
script="/tmp/idontknow-${unique}.sh"
cat <<EOF > "${script}"
for n in {1..8}; do mkdir -p test/dir\${n}
eval touch test/dir\${n}/foo{1.."\$1"}
done
EOF
chmod 700 "${script}"
function fcreate ()
{
exec 2>&1 /usr/bin/time --format=%E "${script}" "$1"
}
function fremove ()
{
exec 2>&1 /usr/bin/time --format=%E ssh node2 "cd `pwd`; rm -Rf test*"
}
function fcp ()
{
exec 2>&1 /usr/bin/time --format=%E ssh node3 "cd `pwd`; cp -R test test.new"
}
echo -------------------------------------------------
echo "| # files | create #s | copy #s | remove #s |"
echo -------------------------------------------------
for ((x=0; x < ${#iterations[*]} ; x++)) do
create[$x]="`fcreate ${iterations[$x]}`"
copy[$x]="`fcp ${iterations[$x]}`"
remove[$x]="`fremove`"
printf "| %8d | %9s | %9s | %9s |\n" ${iterations[$x]} ${create[$x]} ${copy[$x]} ${remove[$x]}
done
rm "${script}"
echo "------------------------"
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we are dropping last inode reference from downconvert thread, we will
end up calling ocfs2_mark_lockres_freeing() which can block if the lock
we are freeing is queued thus creating an A-A deadlock. Luckily, since
we are the downconvert thread, we can immediately dequeue the lock and
thus avoid waiting in this case.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We cannot drop last dquot reference from downconvert thread as that
creates the following deadlock:
NODE 1 NODE2
holds dentry lock for 'foo'
holds inode lock for GLOBAL_BITMAP_SYSTEM_INODE
dquot_initialize(bar)
ocfs2_dquot_acquire()
ocfs2_inode_lock(USER_QUOTA_SYSTEM_INODE)
...
downconvert thread (triggered from another
node or a different process from NODE2)
ocfs2_dentry_post_unlock()
...
iput(foo)
ocfs2_evict_inode(foo)
ocfs2_clear_inode(foo)
dquot_drop(inode)
...
ocfs2_dquot_release()
ocfs2_inode_lock(USER_QUOTA_SYSTEM_INODE)
- blocks
finds we need more space in
quota file
...
ocfs2_extend_no_holes()
ocfs2_inode_lock(GLOBAL_BITMAP_SYSTEM_INODE)
- deadlocks waiting for
downconvert thread
We solve the problem by postponing dropping of the last dquot reference to
a workqueue if it happens from the downconvert thread.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide dqgrab() function to get quota structure reference when we are
sure it already has at least one active reference. Make use of this
function inside quota code.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move dquot_initalize() call in ocfs2_delete_inode() after the moment we
verify inode is actually a sane one to delete. We certainly don't want
to initialize quota for system inodes etc. This also avoids calling
into quota code from downconvert thread.
Add more details into the comment why bailing out from
ocfs2_delete_inode() when we are in downconvert thread is OK.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The flag was never set, delete it.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a part of the nocontrold feature which was incorporated sometime
back.
This is required for backward compatibility of the tools, specifically
the scenario where the tools with recovery callback is used with a
kernel not using the recovery callbacks (older kernel + newer tools).
The tools look for this file to understand if the kernel supports DLM
recovery callbacks.
For kernels which support recovery callbacks but will miss this patch,
ocfs2 will continue to use the older API and would still be able to
mount the filesystem.
[akpm@linux-foundation.org: simplify]
[sfr@canb.auug.org.au: VERIFY_OCTAL_PERMISSIONS fix up]
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a race window in dlm_do_recovery() between dlm_remaster_locks()
and dlm_reset_recovery() when the recovery master nearly finish the
recovery process for a dead node. After the master sends FINALIZE_RECO
message in dlm_remaster_locks(), another node may become the recovery
master for another dead node, and then send the BEGIN_RECO message to
all the nodes included the old master, in the handler of this message
dlm_begin_reco_handler() of old master, dlm->reco.dead_node and
dlm->reco.new_master will be set to the second dead node and the new
master, then in dlm_reset_recovery(), these two variables will be reset
to default value. This will cause new recovery master can not finish
the recovery process and hung, at last the whole cluster will hung for
recovery.
old recovery master: new recovery master:
dlm_remaster_locks()
become recovery master for
another dead node.
dlm_send_begin_reco_message()
dlm_begin_reco_handler()
{
if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) {
return -EAGAIN;
}
dlm_set_reco_master(dlm, br->node_idx);
dlm_set_reco_dead_node(dlm, br->dead_node);
}
dlm_reset_recovery()
{
dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM);
dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM);
}
will hang in dlm_remaster_locks() for
request dlm locks info
Before send FINALIZE_RECO message, recovery master should set
DLM_RECO_STATE_FINALIZE for itself and clear it after the recovery done,
this can break the race windows as the BEGIN_RECO messages will not be
handled before DLM_RECO_STATE_FINALIZE flag is cleared.
A similar race may happen between new recovery master and normal node
which is in dlm_finalize_reco_handler(), also fix it.
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This issue was introduced by commit 800deef3f6 ("ocfs2: use
list_for_each_entry where benefical") in 2007 where it replaced
list_for_each with list_for_each_entry. The variable "lock" will point
to invalid data if "tmpq" list is empty and a panic will be triggered
due to this. Sunil advised reverting it back, but the old version was
also not right. At the end of the outer for loop, that
list_for_each_entry will also set "lock" to an invalid data, then in the
next loop, if the "tmpq" list is empty, "lock" will be an stale invalid
data and cause the panic. So reverting the list_for_each back and reset
"lock" to NULL to fix this issue.
Another concern is that this seemes can not happen because the "tmpq"
list should not be empty. Let me describe how.
old lock resource owner(node 1): migratation target(node 2):
image there's lockres with a EX lock from node 2 in
granted list, a NR lock from node x with convert_type
EX in converting list.
dlm_empty_lockres() {
dlm_pick_migration_target() {
pick node 2 as target as its lock is the first one
in granted list.
}
dlm_migrate_lockres() {
dlm_mark_lockres_migrating() {
res->state |= DLM_LOCK_RES_BLOCK_DIRTY;
wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res));
//after the above code, we can not dirty lockres any more,
// so dlm_thread shuffle list will not run
downconvert lock from EX to NR
upconvert lock from NR to EX
<<< migration may schedule out here, then
<<< node 2 send down convert request to convert type from EX to
<<< NR, then send up convert request to convert type from NR to
<<< EX, at this time, lockres granted list is empty, and two locks
<<< in the converting list, node x up convert lock followed by
<<< node 2 up convert lock.
// will set lockres RES_MIGRATING flag, the following
// lock/unlock can not run
dlm_lockres_release_ast(dlm, res);
}
dlm_send_one_lockres()
dlm_process_recovery_data()
for (i=0; i<mres->num_locks; i++)
if (ml->node == dlm->node_num)
for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) {
list_for_each_entry(lock, tmpq, list)
if (lock) break; <<< lock is invalid as grant list is empty.
}
if (lock->ml.node != ml->node)
BUG() >>> crash here
}
I see the above locks status from a vmcore of our internal bug.
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, ocfs2_sync_file grabs i_mutex and forces the current journal
transaction to complete. This isn't terribly efficient, since sync_file
really only needs to wait for the last transaction involving that inode
to complete, and this doesn't require i_mutex.
Therefore, implement the necessary bits to track the newest tid
associated with an inode, and teach sync_file to wait for that instead
of waiting for everything in the journal to commit. Furthermore, only
issue the flush request to the drive if jbd2 hasn't already done so.
This also eliminates the deadlock between ocfs2_file_aio_write() and
ocfs2_sync_file(). aio_write takes i_mutex then calls
ocfs2_aiodio_wait() to wait for unaligned dio writes to finish.
However, if that dio completion involves calling fsync, then we can get
into trouble when some ocfs2_sync_file tries to take i_mutex.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Variable uuid_net_key in ocfs2_initialize_super() is not used. Clean it
up.
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a problem that waitqueue_active() may check stale data thus miss
a wakeup of threads waiting on ip_unaligned_aio.
The valid value of ip_unaligned_aio is only 0 and 1 so we can change it to
be of type mutex thus the above prolem is avoid. Another benifit is that
mutex which works as FIFO is fairer than wake_up_all().
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When mounting an ocfs2 volume, it will firstly generate a file
/sys/kernel/debug/o2dlm/<uuid>/dlm_state, and then launch the dlm thread.
So the following situation will cause a null pointer dereference.
dlm_debug_init -> access file dlm_state which will call dlm_state_print ->
dlm_launch_thread
Move dlm_debug_init after dlm_launch_thread and dlm_launch_recovery_thread
can fix this issue.
Signed-off-by: Zongxun Wang <wangzongxun@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch the RSPI MSTP clock on SH7757 from a con ID match to a dev ID
match, so we can start looking it up using clk_get() with a NULL ID.
Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The compiler is permitted to generate code which overwrites the
parameters to a function. If those parameters include the only saved
copy we have of userspace's registers, we're in trouble.
Signed-off-by: Bobby Bingham <koorogi@koorogi.info>
Cc: Paul Mundt <paul.mundt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This does not appear to have been used since commit 74d99a5e26 ("sh:
SH-2A FPU support") in 2007.
Signed-off-by: Bobby Bingham <koorogi@koorogi.info>
Cc: Paul Mundt <paul.mundt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When invoking syscall handlers on sh32, the saved userspace registers
are at the top of the stack. This seems to have been intentional, as it
is an easy way to pass r0, r1, ... to the handler as parameters 5, 6,
...
It causes problems, however, because the compiler is allowed to generate
code for a function which clobbers that function's own parameters. For
example, gcc generates the following code for clone:
<SyS_clone>:
mov.l 8c020714 <SyS_clone+0xc>,r1 ! 8c020540 <do_fork>
mov.l r7,@r15
mov r6,r7
jmp @r1
mov #0,r6
nop
.word 0x0540
.word 0x8c02
The `mov.l r7,@r15` clobbers the saved value of r0 passed from
userspace. For most system calls, this might not be a problem, because
we'll be overwriting r0 with the return value anyway. But in the case
of clone, copy_thread will need the original value of r0 if the
CLONE_SETTLS flag was specified.
The first patch in this series fixes this issue for system calls by
pushing to the stack and extra copy of r0-r2 before invoking the
handler. We discard this copy before restoring the userspace registers,
so it is not a problem if they are clobbered.
Exception handlers also receive the userspace register values in a
similar manner, and may hit the same problem. The second patch removes
the do_fpu_error handler, which looks susceptible to this problem and
which, as far as I can tell, has not been used in some time. The third
patch addresses other exception handlers.
This patch (of 3):
The userspace registers are stored at the top of the stack when the
syscall handler is invoked, which allows r0-r2 to act as parameters 5-7.
Parameters passed on the stack may be clobbered by the syscall handler.
The solution is to push an extra copy of the registers which might be
used as syscall parameters to the stack, so that the authoritative set
of saved register values does not get clobbered.
A few system call handlers are also updated to get the userspace
registers using current_pt_regs() instead of from the stack.
Signed-off-by: Bobby Bingham <koorogi@koorogi.info>
Cc: Paul Mundt <paul.mundt@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This removes the CPU_SCORE7 Kconfig parameter, which is no longer used
anywhere in the source code and Makefiles.
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recent increased use of typeof() throughout the tree resulted in a
number of symbols (25 in a typical distro config of ours) not getting a
proper CRC calculated for them anymore, due to the parser in genksyms
not coping with several of these uses (interestingly in the majority of
[if not all] cases the problem is due to the use of typeof() in code
preceding a certain export, not in the declaration/definition of the
exported function/object itself; I wasn't able to find a way to address
this more general parser shortcoming).
The use of parameter_declaration is a little more relaxed than would be
ideal (permitting not just a bare type specification, but also one with
identifier), but since the same code is being passed through an actual
compiler, there's no apparent risk of allowing through any broken code.
Otoh using parameter_declaration instead of the ad hoc
"decl_specifier_seq '*'" / "decl_specifier_seq" pair allows all types to
be handled rather than just plain ones and pointers to plain ones.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move code moving event structure to access_list from copy_event_to_user()
to fanotify_read() where it is more logical (so that we can immediately
see in the main loop that we either move the event to a different list
or free it). Also move special error handling for permission events
from copy_event_to_user() to the main loop to have it in one place with
error handling for normal events. This makes copy_event_to_user()
really only copy the event to user without any side effects.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Swap the error / "read ok" branches in the main loop of fanotify_read().
We will grow the "read ok" part in the next patch and this makes the
indentation easier. Also it is more common to have error conditions
inside an 'if' instead of the fast path.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
access_mutex is used only to guard operations on access_list. There's
no need for sleeping within this lock so just make a spinlock out of it.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, fanotify creates new structure to track the fact that
permission event has been reported to userspace and someone is waiting
for a response to it. As event structures are now completely in the
hands of each notification framework, we can use the event structure for
this tracking instead of allocating a new structure.
Since this makes the event structures for normal events and permission
events even more different and the structures have different lifetime
rules, we split them into two separate structures (where permission
event structure contains the structure for a normal event). This makes
normal events 8 bytes smaller and the code a tad bit cleaner.
[akpm@linux-foundation.org: fix build]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The prepare_for_access_response() function checks whether
group->fanotify_data.bypass_perm is set. However this test can never be
true because prepare_for_access_response() is called only from
fanotify_read() which means fanotify group is alive with an active fd
while bypass_perm is set from fanotify_release() when all file
descriptors pointing to the group are closed and the group is going
away.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nameidata was replaced by flags in commit 00cd8dd3bf ("stop passing
nameidata to ->lookup()").
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cifs_init_inodecache is only called by __init init_cifs.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
They don't have to be atomic_t, because they are simple boolean toggles.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently if kmemleak is disabled, the kmemleak objects can never be
freed, no matter if it's disabled by a user or due to fatal errors.
Those objects can be a big waste of memory.
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
1200264 1197433 99% 0.30K 46164 26 369312K kmemleak_object
With this patch, after kmemleak was disabled you can reclaim memory
with:
# echo clear > /sys/kernel/debug/kmemleak
Also inform users about this with a printk.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently if you stop kmemleak thread before disabling kmemleak,
kmemleak objects will be freed and so you won't be able to check
previously reported leaks.
With this patch, kmemleak objects won't be freed if there're leaks that
can be reported.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the presence of memoryless nodes, numa_node_id() will return the
current CPU's NUMA node, but that may not be where we expect to allocate
from memory from. Instead, we should rely on the fallback code in the
memory allocator itself, by using NUMA_NO_NODE. Also, when calling
kthread_create_on_node(), use the nearest node with memory to the cpu in
question, rather than the node it is running on.
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After commit 839a8e8660 ("writeback: replace custom worker pool
implementation with unbound workqueue") when device is removed while we
are writing to it we crash in bdi_writeback_workfn() ->
set_worker_desc() because bdi->dev is NULL.
This can happen because even though bdi_unregister() cancels all pending
flushing work, nothing really prevents new ones from being queued from
balance_dirty_pages() or other places.
Fix the problem by clearing BDI_registered bit in bdi_unregister() and
checking it before scheduling of any flushing work.
Fixes: 839a8e8660
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Derek Basehore <dbasehore@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bdi_wakeup_thread_delayed() used the mod_delayed_work() function to
schedule work to writeback dirty inodes. The problem with this is that
it can delay work that is scheduled for immediate execution, such as the
work from sync_inodes_sb(). This can happen since mod_delayed_work()
can now steal work from a work_queue. This fixes the problem by using
queue_delayed_work() instead. This is a regression caused by commit
839a8e8660 ("writeback: replace custom worker pool implementation with
unbound workqueue").
The reason that this causes a problem is that laptop-mode will change
the delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default.
In the case that bdi_wakeup_thread_delayed() races with
sync_inodes_sb(), sync will be stopped for 10 minutes and trigger a hung
task. Even if dirty_writeback_centisecs is not long enough to cause a
hung task, we still don't want to delay sync for that long.
We fix the problem by using queue_delayed_work() when we want to
schedule writeback sometime in future. This function doesn't change the
timer if it is already armed.
For the same reason, we also change bdi_writeback_workfn() to
immediately queue the work again in the case that the work_list is not
empty. The same problem can happen if the sync work is run on the
rescue worker.
[jack@suse.cz: update changelog, add comment, use bdi_wakeup_thread_delayed()]
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zento.linux.org.uk>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Derek Basehore <dbasehore@chromium.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benson Leung <bleung@chromium.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Luigi Semenzato <semenzato@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees reported the following error:
arch/sh/kernel/dumpstack.c: In function 'print_trace_address':
arch/sh/kernel/dumpstack.c:118:2: error: format not a string literal and no format arguments [-Werror=format-security]
Use the "%s" format so that it's impossible to interpret 'data' as a
format string.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Reported-by: Kees Cook <keescook@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Allow the vfio-type1 IOMMU to support multiple domains within a container
- Plumb path to query whether all domains are cache-coherent
- Wire query into kvm-vfio device to avoid KVM x86 WBINVD emulation
- Always select CONFIG_ANON_INODES, vfio depends on it (Arnd)
The first patch also makes the vfio-type1 IOMMU driver completely independent
of the bus_type of the devices it's handling, which enables it to be used for
both vfio-pci and a future vfio-platform (and hopefully combinations involving
both simultaneously).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTPbikAAoJECObm247sIsiG9cP/jnurW84YuAHmybzy3R4nMaa
8BWcQST+TyQ78GWvubFDcRt+vHmJUI4iFWPBIm1twBSVrMy6F1GcT/spmSne1Dfb
OvcfGEy59tGO+BeklHg5Qq2Hj1UzfeV9uQoy+PrpTF/sYzsyL6g5+O3Zv39SNvr0
zCwnO2JAKXIpQlkK3wVha6V13X07Z/+d2b4JSsvmON2cnlPzOUYNrgBvoSeV2IQe
3kMAU9qfAfQlpNDhRRZL/HshgWazCg4XZUp9UdNjUkOxrUp4vXHrVTUJnUGBDytk
8V9UGRKF3mik9PqpZJk4jLV5urgUVpnUR5747uqs0KF+9GWxClXvh3gp1XX8Zn7f
NCfDSMn/wrDr6fQBglKUadITaDhF49KV+J0cX083q46BbYxhAfgxv1I9UOvLreG6
SGAezlTB0mefa1BmSwJdfKwDBsuDOhbF0TsHC6zT7yFc8pqe3hl/vQzUyu3HtwuM
yPycQiQQmvOaymaiiBRwyLocwJbDNKPigDomv8NZ8Mcd97Fu9YsF4ydaxMJmmeKf
TffYS4z4tUElYiU2vv1ipB8T+ReCmdtj2cIvyfwTL9jycY9ouO4bJ56dOGWd3WNl
m7AVokbrrJQ03itUpFMuYjHAzTNUlXVvXlGr9n6L4VTR0RTL1zwJVrMVDsXdhxm4
XgDbVB5PrxinDAC1vJvI
=mnzO
-----END PGP SIGNATURE-----
Merge tag 'vfio-v3.15-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
"VFIO updates for v3.15 include:
- Allow the vfio-type1 IOMMU to support multiple domains within a
container
- Plumb path to query whether all domains are cache-coherent
- Wire query into kvm-vfio device to avoid KVM x86 WBINVD emulation
- Always select CONFIG_ANON_INODES, vfio depends on it (Arnd)
The first patch also makes the vfio-type1 IOMMU driver completely
independent of the bus_type of the devices it's handling, which
enables it to be used for both vfio-pci and a future vfio-platform
(and hopefully combinations involving both simultaneously)"
* tag 'vfio-v3.15-rc1' of git://github.com/awilliam/linux-vfio:
vfio: always select ANON_INODES
kvm/vfio: Support for DMA coherent IOMMUs
vfio: Add external user check extension interface
vfio/type1: Add extension to test DMA cache coherence of IOMMU
vfio/iommu_type1: Multi-IOMMU domain support
kernel-based backends (by not populated m2p overrides when mapping),
and assorted minor bug fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQEcBAABAgAGBQJTPTGyAAoJEFxbo/MsZsTRnjgH/10j5CbOK1RFvIyCSslGTf4G
slhK8P8dhhplGAxwXXji322lWNYEx9Jd+V0Bhxnvr4drSlsP/qkWuBWf+u1LBvRq
AVPM99tk0XHCVAuvMMNo/lc62dTIR9IpQvnY6WhHSHnSlfqyVcdnbaGk8/LRuxWJ
u2F0MXzDNH00b/kt6hDBt3F7CkHfjwsEn43LCkkxyHPp5MJGD7bGDIe+bKtnjv9u
D9VJtCWQkrjWQ6jNpjdP833JCNCGQrXtVO3DeTAGs3T1tGmiEsqp6kT6Gp5zCFnh
oaQk9jfQL2S+IVnVhHVMW9nTwNPPrnIrD69FlgTrK301mcYW1mKoFotTogzHu+0=
=2IG+
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen features and fixes from David Vrabel:
"Support PCI devices with multiple MSIs, performance improvement for
kernel-based backends (by not populated m2p overrides when mapping),
and assorted minor bug fixes and cleanups"
* tag 'stable/for-linus-3.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/acpi-processor: fix enabling interrupts on syscore_resume
xen/grant-table: Refactor gnttab_[un]map_refs to avoid m2p_override
xen: remove XEN_PRIVILEGED_GUEST
xen: add support for MSI message groups
xen-pciback: Use pci_enable_msix_exact() instead of pci_enable_msix()
xen/xenbus: remove unused xenbus_bind_evtchn()
xen/events: remove unnecessary call to bind_evtchn_to_cpu()
xen/events: remove the unused resend_irq_on_evtchn()
drivers:xen-selfballoon:reset 'frontswap_inertia_counter' after frontswap_shrink
drivers: xen: Include appropriate header file in pcpu.c
drivers: xen: Mark function as static in platform-pci.c
Pull cgroup updates from Tejun Heo:
"A lot updates for cgroup:
- The biggest one is cgroup's conversion to kernfs. cgroup took
after the long abandoned vfs-entangled sysfs implementation and
made it even more convoluted over time. cgroup's internal objects
were fused with vfs objects which also brought in vfs locking and
object lifetime rules. Naturally, there are places where vfs rules
don't fit and nasty hacks, such as credential switching or lock
dance interleaving inode mutex and cgroup_mutex with object serial
number comparison thrown in to decide whether the operation is
actually necessary, needed to be employed.
After conversion to kernfs, internal object lifetime and locking
rules are mostly isolated from vfs interactions allowing shedding
of several nasty hacks and overall simplification. This will also
allow implmentation of operations which may affect multiple cgroups
which weren't possible before as it would have required nesting
i_mutexes.
- Various simplifications including dropping of module support,
easier cgroup name/path handling, simplified cgroup file type
handling and task_cg_lists optimization.
- Prepatory changes for the planned unified hierarchy, which is still
a patchset away from being actually operational. The dummy
hierarchy is updated to serve as the default unified hierarchy.
Controllers which aren't claimed by other hierarchies are
associated with it, which BTW was what the dummy hierarchy was for
anyway.
- Various fixes from Li and others. This pull request includes some
patches to add missing slab.h to various subsystems. This was
triggered xattr.h include removal from cgroup.h. cgroup.h
indirectly got included a lot of files which brought in xattr.h
which brought in slab.h.
There are several merge commits - one to pull in kernfs updates
necessary for converting cgroup (already in upstream through
driver-core), others for interfering changes in the fixes branch"
* 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (74 commits)
cgroup: remove useless argument from cgroup_exit()
cgroup: fix spurious lockdep warning in cgroup_exit()
cgroup: Use RCU_INIT_POINTER(x, NULL) in cgroup.c
cgroup: break kernfs active_ref protection in cgroup directory operations
cgroup: fix cgroup_taskset walking order
cgroup: implement CFTYPE_ONLY_ON_DFL
cgroup: make cgrp_dfl_root mountable
cgroup: drop const from @buffer of cftype->write_string()
cgroup: rename cgroup_dummy_root and related names
cgroup: move ->subsys_mask from cgroupfs_root to cgroup
cgroup: treat cgroup_dummy_root as an equivalent hierarchy during rebinding
cgroup: remove NULL checks from [pr_cont_]cgroup_{name|path}()
cgroup: use cgroup_setup_root() to initialize cgroup_dummy_root
cgroup: reorganize cgroup bootstrapping
cgroup: relocate setting of CGRP_DEAD
cpuset: use rcu_read_lock() to protect task_cs()
cgroup_freezer: document freezer_fork() subtleties
cgroup: update cgroup_transfer_tasks() to either succeed or fail
cgroup: drop task_lock() protection around task->cgroups
cgroup: update how a newly forked task gets associated with css_set
...
But there were a few features that were added.
Uprobes now work with event triggers and multi buffers.
Uprobes have support under ftrace and perf.
The big feature is that the function tracer can now be used within the
multi buffer instances. That is, you can now trace some functions
in one buffer, others in another buffer, all functions in a third buffer
and so on. They are basically agnostic from each other. This only
works for the function tracer and not for the function graph trace,
although you can have the function graph tracer running in the top level
buffer (or any tracer for that matter) and have different function tracing
going on in the sub buffers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTOthtAAoJEKQekfcNnQGu5c8H/Ana/U+0tmksp1dbHkRHsKSH
+Fsv4Jeu8gf1NaFKHEhkUTcFtnzE6qAPV2VCrcJwXbhAhhwZm+LjrnWdoy3215S3
cQW4LftLEonh2cM36Cos74TulMEYN6XmL6dQZV+CILKQkDrWU4qJjQ64okXEkqrd
9iG3p/mSXyvJcmnyg61ALnMOhZDLsXY3djBhWBPhiTPGS6BRb9zh4Pmw6Zv0n2rJ
U93Gt/3AQrv1ybu73dUxqP0abp60oXOiWoF/R2jcbKqIM+K9RPJX79unCV3jq3u9
f+6jMlB9PgAMqQj6ihJdwxKDDuzwyrVdEPnsgvl4jarCBCtVVwhKedBaKN/KS8k=
=HdXY
-----END PGP SIGNATURE-----
Merge tag 'trace-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"Most of the changes were largely clean ups, and some documentation.
But there were a few features that were added:
Uprobes now work with event triggers and multi buffers and have
support under ftrace and perf.
The big feature is that the function tracer can now be used within the
multi buffer instances. That is, you can now trace some functions in
one buffer, others in another buffer, all functions in a third buffer
and so on. They are basically agnostic from each other. This only
works for the function tracer and not for the function graph trace,
although you can have the function graph tracer running in the top
level buffer (or any tracer for that matter) and have different
function tracing going on in the sub buffers"
* tag 'trace-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (45 commits)
tracing: Add BUG_ON when stack end location is over written
tracepoint: Remove unused API functions
Revert "tracing: Move event storage for array from macro to standalone function"
ftrace: Constify ftrace_text_reserved
tracepoints: API doc update to tracepoint_probe_register() return value
tracepoints: API doc update to data argument
ftrace: Fix compilation warning about control_ops_free
ftrace/x86: BUG when ftrace recovery fails
ftrace: Warn on error when modifying ftrace function
ftrace: Remove freelist from struct dyn_ftrace
ftrace: Do not pass data to ftrace_dyn_arch_init
ftrace: Pass retval through return in ftrace_dyn_arch_init()
ftrace: Inline the code from ftrace_dyn_table_alloc()
ftrace: Cleanup of global variables ftrace_new_pgs and ftrace_update_cnt
tracing: Evaluate len expression only once in __dynamic_array macro
tracing: Correctly expand len expressions from __dynamic_array macro
tracing/module: Replace include of tracepoint.h with jump_label.h in module.h
tracing: Fix event header migrate.h to include tracepoint.h
tracing: Fix event header writeback.h to include tracepoint.h
tracing: Warn if a tracepoint is not set via debugfs
...