License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 21:07:57 +07:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2009-05-22 04:01:20 +07:00
|
|
|
/*
|
|
|
|
* Filesystem access notification for Linux
|
|
|
|
*
|
|
|
|
* Copyright (C) 2008 Red Hat, Inc., Eric Paris <eparis@redhat.com>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef __LINUX_FSNOTIFY_BACKEND_H
|
|
|
|
#define __LINUX_FSNOTIFY_BACKEND_H
|
|
|
|
|
|
|
|
#ifdef __KERNEL__
|
|
|
|
|
2009-05-22 04:02:01 +07:00
|
|
|
#include <linux/idr.h> /* inotify uses this */
|
2009-05-22 04:01:20 +07:00
|
|
|
#include <linux/fs.h> /* struct inode */
|
|
|
|
#include <linux/list.h>
|
|
|
|
#include <linux/path.h> /* struct path */
|
|
|
|
#include <linux/spinlock.h>
|
|
|
|
#include <linux/types.h>
|
2011-07-27 06:09:06 +07:00
|
|
|
#include <linux/atomic.h>
|
2016-12-14 20:56:33 +07:00
|
|
|
#include <linux/user_namespace.h>
|
2017-10-20 17:26:01 +07:00
|
|
|
#include <linux/refcount.h>
|
2009-05-22 04:01:20 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* IN_* from inotfy.h lines up EXACTLY with FS_*, this is so we can easily
|
|
|
|
* convert between them. dnotify only needs conversion at watch creation
|
|
|
|
* so no perf loss there. fanotify isn't defined yet, so it can use the
|
|
|
|
* wholes if it needs more events.
|
|
|
|
*/
|
|
|
|
#define FS_ACCESS 0x00000001 /* File was accessed */
|
|
|
|
#define FS_MODIFY 0x00000002 /* File was modified */
|
|
|
|
#define FS_ATTRIB 0x00000004 /* Metadata changed */
|
|
|
|
#define FS_CLOSE_WRITE 0x00000008 /* Writtable file was closed */
|
|
|
|
#define FS_CLOSE_NOWRITE 0x00000010 /* Unwrittable file closed */
|
|
|
|
#define FS_OPEN 0x00000020 /* File was opened */
|
|
|
|
#define FS_MOVED_FROM 0x00000040 /* File was moved from X */
|
|
|
|
#define FS_MOVED_TO 0x00000080 /* File was moved to Y */
|
|
|
|
#define FS_CREATE 0x00000100 /* Subfile was created */
|
|
|
|
#define FS_DELETE 0x00000200 /* Subfile was deleted */
|
|
|
|
#define FS_DELETE_SELF 0x00000400 /* Self was deleted */
|
|
|
|
#define FS_MOVE_SELF 0x00000800 /* Self was moved */
|
2018-11-08 10:07:14 +07:00
|
|
|
#define FS_OPEN_EXEC 0x00001000 /* File was opened for exec */
|
2009-05-22 04:01:20 +07:00
|
|
|
|
|
|
|
#define FS_UNMOUNT 0x00002000 /* inode on umount fs */
|
|
|
|
#define FS_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
|
|
|
|
#define FS_IN_IGNORED 0x00008000 /* last inotify event here */
|
|
|
|
|
2009-12-18 09:24:34 +07:00
|
|
|
#define FS_OPEN_PERM 0x00010000 /* open event in an permission hook */
|
|
|
|
#define FS_ACCESS_PERM 0x00020000 /* access event in a permissions hook */
|
2018-11-08 10:12:44 +07:00
|
|
|
#define FS_OPEN_EXEC_PERM 0x00040000 /* open/exec event in a permission hook */
|
fanotify: send FAN_DIR_MODIFY event flavor with dir inode and name
Dirent events are going to be supported in two flavors:
1. Directory fid info + mask that includes the specific event types
(e.g. FAN_CREATE) and an optional FAN_ONDIR flag.
2. Directory fid info + name + mask that includes only FAN_DIR_MODIFY.
To request the second event flavor, user needs to set the event type
FAN_DIR_MODIFY in the mark mask.
The first flavor is supported since kernel v5.1 for groups initialized
with flag FAN_REPORT_FID. It is intended to be used for watching
directories in "batch mode" - the watcher is notified when directory is
changed and re-scans the directory content in response. This event
flavor is stored more compactly in the event queue, so it is optimal
for workloads with frequent directory changes.
The second event flavor is intended to be used for watching large
directories, where the cost of re-scan of the directory on every change
is considered too high. The watcher getting the event with the directory
fid and entry name is expected to call fstatat(2) to query the content of
the entry after the change.
Legacy inotify events are reported with name and event mask (e.g. "foo",
FAN_CREATE | FAN_ONDIR). That can lead users to the conclusion that
there is *currently* an entry "foo" that is a sub-directory, when in fact
"foo" may be negative or non-dir by the time user gets the event.
To make it clear that the current state of the named entry is unknown,
when reporting an event with name info, fanotify obfuscates the specific
event types (e.g. create,delete,rename) and uses a common event type -
FAN_DIR_MODIFY to describe the change. This should make it harder for
users to make wrong assumptions and write buggy filesystem monitors.
At this point, name info reporting is not yet implemented, so trying to
set FAN_DIR_MODIFY in mark mask will return -EINVAL.
Link: https://lore.kernel.org/r/20200319151022.31456-12-amir73il@gmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-03-19 22:10:19 +07:00
|
|
|
#define FS_DIR_MODIFY 0x00080000 /* Directory entry was modified */
|
2009-12-18 09:24:34 +07:00
|
|
|
|
2010-07-28 21:18:37 +07:00
|
|
|
#define FS_EXCL_UNLINK 0x04000000 /* do not send events if object is unlinked */
|
2009-05-22 04:01:29 +07:00
|
|
|
/* This inode cares about things that happen to its children. Always set for
|
|
|
|
* dnotify and inotify. */
|
|
|
|
#define FS_EVENT_ON_CHILD 0x08000000
|
|
|
|
|
2020-03-19 22:10:09 +07:00
|
|
|
#define FS_DN_RENAME 0x10000000 /* file renamed */
|
|
|
|
#define FS_DN_MULTISHOT 0x20000000 /* dnotify multishot */
|
|
|
|
#define FS_ISDIR 0x40000000 /* event occurred against dir */
|
|
|
|
#define FS_IN_ONESHOT 0x80000000 /* only send event once */
|
|
|
|
|
2009-12-18 08:12:04 +07:00
|
|
|
#define FS_MOVE (FS_MOVED_FROM | FS_MOVED_TO)
|
|
|
|
|
2019-01-11 00:04:29 +07:00
|
|
|
/*
|
|
|
|
* Directory entry modification events - reported only to directory
|
|
|
|
* where entry is modified and not to a watching parent.
|
|
|
|
* The watching parent may get an FS_ATTRIB|FS_EVENT_ON_CHILD event
|
|
|
|
* when a directory entry inside a child subdir changes.
|
|
|
|
*/
|
fanotify: send FAN_DIR_MODIFY event flavor with dir inode and name
Dirent events are going to be supported in two flavors:
1. Directory fid info + mask that includes the specific event types
(e.g. FAN_CREATE) and an optional FAN_ONDIR flag.
2. Directory fid info + name + mask that includes only FAN_DIR_MODIFY.
To request the second event flavor, user needs to set the event type
FAN_DIR_MODIFY in the mark mask.
The first flavor is supported since kernel v5.1 for groups initialized
with flag FAN_REPORT_FID. It is intended to be used for watching
directories in "batch mode" - the watcher is notified when directory is
changed and re-scans the directory content in response. This event
flavor is stored more compactly in the event queue, so it is optimal
for workloads with frequent directory changes.
The second event flavor is intended to be used for watching large
directories, where the cost of re-scan of the directory on every change
is considered too high. The watcher getting the event with the directory
fid and entry name is expected to call fstatat(2) to query the content of
the entry after the change.
Legacy inotify events are reported with name and event mask (e.g. "foo",
FAN_CREATE | FAN_ONDIR). That can lead users to the conclusion that
there is *currently* an entry "foo" that is a sub-directory, when in fact
"foo" may be negative or non-dir by the time user gets the event.
To make it clear that the current state of the named entry is unknown,
when reporting an event with name info, fanotify obfuscates the specific
event types (e.g. create,delete,rename) and uses a common event type -
FAN_DIR_MODIFY to describe the change. This should make it harder for
users to make wrong assumptions and write buggy filesystem monitors.
At this point, name info reporting is not yet implemented, so trying to
set FAN_DIR_MODIFY in mark mask will return -EINVAL.
Link: https://lore.kernel.org/r/20200319151022.31456-12-amir73il@gmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-03-19 22:10:19 +07:00
|
|
|
#define ALL_FSNOTIFY_DIRENT_EVENTS (FS_CREATE | FS_DELETE | FS_MOVE | \
|
|
|
|
FS_DIR_MODIFY)
|
2019-01-11 00:04:29 +07:00
|
|
|
|
2018-11-08 10:12:44 +07:00
|
|
|
#define ALL_FSNOTIFY_PERM_EVENTS (FS_OPEN_PERM | FS_ACCESS_PERM | \
|
|
|
|
FS_OPEN_EXEC_PERM)
|
2010-10-29 04:21:56 +07:00
|
|
|
|
2019-01-11 00:04:29 +07:00
|
|
|
/*
|
|
|
|
* This is a list of all events that may get sent to a parent based on fs event
|
|
|
|
* happening to inodes inside that directory.
|
|
|
|
*/
|
|
|
|
#define FS_EVENTS_POSS_ON_CHILD (ALL_FSNOTIFY_PERM_EVENTS | \
|
|
|
|
FS_ACCESS | FS_MODIFY | FS_ATTRIB | \
|
|
|
|
FS_CLOSE_WRITE | FS_CLOSE_NOWRITE | \
|
|
|
|
FS_OPEN | FS_OPEN_EXEC)
|
|
|
|
|
2018-10-04 04:25:33 +07:00
|
|
|
/* Events that can be reported to backends */
|
2019-01-11 00:04:29 +07:00
|
|
|
#define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \
|
|
|
|
FS_EVENTS_POSS_ON_CHILD | \
|
|
|
|
FS_DELETE_SELF | FS_MOVE_SELF | FS_DN_RENAME | \
|
|
|
|
FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED)
|
2018-10-04 04:25:33 +07:00
|
|
|
|
|
|
|
/* Extra flags that may be reported with event or control handling of events */
|
|
|
|
#define ALL_FSNOTIFY_FLAGS (FS_EXCL_UNLINK | FS_ISDIR | FS_IN_ONESHOT | \
|
2010-07-28 21:18:37 +07:00
|
|
|
FS_DN_MULTISHOT | FS_EVENT_ON_CHILD)
|
|
|
|
|
2018-10-04 04:25:33 +07:00
|
|
|
#define ALL_FSNOTIFY_BITS (ALL_FSNOTIFY_EVENTS | ALL_FSNOTIFY_FLAGS)
|
|
|
|
|
2009-05-22 04:01:20 +07:00
|
|
|
struct fsnotify_group;
|
|
|
|
struct fsnotify_event;
|
2009-12-18 09:24:24 +07:00
|
|
|
struct fsnotify_mark;
|
2009-05-22 04:01:50 +07:00
|
|
|
struct fsnotify_event_private_data;
|
2014-01-22 06:48:14 +07:00
|
|
|
struct fsnotify_fname;
|
2016-11-10 22:02:11 +07:00
|
|
|
struct fsnotify_iter_info;
|
2009-05-22 04:01:20 +07:00
|
|
|
|
fs: fsnotify: account fsnotify metadata to kmemcg
Patch series "Directed kmem charging", v8.
The Linux kernel's memory cgroup allows limiting the memory usage of the
jobs running on the system to provide isolation between the jobs. All
the kernel memory allocated in the context of the job and marked with
__GFP_ACCOUNT will also be included in the memory usage and be limited
by the job's limit.
The kernel memory can only be charged to the memcg of the process in
whose context kernel memory was allocated. However there are cases
where the allocated kernel memory should be charged to the memcg
different from the current processes's memcg. This patch series
contains two such concrete use-cases i.e. fsnotify and buffer_head.
The fsnotify event objects can consume a lot of system memory for large
or unlimited queues if there is either no or slow listener. The events
are allocated in the context of the event producer. However they should
be charged to the event consumer. Similarly the buffer_head objects can
be allocated in a memcg different from the memcg of the page for which
buffer_head objects are being allocated.
To solve this issue, this patch series introduces mechanism to charge
kernel memory to a given memcg. In case of fsnotify events, the memcg
of the consumer can be used for charging and for buffer_head, the memcg
of the page can be charged. For directed charging, the caller can use
the scope API memalloc_[un]use_memcg() to specify the memcg to charge
for all the __GFP_ACCOUNT allocations within the scope.
This patch (of 2):
A lot of memory can be consumed by the events generated for the huge or
unlimited queues if there is either no or slow listener. This can cause
system level memory pressure or OOMs. So, it's better to account the
fsnotify kmem caches to the memcg of the listener.
However the listener can be in a different memcg than the memcg of the
producer and these allocations happen in the context of the event
producer. This patch introduces remote memcg charging API which the
producer can use to charge the allocations to the memcg of the listener.
There are seven fsnotify kmem caches and among them allocations from
dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and
inotify_inode_mark_cachep happens in the context of syscall from the
listener. So, SLAB_ACCOUNT is enough for these caches.
The objects from fsnotify_mark_connector_cachep are not accounted as
they are small compared to the notification mark or events and it is
unclear whom to account connector to since it is shared by all events
attached to the inode.
The allocations from the event caches happen in the context of the event
producer. For such caches we will need to remote charge the allocations
to the listener's memcg. Thus we save the memcg reference in the
fsnotify_group structure of the listener.
This patch has also moved the members of fsnotify_group to keep the size
same, at least for 64 bit build, even with additional member by filling
the holes.
[shakeelb@google.com: use GFP_KERNEL_ACCOUNT rather than open-coding it]
Link: http://lkml.kernel.org/r/20180702215439.211597-1-shakeelb@google.com
Link: http://lkml.kernel.org/r/20180627191250.209150-2-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-18 05:46:39 +07:00
|
|
|
struct mem_cgroup;
|
|
|
|
|
2009-05-22 04:01:20 +07:00
|
|
|
/*
|
|
|
|
* Each group much define these ops. The fsnotify infrastructure will call
|
|
|
|
* these operations for each relevant group.
|
|
|
|
*
|
|
|
|
* handle_event - main call for a group to handle an fs event
|
|
|
|
* free_group_priv - called when a group refcnt hits 0 to clean up the private union
|
fsnotify: change locking order
On Mon, Aug 01, 2011 at 04:38:22PM -0400, Eric Paris wrote:
>
> I finally built and tested a v3.0 kernel with these patches (I know I'm
> SOOOOOO far behind). Not what I hoped for:
>
> > [ 150.937798] VFS: Busy inodes after unmount of tmpfs. Self-destruct in 5 seconds. Have a nice day...
> > [ 150.945290] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
> > [ 150.946012] IP: [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
> > [ 150.946012] PGD 2bf9e067 PUD 2bf9f067 PMD 0
> > [ 150.946012] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > [ 150.946012] CPU 0
> > [ 150.946012] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ext4 jbd2 crc16 joydev ata_piix i2c_piix4 pcspkr uinput ipv6 autofs4 usbhid [last unloaded: scsi_wait_scan]
> > [ 150.946012]
> > [ 150.946012] Pid: 2764, comm: syscall_thrash Not tainted 3.0.0+ #1 Red Hat KVM
> > [ 150.946012] RIP: 0010:[<ffffffff810ffd58>] [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
> > [ 150.946012] RSP: 0018:ffff88002c2e5df8 EFLAGS: 00010282
> > [ 150.946012] RAX: 000000004e370d9f RBX: 0000000000000000 RCX: ffff88003a029438
> > [ 150.946012] RDX: 0000000033630a5f RSI: 0000000000000000 RDI: ffff88003491c240
> > [ 150.946012] RBP: ffff88002c2e5e08 R08: 0000000000000000 R09: 0000000000000000
> > [ 150.946012] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a029428
> > [ 150.946012] R13: ffff88003a029428 R14: ffff88003a029428 R15: ffff88003499a610
> > [ 150.946012] FS: 00007f5a05420700(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000
> > [ 150.946012] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [ 150.946012] CR2: 0000000000000070 CR3: 000000002a662000 CR4: 00000000000006f0
> > [ 150.946012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 150.946012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [ 150.946012] Process syscall_thrash (pid: 2764, threadinfo ffff88002c2e4000, task ffff88002bfbc760)
> > [ 150.946012] Stack:
> > [ 150.946012] ffff88003a029438 ffff88003a029428 ffff88002c2e5e38 ffffffff81102f76
> > [ 150.946012] ffff88003a029438 ffff88003a029598 ffffffff8160f9c0 ffff88002c221250
> > [ 150.946012] ffff88002c2e5e68 ffffffff8115e9be ffff88002c2e5e68 ffff88003a029438
> > [ 150.946012] Call Trace:
> > [ 150.946012] [<ffffffff81102f76>] shmem_evict_inode+0x76/0x130
> > [ 150.946012] [<ffffffff8115e9be>] evict+0x7e/0x170
> > [ 150.946012] [<ffffffff8115ee40>] iput_final+0xd0/0x190
> > [ 150.946012] [<ffffffff8115ef33>] iput+0x33/0x40
> > [ 150.946012] [<ffffffff81180205>] fsnotify_destroy_mark_locked+0x145/0x160
> > [ 150.946012] [<ffffffff81180316>] fsnotify_destroy_mark+0x36/0x50
> > [ 150.946012] [<ffffffff81181937>] sys_inotify_rm_watch+0x77/0xd0
> > [ 150.946012] [<ffffffff815aca52>] system_call_fastpath+0x16/0x1b
> > [ 150.946012] Code: 67 4a 00 b8 e4 ff ff ff eb aa 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 48 8b 9f 40 05 00 00
> > [ 150.946012] 83 7b 70 00 74 1c 4c 8d a3 80 00 00 00 4c 89 e7 e8 d2 5d 4a
> > [ 150.946012] RIP [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
> > [ 150.946012] RSP <ffff88002c2e5df8>
> > [ 150.946012] CR2: 0000000000000070
>
> Looks at aweful lot like the problem from:
> http://www.spinics.net/lists/linux-fsdevel/msg46101.html
>
I tried to reproduce this bug with your test program, but without success.
However, if I understand correctly, this occurs since we dont hold any locks when
we call iput() in mark_destroy(), right?
With the patches you tested, iput() is also not called within any lock, since the
groups mark_mutex is released temporarily before iput() is called. This is, since
the original codes behaviour is similar.
However since we now have a mutex as the biggest lock, we can do what you
suggested (http://www.spinics.net/lists/linux-fsdevel/msg46107.html) and
call iput() with the mutex held to avoid the race.
The patch below implements this. It uses nested locking to avoid deadlock in case
we do the final iput() on an inode which still holds marks and thus would take
the mutex again when calling fsnotify_inode_delete() in destroy_inode().
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-08-12 06:13:31 +07:00
|
|
|
* freeing_mark - called when a mark is being destroyed for some reason. The group
|
|
|
|
* MUST be holding a reference on each mark and that reference must be
|
|
|
|
* dropped in this function. inotify uses this function to send
|
|
|
|
* userspace messages that marks have been removed.
|
2009-05-22 04:01:20 +07:00
|
|
|
*/
|
|
|
|
struct fsnotify_ops {
|
2010-07-28 21:18:39 +07:00
|
|
|
int (*handle_event)(struct fsnotify_group *group,
|
2014-01-22 06:48:14 +07:00
|
|
|
struct inode *inode,
|
2016-11-21 08:19:09 +07:00
|
|
|
u32 mask, const void *data, int data_type,
|
2019-04-27 00:51:03 +07:00
|
|
|
const struct qstr *file_name, u32 cookie,
|
2016-11-10 23:51:50 +07:00
|
|
|
struct fsnotify_iter_info *iter_info);
|
2009-05-22 04:01:20 +07:00
|
|
|
void (*free_group_priv)(struct fsnotify_group *group);
|
2009-12-18 09:24:24 +07:00
|
|
|
void (*freeing_mark)(struct fsnotify_mark *mark, struct fsnotify_group *group);
|
2014-01-22 06:48:14 +07:00
|
|
|
void (*free_event)(struct fsnotify_event *event);
|
2016-12-22 00:06:12 +07:00
|
|
|
/* called on final put+free to free memory */
|
|
|
|
void (*free_mark)(struct fsnotify_mark *mark);
|
2014-01-22 06:48:14 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* all of the information about the original object we want to now send to
|
|
|
|
* a group. If you want to carry more info from the accessing task to the
|
|
|
|
* listener this structure is where you need to be adding fields.
|
|
|
|
*/
|
|
|
|
struct fsnotify_event {
|
|
|
|
struct list_head list;
|
2020-03-19 22:10:15 +07:00
|
|
|
unsigned long objectid; /* identifier for queue merges */
|
2009-05-22 04:01:20 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* A group is a "thing" that wants to receive notification about filesystem
|
|
|
|
* events. The mask holds the subset of event types this group cares about.
|
|
|
|
* refcnt on a group is up to the implementor and at any moment if it goes 0
|
|
|
|
* everything will be cleaned up.
|
|
|
|
*/
|
|
|
|
struct fsnotify_group {
|
fs: fsnotify: account fsnotify metadata to kmemcg
Patch series "Directed kmem charging", v8.
The Linux kernel's memory cgroup allows limiting the memory usage of the
jobs running on the system to provide isolation between the jobs. All
the kernel memory allocated in the context of the job and marked with
__GFP_ACCOUNT will also be included in the memory usage and be limited
by the job's limit.
The kernel memory can only be charged to the memcg of the process in
whose context kernel memory was allocated. However there are cases
where the allocated kernel memory should be charged to the memcg
different from the current processes's memcg. This patch series
contains two such concrete use-cases i.e. fsnotify and buffer_head.
The fsnotify event objects can consume a lot of system memory for large
or unlimited queues if there is either no or slow listener. The events
are allocated in the context of the event producer. However they should
be charged to the event consumer. Similarly the buffer_head objects can
be allocated in a memcg different from the memcg of the page for which
buffer_head objects are being allocated.
To solve this issue, this patch series introduces mechanism to charge
kernel memory to a given memcg. In case of fsnotify events, the memcg
of the consumer can be used for charging and for buffer_head, the memcg
of the page can be charged. For directed charging, the caller can use
the scope API memalloc_[un]use_memcg() to specify the memcg to charge
for all the __GFP_ACCOUNT allocations within the scope.
This patch (of 2):
A lot of memory can be consumed by the events generated for the huge or
unlimited queues if there is either no or slow listener. This can cause
system level memory pressure or OOMs. So, it's better to account the
fsnotify kmem caches to the memcg of the listener.
However the listener can be in a different memcg than the memcg of the
producer and these allocations happen in the context of the event
producer. This patch introduces remote memcg charging API which the
producer can use to charge the allocations to the memcg of the listener.
There are seven fsnotify kmem caches and among them allocations from
dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and
inotify_inode_mark_cachep happens in the context of syscall from the
listener. So, SLAB_ACCOUNT is enough for these caches.
The objects from fsnotify_mark_connector_cachep are not accounted as
they are small compared to the notification mark or events and it is
unclear whom to account connector to since it is shared by all events
attached to the inode.
The allocations from the event caches happen in the context of the event
producer. For such caches we will need to remote charge the allocations
to the listener's memcg. Thus we save the memcg reference in the
fsnotify_group structure of the listener.
This patch has also moved the members of fsnotify_group to keep the size
same, at least for 64 bit build, even with additional member by filling
the holes.
[shakeelb@google.com: use GFP_KERNEL_ACCOUNT rather than open-coding it]
Link: http://lkml.kernel.org/r/20180702215439.211597-1-shakeelb@google.com
Link: http://lkml.kernel.org/r/20180627191250.209150-2-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-18 05:46:39 +07:00
|
|
|
const struct fsnotify_ops *ops; /* how this group handles things */
|
|
|
|
|
2009-05-22 04:01:20 +07:00
|
|
|
/*
|
|
|
|
* How the refcnt is used is up to each group. When the refcnt hits 0
|
|
|
|
* fsnotify will clean up all of the resources associated with this group.
|
|
|
|
* As an example, the dnotify group will always have a refcnt=1 and that
|
|
|
|
* will never change. Inotify, on the other hand, has a group per
|
|
|
|
* inotify_init() and the refcnt will hit 0 only when that fd has been
|
|
|
|
* closed.
|
|
|
|
*/
|
2017-10-20 17:26:01 +07:00
|
|
|
refcount_t refcnt; /* things with interest in this group */
|
2009-05-22 04:01:20 +07:00
|
|
|
|
2009-05-22 04:01:37 +07:00
|
|
|
/* needed to send notification to userspace */
|
2016-10-08 06:56:52 +07:00
|
|
|
spinlock_t notification_lock; /* protect the notification_list */
|
2009-05-22 04:01:37 +07:00
|
|
|
struct list_head notification_list; /* list of event_holder this group needs to send to userspace */
|
|
|
|
wait_queue_head_t notification_waitq; /* read() on the notification file blocks on this waitq */
|
|
|
|
unsigned int q_len; /* events on the queue */
|
|
|
|
unsigned int max_events; /* maximum events allowed on the list */
|
2010-10-29 04:21:56 +07:00
|
|
|
/*
|
|
|
|
* Valid fsnotify group priorities. Events are send in order from highest
|
|
|
|
* priority to lowest priority. We default to the lowest priority.
|
|
|
|
*/
|
|
|
|
#define FS_PRIO_0 0 /* normal notifiers, no permissions */
|
|
|
|
#define FS_PRIO_1 1 /* fanotify content based access control */
|
|
|
|
#define FS_PRIO_2 2 /* fanotify pre-content access */
|
|
|
|
unsigned int priority;
|
2016-09-20 04:44:27 +07:00
|
|
|
bool shutdown; /* group is being shut down, don't queue more events */
|
2009-05-22 04:01:37 +07:00
|
|
|
|
2009-12-18 09:24:24 +07:00
|
|
|
/* stores all fastpath marks assoc with this group so they can be cleaned on unregister */
|
2011-06-14 22:29:50 +07:00
|
|
|
struct mutex mark_mutex; /* protect marks_list */
|
2009-12-18 09:24:24 +07:00
|
|
|
atomic_t num_marks; /* 1 for each mark and 1 for not being
|
2009-05-22 04:01:26 +07:00
|
|
|
* past the point of no return when freeing
|
|
|
|
* a group */
|
fs: fsnotify: account fsnotify metadata to kmemcg
Patch series "Directed kmem charging", v8.
The Linux kernel's memory cgroup allows limiting the memory usage of the
jobs running on the system to provide isolation between the jobs. All
the kernel memory allocated in the context of the job and marked with
__GFP_ACCOUNT will also be included in the memory usage and be limited
by the job's limit.
The kernel memory can only be charged to the memcg of the process in
whose context kernel memory was allocated. However there are cases
where the allocated kernel memory should be charged to the memcg
different from the current processes's memcg. This patch series
contains two such concrete use-cases i.e. fsnotify and buffer_head.
The fsnotify event objects can consume a lot of system memory for large
or unlimited queues if there is either no or slow listener. The events
are allocated in the context of the event producer. However they should
be charged to the event consumer. Similarly the buffer_head objects can
be allocated in a memcg different from the memcg of the page for which
buffer_head objects are being allocated.
To solve this issue, this patch series introduces mechanism to charge
kernel memory to a given memcg. In case of fsnotify events, the memcg
of the consumer can be used for charging and for buffer_head, the memcg
of the page can be charged. For directed charging, the caller can use
the scope API memalloc_[un]use_memcg() to specify the memcg to charge
for all the __GFP_ACCOUNT allocations within the scope.
This patch (of 2):
A lot of memory can be consumed by the events generated for the huge or
unlimited queues if there is either no or slow listener. This can cause
system level memory pressure or OOMs. So, it's better to account the
fsnotify kmem caches to the memcg of the listener.
However the listener can be in a different memcg than the memcg of the
producer and these allocations happen in the context of the event
producer. This patch introduces remote memcg charging API which the
producer can use to charge the allocations to the memcg of the listener.
There are seven fsnotify kmem caches and among them allocations from
dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and
inotify_inode_mark_cachep happens in the context of syscall from the
listener. So, SLAB_ACCOUNT is enough for these caches.
The objects from fsnotify_mark_connector_cachep are not accounted as
they are small compared to the notification mark or events and it is
unclear whom to account connector to since it is shared by all events
attached to the inode.
The allocations from the event caches happen in the context of the event
producer. For such caches we will need to remote charge the allocations
to the listener's memcg. Thus we save the memcg reference in the
fsnotify_group structure of the listener.
This patch has also moved the members of fsnotify_group to keep the size
same, at least for 64 bit build, even with additional member by filling
the holes.
[shakeelb@google.com: use GFP_KERNEL_ACCOUNT rather than open-coding it]
Link: http://lkml.kernel.org/r/20180702215439.211597-1-shakeelb@google.com
Link: http://lkml.kernel.org/r/20180627191250.209150-2-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-18 05:46:39 +07:00
|
|
|
atomic_t user_waits; /* Number of tasks waiting for user
|
|
|
|
* response */
|
2009-12-18 09:24:24 +07:00
|
|
|
struct list_head marks_list; /* all inode marks for this group */
|
2009-05-22 04:01:26 +07:00
|
|
|
|
2014-01-22 06:48:14 +07:00
|
|
|
struct fasync_struct *fsn_fa; /* async notification */
|
|
|
|
|
2014-02-22 01:14:11 +07:00
|
|
|
struct fsnotify_event *overflow_event; /* Event we queue when the
|
2014-01-22 06:48:14 +07:00
|
|
|
* notification list is too
|
|
|
|
* full */
|
fs: fsnotify: account fsnotify metadata to kmemcg
Patch series "Directed kmem charging", v8.
The Linux kernel's memory cgroup allows limiting the memory usage of the
jobs running on the system to provide isolation between the jobs. All
the kernel memory allocated in the context of the job and marked with
__GFP_ACCOUNT will also be included in the memory usage and be limited
by the job's limit.
The kernel memory can only be charged to the memcg of the process in
whose context kernel memory was allocated. However there are cases
where the allocated kernel memory should be charged to the memcg
different from the current processes's memcg. This patch series
contains two such concrete use-cases i.e. fsnotify and buffer_head.
The fsnotify event objects can consume a lot of system memory for large
or unlimited queues if there is either no or slow listener. The events
are allocated in the context of the event producer. However they should
be charged to the event consumer. Similarly the buffer_head objects can
be allocated in a memcg different from the memcg of the page for which
buffer_head objects are being allocated.
To solve this issue, this patch series introduces mechanism to charge
kernel memory to a given memcg. In case of fsnotify events, the memcg
of the consumer can be used for charging and for buffer_head, the memcg
of the page can be charged. For directed charging, the caller can use
the scope API memalloc_[un]use_memcg() to specify the memcg to charge
for all the __GFP_ACCOUNT allocations within the scope.
This patch (of 2):
A lot of memory can be consumed by the events generated for the huge or
unlimited queues if there is either no or slow listener. This can cause
system level memory pressure or OOMs. So, it's better to account the
fsnotify kmem caches to the memcg of the listener.
However the listener can be in a different memcg than the memcg of the
producer and these allocations happen in the context of the event
producer. This patch introduces remote memcg charging API which the
producer can use to charge the allocations to the memcg of the listener.
There are seven fsnotify kmem caches and among them allocations from
dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and
inotify_inode_mark_cachep happens in the context of syscall from the
listener. So, SLAB_ACCOUNT is enough for these caches.
The objects from fsnotify_mark_connector_cachep are not accounted as
they are small compared to the notification mark or events and it is
unclear whom to account connector to since it is shared by all events
attached to the inode.
The allocations from the event caches happen in the context of the event
producer. For such caches we will need to remote charge the allocations
to the listener's memcg. Thus we save the memcg reference in the
fsnotify_group structure of the listener.
This patch has also moved the members of fsnotify_group to keep the size
same, at least for 64 bit build, even with additional member by filling
the holes.
[shakeelb@google.com: use GFP_KERNEL_ACCOUNT rather than open-coding it]
Link: http://lkml.kernel.org/r/20180702215439.211597-1-shakeelb@google.com
Link: http://lkml.kernel.org/r/20180627191250.209150-2-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-18 05:46:39 +07:00
|
|
|
|
|
|
|
struct mem_cgroup *memcg; /* memcg to charge allocations */
|
2011-10-15 04:43:39 +07:00
|
|
|
|
2009-05-22 04:01:20 +07:00
|
|
|
/* groups can define private fields here or use the void *private */
|
|
|
|
union {
|
|
|
|
void *private;
|
2009-05-22 04:02:01 +07:00
|
|
|
#ifdef CONFIG_INOTIFY_USER
|
|
|
|
struct inotify_group_private_data {
|
|
|
|
spinlock_t idr_lock;
|
|
|
|
struct idr idr;
|
2016-12-14 20:56:33 +07:00
|
|
|
struct ucounts *ucounts;
|
2009-05-22 04:02:01 +07:00
|
|
|
} inotify_data;
|
2009-12-18 09:24:34 +07:00
|
|
|
#endif
|
2010-07-28 21:18:37 +07:00
|
|
|
#ifdef CONFIG_FANOTIFY
|
2009-12-18 09:24:34 +07:00
|
|
|
struct fanotify_group_private_data {
|
|
|
|
/* allows a group to block waiting for a userspace response */
|
|
|
|
struct list_head access_list;
|
|
|
|
wait_queue_head_t access_waitq;
|
2018-09-22 01:20:30 +07:00
|
|
|
int flags; /* flags from fanotify_init() */
|
|
|
|
int f_flags; /* event_f_flags from fanotify_init() */
|
2010-10-29 04:21:57 +07:00
|
|
|
unsigned int max_marks;
|
2010-10-29 04:21:58 +07:00
|
|
|
struct user_struct *user;
|
2009-12-18 09:24:34 +07:00
|
|
|
} fanotify_data;
|
2010-07-28 21:18:37 +07:00
|
|
|
#endif /* CONFIG_FANOTIFY */
|
2009-05-22 04:01:20 +07:00
|
|
|
};
|
|
|
|
};
|
|
|
|
|
2020-03-19 22:10:12 +07:00
|
|
|
/* When calling fsnotify tell it if the data is a path or inode */
|
|
|
|
enum fsnotify_data_type {
|
|
|
|
FSNOTIFY_EVENT_NONE,
|
|
|
|
FSNOTIFY_EVENT_PATH,
|
|
|
|
FSNOTIFY_EVENT_INODE,
|
|
|
|
};
|
|
|
|
|
2020-07-08 18:11:38 +07:00
|
|
|
static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
|
2020-03-19 22:10:12 +07:00
|
|
|
{
|
|
|
|
switch (data_type) {
|
|
|
|
case FSNOTIFY_EVENT_INODE:
|
2020-07-08 18:11:38 +07:00
|
|
|
return (struct inode *)data;
|
2020-03-19 22:10:12 +07:00
|
|
|
case FSNOTIFY_EVENT_PATH:
|
|
|
|
return d_inode(((const struct path *)data)->dentry);
|
|
|
|
default:
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline const struct path *fsnotify_data_path(const void *data,
|
|
|
|
int data_type)
|
|
|
|
{
|
|
|
|
switch (data_type) {
|
|
|
|
case FSNOTIFY_EVENT_PATH:
|
|
|
|
return data;
|
|
|
|
default:
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
}
|
2009-05-22 04:01:20 +07:00
|
|
|
|
2018-04-21 06:10:49 +07:00
|
|
|
enum fsnotify_obj_type {
|
|
|
|
FSNOTIFY_OBJ_TYPE_INODE,
|
|
|
|
FSNOTIFY_OBJ_TYPE_VFSMOUNT,
|
2018-09-01 14:41:11 +07:00
|
|
|
FSNOTIFY_OBJ_TYPE_SB,
|
2018-04-21 06:10:49 +07:00
|
|
|
FSNOTIFY_OBJ_TYPE_COUNT,
|
|
|
|
FSNOTIFY_OBJ_TYPE_DETACHED = FSNOTIFY_OBJ_TYPE_COUNT
|
|
|
|
};
|
|
|
|
|
|
|
|
#define FSNOTIFY_OBJ_TYPE_INODE_FL (1U << FSNOTIFY_OBJ_TYPE_INODE)
|
|
|
|
#define FSNOTIFY_OBJ_TYPE_VFSMOUNT_FL (1U << FSNOTIFY_OBJ_TYPE_VFSMOUNT)
|
2018-09-01 14:41:11 +07:00
|
|
|
#define FSNOTIFY_OBJ_TYPE_SB_FL (1U << FSNOTIFY_OBJ_TYPE_SB)
|
2018-04-21 06:10:49 +07:00
|
|
|
#define FSNOTIFY_OBJ_ALL_TYPES_MASK ((1U << FSNOTIFY_OBJ_TYPE_COUNT) - 1)
|
|
|
|
|
2018-06-23 21:54:48 +07:00
|
|
|
static inline bool fsnotify_valid_obj_type(unsigned int type)
|
|
|
|
{
|
|
|
|
return (type < FSNOTIFY_OBJ_TYPE_COUNT);
|
|
|
|
}
|
|
|
|
|
2018-04-21 06:10:50 +07:00
|
|
|
struct fsnotify_iter_info {
|
2018-04-21 06:10:52 +07:00
|
|
|
struct fsnotify_mark *marks[FSNOTIFY_OBJ_TYPE_COUNT];
|
2018-04-21 06:10:50 +07:00
|
|
|
unsigned int report_mask;
|
|
|
|
int srcu_idx;
|
|
|
|
};
|
|
|
|
|
2018-04-21 06:10:52 +07:00
|
|
|
static inline bool fsnotify_iter_should_report_type(
|
|
|
|
struct fsnotify_iter_info *iter_info, int type)
|
|
|
|
{
|
|
|
|
return (iter_info->report_mask & (1U << type));
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void fsnotify_iter_set_report_type(
|
|
|
|
struct fsnotify_iter_info *iter_info, int type)
|
|
|
|
{
|
|
|
|
iter_info->report_mask |= (1U << type);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void fsnotify_iter_set_report_type_mark(
|
|
|
|
struct fsnotify_iter_info *iter_info, int type,
|
|
|
|
struct fsnotify_mark *mark)
|
|
|
|
{
|
|
|
|
iter_info->marks[type] = mark;
|
|
|
|
iter_info->report_mask |= (1U << type);
|
|
|
|
}
|
|
|
|
|
2018-04-21 06:10:50 +07:00
|
|
|
#define FSNOTIFY_ITER_FUNCS(name, NAME) \
|
|
|
|
static inline struct fsnotify_mark *fsnotify_iter_##name##_mark( \
|
|
|
|
struct fsnotify_iter_info *iter_info) \
|
|
|
|
{ \
|
|
|
|
return (iter_info->report_mask & FSNOTIFY_OBJ_TYPE_##NAME##_FL) ? \
|
2018-04-21 06:10:52 +07:00
|
|
|
iter_info->marks[FSNOTIFY_OBJ_TYPE_##NAME] : NULL; \
|
2018-04-21 06:10:50 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
FSNOTIFY_ITER_FUNCS(inode, INODE)
|
|
|
|
FSNOTIFY_ITER_FUNCS(vfsmount, VFSMOUNT)
|
2018-09-01 14:41:11 +07:00
|
|
|
FSNOTIFY_ITER_FUNCS(sb, SB)
|
2018-04-21 06:10:50 +07:00
|
|
|
|
2018-04-21 06:10:52 +07:00
|
|
|
#define fsnotify_foreach_obj_type(type) \
|
|
|
|
for (type = 0; type < FSNOTIFY_OBJ_TYPE_COUNT; type++)
|
|
|
|
|
2018-06-23 21:54:49 +07:00
|
|
|
/*
|
|
|
|
* fsnotify_connp_t is what we embed in objects which connector can be attached
|
|
|
|
* to. fsnotify_connp_t * is how we refer from connector back to object.
|
|
|
|
*/
|
|
|
|
struct fsnotify_mark_connector;
|
|
|
|
typedef struct fsnotify_mark_connector __rcu *fsnotify_connp_t;
|
|
|
|
|
2017-03-14 18:31:02 +07:00
|
|
|
/*
|
2018-09-01 14:41:11 +07:00
|
|
|
* Inode/vfsmount/sb point to this structure which tracks all marks attached to
|
|
|
|
* the inode/vfsmount/sb. The reference to inode/vfsmount/sb is held by this
|
2017-02-01 15:21:58 +07:00
|
|
|
* structure. We destroy this structure when there are no more marks attached
|
|
|
|
* to it. The structure is protected by fsnotify_mark_srcu.
|
2017-03-14 18:31:02 +07:00
|
|
|
*/
|
|
|
|
struct fsnotify_mark_connector {
|
2017-02-01 14:19:43 +07:00
|
|
|
spinlock_t lock;
|
2019-06-19 17:34:44 +07:00
|
|
|
unsigned short type; /* Type of object [lock] */
|
|
|
|
#define FSNOTIFY_CONN_FLAG_HAS_FSID 0x01
|
|
|
|
unsigned short flags; /* flags [lock] */
|
2019-01-11 00:04:37 +07:00
|
|
|
__kernel_fsid_t fsid; /* fsid of filesystem containing object */
|
2018-06-23 21:54:49 +07:00
|
|
|
union {
|
|
|
|
/* Object pointer [lock] */
|
|
|
|
fsnotify_connp_t *obj;
|
2017-02-01 15:21:58 +07:00
|
|
|
/* Used listing heads to free after srcu period expires */
|
|
|
|
struct fsnotify_mark_connector *destroy_next;
|
|
|
|
};
|
2018-04-20 00:44:33 +07:00
|
|
|
struct hlist_head list;
|
2017-03-14 18:31:02 +07:00
|
|
|
};
|
|
|
|
|
2009-05-22 04:01:26 +07:00
|
|
|
/*
|
2015-09-05 05:43:06 +07:00
|
|
|
* A mark is simply an object attached to an in core inode which allows an
|
2009-05-22 04:01:26 +07:00
|
|
|
* fsnotify listener to indicate they are either no longer interested in events
|
|
|
|
* of a type matching mask or only interested in those events.
|
|
|
|
*
|
2015-09-05 05:43:06 +07:00
|
|
|
* These are flushed when an inode is evicted from core and may be flushed
|
|
|
|
* when the inode is modified (as seen by fsnotify_access). Some fsnotify
|
|
|
|
* users (such as dnotify) will flush these when the open fd is closed and not
|
|
|
|
* at inode eviction or modification.
|
|
|
|
*
|
|
|
|
* Text in brackets is showing the lock(s) protecting modifications of a
|
|
|
|
* particular entry. obj_lock means either inode->i_lock or
|
|
|
|
* mnt->mnt_root->d_lock depending on the mark type.
|
2009-05-22 04:01:26 +07:00
|
|
|
*/
|
2009-12-18 09:24:24 +07:00
|
|
|
struct fsnotify_mark {
|
2015-09-05 05:43:06 +07:00
|
|
|
/* Mask this mark is for [mark->lock, group->mark_mutex] */
|
|
|
|
__u32 mask;
|
|
|
|
/* We hold one for presence in g_list. Also one ref for each 'thing'
|
2009-05-22 04:01:26 +07:00
|
|
|
* in kernel that found and may be using this mark. */
|
2017-10-20 17:26:02 +07:00
|
|
|
refcount_t refcnt;
|
2015-09-05 05:43:06 +07:00
|
|
|
/* Group this mark is for. Set on mark creation, stable until last ref
|
|
|
|
* is dropped */
|
|
|
|
struct fsnotify_group *group;
|
2018-04-05 20:18:04 +07:00
|
|
|
/* List of marks by group->marks_list. Also reused for queueing
|
2015-09-05 05:43:06 +07:00
|
|
|
* mark into destroy_list when it's waiting for the end of SRCU period
|
|
|
|
* before it can be freed. [group->mark_mutex] */
|
2016-02-18 04:11:18 +07:00
|
|
|
struct list_head g_list;
|
2015-09-05 05:43:06 +07:00
|
|
|
/* Protects inode / mnt pointers, flags, masks */
|
|
|
|
spinlock_t lock;
|
2016-12-21 18:15:30 +07:00
|
|
|
/* List of marks for inode / vfsmount [connector->lock, mark ref] */
|
2015-09-05 05:43:06 +07:00
|
|
|
struct hlist_node obj_list;
|
2016-12-21 18:15:30 +07:00
|
|
|
/* Head of list of marks for an object [mark ref] */
|
2017-03-14 20:29:35 +07:00
|
|
|
struct fsnotify_mark_connector *connector;
|
2015-09-05 05:43:06 +07:00
|
|
|
/* Events types to ignore [mark->lock, group->mark_mutex] */
|
|
|
|
__u32 ignored_mask;
|
2017-03-14 20:48:00 +07:00
|
|
|
#define FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY 0x01
|
|
|
|
#define FSNOTIFY_MARK_FLAG_ALIVE 0x02
|
|
|
|
#define FSNOTIFY_MARK_FLAG_ATTACHED 0x04
|
2015-09-05 05:43:06 +07:00
|
|
|
unsigned int flags; /* flags [mark->lock] */
|
2009-05-22 04:01:26 +07:00
|
|
|
};
|
|
|
|
|
2009-05-22 04:01:20 +07:00
|
|
|
#ifdef CONFIG_FSNOTIFY
|
|
|
|
|
|
|
|
/* called from the vfs helpers */
|
|
|
|
|
|
|
|
/* main fsnotify call to send events */
|
2020-03-19 22:10:13 +07:00
|
|
|
extern int fsnotify(struct inode *to_tell, __u32 mask, const void *data,
|
|
|
|
int data_type, const struct qstr *name, u32 cookie);
|
2020-07-08 18:11:36 +07:00
|
|
|
extern int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
|
2020-03-19 22:10:13 +07:00
|
|
|
int data_type);
|
2009-05-22 04:01:26 +07:00
|
|
|
extern void __fsnotify_inode_delete(struct inode *inode);
|
2009-12-18 09:24:27 +07:00
|
|
|
extern void __fsnotify_vfsmount_delete(struct vfsmount *mnt);
|
2018-09-01 14:41:11 +07:00
|
|
|
extern void fsnotify_sb_delete(struct super_block *sb);
|
2009-05-22 04:01:47 +07:00
|
|
|
extern u32 fsnotify_get_cookie(void);
|
2009-05-22 04:01:20 +07:00
|
|
|
|
2009-05-22 04:01:29 +07:00
|
|
|
static inline int fsnotify_inode_watches_children(struct inode *inode)
|
|
|
|
{
|
|
|
|
/* FS_EVENT_ON_CHILD is set if the inode may care */
|
|
|
|
if (!(inode->i_fsnotify_mask & FS_EVENT_ON_CHILD))
|
|
|
|
return 0;
|
|
|
|
/* this inode might care about child events, does it care about the
|
|
|
|
* specific set of events that can happen on a child? */
|
|
|
|
return inode->i_fsnotify_mask & FS_EVENTS_POSS_ON_CHILD;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Update the dentry with a flag indicating the interest of its parent to receive
|
|
|
|
* filesystem events when those events happens to this dentry->d_inode.
|
|
|
|
*/
|
2016-05-30 05:35:12 +07:00
|
|
|
static inline void fsnotify_update_flags(struct dentry *dentry)
|
2009-05-22 04:01:29 +07:00
|
|
|
{
|
|
|
|
assert_spin_locked(&dentry->d_lock);
|
|
|
|
|
2011-01-07 13:49:38 +07:00
|
|
|
/*
|
|
|
|
* Serialisation of setting PARENT_WATCHED on the dentries is provided
|
|
|
|
* by d_lock. If inotify_inode_watched changes after we have taken
|
|
|
|
* d_lock, the following __fsnotify_update_child_dentry_flags call will
|
|
|
|
* find our entry, so it will spin until we complete here, and update
|
|
|
|
* us with the new state.
|
|
|
|
*/
|
2016-05-30 05:35:12 +07:00
|
|
|
if (fsnotify_inode_watches_children(dentry->d_parent->d_inode))
|
2009-05-22 04:01:29 +07:00
|
|
|
dentry->d_flags |= DCACHE_FSNOTIFY_PARENT_WATCHED;
|
|
|
|
else
|
|
|
|
dentry->d_flags &= ~DCACHE_FSNOTIFY_PARENT_WATCHED;
|
|
|
|
}
|
|
|
|
|
2009-05-22 04:01:20 +07:00
|
|
|
/* called from fsnotify listeners, such as fanotify or dnotify */
|
|
|
|
|
2011-06-14 22:29:46 +07:00
|
|
|
/* create a new group */
|
2009-12-18 09:24:22 +07:00
|
|
|
extern struct fsnotify_group *fsnotify_alloc_group(const struct fsnotify_ops *ops);
|
2011-06-14 22:29:46 +07:00
|
|
|
/* get reference to a group */
|
|
|
|
extern void fsnotify_get_group(struct fsnotify_group *group);
|
2009-12-18 09:24:22 +07:00
|
|
|
/* drop reference on a group from fsnotify_alloc_group */
|
2009-05-22 04:01:20 +07:00
|
|
|
extern void fsnotify_put_group(struct fsnotify_group *group);
|
2016-09-20 04:44:27 +07:00
|
|
|
/* group destruction begins, stop queuing new events */
|
|
|
|
extern void fsnotify_group_stop_queueing(struct fsnotify_group *group);
|
2011-06-14 22:29:45 +07:00
|
|
|
/* destroy group */
|
|
|
|
extern void fsnotify_destroy_group(struct fsnotify_group *group);
|
2011-10-15 04:43:39 +07:00
|
|
|
/* fasync handler function */
|
|
|
|
extern int fsnotify_fasync(int fd, struct file *file, int on);
|
2014-01-22 06:48:14 +07:00
|
|
|
/* Free event from memory */
|
|
|
|
extern void fsnotify_destroy_event(struct fsnotify_group *group,
|
|
|
|
struct fsnotify_event *event);
|
2009-05-22 04:01:37 +07:00
|
|
|
/* attach the event to the group notification queue */
|
2014-08-07 06:03:26 +07:00
|
|
|
extern int fsnotify_add_event(struct fsnotify_group *group,
|
|
|
|
struct fsnotify_event *event,
|
|
|
|
int (*merge)(struct list_head *,
|
|
|
|
struct fsnotify_event *));
|
2018-02-21 21:07:52 +07:00
|
|
|
/* Queue overflow event to a notification group */
|
|
|
|
static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
|
|
|
|
{
|
|
|
|
fsnotify_add_event(group, group->overflow_event, NULL);
|
|
|
|
}
|
|
|
|
|
2009-05-22 04:01:37 +07:00
|
|
|
/* true if the group notification queue is empty */
|
|
|
|
extern bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group);
|
|
|
|
/* return, but do not dequeue the first event on the notification queue */
|
2014-08-07 06:03:26 +07:00
|
|
|
extern struct fsnotify_event *fsnotify_peek_first_event(struct fsnotify_group *group);
|
2009-05-22 04:01:50 +07:00
|
|
|
/* return AND dequeue the first event on the notification queue */
|
2014-08-07 06:03:26 +07:00
|
|
|
extern struct fsnotify_event *fsnotify_remove_first_event(struct fsnotify_group *group);
|
2019-01-09 19:15:23 +07:00
|
|
|
/* Remove event queued in the notification list */
|
|
|
|
extern void fsnotify_remove_queued_event(struct fsnotify_group *group,
|
|
|
|
struct fsnotify_event *event);
|
2009-05-22 04:01:37 +07:00
|
|
|
|
2009-05-22 04:01:26 +07:00
|
|
|
/* functions used to manipulate the marks attached to inodes */
|
|
|
|
|
2018-06-23 21:54:50 +07:00
|
|
|
/* Get mask of events for a list of marks */
|
|
|
|
extern __u32 fsnotify_conn_mask(struct fsnotify_mark_connector *conn);
|
2017-03-15 15:16:27 +07:00
|
|
|
/* Calculate mask of events for a list of marks */
|
|
|
|
extern void fsnotify_recalc_mask(struct fsnotify_mark_connector *conn);
|
2016-12-22 00:32:48 +07:00
|
|
|
extern void fsnotify_init_mark(struct fsnotify_mark *mark,
|
2016-12-22 00:06:12 +07:00
|
|
|
struct fsnotify_group *group);
|
2016-12-21 22:28:45 +07:00
|
|
|
/* Find mark belonging to given group in the list of marks */
|
2018-06-23 21:54:47 +07:00
|
|
|
extern struct fsnotify_mark *fsnotify_find_mark(fsnotify_connp_t *connp,
|
|
|
|
struct fsnotify_group *group);
|
2019-01-11 00:04:37 +07:00
|
|
|
/* Get cached fsid of filesystem containing object */
|
|
|
|
extern int fsnotify_get_conn_fsid(const struct fsnotify_mark_connector *conn,
|
|
|
|
__kernel_fsid_t *fsid);
|
2018-06-23 21:54:48 +07:00
|
|
|
/* attach the mark to the object */
|
|
|
|
extern int fsnotify_add_mark(struct fsnotify_mark *mark,
|
|
|
|
fsnotify_connp_t *connp, unsigned int type,
|
2019-01-11 00:04:37 +07:00
|
|
|
int allow_dups, __kernel_fsid_t *fsid);
|
2016-12-22 00:32:48 +07:00
|
|
|
extern int fsnotify_add_mark_locked(struct fsnotify_mark *mark,
|
2019-01-11 00:04:37 +07:00
|
|
|
fsnotify_connp_t *connp,
|
|
|
|
unsigned int type, int allow_dups,
|
|
|
|
__kernel_fsid_t *fsid);
|
|
|
|
|
2018-04-21 06:10:55 +07:00
|
|
|
/* attach the mark to the inode */
|
|
|
|
static inline int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
|
|
|
|
struct inode *inode,
|
|
|
|
int allow_dups)
|
|
|
|
{
|
2018-06-23 21:54:48 +07:00
|
|
|
return fsnotify_add_mark(mark, &inode->i_fsnotify_marks,
|
2019-01-11 00:04:37 +07:00
|
|
|
FSNOTIFY_OBJ_TYPE_INODE, allow_dups, NULL);
|
2018-04-21 06:10:55 +07:00
|
|
|
}
|
|
|
|
static inline int fsnotify_add_inode_mark_locked(struct fsnotify_mark *mark,
|
|
|
|
struct inode *inode,
|
|
|
|
int allow_dups)
|
|
|
|
{
|
2018-06-23 21:54:48 +07:00
|
|
|
return fsnotify_add_mark_locked(mark, &inode->i_fsnotify_marks,
|
2019-01-11 00:04:37 +07:00
|
|
|
FSNOTIFY_OBJ_TYPE_INODE, allow_dups,
|
|
|
|
NULL);
|
2018-04-21 06:10:55 +07:00
|
|
|
}
|
2019-01-11 00:04:37 +07:00
|
|
|
|
2011-06-14 22:29:51 +07:00
|
|
|
/* given a group and a mark, flag mark to be freed when all references are dropped */
|
|
|
|
extern void fsnotify_destroy_mark(struct fsnotify_mark *mark,
|
|
|
|
struct fsnotify_group *group);
|
2015-09-05 05:43:12 +07:00
|
|
|
/* detach mark from inode / mount list, group list, drop inode reference */
|
|
|
|
extern void fsnotify_detach_mark(struct fsnotify_mark *mark);
|
|
|
|
/* free mark */
|
|
|
|
extern void fsnotify_free_mark(struct fsnotify_mark *mark);
|
2019-08-19 01:18:46 +07:00
|
|
|
/* Wait until all marks queued for destruction are destroyed */
|
|
|
|
extern void fsnotify_wait_marks_destroyed(void);
|
2017-03-14 20:29:35 +07:00
|
|
|
/* run all the marks in a group, and clear all of the marks attached to given object type */
|
2017-01-04 16:33:18 +07:00
|
|
|
extern void fsnotify_clear_marks_by_group(struct fsnotify_group *group, unsigned int type);
|
2016-12-21 22:20:32 +07:00
|
|
|
/* run all the marks in a group, and clear all of the vfsmount marks */
|
|
|
|
static inline void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
|
|
|
|
{
|
2018-04-21 06:10:49 +07:00
|
|
|
fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_VFSMOUNT_FL);
|
2016-12-21 22:20:32 +07:00
|
|
|
}
|
|
|
|
/* run all the marks in a group, and clear all of the inode marks */
|
|
|
|
static inline void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
|
|
|
|
{
|
2018-04-21 06:10:49 +07:00
|
|
|
fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_INODE_FL);
|
2016-12-21 22:20:32 +07:00
|
|
|
}
|
2018-09-01 14:41:11 +07:00
|
|
|
/* run all the marks in a group, and clear all of the sn marks */
|
|
|
|
static inline void fsnotify_clear_sb_marks_by_group(struct fsnotify_group *group)
|
|
|
|
{
|
|
|
|
fsnotify_clear_marks_by_group(group, FSNOTIFY_OBJ_TYPE_SB_FL);
|
|
|
|
}
|
2009-12-18 09:24:24 +07:00
|
|
|
extern void fsnotify_get_mark(struct fsnotify_mark *mark);
|
|
|
|
extern void fsnotify_put_mark(struct fsnotify_mark *mark);
|
2016-11-10 22:02:11 +07:00
|
|
|
extern void fsnotify_finish_user_wait(struct fsnotify_iter_info *iter_info);
|
|
|
|
extern bool fsnotify_prepare_user_wait(struct fsnotify_iter_info *iter_info);
|
2009-05-22 04:01:26 +07:00
|
|
|
|
2019-01-11 00:04:31 +07:00
|
|
|
static inline void fsnotify_init_event(struct fsnotify_event *event,
|
2020-03-19 22:10:15 +07:00
|
|
|
unsigned long objectid)
|
2019-01-11 00:04:31 +07:00
|
|
|
{
|
|
|
|
INIT_LIST_HEAD(&event->list);
|
2020-03-19 22:10:15 +07:00
|
|
|
event->objectid = objectid;
|
2019-01-11 00:04:31 +07:00
|
|
|
}
|
2009-12-18 09:24:21 +07:00
|
|
|
|
2009-05-22 04:01:20 +07:00
|
|
|
#else
|
|
|
|
|
2020-03-19 22:10:13 +07:00
|
|
|
static inline int fsnotify(struct inode *to_tell, __u32 mask, const void *data,
|
|
|
|
int data_type, const struct qstr *name, u32 cookie)
|
2009-12-18 09:24:34 +07:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2009-05-22 04:01:26 +07:00
|
|
|
|
2020-07-08 18:11:36 +07:00
|
|
|
static inline int __fsnotify_parent(struct dentry *dentry, __u32 mask,
|
2020-03-19 22:10:13 +07:00
|
|
|
const void *data, int data_type)
|
2010-10-29 04:21:56 +07:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2009-05-22 04:01:29 +07:00
|
|
|
|
2009-05-22 04:01:26 +07:00
|
|
|
static inline void __fsnotify_inode_delete(struct inode *inode)
|
|
|
|
{}
|
|
|
|
|
2009-12-18 09:24:27 +07:00
|
|
|
static inline void __fsnotify_vfsmount_delete(struct vfsmount *mnt)
|
|
|
|
{}
|
|
|
|
|
2018-09-01 14:41:11 +07:00
|
|
|
static inline void fsnotify_sb_delete(struct super_block *sb)
|
|
|
|
{}
|
|
|
|
|
2016-05-30 05:35:12 +07:00
|
|
|
static inline void fsnotify_update_flags(struct dentry *dentry)
|
2009-05-22 04:01:29 +07:00
|
|
|
{}
|
|
|
|
|
2009-05-22 04:01:47 +07:00
|
|
|
static inline u32 fsnotify_get_cookie(void)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2015-03-05 00:37:22 +07:00
|
|
|
static inline void fsnotify_unmount_inodes(struct super_block *sb)
|
2009-05-22 04:01:58 +07:00
|
|
|
{}
|
|
|
|
|
2009-05-22 04:01:20 +07:00
|
|
|
#endif /* CONFIG_FSNOTIFY */
|
|
|
|
|
|
|
|
#endif /* __KERNEL __ */
|
|
|
|
|
|
|
|
#endif /* __LINUX_FSNOTIFY_BACKEND_H */
|