Commit Graph

3215 Commits

Author SHA1 Message Date
John Johansen
73688d1ed0 apparmor: refactor prepare_ns() and make usable from different views
prepare_ns() will need to be called from alternate views, and namespaces
will need to be created via different interfaces. So refactor and
allow specifying the view ns.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:28 -08:00
John Johansen
5fd1b95fc9 apparmor: update policy_destroy to use new debug asserts
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:27 -08:00
John Johansen
d102d89571 apparmor: pass gfp param into aa_policy_init()
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:27 -08:00
John Johansen
bbe4a7c873 apparmor: constify policy name and hname
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:26 -08:00
John Johansen
6e474e3063 apparmor: rename hname_tail to basename
Rename to the shorter and more familiar shell cmd name

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:25 -08:00
John Johansen
efeee83a70 apparmor: rename mediated_filesystem() to path_mediated_fs()
Rename to indicate the test is only about whether path mediation is used,
not whether other types of mediation might be used.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:24 -08:00
John Johansen
680cd62e91 apparmor: add debug assert AA_BUG and Kconfig to control debug info
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:24 -08:00
John Johansen
57e36bbd67 apparmor: add macro for bug asserts to check that a lock is held
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:23 -08:00
John Johansen
92b6d8eff5 apparmor: allow ns visibility question to consider subnses
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:22 -08:00
John Johansen
31617ddfdd apparmor: add fn to lookup profiles by fqname
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:22 -08:00
John Johansen
3b0aaf5866 apparmor: add lib fn to find the "split" for fqnames
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:21 -08:00
John Johansen
9a2d40c12d apparmor: add strn version of aa_find_ns
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:20 -08:00
John Johansen
1741e9eb8c apparmor: add strn version of lookup_profile fn
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:19 -08:00
John Johansen
8399588a7f apparmor: rename replacedby to proxy
Proxy is shorter and a better fit than replaceby, so rename it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:19 -08:00
John Johansen
d97d51d253 apparmor: rename PFLAG_INVALID to PFLAG_STALE
Invalid does not convey the meaning of the flag anymore so rename it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:16:37 -08:00
John Johansen
121d4a91e3 apparmor: rename sid to secid
Move to common terminology with other LSMs and kernel infrastucture

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:17 -08:00
John Johansen
98849dff90 apparmor: rename namespace to ns to improve code line lengths
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:16 -08:00
John Johansen
cff281f686 apparmor: split apparmor policy namespaces code into its own file
Policy namespaces will be diverging from profile management and
expanding so put it in its own file.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:15 -08:00
John Johansen
fe6bb31f59 apparmor: split out shared policy_XXX fns to lib
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:14 -08:00
John Johansen
12557dcba2 apparmor: move lib definitions into separate lib include
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:13 -08:00
Kees Cook
8486adf0d7 apparmor: use designated initializers
Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-15 20:00:32 -08:00
Tetsuo Handa
a7f6c1b63b AppArmor: Use GFP_KERNEL for __aa_kvmalloc().
Calling kmalloc(GFP_NOIO) with order == PAGE_ALLOC_COSTLY_ORDER is not
recommended because it might fall into infinite retry loop without
invoking the OOM killer.

Since aa_dfa_unpack() is the only caller of kvzalloc() and
aa_dfa_unpack() which is calling kvzalloc() via unpack_table() is
doing kzalloc(GFP_KERNEL), it is safe to use GFP_KERNEL from
__aa_kvmalloc().

Since aa_simple_write_to_buffer() is the only caller of kvmalloc()
and aa_simple_write_to_buffer() is calling copy_from_user() which
is GFP_KERNEL context (see memdup_user_nul()), it is safe to use
GFP_KERNEL from __aa_kvmalloc().

Therefore, replace GFP_NOIO with GFP_KERNEL. Also, since we have
vmalloc() fallback, add __GFP_NORETRY so that we don't invoke the OOM
killer by kmalloc(GFP_KERNEL) with order == PAGE_ALLOC_COSTLY_ORDER.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-15 13:41:09 -08:00
Peter Zijlstra
6b1ffa06e5 locking/atomic, kref: Use kref_get_unless_zero() more
For some obscure reason apparmor thinks its needs to locally implement
kref primitives that already exist. Stop doing this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 11:37:20 +01:00
Stephen Smalley
3a2f5a59a6 security,selinux,smack: kill security_task_wait hook
As reported by yangshukui, a permission denial from security_task_wait()
can lead to a soft lockup in zap_pid_ns_processes() since it only expects
sys_wait4() to return 0 or -ECHILD. Further, security_task_wait() can
in general lead to zombies; in the absence of some way to automatically
reparent a child process upon a denial, the hook is not useful.  Remove
the security hook and its implementations in SELinux and Smack.  Smack
already removed its check from its hook.

Reported-by: yangshukui <yangshukui@huawei.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-12 11:10:57 -05:00
Stephen Smalley
b4ba35c75a selinux: drop unused socket security classes
Several of the extended socket classes introduced by
commit da69a5306a ("selinux: support distinctions
among all network address families") are never used because
sockets can never be created with the associated address family.
Remove these unused socket security classes.  The removed classes
are bridge_socket for PF_BRIDGE, ib_socket for PF_IB, and mpls_socket
for PF_MPLS.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-12 11:10:24 -05:00
Seung-Woo Kim
83a1e53f39 Smack: ignore private inode for file functions
The access to fd from anon_inode is always failed because there is
no set xattr operations. So this patch fixes to ignore private
inode including anon_inode for file functions.

It was only ignored for smack_file_receive() to share dma-buf fd,
but dma-buf has other functions like ioctl and mmap.

Reference: https://lkml.org/lkml/2015/4/17/16

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Rafal Krypa
805b65a80b Smack: fix d_instantiate logic for sockfs and pipefs
Since 4b936885a (v2.6.32) all inodes on sockfs and pipefs are disconnected.
It caused filesystem specific code in smack_d_instantiate to be skipped,
because all inodes on those pseudo filesystems were treated as root inodes.
As a result all sockfs inodes had the Smack label set to floor.

In most cases access checks for sockets use socket_smack data so the inode
label is not important. But there are special cases that were broken.
One example would be calling fcntl with F_SETOWN command on a socket fd.

Now smack_d_instantiate expects all pipefs and sockfs inodes to be
disconnected and has the logic in appropriate place.

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Himanshu Shukla
c9d238a18b SMACK: Use smk_tskacc() instead of smk_access() for proper logging
smack_file_open() is first checking the capability of calling subject,
this check will skip the SMACK logging for success case. Use smk_tskacc()
for proper logging and SMACK access check.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
348dc288d4 Smack: Traverse the smack_known_list using list_for_each_entry_rcu macro
In smack_from_secattr function,"smack_known_list" is being traversed
using list_for_each_entry macro, although it is a rcu protected
structure. So it should be traversed using "list_for_each_entry_rcu"
macro to fetch the rcu protected entry.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Himanshu Shukla
3d4f673a69 SMACK: Free the i_security blob in inode using RCU
There is race condition issue while freeing the i_security blob in SMACK
module. There is existing condition where i_security can be freed while
inode_permission is called from path lookup on second CPU. There has been
observed the page fault with such condition. VFS code and Selinux module
takes care of this condition by freeing the inode and i_security field
using RCU via call_rcu(). But in SMACK directly the i_secuirty blob is
being freed. Use call_rcu() to fix this race condition issue.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Himanshu Shukla
d54a197964 SMACK: Delete list_head repeated initialization
smk_copy_rules() and smk_copy_relabel() are initializing list_head though
they have been initialized already in new_task_smack() function. Delete
repeated initialization.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
2e962e2fec SMACK: Add new lock for adding entry in smack master list
"smk_set_access()" function adds a new rule entry in subject label specific
list(rule_list) and in global rule list(smack_rule_list) both. Mutex lock
(rule_lock) is used to avoid simultaneous updates. But this lock is subject
label specific lock. If 2 processes tries to add different rules(i.e with
different subject labels) simultaneously, then both the processes can take
the "rule_lock" respectively. So it will cause a problem while adding
entries in master rule list.
Now a new mutex lock(smack_master_list_lock) has been taken to add entry in
smack_rule_list to avoid simultaneous updates of different rules.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
0c96d1f532 Smack: Fix the issue of wrong SMACK label update in socket bind fail case
Fix the issue of wrong SMACK label (SMACK64IPIN) update when a second bind
call is made to same IP address & port, but with different SMACK label
(SMACK64IPIN) by second instance of server. In this case server returns
with "Bind:Address already in use" error but before returning, SMACK label
is updated in SMACK port-label mapping list inside smack_socket_bind() hook

To fix this issue a new check has been added in smk_ipv6_port_label()
function before updating the existing port entry. It checks whether the
socket for matching port entry is closed or not. If it is closed then it
means port is not bound and it is safe to update the existing port entry
else return if port is still getting used. For checking whether socket is
closed or not, one more field "smk_can_reuse" has been added in the
"smk_port_label" structure. This field will be set to '1' in
"smack_sk_free_security()" function which is called to free the socket
security blob when the socket is being closed. In this function, port entry
is searched in the SMACK port-label mapping list for the closing socket.
If entry is found then "smk_can_reuse" field is set to '1'.Initially
"smk_can_reuse" field is set to '0' in smk_ipv6_port_label() function after
creating a new entry in the list which indicates that socket is in use.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
9d44c97384 Smack: Fix the issue of permission denied error in ipv6 hook
Permission denied error comes when 2 IPv6 servers are running and client
tries to connect one of them. Scenario is that both servers are using same
IP and port but different protocols(Udp and tcp). They are using different
SMACK64IPIN labels.Tcp server is using "test" and udp server is using
"test-in". When we try to run tcp client with SMACK64IPOUT label as "test",
then connection denied error comes. It should not happen since both tcp
server and client labels are same.This happens because there is no check
for protocol in smk_ipv6_port_label() function while searching for the
earlier port entry. It checks whether there is an existing port entry on
the basis of port only. So it updates the earlier port entry in the list.
Due to which smack label gets changed for earlier entry in the
"smk_ipv6_port_list" list and permission denied error comes.

Now a check is added for socket type also.Now if 2 processes use same
port  but different protocols (tcp or udp), then 2 different port entries
will be  added in the list. Similarly while checking smack access in
smk_ipv6_port_check() function,  port entry is searched on the basis of
both port and protocol.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <Himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Vishal Goel
3c7ce34271 SMACK: Add the rcu synchronization mechanism in ipv6 hooks
Add the rcu synchronization mechanism for accessing smk_ipv6_port_list
in smack IPv6 hooks. Access to the port list is vulnerable to a race
condition issue,it does not apply proper synchronization methods while
working on critical section. It is possible that when one thread is
reading the list, at the same time another thread is modifying the
same port list, which can cause the major problems.

To ensure proper synchronization between two threads, rcu mechanism
has been applied while accessing and modifying the port list. RCU will
also not affect the performance, as there are more accesses than
modification where RCU is most effective synchronization mechanism.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2017-01-10 09:47:20 -08:00
Gary Tierney
900fde06cb selinux: default to security isid in sel_make_bools() if no sid is found
Use SECINITSID_SECURITY as the default SID for booleans which don't have
a matching SID returned from security_genfs_sid(), also update the
error message to a warning which matches this.

This prevents the policy failing to load (and consequently the system
failing to boot) when there is no default genfscon statement matched for
the selinuxfs in the new policy.

Signed-off-by: Gary Tierney <gary.tierney@gmx.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:32 -05:00
Gary Tierney
4262fb51c9 selinux: log errors when loading new policy
Adds error logging to the code paths which can fail when loading a new
policy in sel_write_load().  If the policy fails to be loaded from
userspace then a warning message is printed, whereas if a failure occurs
after loading policy from userspace an error message will be printed
with details on where policy loading failed (recreating one of /classes/,
/policy_capabilities/, /booleans/ in the SELinux fs).

Also, if sel_make_bools() fails to obtain an SID for an entry in
/booleans/* an error will be printed indicating the path of the
boolean.

Signed-off-by: Gary Tierney <gary.tierney@gmx.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Stephen Smalley
b21507e272 proc,security: move restriction on writing /proc/pid/attr nodes to proc
Processes can only alter their own security attributes via
/proc/pid/attr nodes.  This is presently enforced by each individual
security module and is also imposed by the Linux credentials
implementation, which only allows a task to alter its own credentials.
Move the check enforcing this restriction from the individual
security modules to proc_pid_attr_write() before calling the security hook,
and drop the unnecessary task argument to the security hook since it can
only ever be the current task.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Stephen Smalley
be0554c9bf selinux: clean up cred usage and simplify
SELinux was sometimes using the task "objective" credentials when
it could/should use the "subjective" credentials.  This was sometimes
hidden by the fact that we were unnecessarily passing around pointers
to the current task, making it appear as if the task could be something
other than current, so eliminate all such passing of current.  Inline
various permission checking helper functions that can be reduced to a
single avc_has_perm() call.

Since the credentials infrastructure only allows a task to alter
its own credentials, we can always assume that current must be the same
as the target task in selinux_setprocattr after the check. We likely
should move this check from selinux_setprocattr() to proc_pid_attr_write()
and drop the task argument to the security hook altogether; it can only
serve to confuse things.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Stephen Smalley
01593d3299 selinux: allow context mounts on tmpfs, ramfs, devpts within user namespaces
commit aad82892af ("selinux: Add support for
unprivileged mounts from user namespaces") prohibited any use of context
mount options within non-init user namespaces.  However, this breaks
use of context mount options for tmpfs mounts within user namespaces,
which are being used by Docker/runc.  There is no reason to block such
usage for tmpfs, ramfs or devpts.  Exempt these filesystem types
from this restriction.

Before:
sh$ userns_child_exec  -p -m -U -M '0 1000 1' -G '0 1000 1' bash
sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp
mount: tmpfs is write-protected, mounting read-only
mount: cannot mount tmpfs read-only

After:
sh$ userns_child_exec  -p -m -U -M '0 1000 1' -G '0 1000 1' bash
sh# mount -t tmpfs -o context=system_u:object_r:user_tmp_t:s0:c13 none /tmp
sh# ls -Zd /tmp
unconfined_u:object_r:user_tmp_t:s0:c13 /tmp

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Stephen Smalley
ef37979a2c selinux: handle ICMPv6 consistently with ICMP
commit 79c8b348f215 ("selinux: support distinctions among all network
address families") mapped datagram ICMP sockets to the new icmp_socket
security class, but left ICMPv6 sockets unchanged.  This change fixes
that oversight to handle both kinds of sockets consistently.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:31 -05:00
Yongqin Liu
a2c7c6fbe5 selinux: add security in-core xattr support for tracefs
Since kernel 4.1 ftrace is supported as a new separate filesystem. It
gets automatically mounted by the kernel under the old path
/sys/kernel/debug/tracing. Because it lives now on a separate filesystem
SELinux needs to be updated to also support setting SELinux labels
on tracefs inodes.  This is required for compatibility in Android
when moving to Linux 4.1 or newer.

Signed-off-by: Yongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: William Roberts <william.c.roberts@intel.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:30 -05:00
Stephen Smalley
da69a5306a selinux: support distinctions among all network address families
Extend SELinux to support distinctions among all network address families
implemented by the kernel by defining new socket security classes
and mapping to them. Otherwise, many sockets are mapped to the generic
socket class and are indistinguishable in policy.  This has come up
previously with regard to selectively allowing access to bluetooth sockets,
and more recently with regard to selectively allowing access to AF_ALG
sockets.  Guido Trentalancia submitted a patch that took a similar approach
to add only support for distinguishing AF_ALG sockets, but this generalizes
his approach to handle all address families implemented by the kernel.
Socket security classes are also added for ICMP and SCTP sockets.
Socket security classes were not defined for AF_* values that are reserved
but unimplemented in the kernel, e.g. AF_NETBEUI, AF_SECURITY, AF_ASH,
AF_ECONET, AF_SNA, AF_WANPIPE.

Backward compatibility is provided by only enabling the finer-grained
socket classes if a new policy capability is set in the policy; older
policies will behave as before.  The legacy redhat1 policy capability
that was only ever used in testing within Fedora for ptrace_child
is reclaimed for this purpose; as far as I can tell, this policy
capability is not enabled in any supported distro policy.

Add a pair of conditional compilation guards to detect when new AF_* values
are added so that we can update SELinux accordingly rather than having to
belatedly update it long after new address families are introduced.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-01-09 10:07:30 -05:00
Linus Torvalds
7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Linus Torvalds
67327145c4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull SElinux fix from James Morris:
 "From Paul:
   'A small SELinux patch to fix some clang/llvm compiler warnings and
    ensure the tools under scripts work well in the face of kernel
    changes'"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  selinux: use the kernel headers when building scripts/selinux
2016-12-22 10:03:52 -08:00
Paul Moore
bfc5e3a6af selinux: use the kernel headers when building scripts/selinux
Commit 3322d0d64f ("selinux: keep SELinux in sync with new capability
definitions") added a check on the defined capabilities without
explicitly including the capability header file which caused problems
when building genheaders for users of clang/llvm.  Resolve this by
using the kernel headers when building genheaders, which is arguably
the right thing to do regardless, and explicitly including the
kernel's capability.h header file in classmap.h.  We also update the
mdp build, even though it wasn't causing an error we really should
be using the headers from the kernel we are building.

Reported-by: Nicolas Iooss <nicolas.iooss@m4x.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-12-21 10:39:25 -05:00
Andreas Steffen
98e1d55d03 ima: platform-independent hash value
For remote attestion it is important for the ima measurement values to
be platform-independent.  Therefore integer fields to be hashed must be
converted to canonical format.

Link: http://lkml.kernel.org/r/1480554346-29071-11-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Andreas Steffen <andreas.steffen@strongswan.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:46 -08:00
Mimi Zohar
d68a6fe9fc ima: define a canonical binary_runtime_measurements list format
The IMA binary_runtime_measurements list is currently in platform native
format.

To allow restoring a measurement list carried across kexec with a
different endianness than the targeted kernel, this patch defines
little-endian as the canonical format.  For big endian systems wanting
to save/restore the measurement list from a system with a different
endianness, a new boot command line parameter named "ima_canonical_fmt"
is defined.

Considerations: use of the "ima_canonical_fmt" boot command line option
will break existing userspace applications on big endian systems
expecting the binary_runtime_measurements list to be in platform native
format.

Link: http://lkml.kernel.org/r/1480554346-29071-10-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:45 -08:00
Mimi Zohar
c7d0936770 ima: support restoring multiple template formats
The configured IMA measurement list template format can be replaced at
runtime on the boot command line, including a custom template format.
This patch adds support for restoring a measuremement list containing
multiple builtin/custom template formats.

Link: http://lkml.kernel.org/r/1480554346-29071-9-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:45 -08:00
Mimi Zohar
3f23d624de ima: store the builtin/custom template definitions in a list
The builtin and single custom templates are currently stored in an
array.  In preparation for being able to restore a measurement list
containing multiple builtin/custom templates, this patch stores the
builtin and custom templates as a linked list.  This will permit
defining more than one custom template per boot.

Link: http://lkml.kernel.org/r/1480554346-29071-8-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:45 -08:00
Mimi Zohar
7b8589cc29 ima: on soft reboot, save the measurement list
The TPM PCRs are only reset on a hard reboot.  In order to validate a
TPM's quote after a soft reboot (eg.  kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot.

This patch uses the kexec buffer passing mechanism to pass the
serialized IMA binary_runtime_measurements to the next kernel.

Link: http://lkml.kernel.org/r/1480554346-29071-7-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:44 -08:00
Mimi Zohar
d158847ae8 ima: maintain memory size needed for serializing the measurement list
In preparation for serializing the binary_runtime_measurements, this
patch maintains the amount of memory required.

Link: http://lkml.kernel.org/r/1480554346-29071-5-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:44 -08:00
Mimi Zohar
dcfc56937b ima: permit duplicate measurement list entries
Measurements carried across kexec need to be added to the IMA
measurement list, but should not prevent measurements of the newly
booted kernel from being added to the measurement list.  This patch adds
support for allowing duplicate measurements.

The "boot_aggregate" measurement entry is the delimiter between soft
boots.

Link: http://lkml.kernel.org/r/1480554346-29071-4-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:43 -08:00
Mimi Zohar
94c3aac567 ima: on soft reboot, restore the measurement list
The TPM PCRs are only reset on a hard reboot.  In order to validate a
TPM's quote after a soft reboot (eg.  kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot.  This
patch restores the measurement list.

Link: http://lkml.kernel.org/r/1480554346-29071-3-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-20 09:48:43 -08:00
Linus Torvalds
9a19a6db37 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:

 - more ->d_init() stuff (work.dcache)

 - pathname resolution cleanups (work.namei)

 - a few missing iov_iter primitives - copy_from_iter_full() and
   friends. Either copy the full requested amount, advance the iterator
   and return true, or fail, return false and do _not_ advance the
   iterator. Quite a few open-coded callers converted (and became more
   readable and harder to fuck up that way) (work.iov_iter)

 - several assorted patches, the big one being logfs removal

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  logfs: remove from tree
  vfs: fix put_compat_statfs64() does not handle errors
  namei: fold should_follow_link() with the step into not-followed link
  namei: pass both WALK_GET and WALK_MORE to should_follow_link()
  namei: invert WALK_PUT logics
  namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link()
  namei: saner calling conventions for mountpoint_last()
  namei.c: get rid of user_path_parent()
  switch getfrag callbacks to ..._full() primitives
  make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success
  [iov_iter] new primitives - copy_from_iter_full() and friends
  don't open-code file_inode()
  ceph: switch to use of ->d_init()
  ceph: unify dentry_operations instances
  lustre: switch to use of ->d_init()
2016-12-16 10:24:44 -08:00
Al Viro
c4364f837c Merge branches 'work.namei', 'work.dcache' and 'work.iov_iter' into for-linus 2016-12-15 01:07:29 -05:00
Linus Torvalds
a57cb1c1d7 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - a few misc things

 - kexec updates

 - DMA-mapping updates to better support networking DMA operations

 - IPC updates

 - various MM changes to improve DAX fault handling

 - lots of radix-tree changes, mainly to the test suite. All leading up
   to reimplementing the IDA/IDR code to be a wrapper layer over the
   radix-tree. However the final trigger-pulling patch is held off for
   4.11.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
  radix tree test suite: delete unused rcupdate.c
  radix tree test suite: add new tag check
  radix-tree: ensure counts are initialised
  radix tree test suite: cache recently freed objects
  radix tree test suite: add some more functionality
  idr: reduce the number of bits per level from 8 to 6
  rxrpc: abstract away knowledge of IDR internals
  tpm: use idr_find(), not idr_find_slowpath()
  idr: add ida_is_empty
  radix tree test suite: check multiorder iteration
  radix-tree: fix replacement for multiorder entries
  radix-tree: add radix_tree_split_preload()
  radix-tree: add radix_tree_split
  radix-tree: add radix_tree_join
  radix-tree: delete radix_tree_range_tag_if_tagged()
  radix-tree: delete radix_tree_locate_item()
  radix-tree: improve multiorder iterators
  btrfs: fix race in btrfs_free_dummy_fs_info()
  radix-tree: improve dump output
  radix-tree: make radix_tree_find_next_bit more useful
  ...
2016-12-14 17:25:18 -08:00
Lorenzo Stoakes
5b56d49fc3 mm: add locked parameter to get_user_pages_remote()
Patch series "mm: unexport __get_user_pages_unlocked()".

This patch series continues the cleanup of get_user_pages*() functions
taking advantage of the fact we can now pass gup_flags as we please.

It firstly adds an additional 'locked' parameter to
get_user_pages_remote() to allow for its callers to utilise
VM_FAULT_RETRY functionality.  This is necessary as the invocation of
__get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of
this and no other existing higher level function would allow it to do
so.

Secondly existing callers of __get_user_pages_unlocked() are replaced
with the appropriate higher-level replacement -
get_user_pages_unlocked() if the current task and memory descriptor are
referenced, or get_user_pages_remote() if other task/memory descriptors
are referenced (having acquiring mmap_sem.)

This patch (of 2):

Add a int *locked parameter to get_user_pages_remote() to allow
VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked().

Taking into account the previous adjustments to get_user_pages*()
functions allowing for the passing of gup_flags, we are now in a
position where __get_user_pages_unlocked() need only be exported for his
ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to
subsequently unexport __get_user_pages_unlocked() as well as allowing
for future flexibility in the use of get_user_pages_remote().

[sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change]
  Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au
Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Linus Torvalds
412ac77a9d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "After a lot of discussion and work we have finally reachanged a basic
  understanding of what is necessary to make unprivileged mounts safe in
  the presence of EVM and IMA xattrs which the last commit in this
  series reflects. While technically it is a revert the comments it adds
  are important for people not getting confused in the future. Clearing
  up that confusion allows us to seriously work on unprivileged mounts
  of fuse in the next development cycle.

  The rest of the fixes in this set are in the intersection of user
  namespaces, ptrace, and exec. I started with the first fix which
  started a feedback cycle of finding additional issues during review
  and fixing them. Culiminating in a fix for a bug that has been present
  since at least Linux v1.0.

  Potentially these fixes were candidates for being merged during the rc
  cycle, and are certainly backport candidates but enough little things
  turned up during review and testing that I decided they should be
  handled as part of the normal development process just to be certain
  there were not any great surprises when it came time to backport some
  of these fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC"
  exec: Ensure mm->user_ns contains the execed files
  ptrace: Don't allow accessing an undumpable mm
  ptrace: Capture the ptracer's creds not PT_PTRACE_CAP
  mm: Add a user_ns owner to mm_struct and fix ptrace permission checks
2016-12-14 14:09:48 -08:00
Linus Torvalds
683b96f4d1 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Generally pretty quiet for this release. Highlights:

  Yama:
   - allow ptrace access for original parent after re-parenting

  TPM:
   - add documentation
   - many bugfixes & cleanups
   - define a generic open() method for ascii & bios measurements

  Integrity:
   - Harden against malformed xattrs

  SELinux:
   - bugfixes & cleanups

  Smack:
   - Remove unnecessary smack_known_invalid label
   - Do not apply star label in smack_setprocattr hook
   - parse mnt opts after privileges check (fixes unpriv DoS vuln)"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (56 commits)
  Yama: allow access for the current ptrace parent
  tpm: adjust return value of tpm_read_log
  tpm: vtpm_proxy: conditionally call tpm_chip_unregister
  tpm: Fix handling of missing event log
  tpm: Check the bios_dir entry for NULL before accessing it
  tpm: return -ENODEV if np is not set
  tpm: cleanup of printk error messages
  tpm: replace of_find_node_by_name() with dev of_node property
  tpm: redefine read_log() to handle ACPI/OF at runtime
  tpm: fix the missing .owner in tpm_bios_measurements_ops
  tpm: have event log use the tpm_chip
  tpm: drop tpm1_chip_register(/unregister)
  tpm: replace dynamically allocated bios_dir with a static array
  tpm: replace symbolic permission with octal for securityfs files
  char: tpm: fix kerneldoc tpm2_unseal_trusted name typo
  tpm_tis: Allow tpm_tis to be bound using DT
  tpm, tpm_vtpm_proxy: add kdoc comments for VTPM_PROXY_IOC_NEW_DEV
  tpm: Only call pm_runtime_get_sync if device has a parent
  tpm: define a generic open() method for ascii & bios measurements
  Documentation: tpm: add the Physical TPM device tree binding documentation
  ...
2016-12-14 13:57:44 -08:00
Linus Torvalds
9465d9cc31 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The time/timekeeping/timer folks deliver with this update:

   - Fix a reintroduced signed/unsigned issue and cleanup the whole
     signed/unsigned mess in the timekeeping core so this wont happen
     accidentaly again.

   - Add a new trace clock based on boot time

   - Prevent injection of random sleep times when PM tracing abuses the
     RTC for storage

   - Make posix timers configurable for real tiny systems

   - Add tracepoints for the alarm timer subsystem so timer based
     suspend wakeups can be instrumented

   - The usual pile of fixes and updates to core and drivers"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  timekeeping: Use mul_u64_u32_shr() instead of open coding it
  timekeeping: Get rid of pointless typecasts
  timekeeping: Make the conversion call chain consistently unsigned
  timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion
  alarmtimer: Add tracepoints for alarm timers
  trace: Update documentation for mono, mono_raw and boot clock
  trace: Add an option for boot clock as trace clock
  timekeeping: Add a fast and NMI safe boot clock
  timekeeping/clocksource_cyc2ns: Document intended range limitation
  timekeeping: Ignore the bogus sleep time if pm_trace is enabled
  selftests/timers: Fix spelling mistake "Asyncrhonous" -> "Asynchronous"
  clocksource/drivers/bcm2835_timer: Unmap region obtained by of_iomap
  clocksource/drivers/arm_arch_timer: Map frame with of_io_request_and_map()
  arm64: dts: rockchip: Arch counter doesn't tick in system suspend
  clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend
  posix-timers: Make them configurable
  posix_cpu_timers: Move the add_device_randomness() call to a proper place
  timer: Move sys_alarm from timer.c to itimer.c
  ptp_clock: Allow for it to be optional
  Kconfig: Regenerate *.c_shipped files after previous changes
  ...
2016-12-12 19:56:15 -08:00
Al Viro
cbbd26b8b1 [iov_iter] new primitives - copy_from_iter_full() and friends
copy_from_iter_full(), copy_from_iter_full_nocache() and
csum_and_copy_from_iter_full() - counterparts of copy_from_iter()
et.al., advancing iterator only in case of successful full copy
and returning whether it had been successful or not.

Convert some obvious users.  *NOTE* - do not blindly assume that
something is a good candidate for those unless you are sure that
not advancing iov_iter in failure case is the right thing in
this case.  Anything that does short read/short write kind of
stuff (or is in a loop, etc.) is unlikely to be a good one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-05 14:33:36 -05:00
Josh Stone
50523a29d9 Yama: allow access for the current ptrace parent
Under ptrace_scope=1, it's possible to have a tracee that is already
ptrace-attached, but is no longer a direct descendant.  For instance, a
forking daemon will be re-parented to init, losing its ancestry to the
tracer that launched it.

The tracer can continue using ptrace in that state, but it will be
denied other accesses that check PTRACE_MODE_ATTACH, like process_vm_rw
and various procfs files.  There's no reason to prevent such access for
a tracer that already has ptrace control anyway.

This patch adds a case to ptracer_exception_found to allow access for
any task in the same thread group as the current ptrace parent.

Signed-off-by: Josh Stone <jistone@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-12-05 11:48:01 +11:00
Al Viro
450630975d don't open-code file_inode()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-04 18:29:28 -05:00
Eric W. Biederman
19339c2516 Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC"
This reverts commit 0b3c9761d1.

Seth Forshee <seth.forshee@canonical.com> writes:
> All right, I think 0b3c9761d1 should be
> reverted then. EVM is a machine-local integrity mechanism, and so it
> makes sense that the signature would be based on the kernel's notion of
> the uid and not the filesystem's.

I added a commment explaining why the EVM hmac needs to be in the
kernel's notion of uid and gid, not the filesystems to prevent
remounting the filesystem and gaining unwaranted trust in files.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-12-02 20:58:41 -06:00
James Morris
0821e30cd2 Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/selinux into next 2016-11-24 11:21:25 +11:00
James Morris
b075361e91 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next 2016-11-23 09:52:11 +11:00
Andreas Gruenbacher
9287aed2ad selinux: Convert isec->lock into a spinlock
Convert isec->lock from a mutex into a spinlock.  Instead of holding
the lock while sleeping in inode_doinit_with_dentry, set
isec->initialized to LABEL_PENDING and release the lock.  Then, when
the sid has been determined, re-acquire the lock.  If isec->initialized
is still set to LABEL_PENDING, set isec->sid; otherwise, the sid has
been set by another task (LABEL_INITIALIZED) or invalidated
(LABEL_INVALID) in the meantime.

This fixes a deadlock on gfs2 where

 * one task is in inode_doinit_with_dentry -> gfs2_getxattr, holds
   isec->lock, and tries to acquire the inode's glock, and

 * another task is in do_xmote -> inode_go_inval ->
   selinux_inode_invalidate_secctx, holds the inode's glock, and
   tries to acquire isec->lock.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
[PM: minor tweaks to keep checkpatch.pl happy]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-22 17:44:02 -05:00
James Morris
636e4625ad Merge remote branch 'smack/smack-for-4.10' into next 2016-11-22 22:15:30 +11:00
Stephen Smalley
3322d0d64f selinux: keep SELinux in sync with new capability definitions
When a new capability is defined, SELinux needs to be updated.
Trigger a build error if a new capability is defined without
corresponding update to security/selinux/include/classmap.h's
COMMON_CAP2_PERMS.  This is similar to BUILD_BUG_ON() guards
in the SELinux nlmsgtab code to ensure that SELinux tracks
new netlink message types as needed.

Note that there is already a similar build guard in
security/selinux/hooks.c to detect when more than 64
capabilities are defined, since that will require adding
a third capability class to SELinux.

A nicer way to do this would be to extend scripts/selinux/genheaders
or a similar tool to auto-generate the necessary definitions and code
for SELinux capability checking from include/uapi/linux/capability.h.
AppArmor does something similar in its Makefile, although it only
needs to generate a single table of names.  That is left as future
work.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: reformat the description to keep checkpatch.pl happy]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-21 15:37:24 -05:00
John Johansen
3d40658c97 apparmor: fix change_hat not finding hat after policy replacement
After a policy replacement, the task cred may be out of date and need
to be updated. However change_hat is using the stale profiles from
the out of date cred resulting in either: a stale profile being applied
or, incorrect failure when searching for a hat profile as it has been
migrated to the new parent profile.

Fixes: 01e2b670aa (failure to find hat)
Fixes: 898127c34e (stale policy being applied)
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000287
Cc: stable@vger.kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-11-21 18:01:28 +11:00
Stephen Smalley
ea49d10eee selinux: normalize input to /sys/fs/selinux/enforce
At present, one can write any signed integer value to
/sys/fs/selinux/enforce and it will be stored,
e.g. echo -1 > /sys/fs/selinux/enforce or echo 2 >
/sys/fs/selinux/enforce. This makes no real difference
to the kernel, since it only ever cares if it is zero or non-zero,
but some userspace code compares it with 1 to decide if SELinux
is enforcing, and this could confuse it. Only a process that is
already root and is allowed the setenforce permission in SELinux
policy can write to /sys/fs/selinux/enforce, so this is not considered
to be a security issue, but it should be fixed.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-20 17:13:19 -05:00
Nicolas Pitre
baa73d9e47 posix-timers: Make them configurable
Some embedded systems have no use for them.  This removes about
25KB from the kernel binary size when configured out.

Corresponding syscalls are routed to a stub logging the attempt to
use those syscalls which should be enough of a clue if they were
disabled without proper consideration. They are: timer_create,
timer_gettime: timer_getoverrun, timer_settime, timer_delete,
clock_adjtime, setitimer, getitimer, alarm.

The clock_settime, clock_gettime, clock_getres and clock_nanosleep
syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME,
CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast
majority of use cases with very little code.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: linux-kbuild@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Michal Marek <mmarek@suse.com>
Cc: Edward Cree <ecree@solarflare.com>
Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:26:35 +01:00
Casey Schaufler
152f91d4d1 Smack: Remove unnecessary smack_known_invalid
The invalid Smack label ("") and the Huh ("?") Smack label
serve the same purpose and having both is unnecessary.
While pulling out the invalid label it became clear that
the use of smack_from_secid() was inconsistent, so that
is repaired. The setting of inode labels to the invalid
label could never happen in a functional system, has
never been observed in the wild and is not what you'd
really want for a failure behavior in any case. That is
removed.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-15 09:34:39 -08:00
Tetsuo Handa
8c15d66e42 Smack: Use GFP_KERNEL for smack_parse_opts_str().
Since smack_parse_opts_str() is calling match_strdup() which uses
GFP_KERNEL, it is safe to use GFP_KERNEL from kcalloc() which is
called by smack_parse_opts_str().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-14 12:55:11 -08:00
Andreas Gruenbacher
13457d073c selinux: Clean up initialization of isec->sclass
Now that isec->initialized == LABEL_INITIALIZED implies that
isec->sclass is valid, skip such inodes immediately in
inode_doinit_with_dentry.

For the remaining inodes, initialize isec->sclass at the beginning of
inode_doinit_with_dentry to simplify the code.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:53:04 -05:00
Andreas Gruenbacher
db978da8fa proc: Pass file mode to proc_pid_make_inode
Pass the file mode of the proc inode to be created to
proc_pid_make_inode.  In proc_pid_make_inode, initialize inode->i_mode
before calling security_task_to_inode.  This allows selinux to set
isec->sclass right away without introducing "half-initialized" inode
security structs.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:39:48 -05:00
Andreas Gruenbacher
420591128c selinux: Minor cleanups
Fix the comment for function __inode_security_revalidate, which returns
an integer.

Use the LABEL_* constants consistently for isec->initialized.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:25:07 -05:00
Tetsuo Handa
8931c3bdb3 SELinux: Use GFP_KERNEL for selinux_parse_opts_str().
Since selinux_parse_opts_str() is calling match_strdup() which uses
GFP_KERNEL, it is safe to use GFP_KERNEL from kcalloc() which is
called by selinux_parse_opts_str().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:03:38 -05:00
Seth Forshee
b4bfec7f4a security/integrity: Harden against malformed xattrs
In general the handling of IMA/EVM xattrs is good, but I found
a few locations where either the xattr size or the value of the
type field in the xattr are not checked. Add a few simple checks
to these locations to prevent malformed or malicious xattrs from
causing problems.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-11-13 22:50:11 -05:00
Mimi Zohar
064be15c52 ima: include the reason for TPM-bypass mode
This patch includes the reason for going into TPM-bypass mode
and not using the TPM.

Signed-off-by: Mimi Zohar (zohar@linux.vnet.ibm>
2016-11-13 22:50:09 -05:00
Mimi Zohar
f5acb3dcba Revert "ima: limit file hash setting by user to fix and log modes"
Userspace applications have been modified to write security xattrs,
but they are not context aware.  In the case of security.ima, the
security xattr can be either a file hash or a file signature.
Permitting writing one, but not the other requires the application to
be context aware.

In addition, userspace applications might write files to a staging
area, which might not be in policy, and then change some file metadata
(eg. owner) making it in policy.  As a result, these files are not
labeled properly.

This reverts commit c68ed80c97, which
prevents writing file hashes as security.ima xattrs.

Requested-by: Patrick Ohly <patrick.ohly@intel.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-11-13 22:50:09 -05:00
Eric Richter
9a11a18902 ima: fix memory leak in ima_release_policy
When the "policy" securityfs file is opened for read, it is opened as a
sequential file. However, when it is eventually released, there is no
cleanup for the sequential file, therefore some memory is leaked.

This patch adds a call to seq_release() in ima_release_policy() to clean up
the memory when the file is opened for read.

Fixes: 80eae209d6 IMA: allow reading back the current policy
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-11-13 22:50:08 -05:00
Casey Schaufler
2e4939f702 Smack: ipv6 label match fix
The check for a deleted entry in the list of IPv6 host
addresses was being performed in the wrong place, leading
to most peculiar results in some cases. This puts the
check into the right place.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-10 11:22:18 -08:00
Himanshu Shukla
b437aba85b SMACK: Fix the memory leak in smack_cred_prepare() hook
Memory leak in smack_cred_prepare()function.
smack_cred_prepare() hook returns error if there is error in allocating
memory in smk_copy_rules() or smk_copy_relabel() function.
If smack_cred_prepare() function returns error then the calling
function should call smack_cred_free() function for cleanup.
In smack_cred_free() function first credential is  extracted and
then all rules are deleted. In smack_cred_prepare() function security
field is assigned in the end when all function return success. But this
function may return before and memory will not be freed.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-10 11:22:06 -08:00
Himanshu Shukla
7128ea159d SMACK: Do not apply star label in smack_setprocattr hook
Smack prohibits processes from using the star ("*") and web ("@") labels.
Checks have been added in other functions. In smack_setprocattr()
hook, only check for web ("@") label has been added and restricted
from applying web ("@") label.
Check for star ("*") label should also be added in smack_setprocattr()
hook. Return error should be "-EINVAL" not "-EPERM" as permission
is there for setting label but not the label value as star ("*") or
web ("@").

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-10 11:21:52 -08:00
Himanshu Shukla
2097f59920 smack: parse mnt opts after privileges check
In smack_set_mnt_opts()first the SMACK mount options are being
parsed and later it is being checked whether the user calling
mount has CAP_MAC_ADMIN capability.
This sequence of operationis will allow unauthorized user to add
SMACK labels in label list and may cause denial of security attack
by adding many labels by allocating kernel memory by unauthorized user.
Superblock smack flag is also being set as initialized though function
may return with EPERM error.
First check the capability of calling user then set the SMACK attributes
and smk_flags.

Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-10 11:21:32 -08:00
jooseong lee
08382c9f6e Smack: Assign smack_known_web label for kernel thread's
Assign smack_known_web label for kernel thread's socket

Creating struct sock by sk_alloc function in various kernel subsystems
like bluetooth doesn't call smack_socket_post_create(). In such case,
received sock label is the floor('_') label and makes access deny.

Signed-off-by: jooseong lee <jooseong.lee@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-11-04 17:42:57 -07:00
Artem Savkov
31e6ec4519 security/keys: make BIG_KEYS dependent on stdrng.
Since BIG_KEYS can't be compiled as module it requires one of the "stdrng"
providers to be compiled into kernel. Otherwise big_key_crypto_init() fails
on crypto_alloc_rng step and next dereference of big_key_skcipher (e.g. in
big_key_preparse()) results in a NULL pointer dereference.

Fixes: 13100a72f4 ('Security: Keys: Big keys stored encrypted')
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Stephan Mueller <smueller@chronox.de>
cc: Kirill Marinushkin <k.marinushkin@gmail.com>
cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-10-27 16:03:33 +11:00
David Howells
7df3e59c3d KEYS: Sort out big_key initialisation
big_key has two separate initialisation functions, one that registers the
key type and one that registers the crypto.  If the key type fails to
register, there's no problem if the crypto registers successfully because
there's no way to reach the crypto except through the key type.

However, if the key type registers successfully but the crypto does not,
big_key_rng and big_key_blkcipher may end up set to NULL - but the code
neither checks for this nor unregisters the big key key type.

Furthermore, since the key type is registered before the crypto, it is
theoretically possible for the kernel to try adding a big_key before the
crypto is set up, leading to the same effect.

Fix this by merging big_key_crypto_init() and big_key_init() and calling
the resulting function late.  If they're going to be encrypted, we
shouldn't be creating big_keys before we have the facilities to do the
encryption available.  The key type registration is also moved after the
crypto initialisation.

The fix also includes message printing on failure.

If the big_key type isn't correctly set up, simply doing:

	dd if=/dev/zero bs=4096 count=1 | keyctl padd big_key a @s

ought to cause an oops.

Fixes: 13100a72f4 ('Security: Keys: Big keys stored encrypted')
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Peter Hlavaty <zer0mem@yahoo.com>
cc: Kirill Marinushkin <k.marinushkin@gmail.com>
cc: Artem Savkov <asavkov@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-10-27 16:03:27 +11:00
David Howells
03dab869b7 KEYS: Fix short sprintf buffer in /proc/keys show function
This fixes CVE-2016-7042.

Fix a short sprintf buffer in proc_keys_show().  If the gcc stack protector
is turned on, this can cause a panic due to stack corruption.

The problem is that xbuf[] is not big enough to hold a 64-bit timeout
rendered as weeks:

	(gdb) p 0xffffffffffffffffULL/(60*60*24*7)
	$2 = 30500568904943

That's 14 chars plus NUL, not 11 chars plus NUL.

Expand the buffer to 16 chars.

I think the unpatched code apparently works if the stack-protector is not
enabled because on a 32-bit machine the buffer won't be overflowed and on a
64-bit machine there's a 64-bit aligned pointer at one side and an int that
isn't checked again on the other side.

The panic incurred looks something like:

Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe
CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f
 ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6
 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679
Call Trace:
 [<ffffffff813d941f>] dump_stack+0x63/0x84
 [<ffffffff811b2cb6>] panic+0xde/0x22a
 [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0
 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30
 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0
 [<ffffffff81350410>] ? key_validate+0x50/0x50
 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20
 [<ffffffff8126b31c>] seq_read+0x2cc/0x390
 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70
 [<ffffffff81244fc7>] __vfs_read+0x37/0x150
 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0
 [<ffffffff81246156>] vfs_read+0x96/0x130
 [<ffffffff81247635>] SyS_read+0x55/0xc0
 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4

Reported-by: Ondrej Kozina <okozina@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ondrej Kozina <okozina@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-10-27 16:03:24 +11:00
Linus Torvalds
86c5bf7101 Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vmap stack fixes from Ingo Molnar:
 "This is fallout from CONFIG_HAVE_ARCH_VMAP_STACK=y on x86: stack
  accesses that used to be just somewhat questionable are now totally
  buggy.

  These changes try to do it without breaking the ABI: the fields are
  left there, they are just reporting zero, or reporting narrower
  information (the maps file change)"

* 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  mm: Change vm_is_stack_for_task() to vm_is_stack_for_current()
  fs/proc: Stop trying to report thread stacks
  fs/proc: Stop reporting eip and esp in /proc/PID/stat
  mm/numa: Remove duplicated include from mprotect.c
2016-10-22 09:39:10 -07:00
Andy Lutomirski
d17af5056c mm: Change vm_is_stack_for_task() to vm_is_stack_for_current()
Asking for a non-current task's stack can't be done without races
unless the task is frozen in kernel mode.  As far as I know,
vm_is_stack_for_task() never had a safe non-current use case.

The __unused annotation is because some KSTK_ESP implementations
ignore their parameter, which IMO is further justification for this
patch.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux API <linux-api@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-20 09:21:41 +02:00
Lorenzo Stoakes
9beae1ea89 mm: replace get_user_pages_remote() write/force parameters with gup_flags
This removes the 'write' and 'force' from get_user_pages_remote() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19 08:12:02 -07:00
Linus Torvalds
101105b171 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 ">rename2() work from Miklos + current_time() from Deepa"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Replace current_fs_time() with current_time()
  fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
  fs: Replace CURRENT_TIME with current_time() for inode timestamps
  fs: proc: Delete inode time initializations in proc_alloc_inode()
  vfs: Add current_time() api
  vfs: add note about i_op->rename changes to porting
  fs: rename "rename2" i_op to "rename"
  vfs: remove unused i_op->rename
  fs: make remaining filesystems use .rename2
  libfs: support RENAME_NOREPLACE in simple_rename()
  fs: support RENAME_NOREPLACE for local filesystems
  ncpfs: fix unused variable warning
2016-10-10 20:16:43 -07:00
Al Viro
3873691e5a Merge remote-tracking branch 'ovl/rename2' into for-linus 2016-10-10 23:02:51 -04:00
Linus Torvalds
97d2116708 Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs xattr updates from Al Viro:
 "xattr stuff from Andreas

  This completes the switch to xattr_handler ->get()/->set() from
  ->getxattr/->setxattr/->removexattr"

* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: Remove {get,set,remove}xattr inode operations
  xattr: Stop calling {get,set,remove}xattr inode operations
  vfs: Check for the IOP_XATTR flag in listxattr
  xattr: Add __vfs_{get,set,remove}xattr helpers
  libfs: Use IOP_XATTR flag for empty directory handling
  vfs: Use IOP_XATTR flag for bad-inode handling
  vfs: Add IOP_XATTR inode operations flag
  vfs: Move xattr_resolve_name to the front of fs/xattr.c
  ecryptfs: Switch to generic xattr handlers
  sockfs: Get rid of getxattr iop
  sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
  kernfs: Switch to generic xattr handlers
  hfs: Switch to generic xattr handlers
  jffs2: Remove jffs2_{get,set,remove}xattr macros
  xattr: Remove unnecessary NULL attribute name check
2016-10-10 17:11:50 -07:00
Linus Torvalds
abb5a14fa2 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted misc bits and pieces.

  There are several single-topic branches left after this (rename2
  series from Miklos, current_time series from Deepa Dinamani, xattr
  series from Andreas, uaccess stuff from from me) and I'd prefer to
  send those separately"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
  proc: switch auxv to use of __mem_open()
  hpfs: support FIEMAP
  cifs: get rid of unused arguments of CIFSSMBWrite()
  posix_acl: uapi header split
  posix_acl: xattr representation cleanups
  fs/aio.c: eliminate redundant loads in put_aio_ring_file
  fs/internal.h: add const to ns_dentry_operations declaration
  compat: remove compat_printk()
  fs/buffer.c: make __getblk_slow() static
  proc: unsigned file descriptors
  fs/file: more unsigned file descriptors
  fs: compat: remove redundant check of nr_segs
  cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
  cifs: don't use memcpy() to copy struct iov_iter
  get rid of separate multipage fault-in primitives
  fs: Avoid premature clearing of capabilities
  fs: Give dentry to inode_change_ok() instead of inode
  fuse: Propagate dentry down to inode_change_ok()
  ceph: Propagate dentry down to inode_change_ok()
  xfs: Propagate dentry down to inode_change_ok()
  ...
2016-10-10 13:04:49 -07:00
Linus Torvalds
563873318d Merge branch 'printk-cleanups'
Merge my system logging cleanups, triggered by the broken '\n' patches.

The line continuation handling has been broken basically forever, and
the code to handle the system log records was both confusing and
dubious.  And it would do entirely the wrong thing unless you always had
a terminating newline, partly because it couldn't actually see whether a
message was marked KERN_CONT or not (but partly because the LOG_CONT
handling in the recording code was rather confusing too).

This re-introduces a real semantically meaningful KERN_CONT, and fixes
the few places I noticed where it was missing.  There are probably more
missing cases, since KERN_CONT hasn't actually had any semantic meaning
for at least four years (other than the checkpatch meaning of "no log
level necessary, this is a continuation line").

This also allows the combination of KERN_CONT and a log level.  In that
case the log level will be ignored if the merging with a previous line
is successful, but if a new record is needed, that new record will now
get the right log level.

That also means that you can at least in theory combine KERN_CONT with
the "pr_info()" style helpers, although any use of pr_fmt() prefixing
would make that just result in a mess, of course (the prefix would end
up in the middle of a continuing line).

* printk-cleanups:
  printk: make reading the kernel log flush pending lines
  printk: re-organize log_output() to be more legible
  printk: split out core logging code into helper function
  printk: reinstate KERN_CONT for printing continuation lines
2016-10-10 09:29:50 -07:00
Linus Torvalds
4bcc595ccd printk: reinstate KERN_CONT for printing continuation lines
Long long ago the kernel log buffer was a buffered stream of bytes, very
much like stdio in user space.  It supported log levels by scanning the
stream and noticing the log level markers at the beginning of each line,
but if you wanted to print a partial line in multiple chunks, you just
did multiple printk() calls, and it just automatically worked.

Except when it didn't, and you had very confusing output when different
lines got all mixed up with each other.  Then you got fragment lines
mixing with each other, or with non-fragment lines, because it was
traditionally impossible to tell whether a printk() call was a
continuation or not.

To at least help clarify the issue of continuation lines, we added a
KERN_CONT marker back in 2007 to mark continuation lines:

  4749252776 ("printk: add KERN_CONT annotation").

That continuation marker was initially an empty string, and didn't
actuall make any semantic difference.  But it at least made it possible
to annotate the source code, and have check-patch notice that a printk()
didn't need or want a log level marker, because it was a continuation of
a previous line.

To avoid the ambiguity between a continuation line that had that
KERN_CONT marker, and a printk with no level information at all, we then
in 2009 made KERN_CONT be a real log level marker which meant that we
could now reliably tell the difference between the two cases.

  5fd29d6ccb ("printk: clean up handling of log-levels and newlines")

and we could take advantage of that to make sure we didn't mix up
continuation lines with lines that just didn't have any loglevel at all.

Then, in 2012, the kernel log buffer was changed to be a "record" based
log, where each line was a record that has a loglevel and a timestamp.

You can see the beginning of that conversion in commits

  e11fea92e1 ("kmsg: export printk records to the /dev/kmsg interface")
  7ff9554bb5 ("printk: convert byte-buffer to variable-length record buffer")

with a number of follow-up commits to fix some painful fallout from that
conversion.  Over all, it took a couple of months to sort out most of
it.  But the upside was that you could have concurrent readers (and
writers) of the kernel log and not have lines with mixed output in them.

And one particular pain-point for the record-based kernel logging was
exactly the fragmentary lines that are generated in smaller chunks.  In
order to still log them as one recrod, the continuation lines need to be
attached to the previous record properly.

However the explicit continuation record marker that is actually useful
for this exact case was actually removed in aroundm the same time by commit

  61e99ab8e3 ("printk: remove the now unnecessary "C" annotation for KERN_CONT")

due to the incorrect belief that KERN_CONT wasn't meaningful.  The
ambiguity between "is this a continuation line" or "is this a plain
printk with no log level information" was reintroduced, and in fact
became an even bigger pain point because there was now the whole
record-level merging of kernel messages going on.

This patch reinstates the KERN_CONT as a real non-empty string marker,
so that the ambiguity is fixed once again.

But it's not a plain revert of that original removal: in the four years
since we made KERN_CONT an empty string again, not only has the format
of the log level markers changed, we've also had some usage changes in
this area.

For example, some ACPI code seems to use KERN_CONT _together_ with a log
level, and now uses both the KERN_CONT marker and (for example) a
KERN_INFO marker to show that it's an informational continuation of a
line.

Which is actually not a bad idea - if the continuation line cannot be
attached to its predecessor, without the log level information we don't
know what log level to assign to it (and we traditionally just assigned
it the default loglevel).  So having both a log level and the KERN_CONT
marker is not necessarily a bad idea, but it does mean that we need to
actually iterate over potentially multiple markers, rather than just a
single one.

Also, since KERN_CONT was still conceptually needed, and encouraged, but
didn't actually _do_ anything, we've also had the reverse problem:
rather than having too many annotations it has too few, and there is bit
rot with code that no longer marks the continuation lines with the
KERN_CONT marker.

So this patch not only re-instates the non-empty KERN_CONT marker, it
also fixes up the cases of bit-rot I noticed in my own logs.

There are probably other cases where KERN_CONT will be needed to be
added, either because it is new code that never dealt with the need for
KERN_CONT, or old code that has bitrotted without anybody noticing.

That said, we should strive to avoid the need for KERN_CONT.  It does
result in real problems for logging, and should generally not be seen as
a good feature.  If we some day can get rid of the feature entirely,
because nobody does any fragmented printk calls, that would be lovely.

But until that point, let's at mark the code that relies on the hacky
multi-fragment kernel printk's.  Not only does it avoid the ambiguity,
it also annotates code as "maybe this would be good to fix some day".

(That said, particularly during single-threaded bootup, the downsides of
KERN_CONT are very limited.  Things get much hairier when you have
multiple threads going on and user level reading and writing logs too).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-09 12:23:38 -07:00
Al Viro
f334bcd94b Merge remote-tracking branch 'ovl/misc' into work.misc 2016-10-08 11:00:01 -04:00
Andreas Gruenbacher
5d6c31910b xattr: Add __vfs_{get,set,remove}xattr helpers
Right now, various places in the kernel check for the existence of
getxattr, setxattr, and removexattr inode operations and directly call
those operations.  Switch to helper functions and test for the IOP_XATTR
flag instead.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-07 20:10:44 -04:00
Linus Torvalds
2ab704a47e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina:
 "The usual rocket science from the trivial tree"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  tracing/syscalls: fix multiline in error message text
  lib/Kconfig.debug: fix DEBUG_SECTION_MISMATCH description
  doc: vfs: fix fadvise() sycall name
  x86/entry: spell EBX register correctly in documentation
  securityfs: fix securityfs_create_dir comment
  irq: Fix typo in tracepoint.xml
2016-10-07 12:24:08 -07:00
Linus Torvalds
a3443cda55 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:

  SELinux/LSM:
   - overlayfs support, necessary for container filesystems

  LSM:
   - finally remove the kernel_module_from_file hook

  Smack:
   - treat signal delivery as an 'append' operation

  TPM:
   - lots of bugfixes & updates

  Audit:
   - new audit data type: LSM_AUDIT_DATA_FILE

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (47 commits)
  Revert "tpm/tpm_crb: implement tpm crb idle state"
  Revert "tmp/tpm_crb: fix Intel PTT hw bug during idle state"
  Revert "tpm/tpm_crb: open code the crb_init into acpi_add"
  Revert "tmp/tpm_crb: implement runtime pm for tpm_crb"
  lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE
  tmp/tpm_crb: implement runtime pm for tpm_crb
  tpm/tpm_crb: open code the crb_init into acpi_add
  tmp/tpm_crb: fix Intel PTT hw bug during idle state
  tpm/tpm_crb: implement tpm crb idle state
  tpm: add check for minimum buffer size in tpm_transmit()
  tpm: constify TPM 1.x header structures
  tpm/tpm_crb: fix the over 80 characters checkpatch warring
  tpm/tpm_crb: drop useless cpu_to_le32 when writing to registers
  tpm/tpm_crb: cache cmd_size register value.
  tmp/tpm_crb: drop include to platform_device
  tpm/tpm_tis: remove unused itpm variable
  tpm_crb: fix incorrect values of cmdReady and goIdle bits
  tpm_crb: refine the naming of constants
  tpm_crb: remove wmb()'s
  tpm_crb: fix crb_req_canceled behavior
  ...
2016-10-04 14:48:27 -07:00
Linus Torvalds
3cd013ab79 Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "Another relatively small pull request for v4.9 with just two patches.

  The patch from Richard updates the list of features we support and
  report back to userspace; this should have been sent earlier with the
  rest of the v4.8 patches but it got lost in my inbox.

  The second patch fixes a problem reported by our Android friends where
  we weren't very consistent in recording PIDs"

* 'stable-4.9' of git://git.infradead.org/users/pcmoore/audit:
  audit: add exclude filter extension to feature bitmap
  audit: consistently record PIDs with task_tgid_nr()
2016-10-04 14:21:41 -07:00
Laurent Georget
1b4606511d securityfs: fix securityfs_create_dir comment
If there is an error creating a directory with securityfs_create_dir,
the error is propagated via ERR_PTR but the function comment claims that
NULL is returned.

This is a similar commit to 88e6c94cda
("fix long-broken securityfs_create_file comment") that did not fix
securityfs_create_dir comment at the same time.

Signed-off-by: Laurent Georget <laurent.georget@supelec.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-09-29 10:07:01 +02:00
Deepa Dinamani
078cd8279e fs: Replace CURRENT_TIME with current_time() for inode timestamps
CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.

CURRENT_TIME is also not y2038 safe.

This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_time() will be
extended to do range checks. Hence, it is necessary for all
file system timestamps to use current_time(). Also,
current_time() will be transitioned along with vfs to be
y2038 safe.

Note that whenever a single call to current_time() is used
to change timestamps in different inodes, it is because they
share the same time granularity.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Felipe Balbi <balbi@kernel.org>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-27 21:06:21 -04:00
Miklos Szeredi
2773bf00ae fs: rename "rename2" i_op to "rename"
Generated patch:

sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-27 11:03:58 +02:00
Miklos Szeredi
18fc84dafa vfs: remove unused i_op->rename
No in-tree uses remain.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-27 11:03:58 +02:00
Linus Torvalds
2ddfdd4289 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a regression in RSA that was only half-fixed earlier in the
  cycle.  It also fixes an older regression that breaks the keyring
  subsystem"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: rsa-pkcs1pad - Handle leading zero for decryption
  KEYS: Fix skcipher IV clobbering
2016-09-23 11:28:04 -07:00
Herbert Xu
456bee986e KEYS: Fix skcipher IV clobbering
The IV must not be modified by the skcipher operation so we need
to duplicate it.

Fixes: c3917fd9df ("KEYS: Use skcipher")
Cc: stable@vger.kernel.org
Reported-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-09-22 17:42:07 +08:00
James Morris
8a17ef9d85 Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/selinux into next 2016-09-21 11:54:19 +10:00
Vivek Goyal
43af5de742 lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE
Right now LSM_AUDIT_DATA_PATH type contains "struct path" in union "u"
of common_audit_data. This information is used to print path of file
at the same time it is also used to get to dentry and inode. And this
inode information is used to get to superblock and device and print
device information.

This does not work well for layered filesystems like overlay where dentry
contained in path is overlay dentry and not the real dentry of underlying
file system. That means inode retrieved from dentry is also overlay
inode and not the real inode.

SELinux helpers like file_path_has_perm() are doing checks on inode
retrieved from file_inode(). This returns the real inode and not the
overlay inode. That means we are doing check on real inode but for audit
purposes we are printing details of overlay inode and that can be
confusing while debugging.

Hence, introduce a new type LSM_AUDIT_DATA_FILE which carries file
information and inode retrieved is real inode using file_inode(). That
way right avc denied information is given to user.

For example, following is one example avc before the patch.

  type=AVC msg=audit(1473360868.399:214): avc:  denied  { read open } for
    pid=1765 comm="cat"
    path="/root/.../overlay/container1/merged/readfile"
    dev="overlay" ino=21443
    scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
    tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
    tclass=file permissive=0

It looks as follows after the patch.

  type=AVC msg=audit(1473360017.388:282): avc:  denied  { read open } for
    pid=2530 comm="cat"
    path="/root/.../overlay/container1/merged/readfile"
    dev="dm-0" ino=2377915
    scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
    tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
    tclass=file permissive=0

Notice that now dev information points to "dm-0" device instead of
"overlay" device. This makes it clear that check failed on underlying
inode and not on the overlay inode.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
[PM: slight tweaks to the description to make checkpatch.pl happy]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-09-19 13:42:38 -04:00
James Morris
de2f4b3453 Merge branch 'stable-4.9' of git://git.infradead.org/users/pcmoore/selinux into next 2016-09-19 12:27:10 +10:00
Miklos Szeredi
e71b9dff06 ima: use file_dentry()
Ima tries to call ->setxattr() on overlayfs dentry after having locked
underlying inode, which results in a deadlock.

Reported-by: Krisztian Litkey <kli@iki.fi>
Fixes: 4bacc9c923 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-09-16 12:44:20 +02:00
Wei Yongjun
9b6a9ecc2d selinux: fix error return code in policydb_read()
Fix to return error code -EINVAL from the error handling case instead
of 0 (rc is overwrite to 0 when policyvers >=
POLICYDB_VERSION_ROLETRANS), as done elsewhere in this function.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
[PM: normalize "selinux" in patch subject, description line wrap]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-09-13 17:14:43 -04:00
Casey Schaufler
c60b906673 Smack: Signal delivery as an append operation
Under a strict subject/object security policy delivering a
signal or delivering network IPC could be considered either
a write or an append operation. The original choice to make
both write operations leads to an issue where IPC delivery
is desired under policy, but delivery of signals is not.
This patch provides the option of making signal delivery
an append operation, allowing Smack rules that deny signal
delivery while allowing IPC. This was requested for Tizen.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2016-09-08 13:22:56 -07:00
Linus Torvalds
80a77045da - force check_object_size() to be inline too
- move page-spanning check behind a CONFIG since it's triggering false positives
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJX0F3qAAoJEIly9N/cbcAmaxEP/R737i+XhP4VOv+rYW050NHB
 K+FDeLv5/Gx56JrHRRWwj80K5/9F09wS929AWcaTDjEb2NKp/o/6W/gN1eVdGlJF
 K3UQk+Kmncb44poVoHkjMRAl6+sfKm6mTWZBTjBECKwQuCFyDoDoqDXhn5IXTlw9
 Ig+TTOSgNw9gRke3ECtFynbVnDWx/Ry/axfT9vGXhFOkWclMUFy2UOdDSTtFAB6x
 yw5hdrfGakk2BPscHLO1xNqRuVLRUSXZVUiJGIQ6AiUupm34Yqmm69mrMuxaOtPC
 Ai3zhNGDuYClcGJAiPJYX+7nRjgPCWAdlyzQqLp5hwx63TJ+gxvhmxoFOJxEmHE/
 99i2Ak073Es6WII532Eknk3vV+UJzQNT/HO+0LcrJFkOEp9EHfVUb19CngQTaX7Q
 UbfYdyFgp3y24cRp7v0tP8gE2LCrsRe0UEhUq2NGmrerw3caNqZGHS9Od5OYEM8D
 uIhaotWoOv9Z0r+DZMGkUjfqeLb6RWNcUoWc5wZ3VYG27BM/pfhRxKf/2aw6O9u0
 2Jk1QJxBr+/8DQ500xu/IBOP9V7aGAc4nxKyqUlwA05/JEFGiAzCwqfZW5CKTxgD
 5Ht994WbTEH3/VaAskKnwggeHvttiEpehBCdVA4bXuhBhJhmPjFiKHX7uRrcM2GV
 /yH3UTkPnr/VD/I2ndiX
 =r8BE
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8-rc6-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull more hardened usercopyfixes from Kees Cook:

 - force check_object_size() to be inline too

 - move page-spanning check behind a CONFIG since it's triggering false
   positives

[ Changed the page-spanning config option to depend on EXPERT in the
  merge.  That way it still gets build testing, and you can enable it if
  you want to, but is never enabled for "normal" configurations ]

* tag 'usercopy-v4.8-rc6-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  usercopy: remove page-spanning test for now
  usercopy: force check_object_size() inline
2016-09-07 14:03:49 -07:00
Kees Cook
8e1f74ea02 usercopy: remove page-spanning test for now
A custom allocator without __GFP_COMP that copies to userspace has been
found in vmw_execbuf_process[1], so this disables the page-span checker
by placing it behind a CONFIG for future work where such things can be
tracked down later.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1373326

Reported-by: Vinson Lee <vlee@freedesktop.org>
Fixes: f5509cc18d ("mm: Hardened usercopy")
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-07 11:33:26 -07:00
Paul Moore
fa2bea2f5c audit: consistently record PIDs with task_tgid_nr()
Unfortunately we record PIDs in audit records using a variety of
methods despite the correct way being the use of task_tgid_nr().
This patch converts all of these callers, except for the case of
AUDIT_SET in audit_receive_msg() (see the comment in the code).

Reported-by: Jeff Vander Stoep <jeffv@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-30 17:19:13 -04:00
William Roberts
7c686af071 selinux: fix overflow and 0 length allocations
Throughout the SELinux LSM, values taken from sepolicy are
used in places where length == 0 or length == <saturated>
matter, find and fix these.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-30 15:45:50 -04:00
William Roberts
3bc7bcf69b selinux: initialize structures
libsepol pointed out an issue where its possible to have
an unitialized jmp and invalid dereference, fix this.
While we're here, zero allocate all the *_val_to_struct
structures.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-29 19:22:10 -04:00
William Roberts
74d977b65e selinux: detect invalid ebitmap
When count is 0 and the highbit is not zero, the ebitmap is not
valid and the internal node is not allocated. This causes issues
when routines, like mls_context_isvalid() attempt to use the
ebitmap_for_each_bit() and ebitmap_node_get_bit() as they assume
a highbit > 0 will have a node allocated.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-29 19:19:50 -04:00
Markus Elfring
63e24c4971 Smack: Use memdup_user() rather than duplicating its implementation
Reuse existing functionality from memdup_user() instead of keeping
duplicate source code.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-08-23 09:58:21 -07:00
Linus Torvalds
6040e57658 Make the hardened user-copy code depend on having a hardened allocator
The kernel test robot reported a usercopy failure in the new hardened
sanity checks, due to a page-crossing copy of the FPU state into the
task structure.

This happened because the kernel test robot was testing with SLOB, which
doesn't actually do the required book-keeping for slab allocations, and
as a result the hardening code didn't realize that the task struct
allocation was one single allocation - and the sanity checks fail.

Since SLOB doesn't even claim to support hardening (and you really
shouldn't use it), the straightforward solution is to just make the
usercopy hardening code depend on the allocator supporting it.

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-19 12:47:01 -07:00
William Roberts
348a0db9e6 selinux: drop SECURITY_SELINUX_POLICYDB_VERSION_MAX
Remove the SECURITY_SELINUX_POLICYDB_VERSION_MAX Kconfig option

Per: https://github.com/SELinuxProject/selinux/wiki/Kernel-Todo

This was only needed on Fedora 3 and 4 and just causes issues now,
so drop it.

The MAX and MIN should just be whatever the kernel can support.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-18 20:01:15 -04:00
Vivek Goyal
a518b0a5b0 selinux: Implement dentry_create_files_as() hook
Calculate what would be the label of newly created file and set that
secid in the passed creds.

Context of the task which is actually creating file is retrieved from
set of creds passed in. (old->security).

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-10 08:25:22 -04:00
Vivek Goyal
2602625b7e security, overlayfs: Provide hook to correctly label newly created files
During a new file creation we need to make sure new file is created with the
right label. New file is created in upper/ so effectively file should get
label as if task had created file in upper/.

We switched to mounter's creds for actual file creation. Also if there is a
whiteout present, then file will be created in work/ dir first and then
renamed in upper. In none of the cases file will be labeled as we want it to
be.

This patch introduces a new hook dentry_create_files_as(), which determines
the label/context dentry will get if it had been created by task in upper
and modify passed set of creds appropriately. Caller makes use of these new
creds for file creation.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: fix whitespace issues found with checkpatch.pl]
[PM: changes to use stat->mode in ovl_create_or_link()]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:46:46 -04:00
Vivek Goyal
c957f6df52 selinux: Pass security pointer to determine_inode_label()
Right now selinux_determine_inode_label() works on security pointer of
current task. Soon I need this to work on a security pointer retrieved
from a set of creds. So start passing in a pointer and caller can
decide where to fetch security pointer from.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:45:29 -04:00
Vivek Goyal
19472b69d6 selinux: Implementation for inode_copy_up_xattr() hook
When a file is copied up in overlay, we have already created file on
upper/ with right label and there is no need to copy up selinux
label/xattr from lower file to upper file. In fact in case of context
mount, we don't want to copy up label as newly created file got its label
from context= option.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:43:59 -04:00
Vivek Goyal
121ab822ef security,overlayfs: Provide security hook for copy up of xattrs for overlay file
Provide a security hook which is called when xattrs of a file are being
copied up. This hook is called once for each xattr and LSM can return
0 if the security module wants the xattr to be copied up, 1 if the
security module wants the xattr to be discarded on the copy, -EOPNOTSUPP
if the security module does not handle/manage the xattr, or a -errno
upon an error.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: whitespace cleanup for checkpatch.pl]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:42:13 -04:00
Vivek Goyal
56909eb3f5 selinux: Implementation for inode_copy_up() hook
A file is being copied up for overlay file system. Prepare a new set of
creds and set create_sid appropriately so that new file is created with
appropriate label.

Overlay inode has right label for both context and non-context mount
cases. In case of non-context mount, overlay inode will have the label
of lower file and in case of context mount, overlay inode will have
the label from context= mount option.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:41:52 -04:00
Vivek Goyal
d8ad8b4961 security, overlayfs: provide copy up security hook for unioned files
Provide a security hook to label new file correctly when a file is copied
up from lower layer to upper layer of a overlay/union mount.

This hook can prepare a new set of creds which are suitable for new file
creation during copy up. Caller will use new creds to create file and then
revert back to old creds and release new creds.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: whitespace cleanup to appease checkpatch.pl]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:06:53 -04:00
Linus Torvalds
1eccfa090e Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user
bounds checking for most architectures on SLAB and SLUB.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJXl9tlAAoJEIly9N/cbcAm5BoP/ikTtDp2bFw1sn92yHTnIWzl
 O+dcKVAeRgjfnSvPfb1JITpaM58exQSaDsPBeR0DbVzU1zDdhLcwHHiQupFh98Ka
 vBZthbrlL/u4NB26enEEW0iyA32BsxYBMnIu0z5ux9RbZflmQwGQ0c0rvy3dJ7/b
 FzB5ayVST5y/a0m6/sImeeExh78GU9rsMb1XmJRMwlJAy6miDz/F9TP0LnuW6PhG
 J5XC99ygNJS1pQBLACRsrZw6ImgBxXnWCok6tWPMxFfD+rJBU2//wqS+HozyMWHL
 iYP7+ytVo/ZVok4114X/V4Oof3a6wqgpBuYrivJ228QO+UsLYbYLo6sZ8kRK7VFm
 9GgHo/8rWB1T9lBbSaa7UL5r0dVNNLjFGS42vwV+YlgUMQ1A35VRojO0jUnJSIQU
 Ug1IxKmylLd0nEcwD8/l3DXeQABsfL8GsoKW0OtdTZtW4RND4gzq34LK6t7hvayF
 kUkLg1OLNdUJwOi16M/rhugwYFZIMfoxQtjkRXKWN4RZ2QgSHnx2lhqNmRGPAXBG
 uy21wlzUTfLTqTpoeOyHzJwyF2qf2y4nsziBMhvmlrUvIzW1LIrYUKCNT4HR8Sh5
 lC2WMGYuIqaiu+NOF3v6CgvKd9UW+mxMRyPEybH8mEgfm+FLZlWABiBjIUpSEZuB
 JFfuMv1zlljj/okIQRg8
 =USIR
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull usercopy protection from Kees Cook:
 "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
  copy_from_user bounds checking for most architectures on SLAB and
  SLUB"

* tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  mm: SLUB hardened usercopy support
  mm: SLAB hardened usercopy support
  s390/uaccess: Enable hardened usercopy
  sparc/uaccess: Enable hardened usercopy
  powerpc/uaccess: Enable hardened usercopy
  ia64/uaccess: Enable hardened usercopy
  arm64/uaccess: Enable hardened usercopy
  ARM: uaccess: Enable hardened usercopy
  x86/uaccess: Enable hardened usercopy
  mm: Hardened usercopy
  mm: Implement stack frame object validation
  mm: Add is_migrate_cma_page
2016-08-08 14:48:14 -07:00
William Roberts
8b31f456c7 selinux: print leading 0x on ioctlcmd audits
ioctlcmd is currently printing hex numbers, but their is no leading
0x. Thus things like ioctlcmd=1234 are misleading, as the base is
not evident.

Correct this by adding 0x as a prefix, so ioctlcmd=1234 becomes
ioctlcmd=0x1234.

Signed-off-by: William Roberts <william.c.roberts@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 13:08:34 -04:00
Javier Martinez Canillas
1a93a6eac3 security: Use IS_ENABLED() instead of checking for built-in or module
The IS_ENABLED() macro checks if a Kconfig symbol has been enabled
either built-in or as a module, use that macro instead of open coding
the same.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 13:08:25 -04:00
Linus Torvalds
835c92d43b Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull qstr constification updates from Al Viro:
 "Fairly self-contained bunch - surprising lot of places passes struct
  qstr * as an argument when const struct qstr * would suffice; it
  complicates analysis for no good reason.

  I'd prefer to feed that separately from the assorted fixes (those are
  in #for-linus and with somewhat trickier topology)"

* 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  qstr: constify instances in adfs
  qstr: constify instances in lustre
  qstr: constify instances in f2fs
  qstr: constify instances in ext2
  qstr: constify instances in vfat
  qstr: constify instances in procfs
  qstr: constify instances in fuse
  qstr constify instances in fs/dcache.c
  qstr: constify instances in nfs
  qstr: constify instances in ocfs2
  qstr: constify instances in autofs4
  qstr: constify instances in hfs
  qstr: constify instances in hfsplus
  qstr: constify instances in logfs
  qstr: constify dentry_init_security
2016-08-06 09:49:02 -04:00
Linus Torvalds
7a1e8b80fb Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - TPM core and driver updates/fixes
   - IPv6 security labeling (CALIPSO)
   - Lots of Apparmor fixes
   - Seccomp: remove 2-phase API, close hole where ptrace can change
     syscall #"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
  apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
  tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
  tpm: Factor out common startup code
  tpm: use devm_add_action_or_reset
  tpm2_i2c_nuvoton: add irq validity check
  tpm: read burstcount from TPM_STS in one 32-bit transaction
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  tpm_tis_core: convert max timeouts from msec to jiffies
  apparmor: fix arg_size computation for when setprocattr is null terminated
  apparmor: fix oops, validate buffer size in apparmor_setprocattr()
  apparmor: do not expose kernel stack
  apparmor: fix module parameters can be changed after policy is locked
  apparmor: fix oops in profile_unpack() when policy_db is not present
  apparmor: don't check for vmalloc_addr if kvzalloc() failed
  apparmor: add missing id bounds check on dfa verification
  apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
  apparmor: use list_next_entry instead of list_entry_next
  apparmor: fix refcount race when finding a child profile
  apparmor: fix ref count leak when profile sha1 hash is read
  apparmor: check that xindex is in trans_table bounds
  ...
2016-07-29 17:38:46 -07:00
Linus Torvalds
a867d7349e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns vfs updates from Eric Biederman:
 "This tree contains some very long awaited work on generalizing the
  user namespace support for mounting filesystems to include filesystems
  with a backing store.  The real world target is fuse but the goal is
  to update the vfs to allow any filesystem to be supported.  This
  patchset is based on a lot of code review and testing to approach that
  goal.

  While looking at what is needed to support the fuse filesystem it
  became clear that there were things like xattrs for security modules
  that needed special treatment.  That the resolution of those concerns
  would not be fuse specific.  That sorting out these general issues
  made most sense at the generic level, where the right people could be
  drawn into the conversation, and the issues could be solved for
  everyone.

  At a high level what this patchset does a couple of simple things:

   - Add a user namespace owner (s_user_ns) to struct super_block.

   - Teach the vfs to handle filesystem uids and gids not mapping into
     to kuids and kgids and being reported as INVALID_UID and
     INVALID_GID in vfs data structures.

  By assigning a user namespace owner filesystems that are mounted with
  only user namespace privilege can be detected.  This allows security
  modules and the like to know which mounts may not be trusted.  This
  also allows the set of uids and gids that are communicated to the
  filesystem to be capped at the set of kuids and kgids that are in the
  owning user namespace of the filesystem.

  One of the crazier corner casees this handles is the case of inodes
  whose i_uid or i_gid are not mapped into the vfs.  Most of the code
  simply doesn't care but it is easy to confuse the inode writeback path
  so no operation that could cause an inode write-back is permitted for
  such inodes (aka only reads are allowed).

  This set of changes starts out by cleaning up the code paths involved
  in user namespace permirted mounts.  Then when things are clean enough
  adds code that cleanly sets s_user_ns.  Then additional restrictions
  are added that are possible now that the filesystem superblock
  contains owner information.

  These changes should not affect anyone in practice, but there are some
  parts of these restrictions that are changes in behavior.

   - Andy's restriction on suid executables that does not honor the
     suid bit when the path is from another mount namespace (think
     /proc/[pid]/fd/) or when the filesystem was mounted by a less
     privileged user.

   - The replacement of the user namespace implicit setting of MNT_NODEV
     with implicitly setting SB_I_NODEV on the filesystem superblock
     instead.

     Using SB_I_NODEV is a stronger form that happens to make this state
     user invisible.  The user visibility can be managed but it caused
     problems when it was introduced from applications reasonably
     expecting mount flags to be what they were set to.

  There is a little bit of work remaining before it is safe to support
  mounting filesystems with backing store in user namespaces, beyond
  what is in this set of changes.

   - Verifying the mounter has permission to read/write the block device
     during mount.

   - Teaching the integrity modules IMA and EVM to handle filesystems
     mounted with only user namespace root and to reduce trust in their
     security xattrs accordingly.

   - Capturing the mounters credentials and using that for permission
     checks in d_automount and the like.  (Given that overlayfs already
     does this, and we need the work in d_automount it make sense to
     generalize this case).

  Furthermore there are a few changes that are on the wishlist:

   - Get all filesystems supporting posix acls using the generic posix
     acls so that posix_acl_fix_xattr_from_user and
     posix_acl_fix_xattr_to_user may be removed.  [Maintainability]

   - Reducing the permission checks in places such as remount to allow
     the superblock owner to perform them.

   - Allowing the superblock owner to chown files with unmapped uids and
     gids to something that is mapped so the files may be treated
     normally.

  I am not considering even obvious relaxations of permission checks
  until it is clear there are no more corner cases that need to be
  locked down and handled generically.

  Many thanks to Seth Forshee who kept this code alive, and putting up
  with me rewriting substantial portions of what he did to handle more
  corner cases, and for his diligent testing and reviewing of my
  changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
  fs: Call d_automount with the filesystems creds
  fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
  evm: Translate user/group ids relative to s_user_ns when computing HMAC
  dquot: For now explicitly don't support filesystems outside of init_user_ns
  quota: Handle quota data stored in s_user_ns in quota_setxquota
  quota: Ensure qids map to the filesystem
  vfs: Don't create inodes with a uid or gid unknown to the vfs
  vfs: Don't modify inodes with a uid or gid unknown to the vfs
  cred: Reject inodes with invalid ids in set_create_file_as()
  fs: Check for invalid i_uid in may_follow_link()
  vfs: Verify acls are valid within superblock's s_user_ns.
  userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
  fs: Refuse uid/gid changes which don't map into s_user_ns
  selinux: Add support for unprivileged mounts from user namespaces
  Smack: Handle labels consistently in untrusted mounts
  Smack: Add support for unprivileged mounts from user namespaces
  fs: Treat foreign mounts as nosuid
  fs: Limit file caps to the user namespace of the super block
  userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
  userns: Remove implicit MNT_NODEV fragility.
  ...
2016-07-29 15:54:19 -07:00
Linus Torvalds
6784725ab0 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  Probably the most interesting part long-term is ->d_init() - that will
  have a bunch of followups in (at least) ceph and lustre, but we'll
  need to sort the barrier-related rules before it can get used for
  really non-trivial stuff.

  Another fun thing is the merge of ->d_iput() callers (dentry_iput()
  and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
  except the one in __d_lookup_lru())"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
  fs/dcache.c: avoid soft-lockup in dput()
  vfs: new d_init method
  vfs: Update lookup_dcache() comment
  bdev: get rid of ->bd_inodes
  Remove last traces of ->sync_page
  new helper: d_same_name()
  dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
  vfs: clean up documentation
  vfs: document ->d_real()
  vfs: merge .d_select_inode() into .d_real()
  unify dentry_iput() and dentry_unlink_inode()
  binfmt_misc: ->s_root is not going anywhere
  drop redundant ->owner initializations
  ufs: get rid of redundant checks
  orangefs: constify inode_operations
  missed comment updates from ->direct_IO() prototype change
  file_inode(f)->i_mapping is f->f_mapping
  trim fsnotify hooks a bit
  9p: new helper - v9fs_parent_fid()
  debugfs: ->d_parent is never NULL or negative
  ...
2016-07-28 12:59:05 -07:00
Linus Torvalds
554828ee0d Merge branch 'salted-string-hash'
This changes the vfs dentry hashing to mix in the parent pointer at the
_beginning_ of the hash, rather than at the end.

That actually improves both the hash and the code generation, because we
can move more of the computation to the "static" part of the dcache
setup, and do less at lookup runtime.

It turns out that a lot of other hash users also really wanted to mix in
a base pointer as a 'salt' for the hash, and so the slightly extended
interface ends up working well for other cases too.

Users that want a string hash that is purely about the string pass in a
'salt' pointer of NULL.

* merge branch 'salted-string-hash':
  fs/dcache.c: Save one 32-bit multiply in dcache lookup
  vfs: make the string hashes salt the hash
2016-07-28 12:26:31 -07:00
Arnd Bergmann
7616ac70d1 apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
The newly added Kconfig option could never work and just causes a build error
when disabled:

security/apparmor/lsm.c:675:25: error: 'CONFIG_SECURITY_APPARMOR_HASH_DEFAULT' undeclared here (not in a function)
 bool aa_g_hash_policy = CONFIG_SECURITY_APPARMOR_HASH_DEFAULT;

The problem is that the macro undefined in this case, and we need to use the IS_ENABLED()
helper to turn it into a boolean constant.

Another minor problem with the original patch is that the option is even offered
in sysfs when SECURITY_APPARMOR_HASH is not enabled, so this also hides the option
in that case.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 6059f71f1e ("apparmor: add parameter to control whether policy hashing is used")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-27 17:39:26 +10:00
Kees Cook
f5509cc18d mm: Hardened usercopy
This is the start of porting PAX_USERCOPY into the mainline kernel. This
is the first set of features, controlled by CONFIG_HARDENED_USERCOPY. The
work is based on code by PaX Team and Brad Spengler, and an earlier port
from Casey Schaufler. Additional non-slab page tests are from Rik van Riel.

This patch contains the logic for validating several conditions when
performing copy_to_user() and copy_from_user() on the kernel object
being copied to/from:
- address range doesn't wrap around
- address range isn't NULL or zero-allocated (with a non-zero copy size)
- if on the slab allocator:
  - object size must be less than or equal to copy size (when check is
    implemented in the allocator, which appear in subsequent patches)
- otherwise, object must not span page allocations (excepting Reserved
  and CMA ranges)
- if on the stack
  - object must not extend before/after the current process stack
  - object must be contained by a valid stack frame (when there is
    arch/build support for identifying stack frames)
- object must not overlap with kernel text

Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-26 14:41:47 -07:00
Linus Torvalds
bbce2ad2d7 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.8:

  API:
   - first part of skcipher low-level conversions
   - add KPP (Key-agreement Protocol Primitives) interface.

  Algorithms:
   - fix IPsec/cryptd reordering issues that affects aesni
   - RSA no longer does explicit leading zero removal
   - add SHA3
   - add DH
   - add ECDH
   - improve DRBG performance by not doing CTR by hand

  Drivers:
   - add x86 AVX2 multibuffer SHA256/512
   - add POWER8 optimised crc32c
   - add xts support to vmx
   - add DH support to qat
   - add RSA support to caam
   - add Layerscape support to caam
   - add SEC1 AEAD support to talitos
   - improve performance by chaining requests in marvell/cesa
   - add support for Araneus Alea I USB RNG
   - add support for Broadcom BCM5301 RNG
   - add support for Amlogic Meson RNG
   - add support Broadcom NSP SoC RNG"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits)
  crypto: vmx - Fix aes_p8_xts_decrypt build failure
  crypto: vmx - Ignore generated files
  crypto: vmx - Adding support for XTS
  crypto: vmx - Adding asm subroutines for XTS
  crypto: skcipher - add comment for skcipher_alg->base
  crypto: testmgr - Print akcipher algorithm name
  crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
  crypto: nx - off by one bug in nx_of_update_msc()
  crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
  crypto: scatterwalk - Inline start/map/done
  crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start
  crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone
  crypto: scatterwalk - Fix test in scatterwalk_done
  crypto: api - Optimise away crypto_yield when hard preemption is on
  crypto: scatterwalk - add no-copy support to copychunks
  crypto: scatterwalk - Remove scatterwalk_bytes_sglen
  crypto: omap - Stop using crypto scatterwalk_bytes_sglen
  crypto: skcipher - Remove top-level givcipher interface
  crypto: user - Remove crypto_lookup_skcipher call
  crypto: cts - Convert to skcipher
  ...
2016-07-26 13:40:17 -07:00
Al Viro
4f3ccd7657 qstr: constify dentry_init_security
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-07-20 23:30:06 -04:00
John Johansen
d4d03f74a7 apparmor: fix arg_size computation for when setprocattr is null terminated
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Vegard Nossum
e89b808132 apparmor: fix oops, validate buffer size in apparmor_setprocattr()
When proc_pid_attr_write() was changed to use memdup_user apparmor's
(interface violating) assumption that the setprocattr buffer was always
a single page was violated.

The size test is not strictly speaking needed as proc_pid_attr_write()
will reject anything larger, but for the sake of robustness we can keep
it in.

SMACK and SELinux look safe to me, but somebody else should probably
have a look just in case.

Based on original patch from Vegard Nossum <vegard.nossum@oracle.com>
modified for the case that apparmor provides null termination.

Fixes: bb646cdb12
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-12 08:43:10 -07:00
Heinrich Schuchardt
f4ee2def2d apparmor: do not expose kernel stack
Do not copy uninitalized fields th.td_hilen, th.td_data.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
58acf9d911 apparmor: fix module parameters can be changed after policy is locked
the policy_lock parameter is a one way switch that prevents policy
from being further modified. Unfortunately some of the module parameters
can effectively modify policy by turning off enforcement.

split policy_admin_capable into a view check and a full admin check,
and update the admin check to test the policy_lock parameter.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
5f20fdfed1 apparmor: fix oops in profile_unpack() when policy_db is not present
BugLink: http://bugs.launchpad.net/bugs/1592547

If unpack_dfa() returns NULL due to the dfa not being present,
profile_unpack() is not checking if the dfa is not present (NULL).

Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
3197f5adf5 apparmor: don't check for vmalloc_addr if kvzalloc() failed
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
15756178c6 apparmor: add missing id bounds check on dfa verification
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Jeff Mahoney
ff118479a7 apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
While using AppArmor, SYS_CAP_RESOURCE is insufficient to call prlimit
on another task. The only other example of a AppArmor mediating access to
another, already running, task (ignoring fork+exec) is ptrace.

The AppArmor model for ptrace is that one of the following must be true:
1) The tracer is unconfined
2) The tracer is in complain mode
3) The tracer and tracee are confined by the same profile
4) The tracer is confined but has SYS_CAP_PTRACE

1), 2, and 3) are already true for setrlimit.

We can match the ptrace model just by allowing CAP_SYS_RESOURCE.

We still test the values of the rlimit since it can always be overridden
using a value that means unlimited for a particular resource.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Geliang Tang
38dbd7d8be apparmor: use list_next_entry instead of list_entry_next
list_next_entry has been defined in list.h, so I replace list_entry_next
with it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
de7c4cc947 apparmor: fix refcount race when finding a child profile
When finding a child profile via an rcu critical section, the profile
may be put and scheduled for deletion after the child is found but
before its refcount is incremented.

Protect against this by repeating the lookup if the profiles refcount
is 0 and is one its way to deletion.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
0b938a2e2c apparmor: fix ref count leak when profile sha1 hash is read
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
23ca7b640b apparmor: check that xindex is in trans_table bounds
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f7da2de011 apparmor: ensure the target profile name is always audited
The target profile name was not being correctly audited in a few
cases because the target variable was not being set and gotos
passed the code to set it at apply:

Since it is always based on new_profile just drop the target var
and conditionally report based on new_profile.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
7ee6da25dc apparmor: fix audit full profile hname on successful load
Currently logging of a successful profile load only logs the basename
of the profile. This can result in confusion when a child profile has
the same name as the another profile in the set. Logging the hname
will ensure there is no confusion.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
bf15cf0c64 apparmor: fix log failures for all profiles in a set
currently only the profile that is causing the failure is logged. This
makes it more confusing than necessary about which profiles loaded
and which didn't. So make sure to log success and failure messages for
all profiles in the set being loaded.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f351841f8d apparmor: fix put() parent ref after updating the active ref
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
6059f71f1e apparmor: add parameter to control whether policy hashing is used
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
bd35db8b8c apparmor: internal paths should be treated as disconnected
Internal mounts are not mounted anywhere and as such should be treated
as disconnected paths.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f2e561d190 apparmor: fix disconnected bind mnts reconnection
Bind mounts can fail to be properly reconnected when PATH_CONNECT is
specified. Ensure that when PATH_CONNECT is specified the path has
a root.

BugLink: http://bugs.launchpad.net/bugs/1319984

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
d671e89020 apparmor: fix update the mtime of the profile file on replacement
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
9049a79221 apparmor: exec should not be returning ENOENT when it denies
The current behavior is confusing as it causes exec failures to report
the executable is missing instead of identifying that apparmor
caused the failure.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
b6b1b81b3a apparmor: fix uninitialized lsm_audit member
BugLink: http://bugs.launchpad.net/bugs/1268727

The task field in the lsm_audit struct needs to be initialized if
a change_hat fails, otherwise the following oops will occur

BUG: unable to handle kernel paging request at 0000002fbead7d08
IP: [<ffffffff8171153e>] _raw_spin_lock+0xe/0x50
PGD 1e3f35067 PUD 0
Oops: 0002 [#1] SMP
Modules linked in: pppox crc_ccitt p8023 p8022 psnap llc ax25 btrfs raid6_pq xor xfs libcrc32c dm_multipath scsi_dh kvm_amd dcdbas kvm microcode amd64_edac_mod joydev edac_core psmouse edac_mce_amd serio_raw k10temp sp5100_tco i2c_piix4 ipmi_si ipmi_msghandler acpi_power_meter mac_hid lp parport hid_generic usbhid hid pata_acpi mpt2sas ahci raid_class pata_atiixp bnx2 libahci scsi_transport_sas [last unloaded: tipc]
CPU: 2 PID: 699 Comm: changehat_twice Tainted: GF          O 3.13.0-7-generic #25-Ubuntu
Hardware name: Dell Inc. PowerEdge R415/08WNM9, BIOS 1.8.6 12/06/2011
task: ffff8802135c6000 ti: ffff880212986000 task.ti: ffff880212986000
RIP: 0010:[<ffffffff8171153e>]  [<ffffffff8171153e>] _raw_spin_lock+0xe/0x50
RSP: 0018:ffff880212987b68  EFLAGS: 00010006
RAX: 0000000000020000 RBX: 0000002fbead7500 RCX: 0000000000000000
RDX: 0000000000000292 RSI: ffff880212987ba8 RDI: 0000002fbead7d08
RBP: ffff880212987b68 R08: 0000000000000246 R09: ffff880216e572a0
R10: ffffffff815fd677 R11: ffffea0008469580 R12: ffffffff8130966f
R13: ffff880212987ba8 R14: 0000002fbead7d08 R15: ffff8800d8c6b830
FS:  00002b5e6c84e7c0(0000) GS:ffff880216e40000(0000) knlGS:0000000055731700
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000002fbead7d08 CR3: 000000021270f000 CR4: 00000000000006e0
Stack:
 ffff880212987b98 ffffffff81075f17 ffffffff8130966f 0000000000000009
 0000000000000000 0000000000000000 ffff880212987bd0 ffffffff81075f7c
 0000000000000292 ffff880212987c08 ffff8800d8c6b800 0000000000000026
Call Trace:
 [<ffffffff81075f17>] __lock_task_sighand+0x47/0x80
 [<ffffffff8130966f>] ? apparmor_cred_prepare+0x2f/0x50
 [<ffffffff81075f7c>] do_send_sig_info+0x2c/0x80
 [<ffffffff81075fee>] send_sig_info+0x1e/0x30
 [<ffffffff8130242d>] aa_audit+0x13d/0x190
 [<ffffffff8130c1dc>] aa_audit_file+0xbc/0x130
 [<ffffffff8130966f>] ? apparmor_cred_prepare+0x2f/0x50
 [<ffffffff81304cc2>] aa_change_hat+0x202/0x530
 [<ffffffff81308fc6>] aa_setprocattr_changehat+0x116/0x1d0
 [<ffffffff8130a11d>] apparmor_setprocattr+0x25d/0x300
 [<ffffffff812cee56>] security_setprocattr+0x16/0x20
 [<ffffffff8121fc87>] proc_pid_attr_write+0x107/0x130
 [<ffffffff811b7604>] vfs_write+0xb4/0x1f0
 [<ffffffff811b8039>] SyS_write+0x49/0xa0
 [<ffffffff8171a1bf>] tracesys+0xe1/0xe6

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
ec34fa24a9 apparmor: fix replacement bug that adds new child to old parent
When set atomic replacement is used and the parent is updated before the
child, and the child did not exist in the old parent so there is no
direct replacement then the new child is incorrectly added to the old
parent. This results in the new parent not having the child(ren) that
it should and the old parent when being destroyed asserting the
following error.

AppArmor: policy_destroy: internal error, policy '<profile/name>' still
contains profiles

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
dcda617a0c apparmor: fix refcount bug in profile replacement
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
James Morris
e1e5fa9616 Merge tag 'keys-misc-20160708' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2016-07-09 12:49:00 +10:00
James Morris
c632809953 Merge branch 'smack-for-4.8' of https://github.com/cschaufler/smack-next into next 2016-07-08 10:41:15 +10:00
Vegard Nossum
30a46a4647 apparmor: fix oops, validate buffer size in apparmor_setprocattr()
When proc_pid_attr_write() was changed to use memdup_user apparmor's
(interface violating) assumption that the setprocattr buffer was always
a single page was violated.

The size test is not strictly speaking needed as proc_pid_attr_write()
will reject anything larger, but for the sake of robustness we can keep
it in.

SMACK and SELinux look safe to me, but somebody else should probably
have a look just in case.

Based on original patch from Vegard Nossum <vegard.nossum@oracle.com>
modified for the case that apparmor provides null termination.

Fixes: bb646cdb12
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-08 10:26:25 +10:00
James Morris
d011a4d861 Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/selinux into next 2016-07-07 10:15:34 +10:00
Seth Forshee
0b3c9761d1 evm: Translate user/group ids relative to s_user_ns when computing HMAC
The EVM HMAC should be calculated using the on disk user and
group ids, so the k[ug]ids in the inode must be translated
relative to the s_user_ns of the inode's super block.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-07-05 15:13:10 -05:00
Al Viro
b223f4e215 Merge branch 'd_real' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into work.misc 2016-06-30 23:34:49 -04:00
Eric Richter
544e1cea03 ima: extend the measurement entry specific pcr
Extend the PCR supplied as a parameter, instead of assuming that the
measurement entry uses the default configured PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:22 -04:00
Eric Richter
a422638d49 ima: change integrity cache to store measured pcr
IMA avoids re-measuring files by storing the current state as a flag in
the integrity cache. It will then skip adding a new measurement log entry
if the cache reports the file as already measured.

If a policy measures an already measured file to a new PCR, the measurement
will not be added to the list. This patch implements a new bitfield for
specifying which PCR the file was measured into, rather than if it was
measured.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:22 -04:00
Eric Richter
67696f6d79 ima: redefine duplicate template entries
Template entry duplicates are prevented from being added to the
measurement list by checking a hash table that contains the template
entry digests. However, the PCR value is not included in this comparison,
so duplicate template entry digests with differing PCRs may be dropped.

This patch redefines duplicate template entries as template entries with
the same digest and same PCR values.

Reported-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
5f6f027b50 ima: change ima_measurements_show() to display the entry specific pcr
IMA assumes that the same default Kconfig PCR is extended for each
entry. This patch replaces the default configured PCR with the policy
defined PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
14b1da85bb ima: include pcr for each measurement log entry
The IMA measurement list entries include the Kconfig defined PCR value.
This patch defines a new ima_template_entry field for including the PCR
as specified in the policy rule.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
725de7fabb ima: extend ima_get_action() to return the policy pcr
Different policy rules may extend different PCRs. This patch retrieves
the specific PCR for the matched rule.  Subsequent patches will include
the rule specific PCR in the measurement list and extend the appropriate
PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:20 -04:00
Eric Richter
0260643ce8 ima: add policy support for extending different pcrs
This patch defines a new IMA measurement policy rule option "pcr=",
which allows extending different PCRs on a per rule basis. For example,
the system independent files could extend the default IMA Kconfig
specified PCR, while the system dependent files could extend a different
PCR.

The following is an example of this usage with an SELinux policy; the
rule would extend PCR 11 with system configuration files:

  measure func=FILE_CHECK mask=MAY_READ obj_type=system_conf_t pcr=11

Changelog v3:
- FIELD_SIZEOF returns bytes, not bits. Fixed INVALID_PCR

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:20 -04:00
Eric Richter
96d450bbec integrity: add measured_pcrs field to integrity cache
To keep track of which measurements have been extended to which PCRs, this
patch defines a new integrity_iint_cache field named measured_pcrs. This
field is a bitmask of the PCRs measured. Each bit corresponds to a PCR
index. For example, bit 10 corresponds to PCR 10.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:19 -04:00
Huw Davies
4fee5242bf calipso: Add a label cache.
This works in exactly the same way as the CIPSO label cache.
The idea is to allow the lsm to cache the result of a secattr
lookup so that it doesn't need to perform the lookup for
every skbuff.

It introduces two sysctl controls:
 calipso_cache_enable - enables/disables the cache.
 calipso_cache_bucket_size - sets the size of a cache bucket.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:17 -04:00
Huw Davies
a04e71f631 netlabel: Pass a family parameter to netlbl_skbuff_err().
This makes it possible to route the error to the appropriate
labelling engine.  CALIPSO is far less verbose than CIPSO
when encountering a bogus packet, so there is no need for a
CALIPSO error handler.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:16 -04:00
Huw Davies
2917f57b6b calipso: Allow the lsm to label the skbuff directly.
In some cases, the lsm needs to add the label to the skbuff directly.
A NF_INET_LOCAL_OUT IPv6 hook is added to selinux to match the IPv4
behaviour.  This allows selinux to label the skbuffs that it requires.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:15 -04:00
Huw Davies
e1adea9270 calipso: Allow request sockets to be relabelled by the lsm.
Request sockets need to have a label that takes into account the
incoming connection as well as their parent's label.  This is used
for the outgoing SYN-ACK and for their child full-socket.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:05:29 -04:00
Huw Davies
1f440c99d3 netlabel: Prevent setsockopt() from changing the hop-by-hop option.
If a socket has a netlabel in place then don't let setsockopt() alter
the socket's IPv6 hop-by-hop option.  This is in the same spirit as
the existing check for IPv4.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:05:27 -04:00
Huw Davies
ceba1832b1 calipso: Set the calipso socket label to match the secattr.
CALIPSO is a hop-by-hop IPv6 option.  A lot of this patch is based on
the equivalent CISPO code.  The main difference is due to manipulating
the options in the hop-by-hop header.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:02:51 -04:00
Seth Forshee
aad82892af selinux: Add support for unprivileged mounts from user namespaces
Security labels from unprivileged mounts in user namespaces must
be ignored. Force superblocks from user namespaces whose labeling
behavior is to use xattrs to use mountpoint labeling instead.
For the mountpoint label, default to converting the current task
context into a form suitable for file objects, but also allow the
policy writer to specify a different label through policy
transition rules.

Pieced together from code snippets provided by Stephen Smalley.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 11:02:54 -05:00
Seth Forshee
809c02e091 Smack: Handle labels consistently in untrusted mounts
The SMACK64, SMACK64EXEC, and SMACK64MMAP labels are all handled
differently in untrusted mounts. This is confusing and
potentically problematic. Change this to handle them all the same
way that SMACK64 is currently handled; that is, read the label
from disk and check it at use time. For SMACK64 and SMACK64MMAP
access is denied if the label does not match smk_root. To be
consistent with suid, a SMACK64EXEC label which does not match
smk_root will still allow execution of the file but will not run
with the label supplied in the xattr.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 11:02:22 -05:00
Seth Forshee
9f50eda2a9 Smack: Add support for unprivileged mounts from user namespaces
Security labels from unprivileged mounts cannot be trusted.
Ideally for these mounts we would assign the objects in the
filesystem the same label as the inode for the backing device
passed to mount. Unfortunately it's currently impossible to
determine which inode this is from the LSM mount hooks, so we
settle for the label of the process doing the mount.

This label is assigned to s_root, and also to smk_default to
ensure that new inodes receive this label. The transmute property
is also set on s_root to make this behavior more explicit, even
though it is technically not necessary.

If a filesystem has existing security labels, access to inodes is
permitted if the label is the same as smk_root, otherwise access
is denied. The SMACK64EXEC xattr is completely ignored.

Explicit setting of security labels continues to require
CAP_MAC_ADMIN in init_user_ns.

Altogether, this ensures that filesystem objects are not
accessible to subjects which cannot already access the backing
store, that MAC is not violated for any objects in the fileystem
which are already labeled, and that a user cannot use an
unprivileged mount to gain elevated MAC privileges.

sysfs, tmpfs, and ramfs are already mountable from user
namespaces and support security labels. We can't rule out the
possibility that these filesystems may already be used in mounts
from user namespaces with security lables set from the init
namespace, so failing to trust lables in these filesystems may
introduce regressions. It is safe to trust labels from these
filesystems, since the unprivileged user does not control the
backing store and thus cannot supply security labels, so an
explicit exception is made to trust labels from these
filesystems.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:42 -05:00
Andy Lutomirski
380cf5ba6b fs: Treat foreign mounts as nosuid
If a process gets access to a mount from a different user
namespace, that process should not be able to take advantage of
setuid files or selinux entrypoints from that filesystem.  Prevent
this by treating mounts from other mount namespaces and those not
owned by current_user_ns() or an ancestor as nosuid.

This will make it safer to allow more complex filesystems to be
mounted in non-root user namespaces.

This does not remove the need for MNT_LOCK_NOSUID.  The setuid,
setgid, and file capability bits can no longer be abused if code in
a user namespace were to clear nosuid on an untrusted filesystem,
but this patch, by itself, is insufficient to protect the system
from abuse of files that, when execed, would increase MAC privilege.

As a more concrete explanation, any task that can manipulate a
vfsmount associated with a given user namespace already has
capabilities in that namespace and all of its descendents.  If they
can cause a malicious setuid, setgid, or file-caps executable to
appear in that mount, then that executable will only allow them to
elevate privileges in exactly the set of namespaces in which they
are already privileges.

On the other hand, if they can cause a malicious executable to
appear with a dangerous MAC label, running it could change the
caller's security context in a way that should not have been
possible, even inside the namespace in which the task is confined.

As a hardening measure, this would have made CVE-2014-5207 much
more difficult to exploit.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:41 -05:00
Seth Forshee
d07b846f62 fs: Limit file caps to the user namespace of the super block
Capability sets attached to files must be ignored except in the
user namespaces where the mounter is privileged, i.e. s_user_ns
and its descendants. Otherwise a vector exists for gaining
privileges in namespaces where a user is not already privileged.

Add a new helper function, current_in_user_ns(), to test whether a user
namespace is the same as or a descendant of another namespace.
Use this helper to determine whether a file's capability set
should be applied to the caps constructed during exec.

--EWB Replaced in_userns with the simpler current_in_userns.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:31 -05:00
Herbert Xu
d56d72c6a0 KEYS: Use skcipher for big keys
This patch replaces use of the obsolete blkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David Howells <dhowells@redhat.com>
2016-06-24 21:24:58 +08:00
Dan Carpenter
38327424b4 KEYS: potential uninitialized variable
If __key_link_begin() failed then "edit" would be uninitialized.  I've
added a check to fix that.

This allows a random user to crash the kernel, though it's quite
difficult to achieve.  There are three ways it can be done as the user
would have to cause an error to occur in __key_link():

 (1) Cause the kernel to run out of memory.  In practice, this is difficult
     to achieve without ENOMEM cropping up elsewhere and aborting the
     attempt.

 (2) Revoke the destination keyring between the keyring ID being looked up
     and it being tested for revocation.  In practice, this is difficult to
     time correctly because the KEYCTL_REJECT function can only be used
     from the request-key upcall process.  Further, users can only make use
     of what's in /sbin/request-key.conf, though this does including a
     rejection debugging test - which means that the destination keyring
     has to be the caller's session keyring in practice.

 (3) Have just enough key quota available to create a key, a new session
     keyring for the upcall and a link in the session keyring, but not then
     sufficient quota to create a link in the nominated destination keyring
     so that it fails with EDQUOT.

The bug can be triggered using option (3) above using something like the
following:

	echo 80 >/proc/sys/kernel/keys/root_maxbytes
	keyctl request2 user debug:fred negate @t

The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system.  Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.

Assuming the failure occurs, something like the following will be seen:

	kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
	------------[ cut here ]------------
	kernel BUG at ../mm/slab.c:2821!
	...
	RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
	RSP: 0018:ffff8804014a7de8  EFLAGS: 00010092
	RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
	RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
	RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
	R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
	R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
	...
	Call Trace:
	  kfree+0xde/0x1bc
	  assoc_array_cancel_edit+0x1f/0x36
	  __key_link_end+0x55/0x63
	  key_reject_and_link+0x124/0x155
	  keyctl_reject_key+0xb6/0xe0
	  keyctl_negate_key+0x10/0x12
	  SyS_keyctl+0x9f/0xe7
	  do_syscall_64+0x63/0x13a
	  entry_SYSCALL64_slow_path+0x25/0x25

Fixes: f70e2e0619 ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-16 17:15:04 -10:00
Heinrich Schuchardt
309c5fad5d selinux: fix type mismatch
avc_cache_threshold is of type unsigned int.  Do not use a signed
new_value in sscanf(page, "%u", &new_value).

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
[PM: subject prefix fix, description cleanup]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-15 16:20:28 -04:00
David Howells
965475acca KEYS: Strip trailing spaces
Strip some trailing spaces.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-14 10:29:44 +01:00
Linus Torvalds
8387ff2577 vfs: make the string hashes salt the hash
We always mixed in the parent pointer into the dentry name hash, but we
did it late at lookup time.  It turns out that we can simplify that
lookup-time action by salting the hash with the parent pointer early
instead of late.

A few other users of our string hashes also wanted to mix in their own
pointers into the hash, and those are updated to use the same mechanism.

Hash users that don't have any particular initial salt can just use the
NULL pointer as a no-salt.

Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-10 20:21:46 -07:00
Paul Moore
8bebe88c09 selinux: import NetLabel category bitmaps correctly
The existing ebitmap_netlbl_import() code didn't correctly handle the
case where the ebitmap_node was not aligned/sized to a power of two,
this patch fixes this (on x86_64 ebitmap_node contains six bitmaps
making a range of 0..383).

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-09 10:40:37 -04:00