Like direct file execution (e.g. ./script.sh), indirect file execution
(e.g. sh script.sh) needs to be measured and appraised. Instantiate
the new security_bprm_creds_for_exec() hook to measure and verify the
indirect file's integrity. Unlike direct file execution, indirect file
execution is optionally enforced by the interpreter.
Differentiate kernel and userspace enforced integrity audit messages.
Co-developed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20241212174223.389435-9-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>
The new SECBIT_EXEC_RESTRICT_FILE, SECBIT_EXEC_DENY_INTERACTIVE, and
their *_LOCKED counterparts are designed to be set by processes setting
up an execution environment, such as a user session, a container, or a
security sandbox. Unlike other securebits, these ones can be set by
unprivileged processes. Like seccomp filters or Landlock domains, the
securebits are inherited across processes.
When SECBIT_EXEC_RESTRICT_FILE is set, programs interpreting code should
control executable resources according to execveat(2) + AT_EXECVE_CHECK
(see previous commit).
When SECBIT_EXEC_DENY_INTERACTIVE is set, a process should deny
execution of user interactive commands (which excludes executable
regular files).
Being able to configure each of these securebits enables system
administrators or owner of image containers to gradually validate the
related changes and to identify potential issues (e.g. with interpreter
or audit logs).
It should be noted that unlike other security bits, the
SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE bits are
dedicated to user space willing to restrict itself. Because of that,
they only make sense in the context of a trusted environment (e.g.
sandbox, container, user session, full system) where the process
changing its behavior (according to these bits) and all its parent
processes are trusted. Otherwise, any parent process could just execute
its own malicious code (interpreting a script or not), or even enforce a
seccomp filter to mask these bits.
Such a secure environment can be achieved with an appropriate access
control (e.g. mount's noexec option, file access rights, LSM policy) and
an enlighten ld.so checking that libraries are allowed for execution
e.g., to protect against illegitimate use of LD_PRELOAD.
Ptrace restrictions according to these securebits would not make sense
because of the processes' trust assumption.
Scripts may need some changes to deal with untrusted data (e.g. stdin,
environment variables), but that is outside the scope of the kernel.
See chromeOS's documentation about script execution control and the
related threat model:
https://www.chromium.org/chromium-os/developer-library/guides/security/noexec-shell-scripts/
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Paul Moore <paul@paul-moore.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Jeff Xu <jeffxu@chromium.org>
Tested-by: Jeff Xu <jeffxu@chromium.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20241212174223.389435-3-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>
Add a new AT_EXECVE_CHECK flag to execveat(2) to check if a file would
be allowed for execution. The main use case is for script interpreters
and dynamic linkers to check execution permission according to the
kernel's security policy. Another use case is to add context to access
logs e.g., which script (instead of interpreter) accessed a file. As
any executable code, scripts could also use this check [1].
This is different from faccessat(2) + X_OK which only checks a subset of
access rights (i.e. inode permission and mount options for regular
files), but not the full context (e.g. all LSM access checks). The main
use case for access(2) is for SUID processes to (partially) check access
on behalf of their caller. The main use case for execveat(2) +
AT_EXECVE_CHECK is to check if a script execution would be allowed,
according to all the different restrictions in place. Because the use
of AT_EXECVE_CHECK follows the exact kernel semantic as for a real
execution, user space gets the same error codes.
An interesting point of using execveat(2) instead of openat2(2) is that
it decouples the check from the enforcement. Indeed, the security check
can be logged (e.g. with audit) without blocking an execution
environment not yet ready to enforce a strict security policy.
LSMs can control or log execution requests with
security_bprm_creds_for_exec(). However, to enforce a consistent and
complete access control (e.g. on binary's dependencies) LSMs should
restrict file executability, or measure executed files, with
security_file_open() by checking file->f_flags & __FMODE_EXEC.
Because AT_EXECVE_CHECK is dedicated to user space interpreters, it
doesn't make sense for the kernel to parse the checked files, look for
interpreters known to the kernel (e.g. ELF, shebang), and return ENOEXEC
if the format is unknown. Because of that, security_bprm_check() is
never called when AT_EXECVE_CHECK is used.
It should be noted that script interpreters cannot directly use
execveat(2) (without this new AT_EXECVE_CHECK flag) because this could
lead to unexpected behaviors e.g., `python script.sh` could lead to Bash
being executed to interpret the script. Unlike the kernel, script
interpreters may just interpret the shebang as a simple comment, which
should not change for backward compatibility reasons.
Because scripts or libraries files might not currently have the
executable permission set, or because we might want specific users to be
allowed to run arbitrary scripts, the following patch provides a dynamic
configuration mechanism with the SECBIT_EXEC_RESTRICT_FILE and
SECBIT_EXEC_DENY_INTERACTIVE securebits.
This is a redesign of the CLIP OS 4's O_MAYEXEC:
f5cb330d6b/1901_open_mayexec.patch
This patch has been used for more than a decade with customized script
interpreters. Some examples can be found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Jeff Xu <jeffxu@chromium.org>
Tested-by: Jeff Xu <jeffxu@chromium.org>
Link: https://docs.python.org/3/library/io.html#io.open_code [1]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20241212174223.389435-2-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmdiU+EUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXP/Dg//X14XikP3UB0OcVRFkG3etPuUTf0L
gCTDvPcv+Ck4T1AVhYgyPnZCjkuzvIWeqPMPcSOpUmgeJb9x3pPAB1pJSJnrhAoE
3VmOmyalxnj/weboKwFLHRgEBN+gYe1J+fchFkQjGJQF+LzZ3I4jk/FARhYzE2UY
gy/WVKS68MWK/RwED4Hc4c+ZJ/fM27bc3QPLB3C62J9qlQI4p+4XIRNrcfqYYvah
X+Gd0oKMpRF6evHfx7LujWq+e9fZv5ZaGrRDRUwTTmdyWK2+iFKfQw1x24ijw3Iq
0xrj8XR1O8nVd+FWo78mSEax+YXa8UY/WbQlTC1IxlN1lETshVGlQPz7QYV0yOpu
FH47UhXDN2fPHGnMQRbSZf7d8GhOmEBEpms7xll5mDKQnx78Cqxp+xL7BzMCRMyK
ktO8HPyQcxlKMAIrNStvA9xYWcbXf6PhNfogKln9hAiUyJBeEAMEQWp/tz2r1IHw
yl78ZsbL3bNOjlk4K7G9w1qqiHjo7DDPgvzE7bTi2yolG/QX4iUIbAeEUAKqxKtl
qn7R+GGIy/oijSohbkxIPDlf93dzQfMG8QzWN+Z/WZ4NtbdDQglZD6F3ediPNPvP
RpmabcXBEK4TKnHzwWx1fsxd256OzrWI3QF5bJaEQ2u+R4RIJGmPjz27xiXZiXyb
oheacqtiYnAyJQU=
=LS+v
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One small SELinux patch to get rid improve our handling of unknown
extended permissions by safely ignoring them.
Not only does this make it easier to support newer SELinux policy
on older kernels in the future, it removes to BUG() calls from the
SELinux code."
* tag 'selinux-pr-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: ignore unknown extended permissions
Fedora 41 has reached Linux 6.12 kernel with TOMOYO enabled. I observed
that /usr/lib/systemd/systemd executes /usr/lib/systemd/systemd-executor
by passing dirfd == 9 or dirfd == 16 upon execveat().
Commit ada1986d07 ("tomoyo: fallback to realpath if symlink's pathname
does not exist") used realpath only if symlink's pathname does not exist.
But an out of tree patch suggested that it will be reasonable to always
use realpath if symlink's pathname refers to proc filesystem.
Therefore, this patch changes the pathname used for checking "file execute"
and the domainname used after a successful execve() request.
Before:
<kernel> /usr/lib/systemd/systemd
file execute proc:/self/fd/16 exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor"
file execute proc:/self/fd/9 exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor"
<kernel> /usr/lib/systemd/systemd proc:/self/fd/16
file execute /usr/sbin/auditd exec.realpath="/usr/sbin/auditd" exec.argv[0]="/usr/sbin/auditd"
<kernel> /usr/lib/systemd/systemd proc:/self/fd/16 /usr/sbin/auditd
<kernel> /usr/lib/systemd/systemd proc:/self/fd/9
file execute /usr/bin/systemctl exec.realpath="/usr/bin/systemctl" exec.argv[0]="/usr/bin/systemctl"
<kernel> /usr/lib/systemd/systemd proc:/self/fd/9 /usr/bin/systemctl
After:
<kernel> /usr/lib/systemd/systemd
file execute /usr/lib/systemd/systemd-executor exec.realpath="/usr/lib/systemd/systemd-executor" exec.argv[0]="/usr/lib/systemd/systemd-executor"
<kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor
file execute /usr/bin/systemctl exec.realpath="/usr/bin/systemctl" exec.argv[0]="/usr/bin/systemctl"
file execute /usr/sbin/auditd exec.realpath="/usr/sbin/auditd" exec.argv[0]="/usr/sbin/auditd"
<kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor /usr/bin/systemctl
<kernel> /usr/lib/systemd/systemd /usr/lib/systemd/systemd-executor /usr/sbin/auditd
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
free_task() already calls bpf_task_storage_free(). It is not necessary
to call it again on security_task_free(). Remove the hook.
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://patch.msgid.link/20241212075956.2614894-1-song@kernel.org
syzbot is reporting too large allocation warning at tomoyo_write_control(),
for one can write a very very long line without new line character. To fix
this warning, I use __GFP_NOWARN rather than checking for KMALLOC_MAX_SIZE,
for practically a valid line should be always shorter than 32KB where the
"too small to fail" memory-allocation rule applies.
One might try to write a valid line that is longer than 32KB, but such
request will likely fail with -ENOMEM. Therefore, I feel that separately
returning -EINVAL when a line is longer than KMALLOC_MAX_SIZE is redundant.
There is no need to distinguish over-32KB and over-KMALLOC_MAX_SIZE.
Reported-by: syzbot+7536f77535e5210a5c76@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7536f77535e5210a5c76
Reported-by: Leo Stone <leocstone@gmail.com>
Closes: https://lkml.kernel.org/r/20241216021459.178759-2-leocstone@gmail.com
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
When evaluating extended permissions, ignore unknown permissions instead
of calling BUG(). This commit ensures that future permissions can be
added without interfering with older kernels.
Cc: stable@vger.kernel.org
Fixes: fa1aa143ac ("selinux: extended permissions for ioctls")
Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add a new audit message type to capture nlmsg-related information. This
is similar to LSM_AUDIT_DATA_IOCTL_OP which was added for the other
SELinux extended permission (ioctl).
Adding a new type is preferred to adding to the existing
lsm_network_audit structure which contains irrelevant information for
the netlink sockets (i.e., dport, sport).
Signed-off-by: Thiébaud Weksteen <tweek@google.com>
[PM: change "nlnk-msgtype" to "nl-msgtype" as discussed]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add support for extended permission rules in conditional policies.
Currently the kernel accepts such rules already, but evaluating a
security decision will hit a BUG() in
services_compute_xperms_decision(). Thus reject extended permission
rules in conditional policies for current policy versions.
Add a new policy version for this feature.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Tested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Check sk->sk_protocol instead of security class to recognize SCTP socket.
SCTP socket is initialized with SECCLASS_SOCKET class if policy does not
support EXTSOCKCLASS capability. In this case bind(2) hook wrongfully
return EAFNOSUPPORT instead of EINVAL.
The inconsistency was detected with help of Landlock tests:
https://lore.kernel.org/all/b58680ca-81b2-7222-7287-0ac7f4227c3c@huawei-partners.com/
Fixes: 0f8db8cc73 ("selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()")
Signed-off-by: Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Use types for iterators equal to the type of the to be compared values.
Reported by clang:
../ss/sidtab.c:126:2: warning: comparison of integers of different
signs: 'int' and 'unsigned long'
126 | hash_for_each_rcu(sidtab->context_to_sid, i, entry, list) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../hashtable.h:139:51: note: expanded from macro 'hash_for_each_rcu'
139 | for (... ; obj == NULL && (bkt) < HASH_SIZE(name);\
| ~~~ ^ ~~~~~~~~~~~~~~~
../selinuxfs.c:1520:23: warning: comparison of integers of different
signs: 'int' and 'unsigned int'
1520 | for (cpu = *idx; cpu < nr_cpu_ids; ++cpu) {
| ~~~ ^ ~~~~~~~~~~
../hooks.c:412:16: warning: comparison of integers of different signs:
'int' and 'unsigned long'
412 | for (i = 0; i < ARRAY_SIZE(tokens); i++) {
| ~ ^ ~~~~~~~~~~~~~~~~~~
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: munged the clang output due to line length concerns]
Signed-off-by: Paul Moore <paul@paul-moore.com>
av_permissions.h was not declared as a target and therefore not cleaned
up automatically by kbuild.
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/lkml/CAK7LNATUnCPt03BRFSKh1EH=+Sy0Q48wE4ER0BZdJqOb_44L8w@mail.gmail.com/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The new FS_PRE_ACCESS permission event is similar to FS_ACCESS_PERM,
but it meant for a different use case of filling file content before
access to a file range, so it has slightly different semantics.
Generate FS_PRE_ACCESS/FS_ACCESS_PERM as two seperate events, so content
scanners could inspect the content filled by pre-content event handler.
Unlike FS_ACCESS_PERM, FS_PRE_ACCESS is also called before a file is
modified by syscalls as write() and fallocate().
FS_ACCESS_PERM is reported also on blockdev and pipes, but the new
pre-content events are only reported for regular files and dirs.
The pre-content events are meant to be used by hierarchical storage
managers that want to fill the content of files on first access.
There are some specific requirements from filesystems that could
be used with pre-content events, so add a flag for fs to opt-in
for pre-content events explicitly before they can be used.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/b934c5e3af205abc4e0e4709f6486815937ddfdf.1731684329.git.josef@toxicpanda.com
Current release - regressions:
- rtnetlink: fix double call of rtnl_link_get_net_ifla()
- tcp: populate XPS related fields of timewait sockets
- ethtool: fix access to uninitialized fields in set RXNFC command
- selinux: use sk_to_full_sk() in selinux_ip_output()
Current release - new code bugs:
- net: make napi_hash_lock irq safe
- eth: bnxt_en: support header page pool in queue API
- eth: ice: fix NULL pointer dereference in switchdev
Previous releases - regressions:
- core: fix icmp host relookup triggering ip_rt_bug
- ipv6:
- avoid possible NULL deref in modify_prefix_route()
- release expired exception dst cached in socket
- smc: fix LGR and link use-after-free issue
- hsr: avoid potential out-of-bound access in fill_frame_info()
- can: hi311x: fix potential use-after-free
- eth: ice: fix VLAN pruning in switchdev mode
Previous releases - always broken:
- netfilter:
- ipset: hold module reference while requesting a module
- nft_inner: incorrect percpu area handling under softirq
- can: j1939: fix skb reference counting
- eth: mlxsw: use correct key block on Spectrum-4
- eth: mlx5: fix memory leak in mlx5hws_definer_calc_layout
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmdRve4SHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOk/TUQAItDVkTQDiAUUqd0TDH2SSnboxXozcMF
fKVrdWulFAOe7qcajUqjkzvTDjAOw9Xbfh8rEiFBLaXmUzSUT2eqbm2VahvWIR5/
k08v7fTTuCNzEwhnbQlsZ47Nd26LJVPwOvbtE/4V8d50pU/serjWuI/74tUCWAjn
DQOjyyqRjKgKKY+WWCQ6g4tVpD//DB4Sj15GiE3MhlW1f08AAnPJSe2oTaq0RZBG
nXo7abOGn8x3RYilrlp/ZwWYuNpVI4oF+qmp+t/46NV+7ER1JgrC97k0kFyyCYVD
g7vBvFjvA7vDmiuzfsOW2n7IRdcfBtkfi8UJYOIvVgJg7KDF0DXoxj3qD4FagI35
RWRMJw+PoNlFFkPprlp0we/19jPJWOO6rx+AOMEQD78jrH7NoFQ/+eeBf/nppfjy
wX0LD1aQsgPk2ju0I8GbcM/qaJ81EJUiYYVLJHieH0+vvmqts8cMDRLZf5t08EHa
myXcRGB9N8gjguGp5mdR5KmtY82zASlNC0PbDp3nlPYc3e/opCmMRYGjBohO+vqn
n7u250WThPwiBtOwYmSbcK7zpS034/VX0ufnTT2X3MWnFGGNDv6XVmho/OBuCHqJ
m/EiJo/D9qVGAIql/vAxN9a4lQKrcZgaMzCyEPYCf7rtLmzx7sfyHWbf/GZlUAU/
9dUUfWqCbZcL
=nkPk
-----END PGP SIGNATURE-----
Merge tag 'net-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from can and netfilter.
Current release - regressions:
- rtnetlink: fix double call of rtnl_link_get_net_ifla()
- tcp: populate XPS related fields of timewait sockets
- ethtool: fix access to uninitialized fields in set RXNFC command
- selinux: use sk_to_full_sk() in selinux_ip_output()
Current release - new code bugs:
- net: make napi_hash_lock irq safe
- eth:
- bnxt_en: support header page pool in queue API
- ice: fix NULL pointer dereference in switchdev
Previous releases - regressions:
- core: fix icmp host relookup triggering ip_rt_bug
- ipv6:
- avoid possible NULL deref in modify_prefix_route()
- release expired exception dst cached in socket
- smc: fix LGR and link use-after-free issue
- hsr: avoid potential out-of-bound access in fill_frame_info()
- can: hi311x: fix potential use-after-free
- eth: ice: fix VLAN pruning in switchdev mode
Previous releases - always broken:
- netfilter:
- ipset: hold module reference while requesting a module
- nft_inner: incorrect percpu area handling under softirq
- can: j1939: fix skb reference counting
- eth:
- mlxsw: use correct key block on Spectrum-4
- mlx5: fix memory leak in mlx5hws_definer_calc_layout"
* tag 'net-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
net :mana :Request a V2 response version for MANA_QUERY_GF_STAT
net: avoid potential UAF in default_operstate()
vsock/test: verify socket options after setting them
vsock/test: fix parameter types in SO_VM_SOCKETS_* calls
vsock/test: fix failures due to wrong SO_RCVLOWAT parameter
net/mlx5e: Remove workaround to avoid syndrome for internal port
net/mlx5e: SD, Use correct mdev to build channel param
net/mlx5: E-Switch, Fix switching to switchdev mode in MPV
net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabled
net/mlx5: HWS: Properly set bwc queue locks lock classes
net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layout
bnxt_en: handle tpa_info in queue API implementation
bnxt_en: refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap()
bnxt_en: refactor tpa_info alloc/free into helpers
geneve: do not assume mac header is set in geneve_xmit_skb()
mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4
ethtool: Fix wrong mod state in case of verbose and no_mask bitset
ipmr: tune the ipmr_can_free_table() checks.
netfilter: nft_set_hash: skip duplicated elements pending gc run
netfilter: ipset: Hold module reference while requesting a module
...
In cases where we want a stable way to observe/trace
cap_capable (e.g. protection from inlining and API updates)
add a tracepoint that passes:
- The credentials used
- The user namespace of the resource being accessed
- The user namespace in which the credential provides the
capability to access the targeted resource
- The capability to check for
- The return value of the check
Signed-off-by: Jordan Rome <linux@jordanrome.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Link: https://lore.kernel.org/r/20241204155911.1817092-1-linux@jordanrome.com
Signed-off-by: Serge Hallyn <sergeh@kernel.org>
The cap_mmap_file() LSM callback returns the default value for the
security_mmap_file() LSM hook and can be safely removed.
Signed-off-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Serge Hallyn <sergeh@kernel.org>
Verify that the LSM releasing the secctx is the LSM that
allocated it. This was not necessary when only one LSM could
create a secctx, but once there can be more than one it is.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Replace the (secctx,seclen) pointer pair with a single lsm_context
pointer to allow return of the LSM identifier along with the context
and context length. This allows security_release_secctx() to know how
to release the context. Callers have been modified to use or save the
returned data from the new structure.
Cc: ceph-devel@vger.kernel.org
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Change the security_inode_getsecctx() interface to fill a lsm_context
structure instead of data and length pointers. This provides
the information about which LSM created the context so that
security_release_secctx() can use the correct hook.
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Replace the (secctx,seclen) pointer pair with a single
lsm_context pointer to allow return of the LSM identifier
along with the context and context length. This allows
security_release_secctx() to know how to release the
context. Callers have been modified to use or save the
returned data from the new structure.
security_secid_to_secctx() and security_lsmproc_to_secctx()
will now return the length value on success instead of 0.
Cc: netdev@vger.kernel.org
Cc: audit@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Cc: Todd Kjos <tkjos@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak, kdoc fix, signedness fix from Dan Carpenter]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add a new lsm_context data structure to hold all the information about a
"security context", including the string, its size and which LSM allocated
the string. The allocation information is necessary because LSMs have
different policies regarding the lifecycle of these strings. SELinux
allocates and destroys them on each use, whereas Smack provides a pointer
to an entry in a list that never goes away.
Update security_release_secctx() to use the lsm_context instead of a
(char *, len) pair. Change its callers to do likewise. The LSMs
supporting this hook have had comments added to remind the developer
that there is more work to be done.
The BPF security module provides all LSM hooks. While there has yet to
be a known instance of a BPF configuration that uses security contexts,
the possibility is real. In the existing implementation there is
potential for multiple frees in that case.
Cc: linux-integrity@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: audit@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: linux-nfs@vger.kernel.org
Cc: Todd Kjos <tkjos@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
When utilized it dodges strlen() in vfs_readlink(), giving about 1.5%
speed up when issuing readlink on /initrd.img on ext4.
Filesystems opt in by calling inode_set_cached_link() when creating an
inode.
The size is stored in a new union utilizing the same space as i_devices,
thus avoiding growing the struct or taking up any more space.
Churn-wise the current readlink_copy() helper is patched to accept the
size instead of calculating it.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/r/20241120112037.822078-2-mjguzik@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmdKdi4UHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXP36A//UudCkttNTVCErQM0r8mjcPjtthlo
fRblzsuXbHB3FTsrDQA4bpPcLTlxMcNZXGgO9F+cwjNRNRbavBZo8+AbtAWOppNx
3XGr65Z0iIelUgvy/T3jYl8xQM4IePWbBn/wXMtqX3synE/UX4fqfxC/feeaUly8
qdyoefM0I4JOabjn9i9N6KyeyU/sjI4aTRhd5YwaCdy6wK2u9hQEqvER5XdFZjhN
HnuFKlBKEH5fo3Wwq5X+PbX8hP4szl3DE7QEplTlMjrh0y7xIc/ndugp/f3IN2Ri
0/Da6FpqjtBiis+wkyGWwwfmw/erRRaLUhwbuW3MjCQTmtbcjkIjA2xb4nxtcT+n
e/sCdWkgf1PX15igHpbWAynlv+jF3Ue+Kr/hrWi/H50fWiZh9skUmUxJ92SlI4fK
DT1N430xvoBpQwynASAM6hcWWgLJlFBo11aZA2fl++ORC3z/dbCZ5iLPdbIvaemF
KSmQTnrk4sVH8sPl9aIHqCFxj78uTHkSzLGQyq76AWU/s8hGzPszpU6iPwlcx1wk
WpkMLG5FQr0a4vKnNx1FVLCmwalPlB6/ZclRvKuSiEGZWTrnOo5KhPy1+Yn39AUo
IDBFnJkJNWxUVyb6zK9OJJ3Tsd6c1qeBEqb0VzUDGmw2umTuEAKqlqNJAohxVvyP
A7GFrlg9FvsDAwM=
=ZCsN
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20241129' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull ima fix from Paul Moore:
"One small patch to fix a function parameter / local variable naming
snafu that went up to you in the current merge window"
* tag 'lsm-pr-20241129' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
ima: uncover hidden variable in ima_match_rules()
The variable name "prop" is inadvertently used twice in
ima_match_rules(), resulting in incorrect use of the local
variable when the function parameter should have been.
Rename the local variable and correct the use of the parameter.
Suggested-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Roberto Sassu <roberto.sassu@huawei.com>
[PM: subj tweak, Roberto's ACK]
Signed-off-by: Paul Moore <paul@paul-moore.com>
the kernel test robot reports a C23 extension
warning: label followed by a declaration is a C23 extension
[-Wc23-extensions]
696 | struct aa_profile *new_profile = NULL;
Instead of adding a null statement creating a C99 style inline var
declaration lift the label declaration out of the block so that it no
longer immediatedly follows the label.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202411101808.AI8YG6cs-lkp@intel.com/
Fixes: ee650b3820 ("apparmor: properly handle cx/px lookup failure for complain")
Signed-off-by: John Johansen <john.johansen@canonical.com>
The wording of 'scrubbing environment' implied that all environment
variables would be removed, when instead secure-execution mode only
removes a small number of environment variables. This patch updates the
wording to describe what actually occurs instead: setting AT_SECURE for
ld.so's secure-execution mode.
Link: https://gitlab.com/apparmor/apparmor/-/merge_requests/1315 is a
merge request that does similar updating for apparmor userspace.
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
The macros for label combination XXX_comb are no longer used and there
are no plans to use them so remove the dead code.
Signed-off-by: John Johansen <john.johansen@canonical.com>
In the macro definition of next_comb(), a parameter L1 is accepted,
but it is not used. Hence, it should be removed.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
The previous audit_cap cache deduping was based on the profile that was
being audited. This could cause confusion due to the deduplication then
occurring across multiple processes, which could happen if multiple
instances of binaries matched the same profile attachment (and thus ran
under the same profile) or a profile was attached to a container and its
processes.
Instead, perform audit_cap deduping over ad->subj_cred, which ensures the
deduping only occurs across a single process, instead of across all
processes that match the current one's profile.
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
When auditing capabilities, AppArmor uses a per-CPU, per-profile cache
such that the same capability for the same profile doesn't get repeatedly
audited, with the original goal of reducing audit logspam. However, this
cache does not have an expiration time, resulting in confusion when a
profile is shared across binaries (for example) and an expected DENIED
audit entry doesn't appear, despite the cache entry having been populated
much longer ago. This confusion was exacerbated by the per-CPU nature of
the cache resulting in the expected entries sporadically appearing when
the later denial+audit occurred on a different CPU.
To resolve this, record the last time a capability was audited for a
profile and add a timestamp expiration check before doing the audit.
v1 -> v2:
- Hardcode a longer timeout and drop the patches making it a sysctl,
after discussion with John Johansen.
- Cache the expiration time instead of the last-audited time. This value
can never be zero, which lets us drop the kernel_cap_t caps field from
the cache struct.
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
The profile_capabile function takes a struct apparmor_audit_data *ad,
which is documented as possibly being NULL. However, the single place that
calls this function never passes it a NULL ad. If we were ever to call
profile_capable with a NULL ad elsewhere, we would need to rework the
function, as its very first use of ad is to dereference ad->class without
checking if ad is NULL.
Thus, document profile_capable's ad parameter as not accepting NULL.
Signed-off-by: Ryan Lee <ryan.lee@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Multiple profiles shared 'ent->caps', so some logs missed.
Fixes: 0ed3b28ab8 ("AppArmor: mediation of non file objects")
Signed-off-by: chao liu <liuzgyid@outlook.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Add a comment to unpack_perm to document the first entry in the packed
perms struct is reserved, and make a non-functional change of unpacking
to a temporary stack variable named "reserved" to help suppor the
documentation of which value is reserved.
Suggested-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>