mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-15 18:04:36 +00:00
a867d7349e
Pull userns vfs updates from Eric Biederman: "This tree contains some very long awaited work on generalizing the user namespace support for mounting filesystems to include filesystems with a backing store. The real world target is fuse but the goal is to update the vfs to allow any filesystem to be supported. This patchset is based on a lot of code review and testing to approach that goal. While looking at what is needed to support the fuse filesystem it became clear that there were things like xattrs for security modules that needed special treatment. That the resolution of those concerns would not be fuse specific. That sorting out these general issues made most sense at the generic level, where the right people could be drawn into the conversation, and the issues could be solved for everyone. At a high level what this patchset does a couple of simple things: - Add a user namespace owner (s_user_ns) to struct super_block. - Teach the vfs to handle filesystem uids and gids not mapping into to kuids and kgids and being reported as INVALID_UID and INVALID_GID in vfs data structures. By assigning a user namespace owner filesystems that are mounted with only user namespace privilege can be detected. This allows security modules and the like to know which mounts may not be trusted. This also allows the set of uids and gids that are communicated to the filesystem to be capped at the set of kuids and kgids that are in the owning user namespace of the filesystem. One of the crazier corner casees this handles is the case of inodes whose i_uid or i_gid are not mapped into the vfs. Most of the code simply doesn't care but it is easy to confuse the inode writeback path so no operation that could cause an inode write-back is permitted for such inodes (aka only reads are allowed). This set of changes starts out by cleaning up the code paths involved in user namespace permirted mounts. Then when things are clean enough adds code that cleanly sets s_user_ns. Then additional restrictions are added that are possible now that the filesystem superblock contains owner information. These changes should not affect anyone in practice, but there are some parts of these restrictions that are changes in behavior. - Andy's restriction on suid executables that does not honor the suid bit when the path is from another mount namespace (think /proc/[pid]/fd/) or when the filesystem was mounted by a less privileged user. - The replacement of the user namespace implicit setting of MNT_NODEV with implicitly setting SB_I_NODEV on the filesystem superblock instead. Using SB_I_NODEV is a stronger form that happens to make this state user invisible. The user visibility can be managed but it caused problems when it was introduced from applications reasonably expecting mount flags to be what they were set to. There is a little bit of work remaining before it is safe to support mounting filesystems with backing store in user namespaces, beyond what is in this set of changes. - Verifying the mounter has permission to read/write the block device during mount. - Teaching the integrity modules IMA and EVM to handle filesystems mounted with only user namespace root and to reduce trust in their security xattrs accordingly. - Capturing the mounters credentials and using that for permission checks in d_automount and the like. (Given that overlayfs already does this, and we need the work in d_automount it make sense to generalize this case). Furthermore there are a few changes that are on the wishlist: - Get all filesystems supporting posix acls using the generic posix acls so that posix_acl_fix_xattr_from_user and posix_acl_fix_xattr_to_user may be removed. [Maintainability] - Reducing the permission checks in places such as remount to allow the superblock owner to perform them. - Allowing the superblock owner to chown files with unmapped uids and gids to something that is mapped so the files may be treated normally. I am not considering even obvious relaxations of permission checks until it is clear there are no more corner cases that need to be locked down and handled generically. Many thanks to Seth Forshee who kept this code alive, and putting up with me rewriting substantial portions of what he did to handle more corner cases, and for his diligent testing and reviewing of my changes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits) fs: Call d_automount with the filesystems creds fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns evm: Translate user/group ids relative to s_user_ns when computing HMAC dquot: For now explicitly don't support filesystems outside of init_user_ns quota: Handle quota data stored in s_user_ns in quota_setxquota quota: Ensure qids map to the filesystem vfs: Don't create inodes with a uid or gid unknown to the vfs vfs: Don't modify inodes with a uid or gid unknown to the vfs cred: Reject inodes with invalid ids in set_create_file_as() fs: Check for invalid i_uid in may_follow_link() vfs: Verify acls are valid within superblock's s_user_ns. userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS fs: Refuse uid/gid changes which don't map into s_user_ns selinux: Add support for unprivileged mounts from user namespaces Smack: Handle labels consistently in untrusted mounts Smack: Add support for unprivileged mounts from user namespaces fs: Treat foreign mounts as nosuid fs: Limit file caps to the user namespace of the super block userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag userns: Remove implicit MNT_NODEV fragility. ...
142 lines
3.4 KiB
C
142 lines
3.4 KiB
C
/*
|
|
File: linux/posix_acl.h
|
|
|
|
(C) 2002 Andreas Gruenbacher, <a.gruenbacher@computer.org>
|
|
*/
|
|
|
|
|
|
#ifndef __LINUX_POSIX_ACL_H
|
|
#define __LINUX_POSIX_ACL_H
|
|
|
|
#include <linux/bug.h>
|
|
#include <linux/slab.h>
|
|
#include <linux/rcupdate.h>
|
|
|
|
#define ACL_UNDEFINED_ID (-1)
|
|
|
|
/* a_type field in acl_user_posix_entry_t */
|
|
#define ACL_TYPE_ACCESS (0x8000)
|
|
#define ACL_TYPE_DEFAULT (0x4000)
|
|
|
|
/* e_tag entry in struct posix_acl_entry */
|
|
#define ACL_USER_OBJ (0x01)
|
|
#define ACL_USER (0x02)
|
|
#define ACL_GROUP_OBJ (0x04)
|
|
#define ACL_GROUP (0x08)
|
|
#define ACL_MASK (0x10)
|
|
#define ACL_OTHER (0x20)
|
|
|
|
/* permissions in the e_perm field */
|
|
#define ACL_READ (0x04)
|
|
#define ACL_WRITE (0x02)
|
|
#define ACL_EXECUTE (0x01)
|
|
//#define ACL_ADD (0x08)
|
|
//#define ACL_DELETE (0x10)
|
|
|
|
struct posix_acl_entry {
|
|
short e_tag;
|
|
unsigned short e_perm;
|
|
union {
|
|
kuid_t e_uid;
|
|
kgid_t e_gid;
|
|
};
|
|
};
|
|
|
|
struct posix_acl {
|
|
atomic_t a_refcount;
|
|
struct rcu_head a_rcu;
|
|
unsigned int a_count;
|
|
struct posix_acl_entry a_entries[0];
|
|
};
|
|
|
|
#define FOREACH_ACL_ENTRY(pa, acl, pe) \
|
|
for(pa=(acl)->a_entries, pe=pa+(acl)->a_count; pa<pe; pa++)
|
|
|
|
|
|
/*
|
|
* Duplicate an ACL handle.
|
|
*/
|
|
static inline struct posix_acl *
|
|
posix_acl_dup(struct posix_acl *acl)
|
|
{
|
|
if (acl)
|
|
atomic_inc(&acl->a_refcount);
|
|
return acl;
|
|
}
|
|
|
|
/*
|
|
* Free an ACL handle.
|
|
*/
|
|
static inline void
|
|
posix_acl_release(struct posix_acl *acl)
|
|
{
|
|
if (acl && atomic_dec_and_test(&acl->a_refcount))
|
|
kfree_rcu(acl, a_rcu);
|
|
}
|
|
|
|
|
|
/* posix_acl.c */
|
|
|
|
extern void posix_acl_init(struct posix_acl *, int);
|
|
extern struct posix_acl *posix_acl_alloc(int, gfp_t);
|
|
extern int posix_acl_valid(struct user_namespace *, const struct posix_acl *);
|
|
extern int posix_acl_permission(struct inode *, const struct posix_acl *, int);
|
|
extern struct posix_acl *posix_acl_from_mode(umode_t, gfp_t);
|
|
extern int posix_acl_equiv_mode(const struct posix_acl *, umode_t *);
|
|
extern int __posix_acl_create(struct posix_acl **, gfp_t, umode_t *);
|
|
extern int __posix_acl_chmod(struct posix_acl **, gfp_t, umode_t);
|
|
|
|
extern struct posix_acl *get_posix_acl(struct inode *, int);
|
|
extern int set_posix_acl(struct inode *, int, struct posix_acl *);
|
|
|
|
#ifdef CONFIG_FS_POSIX_ACL
|
|
extern int posix_acl_chmod(struct inode *, umode_t);
|
|
extern int posix_acl_create(struct inode *, umode_t *, struct posix_acl **,
|
|
struct posix_acl **);
|
|
|
|
extern int simple_set_acl(struct inode *, struct posix_acl *, int);
|
|
extern int simple_acl_create(struct inode *, struct inode *);
|
|
|
|
struct posix_acl *get_cached_acl(struct inode *inode, int type);
|
|
struct posix_acl *get_cached_acl_rcu(struct inode *inode, int type);
|
|
void set_cached_acl(struct inode *inode, int type, struct posix_acl *acl);
|
|
void forget_cached_acl(struct inode *inode, int type);
|
|
void forget_all_cached_acls(struct inode *inode);
|
|
|
|
static inline void cache_no_acl(struct inode *inode)
|
|
{
|
|
inode->i_acl = NULL;
|
|
inode->i_default_acl = NULL;
|
|
}
|
|
#else
|
|
static inline int posix_acl_chmod(struct inode *inode, umode_t mode)
|
|
{
|
|
return 0;
|
|
}
|
|
|
|
#define simple_set_acl NULL
|
|
|
|
static inline int simple_acl_create(struct inode *dir, struct inode *inode)
|
|
{
|
|
return 0;
|
|
}
|
|
static inline void cache_no_acl(struct inode *inode)
|
|
{
|
|
}
|
|
|
|
static inline int posix_acl_create(struct inode *inode, umode_t *mode,
|
|
struct posix_acl **default_acl, struct posix_acl **acl)
|
|
{
|
|
*default_acl = *acl = NULL;
|
|
return 0;
|
|
}
|
|
|
|
static inline void forget_all_cached_acls(struct inode *inode)
|
|
{
|
|
}
|
|
#endif /* CONFIG_FS_POSIX_ACL */
|
|
|
|
struct posix_acl *get_acl(struct inode *inode, int type);
|
|
|
|
#endif /* __LINUX_POSIX_ACL_H */
|