mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2024-12-29 17:23:36 +00:00
19 hotfixes, 8 of which are cc:stable.
Mainly MM singleton fixes. And a couple of ocfs2 regression fixes. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZnCEQAAKCRDdBJ7gKXxA jmgSAQDk3BYs1n67cnwx/Zi04yMYDyfYTCYg2udPfT2a+GpmbwD+N5dJd/vCztXH 5eLpP11xd/yr2+I9FefyZeUuA80KtgQ= =2agY -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2024-06-17-11-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Mainly MM singleton fixes. And a couple of ocfs2 regression fixes" * tag 'mm-hotfixes-stable-2024-06-17-11-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: kcov: don't lose track of remote references during softirqs mm: shmem: fix getting incorrect lruvec when replacing a shmem folio mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick mm: fix possible OOB in numa_rebuild_large_mapping() mm/migrate: fix kernel BUG at mm/compaction.c:2761! selftests: mm: make map_fixed_noreplace test names stable mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC mm: mmap: allow for the maximum number of bits for randomizing mmap_base by default gcov: add support for GCC 14 zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING mm: huge_memory: fix misused mapping_large_folio_support() for anon folios lib/alloc_tag: fix RCU imbalance in pgalloc_tag_get() lib/alloc_tag: do not register sysctl interface when CONFIG_SYSCTL=n MAINTAINERS: remove Lorenzo as vmalloc reviewer Revert "mm: init_mlocked_on_free_v3" mm/page_table_check: fix crash on ZONE_DEVICE gcc: disable '-Warray-bounds' for gcc-9 ocfs2: fix NULL pointer dereference in ocfs2_abort_trigger() ocfs2: fix NULL pointer dereference in ocfs2_journal_dirty()
This commit is contained in:
commit
e6b324fbf2
@ -2192,12 +2192,6 @@
|
||||
Format: 0 | 1
|
||||
Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON.
|
||||
|
||||
init_mlocked_on_free= [MM] Fill freed userspace memory with zeroes if
|
||||
it was mlock'ed and not explicitly munlock'ed
|
||||
afterwards.
|
||||
Format: 0 | 1
|
||||
Default set by CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON
|
||||
|
||||
init_pkru= [X86] Specify the default memory protection keys rights
|
||||
register contents for all processes. 0x55555554 by
|
||||
default (disallow access to all but pkey 0). Can
|
||||
|
@ -32,6 +32,7 @@ Security-related interfaces
|
||||
seccomp_filter
|
||||
landlock
|
||||
lsm
|
||||
mfd_noexec
|
||||
spec_ctrl
|
||||
tee
|
||||
|
||||
|
86
Documentation/userspace-api/mfd_noexec.rst
Normal file
86
Documentation/userspace-api/mfd_noexec.rst
Normal file
@ -0,0 +1,86 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==================================
|
||||
Introduction of non-executable mfd
|
||||
==================================
|
||||
:Author:
|
||||
Daniel Verkamp <dverkamp@chromium.org>
|
||||
Jeff Xu <jeffxu@chromium.org>
|
||||
|
||||
:Contributor:
|
||||
Aleksa Sarai <cyphar@cyphar.com>
|
||||
|
||||
Since Linux introduced the memfd feature, memfds have always had their
|
||||
execute bit set, and the memfd_create() syscall doesn't allow setting
|
||||
it differently.
|
||||
|
||||
However, in a secure-by-default system, such as ChromeOS, (where all
|
||||
executables should come from the rootfs, which is protected by verified
|
||||
boot), this executable nature of memfd opens a door for NoExec bypass
|
||||
and enables “confused deputy attack”. E.g, in VRP bug [1]: cros_vm
|
||||
process created a memfd to share the content with an external process,
|
||||
however the memfd is overwritten and used for executing arbitrary code
|
||||
and root escalation. [2] lists more VRP of this kind.
|
||||
|
||||
On the other hand, executable memfd has its legit use: runc uses memfd’s
|
||||
seal and executable feature to copy the contents of the binary then
|
||||
execute them. For such a system, we need a solution to differentiate runc's
|
||||
use of executable memfds and an attacker's [3].
|
||||
|
||||
To address those above:
|
||||
- Let memfd_create() set X bit at creation time.
|
||||
- Let memfd be sealed for modifying X bit when NX is set.
|
||||
- Add a new pid namespace sysctl: vm.memfd_noexec to help applications in
|
||||
migrating and enforcing non-executable MFD.
|
||||
|
||||
User API
|
||||
========
|
||||
``int memfd_create(const char *name, unsigned int flags)``
|
||||
|
||||
``MFD_NOEXEC_SEAL``
|
||||
When MFD_NOEXEC_SEAL bit is set in the ``flags``, memfd is created
|
||||
with NX. F_SEAL_EXEC is set and the memfd can't be modified to
|
||||
add X later. MFD_ALLOW_SEALING is also implied.
|
||||
This is the most common case for the application to use memfd.
|
||||
|
||||
``MFD_EXEC``
|
||||
When MFD_EXEC bit is set in the ``flags``, memfd is created with X.
|
||||
|
||||
Note:
|
||||
``MFD_NOEXEC_SEAL`` implies ``MFD_ALLOW_SEALING``. In case that
|
||||
an app doesn't want sealing, it can add F_SEAL_SEAL after creation.
|
||||
|
||||
|
||||
Sysctl:
|
||||
========
|
||||
``pid namespaced sysctl vm.memfd_noexec``
|
||||
|
||||
The new pid namespaced sysctl vm.memfd_noexec has 3 values:
|
||||
|
||||
- 0: MEMFD_NOEXEC_SCOPE_EXEC
|
||||
memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
|
||||
MFD_EXEC was set.
|
||||
|
||||
- 1: MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL
|
||||
memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
|
||||
MFD_NOEXEC_SEAL was set.
|
||||
|
||||
- 2: MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED
|
||||
memfd_create() without MFD_NOEXEC_SEAL will be rejected.
|
||||
|
||||
The sysctl allows finer control of memfd_create for old software that
|
||||
doesn't set the executable bit; for example, a container with
|
||||
vm.memfd_noexec=1 means the old software will create non-executable memfd
|
||||
by default while new software can create executable memfd by setting
|
||||
MFD_EXEC.
|
||||
|
||||
The value of vm.memfd_noexec is passed to child namespace at creation
|
||||
time. In addition, the setting is hierarchical, i.e. during memfd_create,
|
||||
we will search from current ns to root ns and use the most restrictive
|
||||
setting.
|
||||
|
||||
[1] https://crbug.com/1305267
|
||||
|
||||
[2] https://bugs.chromium.org/p/chromium/issues/list?q=type%3Dbug-security%20memfd%20escalation&can=1
|
||||
|
||||
[3] https://lwn.net/Articles/781013/
|
@ -23974,7 +23974,6 @@ VMALLOC
|
||||
M: Andrew Morton <akpm@linux-foundation.org>
|
||||
R: Uladzislau Rezki <urezki@gmail.com>
|
||||
R: Christoph Hellwig <hch@infradead.org>
|
||||
R: Lorenzo Stoakes <lstoakes@gmail.com>
|
||||
L: linux-mm@kvack.org
|
||||
S: Maintained
|
||||
W: http://www.linux-mm.org
|
||||
|
12
arch/Kconfig
12
arch/Kconfig
@ -1046,10 +1046,21 @@ config ARCH_MMAP_RND_BITS_MAX
|
||||
config ARCH_MMAP_RND_BITS_DEFAULT
|
||||
int
|
||||
|
||||
config FORCE_MAX_MMAP_RND_BITS
|
||||
bool "Force maximum number of bits to use for ASLR of mmap base address"
|
||||
default y if !64BIT
|
||||
help
|
||||
ARCH_MMAP_RND_BITS and ARCH_MMAP_RND_COMPAT_BITS represent the number
|
||||
of bits to use for ASLR and if no custom value is assigned (EXPERT)
|
||||
then the architecture's lower bound (minimum) value is assumed.
|
||||
This toggle changes that default assumption to assume the arch upper
|
||||
bound (maximum) value instead.
|
||||
|
||||
config ARCH_MMAP_RND_BITS
|
||||
int "Number of bits to use for ASLR of mmap base address" if EXPERT
|
||||
range ARCH_MMAP_RND_BITS_MIN ARCH_MMAP_RND_BITS_MAX
|
||||
default ARCH_MMAP_RND_BITS_DEFAULT if ARCH_MMAP_RND_BITS_DEFAULT
|
||||
default ARCH_MMAP_RND_BITS_MAX if FORCE_MAX_MMAP_RND_BITS
|
||||
default ARCH_MMAP_RND_BITS_MIN
|
||||
depends on HAVE_ARCH_MMAP_RND_BITS
|
||||
help
|
||||
@ -1084,6 +1095,7 @@ config ARCH_MMAP_RND_COMPAT_BITS
|
||||
int "Number of bits to use for ASLR of mmap base address for compatible applications" if EXPERT
|
||||
range ARCH_MMAP_RND_COMPAT_BITS_MIN ARCH_MMAP_RND_COMPAT_BITS_MAX
|
||||
default ARCH_MMAP_RND_COMPAT_BITS_DEFAULT if ARCH_MMAP_RND_COMPAT_BITS_DEFAULT
|
||||
default ARCH_MMAP_RND_COMPAT_BITS_MAX if FORCE_MAX_MMAP_RND_BITS
|
||||
default ARCH_MMAP_RND_COMPAT_BITS_MIN
|
||||
depends on HAVE_ARCH_MMAP_RND_COMPAT_BITS
|
||||
help
|
||||
|
@ -479,12 +479,6 @@ int ocfs2_allocate_extend_trans(handle_t *handle, int thresh)
|
||||
return status;
|
||||
}
|
||||
|
||||
|
||||
struct ocfs2_triggers {
|
||||
struct jbd2_buffer_trigger_type ot_triggers;
|
||||
int ot_offset;
|
||||
};
|
||||
|
||||
static inline struct ocfs2_triggers *to_ocfs2_trigger(struct jbd2_buffer_trigger_type *triggers)
|
||||
{
|
||||
return container_of(triggers, struct ocfs2_triggers, ot_triggers);
|
||||
@ -548,85 +542,76 @@ static void ocfs2_db_frozen_trigger(struct jbd2_buffer_trigger_type *triggers,
|
||||
static void ocfs2_abort_trigger(struct jbd2_buffer_trigger_type *triggers,
|
||||
struct buffer_head *bh)
|
||||
{
|
||||
struct ocfs2_triggers *ot = to_ocfs2_trigger(triggers);
|
||||
|
||||
mlog(ML_ERROR,
|
||||
"ocfs2_abort_trigger called by JBD2. bh = 0x%lx, "
|
||||
"bh->b_blocknr = %llu\n",
|
||||
(unsigned long)bh,
|
||||
(unsigned long long)bh->b_blocknr);
|
||||
|
||||
ocfs2_error(bh->b_assoc_map->host->i_sb,
|
||||
ocfs2_error(ot->sb,
|
||||
"JBD2 has aborted our journal, ocfs2 cannot continue\n");
|
||||
}
|
||||
|
||||
static struct ocfs2_triggers di_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
.ot_offset = offsetof(struct ocfs2_dinode, i_check),
|
||||
};
|
||||
static void ocfs2_setup_csum_triggers(struct super_block *sb,
|
||||
enum ocfs2_journal_trigger_type type,
|
||||
struct ocfs2_triggers *ot)
|
||||
{
|
||||
BUG_ON(type >= OCFS2_JOURNAL_TRIGGER_COUNT);
|
||||
|
||||
static struct ocfs2_triggers eb_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
.ot_offset = offsetof(struct ocfs2_extent_block, h_check),
|
||||
};
|
||||
switch (type) {
|
||||
case OCFS2_JTR_DI:
|
||||
ot->ot_triggers.t_frozen = ocfs2_frozen_trigger;
|
||||
ot->ot_offset = offsetof(struct ocfs2_dinode, i_check);
|
||||
break;
|
||||
case OCFS2_JTR_EB:
|
||||
ot->ot_triggers.t_frozen = ocfs2_frozen_trigger;
|
||||
ot->ot_offset = offsetof(struct ocfs2_extent_block, h_check);
|
||||
break;
|
||||
case OCFS2_JTR_RB:
|
||||
ot->ot_triggers.t_frozen = ocfs2_frozen_trigger;
|
||||
ot->ot_offset = offsetof(struct ocfs2_refcount_block, rf_check);
|
||||
break;
|
||||
case OCFS2_JTR_GD:
|
||||
ot->ot_triggers.t_frozen = ocfs2_frozen_trigger;
|
||||
ot->ot_offset = offsetof(struct ocfs2_group_desc, bg_check);
|
||||
break;
|
||||
case OCFS2_JTR_DB:
|
||||
ot->ot_triggers.t_frozen = ocfs2_db_frozen_trigger;
|
||||
break;
|
||||
case OCFS2_JTR_XB:
|
||||
ot->ot_triggers.t_frozen = ocfs2_frozen_trigger;
|
||||
ot->ot_offset = offsetof(struct ocfs2_xattr_block, xb_check);
|
||||
break;
|
||||
case OCFS2_JTR_DQ:
|
||||
ot->ot_triggers.t_frozen = ocfs2_dq_frozen_trigger;
|
||||
break;
|
||||
case OCFS2_JTR_DR:
|
||||
ot->ot_triggers.t_frozen = ocfs2_frozen_trigger;
|
||||
ot->ot_offset = offsetof(struct ocfs2_dx_root_block, dr_check);
|
||||
break;
|
||||
case OCFS2_JTR_DL:
|
||||
ot->ot_triggers.t_frozen = ocfs2_frozen_trigger;
|
||||
ot->ot_offset = offsetof(struct ocfs2_dx_leaf, dl_check);
|
||||
break;
|
||||
case OCFS2_JTR_NONE:
|
||||
/* To make compiler happy... */
|
||||
return;
|
||||
}
|
||||
|
||||
static struct ocfs2_triggers rb_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
.ot_offset = offsetof(struct ocfs2_refcount_block, rf_check),
|
||||
};
|
||||
ot->ot_triggers.t_abort = ocfs2_abort_trigger;
|
||||
ot->sb = sb;
|
||||
}
|
||||
|
||||
static struct ocfs2_triggers gd_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
.ot_offset = offsetof(struct ocfs2_group_desc, bg_check),
|
||||
};
|
||||
void ocfs2_initialize_journal_triggers(struct super_block *sb,
|
||||
struct ocfs2_triggers triggers[])
|
||||
{
|
||||
enum ocfs2_journal_trigger_type type;
|
||||
|
||||
static struct ocfs2_triggers db_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_db_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
};
|
||||
|
||||
static struct ocfs2_triggers xb_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
.ot_offset = offsetof(struct ocfs2_xattr_block, xb_check),
|
||||
};
|
||||
|
||||
static struct ocfs2_triggers dq_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_dq_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
};
|
||||
|
||||
static struct ocfs2_triggers dr_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
.ot_offset = offsetof(struct ocfs2_dx_root_block, dr_check),
|
||||
};
|
||||
|
||||
static struct ocfs2_triggers dl_triggers = {
|
||||
.ot_triggers = {
|
||||
.t_frozen = ocfs2_frozen_trigger,
|
||||
.t_abort = ocfs2_abort_trigger,
|
||||
},
|
||||
.ot_offset = offsetof(struct ocfs2_dx_leaf, dl_check),
|
||||
};
|
||||
for (type = OCFS2_JTR_DI; type < OCFS2_JOURNAL_TRIGGER_COUNT; type++)
|
||||
ocfs2_setup_csum_triggers(sb, type, &triggers[type]);
|
||||
}
|
||||
|
||||
static int __ocfs2_journal_access(handle_t *handle,
|
||||
struct ocfs2_caching_info *ci,
|
||||
@ -708,56 +693,91 @@ static int __ocfs2_journal_access(handle_t *handle,
|
||||
int ocfs2_journal_access_di(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &di_triggers, type);
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_DI],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access_eb(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &eb_triggers, type);
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_EB],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access_rb(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &rb_triggers,
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_RB],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access_gd(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &gd_triggers, type);
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_GD],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access_db(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &db_triggers, type);
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_DB],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access_xb(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &xb_triggers, type);
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_XB],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access_dq(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &dq_triggers, type);
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_DQ],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access_dr(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &dr_triggers, type);
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_DR],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access_dl(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
struct buffer_head *bh, int type)
|
||||
{
|
||||
return __ocfs2_journal_access(handle, ci, bh, &dl_triggers, type);
|
||||
struct ocfs2_super *osb = OCFS2_SB(ocfs2_metadata_cache_get_super(ci));
|
||||
|
||||
return __ocfs2_journal_access(handle, ci, bh,
|
||||
&osb->s_journal_triggers[OCFS2_JTR_DL],
|
||||
type);
|
||||
}
|
||||
|
||||
int ocfs2_journal_access(handle_t *handle, struct ocfs2_caching_info *ci,
|
||||
@ -778,13 +798,15 @@ void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)
|
||||
if (!is_handle_aborted(handle)) {
|
||||
journal_t *journal = handle->h_transaction->t_journal;
|
||||
|
||||
mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
|
||||
"Aborting transaction and journal.\n");
|
||||
mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed: "
|
||||
"handle type %u started at line %u, credits %u/%u "
|
||||
"errcode %d. Aborting transaction and journal.\n",
|
||||
handle->h_type, handle->h_line_no,
|
||||
handle->h_requested_credits,
|
||||
jbd2_handle_buffer_credits(handle), status);
|
||||
handle->h_err = status;
|
||||
jbd2_journal_abort_handle(handle);
|
||||
jbd2_journal_abort(journal, status);
|
||||
ocfs2_abort(bh->b_assoc_map->host->i_sb,
|
||||
"Journal already aborted.\n");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -284,6 +284,30 @@ enum ocfs2_mount_options
|
||||
#define OCFS2_OSB_ERROR_FS 0x0004
|
||||
#define OCFS2_DEFAULT_ATIME_QUANTUM 60
|
||||
|
||||
struct ocfs2_triggers {
|
||||
struct jbd2_buffer_trigger_type ot_triggers;
|
||||
int ot_offset;
|
||||
struct super_block *sb;
|
||||
};
|
||||
|
||||
enum ocfs2_journal_trigger_type {
|
||||
OCFS2_JTR_DI,
|
||||
OCFS2_JTR_EB,
|
||||
OCFS2_JTR_RB,
|
||||
OCFS2_JTR_GD,
|
||||
OCFS2_JTR_DB,
|
||||
OCFS2_JTR_XB,
|
||||
OCFS2_JTR_DQ,
|
||||
OCFS2_JTR_DR,
|
||||
OCFS2_JTR_DL,
|
||||
OCFS2_JTR_NONE /* This must be the last entry */
|
||||
};
|
||||
|
||||
#define OCFS2_JOURNAL_TRIGGER_COUNT OCFS2_JTR_NONE
|
||||
|
||||
void ocfs2_initialize_journal_triggers(struct super_block *sb,
|
||||
struct ocfs2_triggers triggers[]);
|
||||
|
||||
struct ocfs2_journal;
|
||||
struct ocfs2_slot_info;
|
||||
struct ocfs2_recovery_map;
|
||||
@ -351,6 +375,9 @@ struct ocfs2_super
|
||||
struct ocfs2_journal *journal;
|
||||
unsigned long osb_commit_interval;
|
||||
|
||||
/* Journal triggers for checksum */
|
||||
struct ocfs2_triggers s_journal_triggers[OCFS2_JOURNAL_TRIGGER_COUNT];
|
||||
|
||||
struct delayed_work la_enable_wq;
|
||||
|
||||
/*
|
||||
|
@ -1075,9 +1075,11 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
|
||||
debugfs_create_file("fs_state", S_IFREG|S_IRUSR, osb->osb_debug_root,
|
||||
osb, &ocfs2_osb_debug_fops);
|
||||
|
||||
if (ocfs2_meta_ecc(osb))
|
||||
if (ocfs2_meta_ecc(osb)) {
|
||||
ocfs2_initialize_journal_triggers(sb, osb->s_journal_triggers);
|
||||
ocfs2_blockcheck_stats_debugfs_install( &osb->osb_ecc_stats,
|
||||
osb->osb_debug_root);
|
||||
}
|
||||
|
||||
status = ocfs2_mount_volume(sb);
|
||||
if (status < 0)
|
||||
|
@ -21,6 +21,8 @@ enum kcov_mode {
|
||||
KCOV_MODE_TRACE_PC = 2,
|
||||
/* Collecting comparison operands mode. */
|
||||
KCOV_MODE_TRACE_CMP = 3,
|
||||
/* The process owns a KCOV remote reference. */
|
||||
KCOV_MODE_REMOTE = 4,
|
||||
};
|
||||
|
||||
#define KCOV_IN_CTXSW (1 << 30)
|
||||
|
@ -3776,14 +3776,7 @@ DECLARE_STATIC_KEY_MAYBE(CONFIG_INIT_ON_FREE_DEFAULT_ON, init_on_free);
|
||||
static inline bool want_init_on_free(void)
|
||||
{
|
||||
return static_branch_maybe(CONFIG_INIT_ON_FREE_DEFAULT_ON,
|
||||
&init_on_free);
|
||||
}
|
||||
|
||||
DECLARE_STATIC_KEY_MAYBE(CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON, init_mlocked_on_free);
|
||||
static inline bool want_init_mlocked_on_free(void)
|
||||
{
|
||||
return static_branch_maybe(CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON,
|
||||
&init_mlocked_on_free);
|
||||
&init_on_free);
|
||||
}
|
||||
|
||||
extern bool _debug_pagealloc_enabled_early;
|
||||
|
@ -381,6 +381,10 @@ static inline void mapping_set_large_folios(struct address_space *mapping)
|
||||
*/
|
||||
static inline bool mapping_large_folio_support(struct address_space *mapping)
|
||||
{
|
||||
/* AS_LARGE_FOLIO_SUPPORT is only reasonable for pagecache folios */
|
||||
VM_WARN_ONCE((unsigned long)mapping & PAGE_MAPPING_ANON,
|
||||
"Anonymous mapping always supports large folio");
|
||||
|
||||
return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
|
||||
test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
|
||||
}
|
||||
|
@ -37,6 +37,9 @@ static inline union codetag_ref *get_page_tag_ref(struct page *page)
|
||||
|
||||
static inline void put_page_tag_ref(union codetag_ref *ref)
|
||||
{
|
||||
if (WARN_ON(!ref))
|
||||
return;
|
||||
|
||||
page_ext_put(page_ext_from_codetag_ref(ref));
|
||||
}
|
||||
|
||||
@ -102,9 +105,11 @@ static inline struct alloc_tag *pgalloc_tag_get(struct page *page)
|
||||
union codetag_ref *ref = get_page_tag_ref(page);
|
||||
|
||||
alloc_tag_sub_check(ref);
|
||||
if (ref && ref->ct)
|
||||
tag = ct_to_alloc_tag(ref->ct);
|
||||
put_page_tag_ref(ref);
|
||||
if (ref) {
|
||||
if (ref->ct)
|
||||
tag = ct_to_alloc_tag(ref->ct);
|
||||
put_page_tag_ref(ref);
|
||||
}
|
||||
}
|
||||
|
||||
return tag;
|
||||
|
@ -883,7 +883,7 @@ config GCC10_NO_ARRAY_BOUNDS
|
||||
|
||||
config CC_NO_ARRAY_BOUNDS
|
||||
bool
|
||||
default y if CC_IS_GCC && GCC_VERSION >= 100000 && GCC10_NO_ARRAY_BOUNDS
|
||||
default y if CC_IS_GCC && GCC_VERSION >= 90000 && GCC10_NO_ARRAY_BOUNDS
|
||||
|
||||
# Currently, disable -Wstringop-overflow for GCC globally.
|
||||
config GCC_NO_STRINGOP_OVERFLOW
|
||||
|
@ -18,7 +18,9 @@
|
||||
#include <linux/mm.h>
|
||||
#include "gcov.h"
|
||||
|
||||
#if (__GNUC__ >= 10)
|
||||
#if (__GNUC__ >= 14)
|
||||
#define GCOV_COUNTERS 9
|
||||
#elif (__GNUC__ >= 10)
|
||||
#define GCOV_COUNTERS 8
|
||||
#elif (__GNUC__ >= 7)
|
||||
#define GCOV_COUNTERS 9
|
||||
|
@ -632,6 +632,7 @@ static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
|
||||
return -EINVAL;
|
||||
kcov->mode = mode;
|
||||
t->kcov = kcov;
|
||||
t->kcov_mode = KCOV_MODE_REMOTE;
|
||||
kcov->t = t;
|
||||
kcov->remote = true;
|
||||
kcov->remote_size = remote_arg->area_size;
|
||||
|
@ -218,6 +218,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
|
||||
*/
|
||||
do {
|
||||
clear_thread_flag(TIF_SIGPENDING);
|
||||
clear_thread_flag(TIF_NOTIFY_SIGNAL);
|
||||
rc = kernel_wait4(-1, NULL, __WALL, NULL);
|
||||
} while (rc != -ECHILD);
|
||||
|
||||
|
@ -227,6 +227,7 @@ struct page_ext_operations page_alloc_tagging_ops = {
|
||||
};
|
||||
EXPORT_SYMBOL(page_alloc_tagging_ops);
|
||||
|
||||
#ifdef CONFIG_SYSCTL
|
||||
static struct ctl_table memory_allocation_profiling_sysctls[] = {
|
||||
{
|
||||
.procname = "mem_profiling",
|
||||
@ -241,6 +242,17 @@ static struct ctl_table memory_allocation_profiling_sysctls[] = {
|
||||
{ }
|
||||
};
|
||||
|
||||
static void __init sysctl_init(void)
|
||||
{
|
||||
if (!mem_profiling_support)
|
||||
memory_allocation_profiling_sysctls[0].mode = 0444;
|
||||
|
||||
register_sysctl_init("vm", memory_allocation_profiling_sysctls);
|
||||
}
|
||||
#else /* CONFIG_SYSCTL */
|
||||
static inline void sysctl_init(void) {}
|
||||
#endif /* CONFIG_SYSCTL */
|
||||
|
||||
static int __init alloc_tag_init(void)
|
||||
{
|
||||
const struct codetag_type_desc desc = {
|
||||
@ -253,9 +265,7 @@ static int __init alloc_tag_init(void)
|
||||
if (IS_ERR(alloc_tag_cttype))
|
||||
return PTR_ERR(alloc_tag_cttype);
|
||||
|
||||
if (!mem_profiling_support)
|
||||
memory_allocation_profiling_sysctls[0].mode = 0444;
|
||||
register_sysctl_init("vm", memory_allocation_profiling_sysctls);
|
||||
sysctl_init();
|
||||
procfs_init();
|
||||
|
||||
return 0;
|
||||
|
@ -40,22 +40,7 @@
|
||||
* Please refer Documentation/mm/arch_pgtable_helpers.rst for the semantics
|
||||
* expectations that are being validated here. All future changes in here
|
||||
* or the documentation need to be in sync.
|
||||
*
|
||||
* On s390 platform, the lower 4 bits are used to identify given page table
|
||||
* entry type. But these bits might affect the ability to clear entries with
|
||||
* pxx_clear() because of how dynamic page table folding works on s390. So
|
||||
* while loading up the entries do not change the lower 4 bits. It does not
|
||||
* have affect any other platform. Also avoid the 62nd bit on ppc64 that is
|
||||
* used to mark a pte entry.
|
||||
*/
|
||||
#define S390_SKIP_MASK GENMASK(3, 0)
|
||||
#if __BITS_PER_LONG == 64
|
||||
#define PPC64_SKIP_MASK GENMASK(62, 62)
|
||||
#else
|
||||
#define PPC64_SKIP_MASK 0x0
|
||||
#endif
|
||||
#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK)
|
||||
#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK)
|
||||
#define RANDOM_NZVALUE GENMASK(7, 0)
|
||||
|
||||
struct pgtable_debug_args {
|
||||
@ -511,8 +496,7 @@ static void __init pud_clear_tests(struct pgtable_debug_args *args)
|
||||
return;
|
||||
|
||||
pr_debug("Validating PUD clear\n");
|
||||
pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
|
||||
WRITE_ONCE(*args->pudp, pud);
|
||||
WARN_ON(pud_none(pud));
|
||||
pud_clear(args->pudp);
|
||||
pud = READ_ONCE(*args->pudp);
|
||||
WARN_ON(!pud_none(pud));
|
||||
@ -548,8 +532,7 @@ static void __init p4d_clear_tests(struct pgtable_debug_args *args)
|
||||
return;
|
||||
|
||||
pr_debug("Validating P4D clear\n");
|
||||
p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
|
||||
WRITE_ONCE(*args->p4dp, p4d);
|
||||
WARN_ON(p4d_none(p4d));
|
||||
p4d_clear(args->p4dp);
|
||||
p4d = READ_ONCE(*args->p4dp);
|
||||
WARN_ON(!p4d_none(p4d));
|
||||
@ -582,8 +565,7 @@ static void __init pgd_clear_tests(struct pgtable_debug_args *args)
|
||||
return;
|
||||
|
||||
pr_debug("Validating PGD clear\n");
|
||||
pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
|
||||
WRITE_ONCE(*args->pgdp, pgd);
|
||||
WARN_ON(pgd_none(pgd));
|
||||
pgd_clear(args->pgdp);
|
||||
pgd = READ_ONCE(*args->pgdp);
|
||||
WARN_ON(!pgd_none(pgd));
|
||||
@ -634,10 +616,8 @@ static void __init pte_clear_tests(struct pgtable_debug_args *args)
|
||||
if (WARN_ON(!args->ptep))
|
||||
return;
|
||||
|
||||
#ifndef CONFIG_RISCV
|
||||
pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
|
||||
#endif
|
||||
set_pte_at(args->mm, args->vaddr, args->ptep, pte);
|
||||
WARN_ON(pte_none(pte));
|
||||
flush_dcache_page(page);
|
||||
barrier();
|
||||
ptep_clear(args->mm, args->vaddr, args->ptep);
|
||||
@ -650,8 +630,7 @@ static void __init pmd_clear_tests(struct pgtable_debug_args *args)
|
||||
pmd_t pmd = READ_ONCE(*args->pmdp);
|
||||
|
||||
pr_debug("Validating PMD clear\n");
|
||||
pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
|
||||
WRITE_ONCE(*args->pmdp, pmd);
|
||||
WARN_ON(pmd_none(pmd));
|
||||
pmd_clear(args->pmdp);
|
||||
pmd = READ_ONCE(*args->pmdp);
|
||||
WARN_ON(!pmd_none(pmd));
|
||||
|
@ -3009,30 +3009,36 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
|
||||
if (new_order >= folio_order(folio))
|
||||
return -EINVAL;
|
||||
|
||||
/* Cannot split anonymous THP to order-1 */
|
||||
if (new_order == 1 && folio_test_anon(folio)) {
|
||||
VM_WARN_ONCE(1, "Cannot split to order-1 folio");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (new_order) {
|
||||
/* Only swapping a whole PMD-mapped folio is supported */
|
||||
if (folio_test_swapcache(folio))
|
||||
if (folio_test_anon(folio)) {
|
||||
/* order-1 is not supported for anonymous THP. */
|
||||
if (new_order == 1) {
|
||||
VM_WARN_ONCE(1, "Cannot split to order-1 folio");
|
||||
return -EINVAL;
|
||||
}
|
||||
} else if (new_order) {
|
||||
/* Split shmem folio to non-zero order not supported */
|
||||
if (shmem_mapping(folio->mapping)) {
|
||||
VM_WARN_ONCE(1,
|
||||
"Cannot split shmem folio to non-0 order");
|
||||
return -EINVAL;
|
||||
}
|
||||
/* No split if the file system does not support large folio */
|
||||
if (!mapping_large_folio_support(folio->mapping)) {
|
||||
/*
|
||||
* No split if the file system does not support large folio.
|
||||
* Note that we might still have THPs in such mappings due to
|
||||
* CONFIG_READ_ONLY_THP_FOR_FS. But in that case, the mapping
|
||||
* does not actually support large folios properly.
|
||||
*/
|
||||
if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
|
||||
!mapping_large_folio_support(folio->mapping)) {
|
||||
VM_WARN_ONCE(1,
|
||||
"Cannot split file folio to non-0 order");
|
||||
return -EINVAL;
|
||||
}
|
||||
}
|
||||
|
||||
/* Only swapping a whole PMD-mapped folio is supported */
|
||||
if (folio_test_swapcache(folio) && new_order)
|
||||
return -EINVAL;
|
||||
|
||||
is_hzp = is_huge_zero_folio(folio);
|
||||
if (is_hzp) {
|
||||
|
@ -588,7 +588,6 @@ extern void __putback_isolated_page(struct page *page, unsigned int order,
|
||||
extern void memblock_free_pages(struct page *page, unsigned long pfn,
|
||||
unsigned int order);
|
||||
extern void __free_pages_core(struct page *page, unsigned int order);
|
||||
extern void kernel_init_pages(struct page *page, int numpages);
|
||||
|
||||
/*
|
||||
* This will have no effect, other than possibly generating a warning, if the
|
||||
|
@ -7745,8 +7745,7 @@ void __mem_cgroup_uncharge_folios(struct folio_batch *folios)
|
||||
* @new: Replacement folio.
|
||||
*
|
||||
* Charge @new as a replacement folio for @old. @old will
|
||||
* be uncharged upon free. This is only used by the page cache
|
||||
* (in replace_page_cache_folio()).
|
||||
* be uncharged upon free.
|
||||
*
|
||||
* Both folios must be locked, @new->mapping must be set up.
|
||||
*/
|
||||
|
20
mm/memory.c
20
mm/memory.c
@ -1507,12 +1507,6 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb,
|
||||
if (unlikely(folio_mapcount(folio) < 0))
|
||||
print_bad_pte(vma, addr, ptent, page);
|
||||
}
|
||||
|
||||
if (want_init_mlocked_on_free() && folio_test_mlocked(folio) &&
|
||||
!delay_rmap && folio_test_anon(folio)) {
|
||||
kernel_init_pages(page, folio_nr_pages(folio));
|
||||
}
|
||||
|
||||
if (unlikely(__tlb_remove_folio_pages(tlb, page, nr, delay_rmap))) {
|
||||
*force_flush = true;
|
||||
*force_break = true;
|
||||
@ -5106,10 +5100,16 @@ static void numa_rebuild_large_mapping(struct vm_fault *vmf, struct vm_area_stru
|
||||
bool ignore_writable, bool pte_write_upgrade)
|
||||
{
|
||||
int nr = pte_pfn(fault_pte) - folio_pfn(folio);
|
||||
unsigned long start = max(vmf->address - nr * PAGE_SIZE, vma->vm_start);
|
||||
unsigned long end = min(vmf->address + (folio_nr_pages(folio) - nr) * PAGE_SIZE, vma->vm_end);
|
||||
pte_t *start_ptep = vmf->pte - (vmf->address - start) / PAGE_SIZE;
|
||||
unsigned long addr;
|
||||
unsigned long start, end, addr = vmf->address;
|
||||
unsigned long addr_start = addr - (nr << PAGE_SHIFT);
|
||||
unsigned long pt_start = ALIGN_DOWN(addr, PMD_SIZE);
|
||||
pte_t *start_ptep;
|
||||
|
||||
/* Stay within the VMA and within the page table. */
|
||||
start = max3(addr_start, pt_start, vma->vm_start);
|
||||
end = min3(addr_start + folio_size(folio), pt_start + PMD_SIZE,
|
||||
vma->vm_end);
|
||||
start_ptep = vmf->pte - ((addr - start) >> PAGE_SHIFT);
|
||||
|
||||
/* Restore all PTEs' mapping of the large folio */
|
||||
for (addr = start; addr != end; start_ptep++, addr += PAGE_SIZE) {
|
||||
|
@ -1654,7 +1654,12 @@ static int migrate_pages_batch(struct list_head *from,
|
||||
|
||||
/*
|
||||
* The rare folio on the deferred split list should
|
||||
* be split now. It should not count as a failure.
|
||||
* be split now. It should not count as a failure:
|
||||
* but increment nr_failed because, without doing so,
|
||||
* migrate_pages() may report success with (split but
|
||||
* unmigrated) pages still on its fromlist; whereas it
|
||||
* always reports success when its fromlist is empty.
|
||||
*
|
||||
* Only check it without removing it from the list.
|
||||
* Since the folio can be on deferred_split_scan()
|
||||
* local list and removing it can cause the local list
|
||||
@ -1669,6 +1674,7 @@ static int migrate_pages_batch(struct list_head *from,
|
||||
if (nr_pages > 2 &&
|
||||
!list_empty(&folio->_deferred_list)) {
|
||||
if (try_split_folio(folio, split_folios) == 0) {
|
||||
nr_failed++;
|
||||
stats->nr_thp_split += is_thp;
|
||||
stats->nr_split++;
|
||||
continue;
|
||||
|
43
mm/mm_init.c
43
mm/mm_init.c
@ -2523,9 +2523,6 @@ EXPORT_SYMBOL(init_on_alloc);
|
||||
DEFINE_STATIC_KEY_MAYBE(CONFIG_INIT_ON_FREE_DEFAULT_ON, init_on_free);
|
||||
EXPORT_SYMBOL(init_on_free);
|
||||
|
||||
DEFINE_STATIC_KEY_MAYBE(CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON, init_mlocked_on_free);
|
||||
EXPORT_SYMBOL(init_mlocked_on_free);
|
||||
|
||||
static bool _init_on_alloc_enabled_early __read_mostly
|
||||
= IS_ENABLED(CONFIG_INIT_ON_ALLOC_DEFAULT_ON);
|
||||
static int __init early_init_on_alloc(char *buf)
|
||||
@ -2543,14 +2540,6 @@ static int __init early_init_on_free(char *buf)
|
||||
}
|
||||
early_param("init_on_free", early_init_on_free);
|
||||
|
||||
static bool _init_mlocked_on_free_enabled_early __read_mostly
|
||||
= IS_ENABLED(CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON);
|
||||
static int __init early_init_mlocked_on_free(char *buf)
|
||||
{
|
||||
return kstrtobool(buf, &_init_mlocked_on_free_enabled_early);
|
||||
}
|
||||
early_param("init_mlocked_on_free", early_init_mlocked_on_free);
|
||||
|
||||
DEFINE_STATIC_KEY_MAYBE(CONFIG_DEBUG_VM, check_pages_enabled);
|
||||
|
||||
/*
|
||||
@ -2578,21 +2567,12 @@ static void __init mem_debugging_and_hardening_init(void)
|
||||
}
|
||||
#endif
|
||||
|
||||
if ((_init_on_alloc_enabled_early || _init_on_free_enabled_early ||
|
||||
_init_mlocked_on_free_enabled_early) &&
|
||||
if ((_init_on_alloc_enabled_early || _init_on_free_enabled_early) &&
|
||||
page_poisoning_requested) {
|
||||
pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, "
|
||||
"will take precedence over init_on_alloc, init_on_free "
|
||||
"and init_mlocked_on_free\n");
|
||||
"will take precedence over init_on_alloc and init_on_free\n");
|
||||
_init_on_alloc_enabled_early = false;
|
||||
_init_on_free_enabled_early = false;
|
||||
_init_mlocked_on_free_enabled_early = false;
|
||||
}
|
||||
|
||||
if (_init_mlocked_on_free_enabled_early && _init_on_free_enabled_early) {
|
||||
pr_info("mem auto-init: init_on_free is on, "
|
||||
"will take precedence over init_mlocked_on_free\n");
|
||||
_init_mlocked_on_free_enabled_early = false;
|
||||
}
|
||||
|
||||
if (_init_on_alloc_enabled_early) {
|
||||
@ -2609,17 +2589,9 @@ static void __init mem_debugging_and_hardening_init(void)
|
||||
static_branch_disable(&init_on_free);
|
||||
}
|
||||
|
||||
if (_init_mlocked_on_free_enabled_early) {
|
||||
want_check_pages = true;
|
||||
static_branch_enable(&init_mlocked_on_free);
|
||||
} else {
|
||||
static_branch_disable(&init_mlocked_on_free);
|
||||
}
|
||||
|
||||
if (IS_ENABLED(CONFIG_KMSAN) && (_init_on_alloc_enabled_early ||
|
||||
_init_on_free_enabled_early || _init_mlocked_on_free_enabled_early))
|
||||
pr_info("mem auto-init: please make sure init_on_alloc, init_on_free and "
|
||||
"init_mlocked_on_free are disabled when running KMSAN\n");
|
||||
if (IS_ENABLED(CONFIG_KMSAN) &&
|
||||
(_init_on_alloc_enabled_early || _init_on_free_enabled_early))
|
||||
pr_info("mem auto-init: please make sure init_on_alloc and init_on_free are disabled when running KMSAN\n");
|
||||
|
||||
#ifdef CONFIG_DEBUG_PAGEALLOC
|
||||
if (debug_pagealloc_enabled()) {
|
||||
@ -2658,10 +2630,9 @@ static void __init report_meminit(void)
|
||||
else
|
||||
stack = "off";
|
||||
|
||||
pr_info("mem auto-init: stack:%s, heap alloc:%s, heap free:%s, mlocked free:%s\n",
|
||||
pr_info("mem auto-init: stack:%s, heap alloc:%s, heap free:%s\n",
|
||||
stack, want_init_on_alloc(GFP_KERNEL) ? "on" : "off",
|
||||
want_init_on_free() ? "on" : "off",
|
||||
want_init_mlocked_on_free() ? "on" : "off");
|
||||
want_init_on_free() ? "on" : "off");
|
||||
if (want_init_on_free())
|
||||
pr_info("mem auto-init: clearing system memory may take some time...\n");
|
||||
}
|
||||
|
@ -1016,7 +1016,7 @@ static inline bool should_skip_kasan_poison(struct page *page)
|
||||
return page_kasan_tag(page) == KASAN_TAG_KERNEL;
|
||||
}
|
||||
|
||||
void kernel_init_pages(struct page *page, int numpages)
|
||||
static void kernel_init_pages(struct page *page, int numpages)
|
||||
{
|
||||
int i;
|
||||
|
||||
|
@ -73,6 +73,9 @@ static void page_table_check_clear(unsigned long pfn, unsigned long pgcnt)
|
||||
page = pfn_to_page(pfn);
|
||||
page_ext = page_ext_get(page);
|
||||
|
||||
if (!page_ext)
|
||||
return;
|
||||
|
||||
BUG_ON(PageSlab(page));
|
||||
anon = PageAnon(page);
|
||||
|
||||
@ -110,6 +113,9 @@ static void page_table_check_set(unsigned long pfn, unsigned long pgcnt,
|
||||
page = pfn_to_page(pfn);
|
||||
page_ext = page_ext_get(page);
|
||||
|
||||
if (!page_ext)
|
||||
return;
|
||||
|
||||
BUG_ON(PageSlab(page));
|
||||
anon = PageAnon(page);
|
||||
|
||||
@ -140,7 +146,10 @@ void __page_table_check_zero(struct page *page, unsigned int order)
|
||||
BUG_ON(PageSlab(page));
|
||||
|
||||
page_ext = page_ext_get(page);
|
||||
BUG_ON(!page_ext);
|
||||
|
||||
if (!page_ext)
|
||||
return;
|
||||
|
||||
for (i = 0; i < (1ul << order); i++) {
|
||||
struct page_table_check *ptc = get_page_table_check(page_ext);
|
||||
|
||||
|
@ -1786,7 +1786,7 @@ static int shmem_replace_folio(struct folio **foliop, gfp_t gfp,
|
||||
xa_lock_irq(&swap_mapping->i_pages);
|
||||
error = shmem_replace_entry(swap_mapping, swap_index, old, new);
|
||||
if (!error) {
|
||||
mem_cgroup_migrate(old, new);
|
||||
mem_cgroup_replace_folio(old, new);
|
||||
__lruvec_stat_mod_folio(new, NR_FILE_PAGES, 1);
|
||||
__lruvec_stat_mod_folio(new, NR_SHMEM, 1);
|
||||
__lruvec_stat_mod_folio(old, NR_FILE_PAGES, -1);
|
||||
|
@ -255,21 +255,6 @@ config INIT_ON_FREE_DEFAULT_ON
|
||||
touching "cold" memory areas. Most cases see 3-5% impact. Some
|
||||
synthetic workloads have measured as high as 8%.
|
||||
|
||||
config INIT_MLOCKED_ON_FREE_DEFAULT_ON
|
||||
bool "Enable mlocked memory zeroing on free"
|
||||
depends on !KMSAN
|
||||
help
|
||||
This config has the effect of setting "init_mlocked_on_free=1"
|
||||
on the kernel command line. If it is enabled, all mlocked process
|
||||
memory is zeroed when freed. This restriction to mlocked memory
|
||||
improves performance over "init_on_free" but can still be used to
|
||||
protect confidential data like key material from content exposures
|
||||
to other processes, as well as live forensics and cold boot attacks.
|
||||
Any non-mlocked memory is not cleared before it is reassigned. This
|
||||
configuration can be overwritten by setting "init_mlocked_on_free=0"
|
||||
on the command line. The "init_on_free" boot option takes
|
||||
precedence over "init_mlocked_on_free".
|
||||
|
||||
config CC_HAS_ZERO_CALL_USED_REGS
|
||||
def_bool $(cc-option,-fzero-call-used-regs=used-gpr)
|
||||
# https://github.com/ClangBuiltLinux/linux/issues/1766
|
||||
|
@ -67,7 +67,8 @@ int main(void)
|
||||
dump_maps();
|
||||
ksft_exit_fail_msg("Error: munmap failed!?\n");
|
||||
}
|
||||
ksft_test_result_pass("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_print_msg("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_test_result_pass("mmap() 5*PAGE_SIZE at base\n");
|
||||
|
||||
addr = base_addr + page_size;
|
||||
size = 3 * page_size;
|
||||
@ -76,7 +77,8 @@ int main(void)
|
||||
dump_maps();
|
||||
ksft_exit_fail_msg("Error: first mmap() failed unexpectedly\n");
|
||||
}
|
||||
ksft_test_result_pass("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_print_msg("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_test_result_pass("mmap() 3*PAGE_SIZE at base+PAGE_SIZE\n");
|
||||
|
||||
/*
|
||||
* Exact same mapping again:
|
||||
@ -93,7 +95,8 @@ int main(void)
|
||||
dump_maps();
|
||||
ksft_exit_fail_msg("Error:1: mmap() succeeded when it shouldn't have\n");
|
||||
}
|
||||
ksft_test_result_pass("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_print_msg("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_test_result_pass("mmap() 5*PAGE_SIZE at base\n");
|
||||
|
||||
/*
|
||||
* Second mapping contained within first:
|
||||
@ -111,7 +114,8 @@ int main(void)
|
||||
dump_maps();
|
||||
ksft_exit_fail_msg("Error:2: mmap() succeeded when it shouldn't have\n");
|
||||
}
|
||||
ksft_test_result_pass("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_print_msg("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_test_result_pass("mmap() 2*PAGE_SIZE at base+PAGE_SIZE\n");
|
||||
|
||||
/*
|
||||
* Overlap end of existing mapping:
|
||||
@ -128,7 +132,8 @@ int main(void)
|
||||
dump_maps();
|
||||
ksft_exit_fail_msg("Error:3: mmap() succeeded when it shouldn't have\n");
|
||||
}
|
||||
ksft_test_result_pass("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_print_msg("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_test_result_pass("mmap() 2*PAGE_SIZE at base+(3*PAGE_SIZE)\n");
|
||||
|
||||
/*
|
||||
* Overlap start of existing mapping:
|
||||
@ -145,7 +150,8 @@ int main(void)
|
||||
dump_maps();
|
||||
ksft_exit_fail_msg("Error:4: mmap() succeeded when it shouldn't have\n");
|
||||
}
|
||||
ksft_test_result_pass("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_print_msg("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_test_result_pass("mmap() 2*PAGE_SIZE bytes at base\n");
|
||||
|
||||
/*
|
||||
* Adjacent to start of existing mapping:
|
||||
@ -162,7 +168,8 @@ int main(void)
|
||||
dump_maps();
|
||||
ksft_exit_fail_msg("Error:5: mmap() failed when it shouldn't have\n");
|
||||
}
|
||||
ksft_test_result_pass("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_print_msg("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_test_result_pass("mmap() PAGE_SIZE at base\n");
|
||||
|
||||
/*
|
||||
* Adjacent to end of existing mapping:
|
||||
@ -179,7 +186,8 @@ int main(void)
|
||||
dump_maps();
|
||||
ksft_exit_fail_msg("Error:6: mmap() failed when it shouldn't have\n");
|
||||
}
|
||||
ksft_test_result_pass("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_print_msg("mmap() @ 0x%lx-0x%lx p=%p result=%m\n", addr, addr + size, p);
|
||||
ksft_test_result_pass("mmap() PAGE_SIZE at base+(4*PAGE_SIZE)\n");
|
||||
|
||||
addr = base_addr;
|
||||
size = 5 * page_size;
|
||||
|
Loading…
Reference in New Issue
Block a user