Fixing the following checkpatch.pl errors:
ERROR: "foo * bar" should be "foo *bar"
+ befs_blocknr_t blockno, befs_block_run * run);
WARNING: Missing a blank line after declarations
+ struct buffer_head *bh;
+ befs_debug(sb, "---> %s length: %llu", __func__, len);
WARNING: Block comments use * on subsequent lines
+ /*
+ Double indir block, plus all the indirect blocks it maps.
(and other instances of these)
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Convert function descriptions to kernel-doc style.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Fixing typos in kernel-doc function descriptions in fs/befs/btree.c.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Fixing the following checkpatch.pl error:
ERROR: "foo * bar" should be "foo *bar"
+befs_load_sb(struct super_block *sb, befs_super_block * disk_sb)
And the following warnings:
WARNING: suspect code indent for conditional statements (8, 12)
+ if (disk_sb->fs_byte_order == BEFS_BYTEORDER_NATIVE_LE)
+ befs_sb->byte_order = BEFS_BYTESEX_LE;
WARNING: suspect code indent for conditional statements (8, 12)
+ else if (disk_sb->fs_byte_order == BEFS_BYTEORDER_NATIVE_BE)
+ befs_sb->byte_order = BEFS_BYTESEX_BE;
WARNING: break quoted strings at a space character
+ befs_error(sb, "blocksize(%u) cannot be larger"
+ "than system pagesize(%lu)", befs_sb->block_size,
WARNING: line over 80 characters
+ if (befs_sb->log_start != befs_sb->log_end || befs_sb->flags == BEFS_DIRTY) {
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
The description of befs_load_sb was confusing the kernel-doc system since,
because it starts with /**, it thinks it will document the function with
kernel-doc formatting. Which it isn't.
Fix other comment style issues in the file while we are at it.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
ag_shift and blocks_per_ag contain the same information in different ways,
same as block_shift and block_size do. It is worth checking this two are
consistent, but since blocks_per_ag isn't documented as mandatory to use
some implementations of befs don't enforce this, so making it non-fatal if
they don't match and just having it as a warning.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
befs_dump_super_block() wasn't giving the inode_size information when
dumping all elements of the superblock. Add this element to have complete
information of the superblock.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
There is no need to init block, since it will be overwitten later by
iaddr2blockno().
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
For validating superblock state, add flags field to befs_sb_info, read the state from the disk
and check if it is equal to BEFS_DIRTY.
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
befs_btree_find(), the only caller of befs_find_key(), only cares about if
the return from that function is BEFS_BT_MATCH or not. It never uses the
partial match given with BEFS_BT_PARMATCH. Make the overflow return clearer
by having BEFS_BT_OVERFLOW instead of BEFS_BT_PARMATCH.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
ret is initialized to -EIO and is never modified, so remove ret and use
-EIO directly.
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Acked-by: Luis de Bethencourt <luisbg@osg.samsung.com>
There is no need to init res, since it will be overwitten later by
befs_fblock2brun().
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Acked-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Remove *befs_sb and just call BEFS_SB(sb) directly, since the returned
value by this function is only used once.
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Acked-by: Luis de Bethencourt <luisbg@osg.samsung.com>
node_off is unconditionally set to bt_super.root_node_ptr, so no need to
init it to zero.
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Acked-by: Luis de Bethencourt <luisbg@osg.samsung.com>
There is no need to set *value, it will be overwritten later.
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Acked-by: Luis de Bethencourt <luisbg@osg.samsung.com>
As VFS expects, lookup inserts NULL inode to dentry when the named inode
does not exist.
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Acked-by: Luis de Bethencourt <luisbg@osg.samsung.com>
The calls to brelse are useless since dbl_indir_block and indir_block
are NULL.
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Acked-by: Luis de Bethencourt <luisbg@osg.samsung.com>
The only caller of befs_find_brun_direct is befs_fblock2brun, which
already validates that the block is within the range of direct blocks.
So remove the duplicate validation.
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Acked-by: Luis de Bethencourt <luisbg@osg.samsung.com>
The only place the values of free_node_ptr and max_size are read is in
befs_dump_index_entry(), which both times it is called, it is passed the on
disk superblock. Removing assignment of unused values.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
befs_error() is used in potential errors that could happen in befs to
provide informational log messages. befs_debug() is silent when
CONFIG_BEFS_DEBUG=no, and very verbose when switched on, which is why it is
used for general debugging but not for errors.
Fix a few cases where the befs debug utility usage isn't following the
expected pattern. To make sure we have consistent information in the logs.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Use macro directly instead of via assigning it to an unchanging variable.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Acked-by: Salah Triki <salah.triki@gmail.com>
No need to dereference dentry twice to get the name when we already have
it stored in a local variable.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
This comment with a mysterious unfinished line confuses the kernel-doc
system since, because it starts with /**, it thinks it is documenting a
function.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Log error only when silent flag is not set.
Fixes: dbe6460388bc ("fs/befs/linuxvfs.c: check silent flag before logging errors")
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Acked-by: Salah Triki <salah.triki@gmail.com>
Merge updates from Andrew Morton:
- fsnotify updates
- ocfs2 updates
- all of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (127 commits)
console: don't prefer first registered if DT specifies stdout-path
cred: simpler, 1D supplementary groups
CREDITS: update Pavel's information, add GPG key, remove snail mail address
mailmap: add Johan Hovold
.gitattributes: set git diff driver for C source code files
uprobes: remove function declarations from arch/{mips,s390}
spelling.txt: "modeled" is spelt correctly
nmi_backtrace: generate one-line reports for idle cpus
arch/tile: adopt the new nmi_backtrace framework
nmi_backtrace: do a local dump_stack() instead of a self-NMI
nmi_backtrace: add more trigger_*_cpu_backtrace() methods
min/max: remove sparse warnings when they're nested
Documentation/filesystems/proc.txt: add more description for maps/smaps
mm, proc: fix region lost in /proc/self/smaps
proc: fix timerslack_ns CAP_SYS_NICE check when adjusting self
proc: add LSM hook checks to /proc/<tid>/timerslack_ns
proc: relax /proc/<tid>/timerslack_ns capability requirements
meminfo: break apart a very long seq_printf with #ifdefs
seq/proc: modify seq_put_decimal_[u]ll to take a const char *, not char
proc: faster /proc/*/status
...
These inode operations are no longer used; remove them.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Current supplementary groups code can massively overallocate memory and
is implemented in a way so that access to individual gid is done via 2D
array.
If number of gids is <= 32, memory allocation is more or less tolerable
(140/148 bytes). But if it is not, code allocates full page (!)
regardless and, what's even more fun, doesn't reuse small 32-entry
array.
2D array means dependent shifts, loads and LEAs without possibility to
optimize them (gid is never known at compile time).
All of the above is unnecessary. Switch to the usual
trailing-zero-len-array scheme. Memory is allocated with
kmalloc/vmalloc() and only as much as needed. Accesses become simpler
(LEA 8(gi,idx,4) or even without displacement).
Maximum number of gids is 65536 which translates to 256KB+8 bytes. I
think kernel can handle such allocation.
On my usual desktop system with whole 9 (nine) aux groups, struct
group_info shrinks from 148 bytes to 44 bytes, yay!
Nice side effects:
- "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,
- fix little mess in net/ipv4/ping.c
should have been using GROUP_AT macro but this point becomes moot,
- aux group allocation is persistent and should be accounted as such.
Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.by
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recently, Redhat reported that nvml test suite failed on QEMU/KVM,
more detailed info please refer to:
https://bugzilla.redhat.com/show_bug.cgi?id=1365721
Actually, this bug is not only for NVDIMM/DAX but also for any other
file systems. This simple test case abstracted from nvml can easily
reproduce this bug in common environment:
-------------------------- testcase.c -----------------------------
int
is_pmem_proc(const void *addr, size_t len)
{
const char *caddr = addr;
FILE *fp;
if ((fp = fopen("/proc/self/smaps", "r")) == NULL) {
printf("!/proc/self/smaps");
return 0;
}
int retval = 0; /* assume false until proven otherwise */
char line[PROCMAXLEN]; /* for fgets() */
char *lo = NULL; /* beginning of current range in smaps file */
char *hi = NULL; /* end of current range in smaps file */
int needmm = 0; /* looking for mm flag for current range */
while (fgets(line, PROCMAXLEN, fp) != NULL) {
static const char vmflags[] = "VmFlags:";
static const char mm[] = " wr";
/* check for range line */
if (sscanf(line, "%p-%p", &lo, &hi) == 2) {
if (needmm) {
/* last range matched, but no mm flag found */
printf("never found mm flag.\n");
break;
} else if (caddr < lo) {
/* never found the range for caddr */
printf("#######no match for addr %p.\n", caddr);
break;
} else if (caddr < hi) {
/* start address is in this range */
size_t rangelen = (size_t)(hi - caddr);
/* remember that matching has started */
needmm = 1;
/* calculate remaining range to search for */
if (len > rangelen) {
len -= rangelen;
caddr += rangelen;
printf("matched %zu bytes in range "
"%p-%p, %zu left over.\n",
rangelen, lo, hi, len);
} else {
len = 0;
printf("matched all bytes in range "
"%p-%p.\n", lo, hi);
}
}
} else if (needmm && strncmp(line, vmflags,
sizeof(vmflags) - 1) == 0) {
if (strstr(&line[sizeof(vmflags) - 1], mm) != NULL) {
printf("mm flag found.\n");
if (len == 0) {
/* entire range matched */
retval = 1;
break;
}
needmm = 0; /* saw what was needed */
} else {
/* mm flag not set for some or all of range */
printf("range has no mm flag.\n");
break;
}
}
}
fclose(fp);
printf("returning %d.\n", retval);
return retval;
}
void *Addr;
size_t Size;
/*
* worker -- the work each thread performs
*/
static void *
worker(void *arg)
{
int *ret = (int *)arg;
*ret = is_pmem_proc(Addr, Size);
return NULL;
}
int main(int argc, char *argv[])
{
if (argc < 2 || argc > 3) {
printf("usage: %s file [env].\n", argv[0]);
return -1;
}
int fd = open(argv[1], O_RDWR);
struct stat stbuf;
fstat(fd, &stbuf);
Size = stbuf.st_size;
Addr = mmap(0, stbuf.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
close(fd);
pthread_t threads[NTHREAD];
int ret[NTHREAD];
/* kick off NTHREAD threads */
for (int i = 0; i < NTHREAD; i++)
pthread_create(&threads[i], NULL, worker, &ret[i]);
/* wait for all the threads to complete */
for (int i = 0; i < NTHREAD; i++)
pthread_join(threads[i], NULL);
/* verify that all the threads return the same value */
for (int i = 1; i < NTHREAD; i++) {
if (ret[0] != ret[i]) {
printf("Error i %d ret[0] = %d ret[i] = %d.\n", i,
ret[0], ret[i]);
}
}
printf("%d", ret[0]);
return 0;
}
It failed as some threads can not find the memory region in
"/proc/self/smaps" which is allocated in the main process
It is caused by proc fs which uses 'file->version' to indicate the VMA that
is the last one has already been handled by read() system call. When the
next read() issues, it uses the 'version' to find the VMA, then the next
VMA is what we want to handle, the related code is as follows:
if (last_addr) {
vma = find_vma(mm, last_addr);
if (vma && (vma = m_next_vma(priv, vma)))
return vma;
}
However, VMA will be lost if the last VMA is gone, e.g:
The process VMA list is A->B->C->D
CPU 0 CPU 1
read() system call
handle VMA B
version = B
return to userspace
unmap VMA B
issue read() again to continue to get
the region info
find_vma(version) will get VMA C
m_next_vma(C) will get VMA D
handle D
!!! VMA C is lost !!!
In order to fix this bug, we make 'file->version' indicate the end address
of the current VMA. m_start will then look up a vma which with vma_start
< last_vm_end and moves on to the next vma if we found the same or an
overlapping vma. This will guarantee that we will not miss an exclusive
vma but we can still miss one if the previous vma was shrunk. This is
acceptable because guaranteeing "never miss a vma" is simply not feasible.
User has to cope with some inconsistencies if the file is not read in one
go.
[mhocko@suse.com: changelog fixes]
Link: http://lkml.kernel.org/r/1475296958-27652-1-git-send-email-robert.hu@intel.com
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Robert Hu <robert.hu@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In changing from checking ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)
to capable(CAP_SYS_NICE), I missed that ptrace_my_access succeeds when p
== current, but the CAP_SYS_NICE doesn't.
Thus while the previous commit was intended to loosen the needed
privileges to modify a processes timerslack, it needlessly restricted a
task modifying its own timerslack via the proc/<tid>/timerslack_ns
(which is permitted also via the PR_SET_TIMERSLACK method).
This patch corrects this by checking if p == current before checking the
CAP_SYS_NICE value.
This patch applies on top of my two previous patches currently in -mm
Link: http://lkml.kernel.org/r/1471906870-28624-1-git-send-email-john.stultz@linaro.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Colin Cross <ccross@android.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Elliott Hughes <enh@google.com>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As requested, this patch checks the existing LSM hooks
task_getscheduler/task_setscheduler when reading or modifying the task's
timerslack value.
Previous versions added new get/settimerslack LSM hooks, but since they
checked the same PROCESS__SET/GETSCHED values as existing hooks, it was
suggested we just use the existing ones.
Link: http://lkml.kernel.org/r/1469132667-17377-2-git-send-email-john.stultz@linaro.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Colin Cross <ccross@android.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Elliott Hughes <enh@google.com>
Cc: James Morris <jmorris@namei.org>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When an interface to allow a task to change another tasks timerslack was
first proposed, it was suggested that something greater then
CAP_SYS_NICE would be needed, as a task could be delayed further then
what normally could be done with nice adjustments.
So CAP_SYS_PTRACE was adopted instead for what became the
/proc/<tid>/timerslack_ns interface. However, for Android (where this
feature originates), giving the system_server CAP_SYS_PTRACE would allow
it to observe and modify all tasks memory. This is considered too high
a privilege level for only needing to change the timerslack.
After some discussion, it was realized that a CAP_SYS_NICE process can
set a task as SCHED_FIFO, so they could fork some spinning processes and
set them all SCHED_FIFO 99, in effect delaying all other tasks for an
infinite amount of time.
So as a CAP_SYS_NICE task can already cause trouble for other tasks,
using it as a required capability for accessing and modifying
/proc/<tid>/timerslack_ns seems sufficient.
Thus, this patch loosens the capability requirements to CAP_SYS_NICE and
removes CAP_SYS_PTRACE, simplifying some of the code flow as well.
This is technically an ABI change, but as the feature just landed in
4.6, I suspect no one is yet using it.
Link: http://lkml.kernel.org/r/1469132667-17377-1-git-send-email-john.stultz@linaro.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Nick Kralevich <nnk@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Colin Cross <ccross@android.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Elliott Hughes <enh@google.com>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use a specific routine to emit most lines so that the code is easier to
read and maintain.
akpm:
text data bss dec hex filename
2976 8 0 2984 ba8 fs/proc/meminfo.o before
2669 8 0 2677 a75 fs/proc/meminfo.o after
Link: http://lkml.kernel.org/r/8fce7fdef2ba081a4ef531594e97da8a9feebb58.1470810406.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>