License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2006-08-29 18:06:00 +00:00
|
|
|
/* bounce buffer handling for block devices
|
|
|
|
*
|
|
|
|
* - Split from highmem.c
|
|
|
|
*/
|
|
|
|
|
2014-06-06 21:38:30 +00:00
|
|
|
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
|
|
|
|
|
2006-08-29 18:06:00 +00:00
|
|
|
#include <linux/mm.h>
|
2011-10-16 06:01:52 +00:00
|
|
|
#include <linux/export.h>
|
2006-08-29 18:06:00 +00:00
|
|
|
#include <linux/swap.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 08:04:11 +00:00
|
|
|
#include <linux/gfp.h>
|
2006-08-29 18:06:00 +00:00
|
|
|
#include <linux/bio.h>
|
|
|
|
#include <linux/pagemap.h>
|
|
|
|
#include <linux/mempool.h>
|
|
|
|
#include <linux/blkdev.h>
|
2015-05-22 21:13:32 +00:00
|
|
|
#include <linux/backing-dev.h>
|
2006-08-29 18:06:00 +00:00
|
|
|
#include <linux/init.h>
|
|
|
|
#include <linux/hash.h>
|
|
|
|
#include <linux/highmem.h>
|
2014-06-06 21:38:30 +00:00
|
|
|
#include <linux/printk.h>
|
2006-08-29 18:06:00 +00:00
|
|
|
#include <asm/tlbflush.h>
|
|
|
|
|
tracing/events: convert block trace points to TRACE_EVENT()
TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
these new capabilities to this tracepoint:
- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
...
Cons:
- no dev_t info for the output of plug, unplug_timer and unplug_io events.
no dev_t info for getrq and sleeprq events if bio == NULL.
no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
This is mainly because we can't get the deivce from a request queue.
But this may change in the future.
- A packet command is converted to a string in TP_assign, not TP_print.
While blktrace do the convertion just before output.
Since pc requests should be rather rare, this is not a big issue.
- In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
has a unique format, which means we have some unused data in a trace entry.
The overhead is minimized by using __dynamic_array() instead of __array().
I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
dd dd + ioctl blktrace dd + TRACE_EVENT (splice)
1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s
2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s
3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s
So the overhead of tracing is very small, and no regression when using
those trace events vs blktrace.
And the binary output of TRACE_EVENT is much smaller than blktrace:
# ls -l -h
-rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
-rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
-rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
Following are some comparisons between TRACE_EVENT and blktrace:
plug:
kjournald-480 [000] 303.084981: block_plug: [kjournald]
kjournald-480 [000] 303.084981: 8,0 P N [kjournald]
unplug_io:
kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1
kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1
remap:
kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
kjournald-480 [000] 303.085043: 8,0 A W 102736992 + 8 <- (8,8) 33384
bio_backmerge:
kjournald-480 [000] 303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
kjournald-480 [000] 303.085086: 8,0 M W 102737032 + 8 [kjournald]
getrq:
kjournald-480 [000] 303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084975: 8,0 G W 102736984 + 8 [kjournald]
bash-2066 [001] 1072.953770: 8,0 G N [bash]
bash-2066 [001] 1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
rq_complete:
konsole-2065 [001] 300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
konsole-2065 [001] 300.053191: 8,0 C W 103669040 + 16 [0]
ksoftirqd/1-7 [001] 1072.953811: 8,0 C N (5a 00 08 00 00 00 00 00 24 00) [0]
ksoftirqd/1-7 [001] 1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
rq_insert:
kjournald-480 [000] 303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084986: 8,0 I W 102736984 + 8 [kjournald]
Changelog from v2 -> v3:
- use the newly introduced __dynamic_array().
Changelog from v1 -> v2:
- use __string() instead of __array() to minimize the memory required
to store hex dump of rq->cmd().
- support large pc requests.
- add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
- some cleanups.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-09 05:43:05 +00:00
|
|
|
#include <trace/events/block.h>
|
2017-06-19 07:26:21 +00:00
|
|
|
#include "blk.h"
|
tracing/events: convert block trace points to TRACE_EVENT()
TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
these new capabilities to this tracepoint:
- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
...
Cons:
- no dev_t info for the output of plug, unplug_timer and unplug_io events.
no dev_t info for getrq and sleeprq events if bio == NULL.
no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
This is mainly because we can't get the deivce from a request queue.
But this may change in the future.
- A packet command is converted to a string in TP_assign, not TP_print.
While blktrace do the convertion just before output.
Since pc requests should be rather rare, this is not a big issue.
- In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
has a unique format, which means we have some unused data in a trace entry.
The overhead is minimized by using __dynamic_array() instead of __array().
I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
dd dd + ioctl blktrace dd + TRACE_EVENT (splice)
1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s
2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s
3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s
So the overhead of tracing is very small, and no regression when using
those trace events vs blktrace.
And the binary output of TRACE_EVENT is much smaller than blktrace:
# ls -l -h
-rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
-rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
-rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
Following are some comparisons between TRACE_EVENT and blktrace:
plug:
kjournald-480 [000] 303.084981: block_plug: [kjournald]
kjournald-480 [000] 303.084981: 8,0 P N [kjournald]
unplug_io:
kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1
kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1
remap:
kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
kjournald-480 [000] 303.085043: 8,0 A W 102736992 + 8 <- (8,8) 33384
bio_backmerge:
kjournald-480 [000] 303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
kjournald-480 [000] 303.085086: 8,0 M W 102737032 + 8 [kjournald]
getrq:
kjournald-480 [000] 303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084975: 8,0 G W 102736984 + 8 [kjournald]
bash-2066 [001] 1072.953770: 8,0 G N [bash]
bash-2066 [001] 1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
rq_complete:
konsole-2065 [001] 300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
konsole-2065 [001] 300.053191: 8,0 C W 103669040 + 16 [0]
ksoftirqd/1-7 [001] 1072.953811: 8,0 C N (5a 00 08 00 00 00 00 00 24 00) [0]
ksoftirqd/1-7 [001] 1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
rq_insert:
kjournald-480 [000] 303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084986: 8,0 I W 102736984 + 8 [kjournald]
Changelog from v2 -> v3:
- use the newly introduced __dynamic_array().
Changelog from v1 -> v2:
- use __string() instead of __array() to minimize the memory required
to store hex dump of rq->cmd().
- support large pc requests.
- add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
- some cleanups.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-09 05:43:05 +00:00
|
|
|
|
2006-08-29 18:06:00 +00:00
|
|
|
#define POOL_SIZE 64
|
|
|
|
#define ISA_POOL_SIZE 16
|
|
|
|
|
2018-05-20 22:25:47 +00:00
|
|
|
static struct bio_set bounce_bio_set, bounce_bio_split;
|
2021-03-31 07:29:59 +00:00
|
|
|
static mempool_t page_pool;
|
2006-08-29 18:06:00 +00:00
|
|
|
|
2018-10-21 18:02:36 +00:00
|
|
|
static void init_bounce_bioset(void)
|
|
|
|
{
|
|
|
|
static bool bounce_bs_setup;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (bounce_bs_setup)
|
|
|
|
return;
|
|
|
|
|
|
|
|
ret = bioset_init(&bounce_bio_set, BIO_POOL_SIZE, 0, BIOSET_NEED_BVECS);
|
|
|
|
BUG_ON(ret);
|
|
|
|
if (bioset_integrity_create(&bounce_bio_set, BIO_POOL_SIZE))
|
|
|
|
BUG_ON(1);
|
|
|
|
|
|
|
|
ret = bioset_init(&bounce_bio_split, BIO_POOL_SIZE, 0, 0);
|
|
|
|
BUG_ON(ret);
|
|
|
|
bounce_bs_setup = true;
|
|
|
|
}
|
|
|
|
|
2006-08-29 18:06:00 +00:00
|
|
|
static __init int init_emergency_pool(void)
|
|
|
|
{
|
2018-05-20 22:25:47 +00:00
|
|
|
int ret;
|
2021-03-31 07:30:00 +00:00
|
|
|
|
|
|
|
#ifndef CONFIG_MEMORY_HOTPLUG
|
2011-10-20 19:24:30 +00:00
|
|
|
if (max_pfn <= max_low_pfn)
|
2006-08-29 18:06:00 +00:00
|
|
|
return 0;
|
2011-10-20 19:24:30 +00:00
|
|
|
#endif
|
2006-08-29 18:06:00 +00:00
|
|
|
|
2018-05-20 22:25:47 +00:00
|
|
|
ret = mempool_init_page_pool(&page_pool, POOL_SIZE, 0);
|
|
|
|
BUG_ON(ret);
|
2014-06-06 21:38:30 +00:00
|
|
|
pr_info("pool size: %d pages\n", POOL_SIZE);
|
2006-08-29 18:06:00 +00:00
|
|
|
|
2018-10-21 18:02:36 +00:00
|
|
|
init_bounce_bioset();
|
2006-08-29 18:06:00 +00:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
__initcall(init_emergency_pool);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* highmem version, map in to vec
|
|
|
|
*/
|
|
|
|
static void bounce_copy_vec(struct bio_vec *to, unsigned char *vfrom)
|
|
|
|
{
|
|
|
|
unsigned char *vto;
|
|
|
|
|
2011-11-25 15:14:39 +00:00
|
|
|
vto = kmap_atomic(to->bv_page);
|
2006-08-29 18:06:00 +00:00
|
|
|
memcpy(vto + to->bv_offset, vfrom, to->bv_len);
|
2011-11-25 15:14:39 +00:00
|
|
|
kunmap_atomic(vto);
|
2006-08-29 18:06:00 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Simple bounce buffer support for highmem pages. Depending on the
|
|
|
|
* queue gfp mask set, *to may or may not be a highmem page. kmap it
|
|
|
|
* always, it will do the Right Thing
|
|
|
|
*/
|
|
|
|
static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
|
|
|
|
{
|
|
|
|
unsigned char *vfrom;
|
2017-12-18 12:22:07 +00:00
|
|
|
struct bio_vec tovec, fromvec;
|
2013-11-24 01:19:00 +00:00
|
|
|
struct bvec_iter iter;
|
2017-12-18 12:22:07 +00:00
|
|
|
/*
|
|
|
|
* The bio of @from is created by bounce, so we can iterate
|
|
|
|
* its bvec from start to end, but the @from->bi_iter can't be
|
|
|
|
* trusted because it might be changed by splitting.
|
|
|
|
*/
|
|
|
|
struct bvec_iter from_iter = BVEC_ITER_ALL_INIT;
|
2013-11-24 01:19:00 +00:00
|
|
|
|
|
|
|
bio_for_each_segment(tovec, to, iter) {
|
2017-12-18 12:22:07 +00:00
|
|
|
fromvec = bio_iter_iovec(from, from_iter);
|
|
|
|
if (tovec.bv_page != fromvec.bv_page) {
|
2013-11-24 01:19:00 +00:00
|
|
|
/*
|
|
|
|
* fromvec->bv_offset and fromvec->bv_len might have
|
|
|
|
* been modified by the block layer, so use the original
|
|
|
|
* copy, bounce_copy_vec already uses tovec->bv_len
|
|
|
|
*/
|
2017-12-18 12:22:07 +00:00
|
|
|
vfrom = page_address(fromvec.bv_page) +
|
2013-11-24 01:19:00 +00:00
|
|
|
tovec.bv_offset;
|
|
|
|
|
|
|
|
bounce_copy_vec(&tovec, vfrom);
|
|
|
|
flush_dcache_page(tovec.bv_page);
|
|
|
|
}
|
2017-12-18 12:22:07 +00:00
|
|
|
bio_advance_iter(from, &from_iter, tovec.bv_len);
|
2006-08-29 18:06:00 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-03-31 07:29:59 +00:00
|
|
|
static void bounce_end_io(struct bio *bio)
|
2006-08-29 18:06:00 +00:00
|
|
|
{
|
|
|
|
struct bio *bio_orig = bio->bi_private;
|
2017-12-18 12:22:06 +00:00
|
|
|
struct bio_vec *bvec, orig_vec;
|
|
|
|
struct bvec_iter orig_iter = bio_orig->bi_iter;
|
2019-02-15 11:13:19 +00:00
|
|
|
struct bvec_iter_all iter_all;
|
2006-08-29 18:06:00 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* free up bounce indirect pages used
|
|
|
|
*/
|
2019-04-25 07:03:00 +00:00
|
|
|
bio_for_each_segment_all(bvec, bio, iter_all) {
|
2017-12-18 12:22:06 +00:00
|
|
|
orig_vec = bio_iter_iovec(bio_orig, orig_iter);
|
|
|
|
if (bvec->bv_page != orig_vec.bv_page) {
|
|
|
|
dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
|
2021-03-31 07:29:59 +00:00
|
|
|
mempool_free(bvec->bv_page, &page_pool);
|
2017-12-18 12:22:06 +00:00
|
|
|
}
|
|
|
|
bio_advance_iter(bio_orig, &orig_iter, orig_vec.bv_len);
|
2006-08-29 18:06:00 +00:00
|
|
|
}
|
|
|
|
|
2017-06-03 07:38:06 +00:00
|
|
|
bio_orig->bi_status = bio->bi_status;
|
2015-07-20 13:29:37 +00:00
|
|
|
bio_endio(bio_orig);
|
2006-08-29 18:06:00 +00:00
|
|
|
bio_put(bio);
|
|
|
|
}
|
|
|
|
|
2015-07-20 13:29:37 +00:00
|
|
|
static void bounce_end_io_write(struct bio *bio)
|
2006-08-29 18:06:00 +00:00
|
|
|
{
|
2021-03-31 07:29:59 +00:00
|
|
|
bounce_end_io(bio);
|
2006-08-29 18:06:00 +00:00
|
|
|
}
|
|
|
|
|
2021-03-31 07:29:59 +00:00
|
|
|
static void bounce_end_io_read(struct bio *bio)
|
2006-08-29 18:06:00 +00:00
|
|
|
{
|
|
|
|
struct bio *bio_orig = bio->bi_private;
|
|
|
|
|
2017-06-03 07:38:06 +00:00
|
|
|
if (!bio->bi_status)
|
2006-08-29 18:06:00 +00:00
|
|
|
copy_to_high_bio_irq(bio_orig, bio);
|
|
|
|
|
2021-03-31 07:29:59 +00:00
|
|
|
bounce_end_io(bio);
|
2006-08-29 18:06:00 +00:00
|
|
|
}
|
|
|
|
|
2021-02-24 07:24:06 +00:00
|
|
|
static struct bio *bounce_clone_bio(struct bio *bio_src)
|
2018-07-24 07:52:34 +00:00
|
|
|
{
|
|
|
|
struct bvec_iter iter;
|
|
|
|
struct bio_vec bv;
|
|
|
|
struct bio *bio;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Pre immutable biovecs, __bio_clone() used to just do a memcpy from
|
|
|
|
* bio_src->bi_io_vec to bio->bi_io_vec.
|
|
|
|
*
|
|
|
|
* We can't do that anymore, because:
|
|
|
|
*
|
|
|
|
* - The point of cloning the biovec is to produce a bio with a biovec
|
|
|
|
* the caller can modify: bi_idx and bi_bvec_done should be 0.
|
|
|
|
*
|
2021-03-11 11:01:37 +00:00
|
|
|
* - The original bio could've had more than BIO_MAX_VECS biovecs; if
|
2018-07-24 07:52:34 +00:00
|
|
|
* we tried to clone the whole thing bio_alloc_bioset() would fail.
|
|
|
|
* But the clone should succeed as long as the number of biovecs we
|
2021-03-11 11:01:37 +00:00
|
|
|
* actually need to allocate is fewer than BIO_MAX_VECS.
|
2018-07-24 07:52:34 +00:00
|
|
|
*
|
|
|
|
* - Lastly, bi_vcnt should not be looked at or relied upon by code
|
|
|
|
* that does not own the bio - reason being drivers don't use it for
|
|
|
|
* iterating over the biovec anymore, so expecting it to be kept up
|
|
|
|
* to date (i.e. for clones that share the parent biovec) is just
|
|
|
|
* asking for trouble and would force extra work on
|
|
|
|
* __bio_clone_fast() anyways.
|
|
|
|
*/
|
2021-02-24 07:24:05 +00:00
|
|
|
if (bio_is_passthrough(bio_src))
|
2021-02-24 07:24:07 +00:00
|
|
|
bio = bio_kmalloc(GFP_NOIO | __GFP_NOFAIL,
|
|
|
|
bio_segments(bio_src));
|
2021-02-24 07:24:05 +00:00
|
|
|
else
|
2021-02-24 07:24:06 +00:00
|
|
|
bio = bio_alloc_bioset(GFP_NOIO, bio_segments(bio_src),
|
2021-02-24 07:24:05 +00:00
|
|
|
&bounce_bio_set);
|
2021-01-24 10:02:34 +00:00
|
|
|
bio->bi_bdev = bio_src->bi_bdev;
|
2021-01-26 14:33:08 +00:00
|
|
|
if (bio_flagged(bio_src, BIO_REMAPPED))
|
|
|
|
bio_set_flag(bio, BIO_REMAPPED);
|
2018-07-24 07:52:34 +00:00
|
|
|
bio->bi_opf = bio_src->bi_opf;
|
2018-11-12 17:35:25 +00:00
|
|
|
bio->bi_ioprio = bio_src->bi_ioprio;
|
2018-07-24 07:52:34 +00:00
|
|
|
bio->bi_write_hint = bio_src->bi_write_hint;
|
|
|
|
bio->bi_iter.bi_sector = bio_src->bi_iter.bi_sector;
|
|
|
|
bio->bi_iter.bi_size = bio_src->bi_iter.bi_size;
|
|
|
|
|
|
|
|
switch (bio_op(bio)) {
|
|
|
|
case REQ_OP_DISCARD:
|
|
|
|
case REQ_OP_SECURE_ERASE:
|
|
|
|
case REQ_OP_WRITE_ZEROES:
|
|
|
|
break;
|
|
|
|
case REQ_OP_WRITE_SAME:
|
|
|
|
bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0];
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
bio_for_each_segment(bv, bio_src, iter)
|
|
|
|
bio->bi_io_vec[bio->bi_vcnt++] = bv;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2021-02-24 07:24:06 +00:00
|
|
|
if (bio_crypt_clone(bio, bio_src, GFP_NOIO) < 0)
|
2020-09-16 03:53:13 +00:00
|
|
|
goto err_put;
|
block: Inline encryption support for blk-mq
We must have some way of letting a storage device driver know what
encryption context it should use for en/decrypting a request. However,
it's the upper layers (like the filesystem/fscrypt) that know about and
manages encryption contexts. As such, when the upper layer submits a bio
to the block layer, and this bio eventually reaches a device driver with
support for inline encryption, the device driver will need to have been
told the encryption context for that bio.
We want to communicate the encryption context from the upper layer to the
storage device along with the bio, when the bio is submitted to the block
layer. To do this, we add a struct bio_crypt_ctx to struct bio, which can
represent an encryption context (note that we can't use the bi_private
field in struct bio to do this because that field does not function to pass
information across layers in the storage stack). We also introduce various
functions to manipulate the bio_crypt_ctx and make the bio/request merging
logic aware of the bio_crypt_ctx.
We also make changes to blk-mq to make it handle bios with encryption
contexts. blk-mq can merge many bios into the same request. These bios need
to have contiguous data unit numbers (the necessary changes to blk-merge
are also made to ensure this) - as such, it suffices to keep the data unit
number of just the first bio, since that's all a storage driver needs to
infer the data unit number to use for each data block in each bio in a
request. blk-mq keeps track of the encryption context to be used for all
the bios in a request with the request's rq_crypt_ctx. When the first bio
is added to an empty request, blk-mq will program the encryption context
of that bio into the request_queue's keyslot manager, and store the
returned keyslot in the request's rq_crypt_ctx. All the functions to
operate on encryption contexts are in blk-crypto.c.
Upper layers only need to call bio_crypt_set_ctx with the encryption key,
algorithm and data_unit_num; they don't have to worry about getting a
keyslot for each encryption context, as blk-mq/blk-crypto handles that.
Blk-crypto also makes it possible for request-based layered devices like
dm-rq to make use of inline encryption hardware by cloning the
rq_crypt_ctx and programming a keyslot in the new request_queue when
necessary.
Note that any user of the block layer can submit bios with an
encryption context, such as filesystems, device-mapper targets, etc.
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-14 00:37:18 +00:00
|
|
|
|
2020-09-16 03:53:13 +00:00
|
|
|
if (bio_integrity(bio_src) &&
|
2021-02-24 07:24:06 +00:00
|
|
|
bio_integrity_clone(bio, bio_src, GFP_NOIO) < 0)
|
2020-09-16 03:53:13 +00:00
|
|
|
goto err_put;
|
2018-07-24 07:52:34 +00:00
|
|
|
|
2018-12-05 17:10:35 +00:00
|
|
|
bio_clone_blkg_association(bio, bio_src);
|
2018-12-05 17:10:32 +00:00
|
|
|
blkcg_bio_issue_init(bio);
|
2018-09-11 18:41:30 +00:00
|
|
|
|
2018-07-24 07:52:34 +00:00
|
|
|
return bio;
|
2020-09-16 03:53:13 +00:00
|
|
|
|
|
|
|
err_put:
|
|
|
|
bio_put(bio);
|
|
|
|
return NULL;
|
2018-07-24 07:52:34 +00:00
|
|
|
}
|
|
|
|
|
2021-03-31 07:30:00 +00:00
|
|
|
void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig)
|
2006-08-29 18:06:00 +00:00
|
|
|
{
|
2012-09-10 21:30:37 +00:00
|
|
|
struct bio *bio;
|
|
|
|
int rw = bio_data_dir(*bio_orig);
|
2013-11-24 01:19:00 +00:00
|
|
|
struct bio_vec *to, from;
|
|
|
|
struct bvec_iter iter;
|
2017-06-18 04:38:58 +00:00
|
|
|
unsigned i = 0;
|
|
|
|
bool bounce = false;
|
|
|
|
int sectors = 0;
|
2006-08-29 18:06:00 +00:00
|
|
|
|
2017-06-18 04:38:58 +00:00
|
|
|
bio_for_each_segment(from, *bio_orig, iter) {
|
2021-03-11 11:01:37 +00:00
|
|
|
if (i++ < BIO_MAX_VECS)
|
2017-06-18 04:38:58 +00:00
|
|
|
sectors += from.bv_len >> 9;
|
2021-03-31 07:30:00 +00:00
|
|
|
if (PageHighMem(from.bv_page))
|
2017-06-18 04:38:58 +00:00
|
|
|
bounce = true;
|
|
|
|
}
|
|
|
|
if (!bounce)
|
|
|
|
return;
|
|
|
|
|
2021-02-24 07:24:05 +00:00
|
|
|
if (!bio_is_passthrough(*bio_orig) &&
|
|
|
|
sectors < bio_sectors(*bio_orig)) {
|
2018-05-20 22:25:47 +00:00
|
|
|
bio = bio_split(*bio_orig, sectors, GFP_NOIO, &bounce_bio_split);
|
2017-06-18 04:38:58 +00:00
|
|
|
bio_chain(bio, *bio_orig);
|
2020-07-01 08:59:44 +00:00
|
|
|
submit_bio_noacct(*bio_orig);
|
2017-06-18 04:38:58 +00:00
|
|
|
*bio_orig = bio;
|
|
|
|
}
|
2021-02-24 07:24:06 +00:00
|
|
|
bio = bounce_clone_bio(*bio_orig);
|
2006-08-29 18:06:00 +00:00
|
|
|
|
2019-02-21 15:43:36 +00:00
|
|
|
/*
|
|
|
|
* Bvec table can't be updated by bio_for_each_segment_all(),
|
|
|
|
* so retrieve bvec from the table directly. This way is safe
|
|
|
|
* because the 'bio' is single-page bvec.
|
|
|
|
*/
|
|
|
|
for (i = 0, to = bio->bi_io_vec; i < bio->bi_vcnt; to++, i++) {
|
2012-09-10 21:30:37 +00:00
|
|
|
struct page *page = to->bv_page;
|
2008-12-23 11:44:19 +00:00
|
|
|
|
2021-03-31 07:30:00 +00:00
|
|
|
if (!PageHighMem(page))
|
2012-09-10 21:30:37 +00:00
|
|
|
continue;
|
2006-08-29 18:06:00 +00:00
|
|
|
|
2021-03-31 07:29:59 +00:00
|
|
|
to->bv_page = mempool_alloc(&page_pool, GFP_NOIO);
|
block:bounce: fix call inc_|dec_zone_page_state on different pages confuse value of NR_BOUNCE
Commit d2c5e30c9a1420902262aa923794d2ae4e0bc391
("[PATCH] zoned vm counters: conversion of nr_bounce to per zone counter")
convert statistic of nr_bounce to per zone and one global value in vm_stat,
but it call inc_|dec_zone_page_state on different pages, then different
zones, and cause us to get unexpected value of NR_BOUNCE.
Below is the result on my machine:
Mar 2 09:26:08 udknight kernel: [144766.778265] Mem-Info:
Mar 2 09:26:08 udknight kernel: [144766.778266] DMA per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778268] CPU 0: hi: 0, btch: 1 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778269] CPU 1: hi: 0, btch: 1 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778270] Normal per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778271] CPU 0: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778273] CPU 1: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778274] HighMem per-cpu:
Mar 2 09:26:08 udknight kernel: [144766.778275] CPU 0: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778276] CPU 1: hi: 186, btch: 31 usd: 0
Mar 2 09:26:08 udknight kernel: [144766.778279] active_anon:46926 inactive_anon:287406 isolated_anon:0
Mar 2 09:26:08 udknight kernel: [144766.778279] active_file:105085 inactive_file:139432 isolated_file:0
Mar 2 09:26:08 udknight kernel: [144766.778279] unevictable:653 dirty:0 writeback:0 unstable:0
Mar 2 09:26:08 udknight kernel: [144766.778279] free:178957 slab_reclaimable:6419 slab_unreclaimable:9966
Mar 2 09:26:08 udknight kernel: [144766.778279] mapped:4426 shmem:305277 pagetables:784 bounce:0
Mar 2 09:26:08 udknight kernel: [144766.778279] free_cma:0
Mar 2 09:26:08 udknight kernel: [144766.778286] DMA free:3324kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 2 09:26:08 udknight kernel: [144766.778287] lowmem_reserve[]: 0 822 3754 3754
Mar 2 09:26:08 udknight kernel: [144766.778293] Normal free:26828kB min:3632kB low:4540kB high:5448kB active_anon:4872kB inactive_anon:68kB active_file:1796kB inactive_file:1796kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:892920kB managed:842560kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:4144kB slab_reclaimable:25676kB slab_unreclaimable:39864kB kernel_stack:1944kB pagetables:3136kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2412612 all_unreclaimable? yes
Mar 2 09:26:08 udknight kernel: [144766.778294] lowmem_reserve[]: 0 0 23451 23451
Mar 2 09:26:08 udknight kernel: [144766.778299] HighMem free:685676kB min:512kB low:3748kB high:6984kB active_anon:182832kB inactive_anon:1149556kB active_file:418544kB inactive_file:555932kB unevictable:2612kB isolated(anon):0kB isolated(file):0kB present:3001732kB managed:3001732kB mlocked:0kB dirty:0kB writeback:0kB mapped:17704kB shmem:1216964kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:75771152kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Mar 2 09:26:08 udknight kernel: [144766.778300] lowmem_reserve[]: 0 0 0 0
You can see bounce:75771152kB for HighMem, but bounce:0 for lowmem and global.
This patch fix it.
Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-04-26 08:43:31 +00:00
|
|
|
inc_zone_page_state(to->bv_page, NR_BOUNCE);
|
2006-08-29 18:06:00 +00:00
|
|
|
|
|
|
|
if (rw == WRITE) {
|
|
|
|
char *vto, *vfrom;
|
|
|
|
|
2012-09-10 21:30:37 +00:00
|
|
|
flush_dcache_page(page);
|
|
|
|
|
2006-08-29 18:06:00 +00:00
|
|
|
vto = page_address(to->bv_page) + to->bv_offset;
|
2012-09-10 21:30:37 +00:00
|
|
|
vfrom = kmap_atomic(page) + to->bv_offset;
|
2006-08-29 18:06:00 +00:00
|
|
|
memcpy(vto, vfrom, to->bv_len);
|
2012-09-10 21:30:37 +00:00
|
|
|
kunmap_atomic(vfrom);
|
2006-08-29 18:06:00 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-12-03 16:21:36 +00:00
|
|
|
trace_block_bio_bounce(*bio_orig);
|
2007-01-12 11:20:26 +00:00
|
|
|
|
2006-08-29 18:06:00 +00:00
|
|
|
bio->bi_flags |= (1 << BIO_BOUNCED);
|
|
|
|
|
2021-03-31 07:29:59 +00:00
|
|
|
if (rw == READ)
|
|
|
|
bio->bi_end_io = bounce_end_io_read;
|
|
|
|
else
|
2006-08-29 18:06:00 +00:00
|
|
|
bio->bi_end_io = bounce_end_io_write;
|
|
|
|
|
|
|
|
bio->bi_private = *bio_orig;
|
|
|
|
*bio_orig = bio;
|
|
|
|
}
|