mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-08 15:04:45 +00:00
b24413180f
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
---|---|---|
.. | ||
drivers | ||
kernel | ||
lib | ||
mm | ||
Kconfig | ||
output_arch.ld | ||
README.mm |
Memory management for CRIS/MMU ------------------------------ HISTORY: $Log: README.mm,v $ Revision 1.1 2001/12/17 13:59:27 bjornw Initial revision Revision 1.1 2000/07/10 16:25:21 bjornw Initial revision Revision 1.4 2000/01/17 02:31:59 bjornw Added discussion of paging and VM. Revision 1.3 1999/12/03 16:43:23 hp Blurb about that the 3.5G-limitation is not a MMU limitation Revision 1.2 1999/12/03 16:04:21 hp Picky comment about not mapping the first page Revision 1.1 1999/12/03 15:41:30 bjornw First version of CRIS/MMU memory layout specification. ------------------------------ See the ETRAX-NG HSDD for reference. We use the page-size of 8 kbytes, as opposed to the i386 page-size of 4 kbytes. The MMU can, apart from the normal mapping of pages, also do a top-level segmentation of the kernel memory space. We use this feature to avoid having to use page-tables to map the physical memory into the kernel's address space. We also use it to keep the user-mode virtual mapping in the same map during kernel-mode, so that the kernel easily can access the corresponding user-mode process' data. As a comparison, the Linux/i386 2.0 puts the kernel and physical RAM at address 0, overlapping with the user-mode virtual space, so that descriptor registers are needed for each memory access to specify which MMU space to map through. That changed in 2.2, putting the kernel/physical RAM at 0xc0000000, to co-exist with the user-mode mapping. We will do something quite similar, but with the additional complexity of having to map the internal chip I/O registers and the flash memory area (including SRAM and peripherial chip-selets). The kernel-mode segmentation map: ------------------------ ------------------------ FFFFFFFF| | => cached | | | kernel seg_f | flash | | F0000000|______________________| | | EFFFFFFF| | => uncached | | | kernel seg_e | flash | | E0000000|______________________| | DRAM | DFFFFFFF| | paged to any | Un-cached | | kernel seg_d | =======> | | D0000000|______________________| | | CFFFFFFF| | | | | kernel seg_c |==\ | | C0000000|______________________| \ |______________________| BFFFFFFF| | uncached | | | kernel seg_b |=====\=========>| Registers | B0000000|______________________| \c |______________________| AFFFFFFF| | \a | | | | \c | FLASH/SRAM/Peripheral| | | \h |______________________| | | \e | | | | \d | | | kernel seg_0 - seg_a | \==>| DRAM | | | | Cached | | | paged to any | | | | =======> |______________________| | | | | | | | Illegal | | | |______________________| | | | | | | | FLASH/SRAM/Peripheral| 00000000|______________________| |______________________| In user-mode it looks the same except that only the space 0-AFFFFFFF is available. Therefore, in this model, the virtual address space per process is limited to 0xb0000000 bytes (minus 8192 bytes, since the first page, 0..8191, is never mapped, in order to trap NULL references). It also means that the total physical RAM that can be mapped is 256 MB (kseg_c above). More RAM can be mapped by choosing a different segmentation and shrinking the user-mode memory space. The MMU can map all 4 GB in user mode, but doing that would mean that a few extra instructions would be needed for each access to user mode memory. The kernel needs access to both cached and uncached flash. Uncached is necessary because of the special write/erase sequences. Also, the peripherial chip-selects are decoded from that region. The kernel also needs its own virtual memory space. That is kseg_d. It is used by the vmalloc() kernel function to allocate virtual contiguous chunks of memory not possible using the normal kmalloc physical RAM allocator. The setting of the actual MMU control registers to use this layout would be something like this: R_MMU_KSEG = ( ( seg_f, seg ) | // Flash cached ( seg_e, seg ) | // Flash uncached ( seg_d, page ) | // kernel vmalloc area ( seg_c, seg ) | // kernel linear segment ( seg_b, seg ) | // kernel linear segment ( seg_a, page ) | ( seg_9, page ) | ( seg_8, page ) | ( seg_7, page ) | ( seg_6, page ) | ( seg_5, page ) | ( seg_4, page ) | ( seg_3, page ) | ( seg_2, page ) | ( seg_1, page ) | ( seg_0, page ) ); R_MMU_KBASE_HI = ( ( base_f, 0x0 ) | // flash/sram/periph cached ( base_e, 0x8 ) | // flash/sram/periph uncached ( base_d, 0x0 ) | // don't care ( base_c, 0x4 ) | // physical RAM cached area ( base_b, 0xb ) | // uncached on-chip registers ( base_a, 0x0 ) | // don't care ( base_9, 0x0 ) | // don't care ( base_8, 0x0 ) ); // don't care R_MMU_KBASE_LO = ( ( base_7, 0x0 ) | // don't care ( base_6, 0x0 ) | // don't care ( base_5, 0x0 ) | // don't care ( base_4, 0x0 ) | // don't care ( base_3, 0x0 ) | // don't care ( base_2, 0x0 ) | // don't care ( base_1, 0x0 ) | // don't care ( base_0, 0x0 ) ); // don't care NOTE: while setting up the MMU, we run in a non-mapped mode in the DRAM (0x40 segment) and need to setup the seg_4 to a unity mapping, so that we don't get a fault before we have had time to jump into the real kernel segment (0xc0). This is done in head.S temporarily, but fixed by the kernel later in paging_init. Paging - PTE's, PMD's and PGD's ------------------------------- [ References: asm/pgtable.h, asm/page.h, asm/mmu.h ] The paging mechanism uses virtual addresses to split a process memory-space into pages, a page being the smallest unit that can be freely remapped in memory. On Linux/CRIS, a page is 8192 bytes (for technical reasons not equal to 4096 as in most other 32-bit architectures). It would be inefficient to let a virtual memory mapping be controlled by a long table of page mappings, so it is broken down into a 2-level structure with a Page Directory containing pointers to Page Tables which each have maps of up to 2048 pages (8192 / sizeof(void *)). Linux can actually handle 3-level structures as well, with a Page Middle Directory in between, but in many cases, this is folded into a two-level structure by excluding the Middle Directory. We'll take a look at how an address is translated while we discuss how it's handled in the Linux kernel. The example address is 0xd004000c; in binary this is: 31 23 15 7 0 11010000 00000100 00000000 00001100 |______| |__________||____________| PGD PTE page offset Given the top-level Page Directory, the offset in that directory is calculated using the upper 8 bits: static inline pgd_t * pgd_offset(struct mm_struct * mm, unsigned long address) { return mm->pgd + (address >> PGDIR_SHIFT); } PGDIR_SHIFT is the log2 of the amount of memory an entry in the PGD can map; in our case it is 24, corresponding to 16 MB. This means that each entry in the PGD corresponds to 16 MB of virtual memory. The pgd_t from our example will therefore be the 208'th (0xd0) entry in mm->pgd. Since the Middle Directory does not exist, it is a unity mapping: static inline pmd_t * pmd_offset(pgd_t * dir, unsigned long address) { return (pmd_t *) dir; } The Page Table provides the final lookup by using bits 13 to 23 as index: static inline pte_t * pte_offset(pmd_t * dir, unsigned long address) { return (pte_t *) pmd_page(*dir) + ((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)); } PAGE_SHIFT is the log2 of the size of a page; 13 in our case. PTRS_PER_PTE is the number of pointers that fit in a Page Table and is used to mask off the PGD-part of the address. The so-far unused bits 0 to 12 are used to index inside a page linearily. The VM system ------------- The kernels own page-directory is the swapper_pg_dir, cleared in paging_init, and contains the kernels virtual mappings (the kernel itself is not paged - it is mapped linearily using kseg_c as described above). Architectures without kernel segments like the i386, need to setup swapper_pg_dir directly in head.S to map the kernel itself. swapper_pg_dir is pointed to by init_mm.pgd as the init-task's PGD. To see what support functions are used to setup a page-table, let's look at the kernel's internal paged memory system, vmalloc/vfree. void * vmalloc(unsigned long size) The vmalloc-system keeps a paged segment in kernel-space at 0xd0000000. What happens first is that a virtual address chunk is allocated to the request using get_vm_area(size). After that, physical RAM pages are allocated and put into the kernel's page-table using alloc_area_pages(addr, size). static int alloc_area_pages(unsigned long address, unsigned long size) First the PGD entry is found using init_mm.pgd. This is passed to alloc_area_pmd (remember the 3->2 folding). It uses pte_alloc_kernel to check if the PGD entry points anywhere - if not, a page table page is allocated and the PGD entry updated. Then the alloc_area_pte function is used just like alloc_area_pmd to check which page table entry is desired, and a physical page is allocated and the table entry updated. All of this is repeated at the top-level until the entire address range specified has been mapped.