Currently the kernel minimal compiler requirement is gcc 4.9 or
clang 10.0.1.
* gcc -mhotpatch option is supported since 4.8.
* A combination of -pg -mrecord-mcount -mnop-mcount -mfentry flags is
supported since gcc 9 and since clang 10.
Drop support for old -pg function prologues. Which leaves binary
compatible -mhotpatch / -mnop-mcount -mfentry prologues in a form:
brcl 0,0
Which are also do not require initial nop optimization / conversion and
presence of _mcount symbol.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Currently we have to consider too many different values which
in the end only affect identity mapping size. These are:
1. max_physmem_end - end of physical memory online or standby.
Always <= end of the last online memory block (get_mem_detect_end()).
2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the
kernel is able to support.
3. "mem=" kernel command line option which limits physical memory usage.
4. OLDMEM_BASE which is a kdump memory limit when the kernel is executed as
crash kernel.
5. "hsa" size which is a memory limit when the kernel is executed during
zfcp/nvme dump.
Through out kernel startup and run we juggle all those values at once
but that does not bring any amusement, only confusion and complexity.
Unify all those values to a single one we should really care, that is
our identity mapping size.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Instead of creating the sysfs attributes for the prng devices by hand,
describe them in .groups and let the misdevice core handle it.
This also ensures that the attributes are available when the KOBJ_ADD
event is raised.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
System call and program check handler both use the system call exit
path when returning to previous context. However the program check
handler jumps right to the end of the system call exit path if the
previous context is kernel context.
This lead to the quite odd double disabling of interrupts in the
system call exit path introduced with commit ce9dfafe29be ("s390:
fix system call exit path").
To avoid that have a separate program check handler exit path if the
previous context is kernel context.
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
As the number of cpus increases, the sccb response can exceed 4k for
read cpu and read scp info sclp commands. Hence, all cpu info entries
cant be embedded within a sccb response
Solution:
To overcome this limitation, extended sccb facility is provided by sclp.
1. Check if the extended sccb facility is installed.
2. If extended sccb is installed, perform the read scp and read cpu
command considering a max sccb length of three page size. This max
length is based on factors like max cpus, sccb header.
3. If extended sccb is not installed, perform the read scp and read cpu
sclp command considering a max sccb length of one page size.
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
For extended sccb support, sccb size could be up to 3 pages. Hence avoid
copy of sclp_info_sccb.
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
sclp early read cpu info is used to detect the number of configured
cpus, which is utilized by smp_detect_cpus() in early startup.
* For read cpu info, the sccb block should be below 2gb.
* smp_detect_cpus() utilizes read cpu info early, but after memblock
initialization. Thus use memblock_allow_low() instead.
* Avoid copy of sclp_core_info structure.
* sclp_early_init_core_info(), sclp_early_core_info and
sclp_early_core_info_valid initdata are no longer required.
* smp_get_core_info() is called only once during early stage. Hence for
early sclp_get_core_info(), directly call read cpu command. No need to
maintain sclp_early_core_info_valid.
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
when we're missing the necessary machine facilities zPCI can
not function. Until now it would silently fail to be initialized,
add an informational print.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
This file is installed by the s390 CPU Measurement sampling
facility device driver to export supported minimum and
maximum sample buffer sizes.
This file is read by lscpumf tool to display the details
of the device driver capabilities. The lscpumf tool might
be invoked by a non-root user. In this case it does not
print anything because the file contents can not be read.
Fix this by allowing read access for all users. Reading
the file contents is ok, changing the file contents is
left to the root user only.
For further reference and details see:
[1] https://github.com/ibm-s390-tools/s390-tools/issues/97
Fixes: 69f239ed335a ("s390/cpum_sf: Dynamically extend the sampling buffer if overflows occur")
Cc: <stable@vger.kernel.org> # 3.14
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
The zcrypt api provides a new function to wait until the zcrypt
api is operational:
int zcrypt_wait_api_operational(void);
The AP bus scan and the binding of ap devices to device drivers is
an asynchronous job. This function waits until these initial jobs
are done and so the zcrypt api should be ready to serve crypto
requests - if there are resources available. The function uses an
internal timeout of 60s. The very first caller will either wait for
ap bus bindings complete or the timeout happens. This state will be
remembered for further callers which will only be blocked until a
decision is made (timeout or bindings complete).
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
This patch adds notifications to userspace for two important
conditions of the ap bus:
I) Initial ap bus scan done. This indicates that the initial
scan of all the ap devices (cards, queues) is complete and
ap devices have been build up for all the hardware found.
This condition is signaled with
1) An ap bus change uevent send to userspace with an environment
key/value pair "INITSCAN=done":
# udevadm monitor -k -p
...
KERNEL[97.830919] change /devices/ap (ap)
ACTION=change
DEVPATH=/devices/ap
SUBSYSTEM=ap
INITSCAN=done
SEQNUM=10421
2) A sysfs attribute /sys/bus/ap/scans which shows the
number of completed ap bus scans done since bus init.
So a value of 1 or greater signals that the initial
ap bus scan is complete.
Note: The initial ap bus scan complete condition is fulfilled
and will be signaled even if there was no ap resource found.
II) APQN driver bindings complete. This indicates that all
APQNs have been bound to an zcrypt or alternate device
driver. Only with the help of an device driver an APQN
can be used for crypto load. So the binding complete
condition is the starting point for user space to be
sure all crypto resources on the ap bus are available
for use.
This condition is signaled with
1) An ap bus change uevent send to userspace with an environment
key/value pair "BINDINGS=complete":
# udevadm monitor -k -p
...
KERNEL[97.830975] change /devices/ap (ap)
ACTION=change
DEVPATH=/devices/ap
SUBSYSTEM=ap
BINDINGS=complete
SEQNUM=10422
2) A sysfs attribute /sys/bus/ap/bindings showing
"<nr of bound apqns>/<total nr of apqns> (complete)"
when all available apqns have been bound to device drivers, or
"<nr of bound apqns>/<total nr of apqns>"
when there are some apqns not bound to an device driver.
Note: The binding complete condition is also fulfilled, when
there are no apqns available to bind any device driver. In
this case the binding complete will be signaled AFTER init
scan is done.
Note: This condition may arise multiple times when after
initial scan modifications on the bindings take place. For
example a manual unbind of an APQN switches the binding
complete condition off. When at a later time the unbound APQNs
are bound with an device driver the binding is (again) complete
resulting in another uevent and marking the bindings sysfs
attribute with '(complete)'.
There is also a new function to be used within the kernel:
int ap_wait_init_apqn_bindings_complete(unsigned long timeout)
Interface to wait for the AP bus to have done one initial ap bus
scan and all detected APQNs have been bound to device drivers.
If these both conditions are not fulfilled, this function blocks
on a condition with wait_for_completion_interruptible_timeout().
If these both conditions are fulfilled (before the timeout hits)
the return value is 0. If the timeout (in jiffies) hits instead
-ETIME is returned. On failures negative return values are
returned to the caller. Please note that further unbind/bind
actions after initial binding complete is through do not cause this
function to block again.
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
The s390-trng does provide 100% entropy. The quality value is supported
to be between 1 and 1024 and not 1..1000. Use 1024 to make this driver
the preferred one. If we ever have a better driver that has the same
quality but is faster we can change this again when merging the new
driver. No need to be conservative.
This makes sure that the hw variant is preferred over things like
virtio-rng, where the hypervisor has a potential to be misconfigured
and thus should have a slightly lower confidence.
Cc: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
And move it earlier in the decompressor.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
It is relying on _REGION1_SHIFT / _REGION2_SHIFT values which come from
asm/pgtable.h, so include it.
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Kasan early code is only working on init_mm, remove unneeded pgd
parameter from kasan_copy_shadow and rename it to
kasan_copy_shadow_mapping.
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Kasan has nothing to do with vmemmap, strip vmemmap from function names
to avoid confusing people.
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Fixes the following warning with CONFIG_KERNEL_UNCOMPRESSED=y
arch/s390/boot/compressed/decompressor.h:6:46: warning: non-void function
does not return a value [-Wreturn-type]
static inline void *decompress_kernel(void) {}
^
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
To make sure that the vmalloc area size is for almost all cases large
enough let it depend on the (potential) physical memory size. There is
still the possibility to override this with the vmalloc kernel command
line parameter.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
We've seen several occurences in the past where the default vmalloc
size of 128GB is not sufficient. Therefore extend the default size.
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Since commit 29d37e5b82f3 ("s390/protvirt: add ultravisor initialization")
vmax is adjusted to the ultravisor secure storage limit. This limit is
currently applied when 4-level paging is used. Later vmax is also used
to align vmemmap address to the top region table entry border. When vmax
is set to the ultravisor secure storage limit this is no longer the case.
Instead of changing vmax, make only MODULES_END be affected by the
secure storage limit, so that vmax stays intact for further vmemmap
address alignment.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Compiling the kernel with Kasan disables automatic 3-level vs 4-level
kernel space paging selection, because the shadow memory offset has
to be known at compile time and there is no such offset which would be
acceptable for both 3 and 4-level paging. Instead S390_4_LEVEL_PAGING
option was introduced which allowed to pick how many paging levels to
use under Kasan.
With the introduction of protected virtualization, kernel memory layout
may be affected due to ultravisor secure storage limit. This adds
additional complexity into how memory layout would look like in
combination with Kasan predefined shadow memory offsets. To simplify
this make Kasan 4-level paging default and remove Kasan 3-level paging
support.
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
s390_base_ext_handler_fn haven't been used since its introduction in
commit ab14de6c37fa ("[S390] Convert memory detection into C code.").
s390_base_ext_handler itself is currently falsely storing 16 registers
at __LC_SAVE_AREA_ASYNC rewriting several following lowcore values:
cpu_flags, return_psw, return_mcck_psw, sync_enter_timer and
async_enter_timer.
Besides that s390_base_ext_handler itself is only potentially hiding
EXT interrupts which should not have happen in the first place. Any
piece of code which requires EXT interrupts before fully functional
ext_int_handler is enabled has to do it on its own, like this is done
by sclp_early_cmd() which is doing EXT interrupts handling synchronously
in sclp_early_wait_irq().
With s390_base_ext_handler removed unexpected EXT interrupt leads
to disabled wait with the address 0x1b0 (__LC_EXT_NEW_PSW), which is
currently setup in the decompressor.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Currently udelay relies on working EXT interrupts handler, which is not
the case during early startup. In such cases udelay_simple() has to be
used instead.
To avoid mistakes of calling udelay too early, which could happen from
the common code as well - make udelay work for the early code by
introducing static branch and redirecting all udelay calls to
udelay_simple until EXT interrupts handler is fully initialized and
async stack is allocated.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Set io/ext handlers to disabled wait in the initial lowcore, so that they
are effective right from the kernel start, when a boot method used does
not rewrite this part of the lowcore for its own needs (i.e. kexec, z/vm
ipl reader boot, qemu direct boot, load from removable media or server).
When the kernel is loaded by zipl, scsi loader or qemu loader, some or
all of the io/ext/pgm handlers addresses might be rewritten. Rewrite them
to initial values again as early as possible.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
The system call exit path is running with interrupts enabled while
checking for TIF/PIF/CIF bits which require special handling. If all
bits have been checked interrupts are disabled and the kernel exits to
user space.
The problem is that after checking all bits and before interrupts are
disabled bits can be set already again, due to interrupt handling.
This means that the kernel can exit to user space with some
TIF/PIF/CIF bits set, which should never happen. E.g. TIF_NEED_RESCHED
might be set, which might lead to additional latencies, since that bit
will only be recognized with next exit to user space.
Fix this by checking the corresponding bits only when interrupts are
disabled.
Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S")
Cc: <stable@vger.kernel.org> # 5.8
Acked-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Here are some small Documentation fixes for 5.10-rc3 that were fallout
from the larger documentation update we did in 5.10-rc2. Nothing major
here at all, but all of these have been in linux-next and resolve build
warnings when building the documentation files.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX6g8vw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykdGQCgjEHwT/N2jltd8yHF1Kca5yR+FJYAoMb3sbJS
gR1iX48G20OhkrPNSIUw
=HDyQ
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core documentation fixes from Greg KH:
"Some small Documentation fixes that were fallout from the larger
documentation update we did in 5.10-rc2.
Nothing major here at all, but all of these have been in linux-next
and resolve build warnings when building the documentation files"
* tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Documentation: remove mic/index from misc-devices/index.rst
scripts: get_api.pl: Add sub-titles to ABI output
scripts: get_abi.pl: Don't let ABI files to create subtitles
docs: leds: index.rst: add a missing file
docs: ABI: sysfs-class-net: fix a typo
docs: ABI: sysfs-driver-dma-ioatdma: what starts with /sys
Here are a small number of small tty and serial fixes for some reported
problems for the tty core, vt code, and some serial drivers.
They include fixes for:
- a buggy and obsolete vt font ioctl removal
- 8250_mtk serial baudrate runtime warnings
- imx serial earlycon build configuration fix
- txx9 serial driver error path cleanup issues
- tty core fix in release_tty that can be triggered by trying to
bind an invalid serial port name to a speakup console device
Almost all of these have been in linux-next without any problems, the
only one that hasn't, just deletes code :)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX6g7nQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymmVACeNXgZpIAUJujtM7hQAEpCDYrFZ08An13TN07Y
wZe5okUITYIXQRZevKyi
=w0og
-----END PGP SIGNATURE-----
Merge tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are a small number of small tty and serial fixes for some
reported problems for the tty core, vt code, and some serial drivers.
They include fixes for:
- a buggy and obsolete vt font ioctl removal
- 8250_mtk serial baudrate runtime warnings
- imx serial earlycon build configuration fix
- txx9 serial driver error path cleanup issues
- tty core fix in release_tty that can be triggered by trying to bind
an invalid serial port name to a speakup console device
Almost all of these have been in linux-next without any problems, the
only one that hasn't, just deletes code :)"
* tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt: Disable KD_FONT_OP_COPY
tty: fix crash in release_tty if tty->port is not set
serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init
tty: serial: imx: enable earlycon by default if IMX_SERIAL_CONSOLE is enabled
serial: 8250_mtk: Fix uart_get_baud_rate warning
Here are some small USB fixes and new device ids for 5.10-rc3
They include:
- USB gadget fixes for some reported issues
- Fixes for the every-troublesome apple fastcharge driver,
hopefully we finally have it right.
- More USB core quirks for odd devices
- USB serial driver fixes for some long-standing issues that
were recently found
- some new USB serial driver device ids
All have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX6g8RA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynX/wCg2StXM3DVNqnEvOF8OQ23WAgcC0IAoKIYuTNi
Uh303+JJ6A0cz4uGfhc0
=OKt5
-----END PGP SIGNATURE-----
Merge tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes and new device ids:
- USB gadget fixes for some reported issues
- Fixes for the ever-troublesome apple fastcharge driver, hopefully
we finally have it right.
- More USB core quirks for odd devices
- USB serial driver fixes for some long-standing issues that were
recently found
- some new USB serial driver device ids
All have been in linux-next with no reported issues"
* tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: apple-mfi-fastcharge: fix reference leak in apple_mfi_fc_set_property
usb: mtu3: fix panic in mtu3_gadget_stop()
USB: serial: option: add Telit FN980 composition 0x1055
USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231
USB: serial: cyberjack: fix write-URB completion race
USB: Add NO_LPM quirk for Kingston flash drive
USB: serial: option: add Quectel EC200T module support
usb: raw-gadget: fix memory leak in gadget_setup
usb: dwc2: Avoid leaving the error_debugfs label unused
usb: dwc3: ep0: Fix delay status handling
usb: gadget: fsl: fix null pointer checking
usb: gadget: goku_udc: fix potential crashes in probe
usb: dwc3: pci: add support for the Intel Alder Lake-S
current->group_leader->exit_signal may change during copy_process() if
current->real_parent exits.
Move the assignment inside tasklist_lock to avoid the race.
Signed-off-by: Eddy Wu <eddy_wu@trendmicro.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's buggy:
On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote:
> We recently discovered a slab-out-of-bounds read in fbcon in the latest
> kernel ( v5.10-rc2 for now ). The root cause of this vulnerability is that
> "fbcon_do_set_font" did not handle "vc->vc_font.data" and
> "vc->vc_font.height" correctly, and the patch
> <https://lkml.org/lkml/2020/9/27/223> for VT_RESIZEX can't handle this
> issue.
>
> Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and
> use KD_FONT_OP_SET again to set a large font.height for tty1. After that,
> we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data
> in "fbcon_do_set_font", while tty1 retains the original larger
> height. Obviously, this will cause an out-of-bounds read, because we can
> access a smaller vc_font.data with a larger vc_font.height.
Further there was only one user ever.
- Android's loadfont, busybox and console-tools only ever use OP_GET
and OP_SET
- fbset documentation only mentions the kernel cmdline font: option,
not anything else.
- systemd used OP_COPY before release 232 published in Nov 2016
Now unfortunately the crucial report seems to have gone down with
gmane, and the commit message doesn't say much. But the pull request
hints at OP_COPY being broken
https://github.com/systemd/systemd/pull/3651
So in other words, this never worked, and the only project which
foolishly every tried to use it, realized that rather quickly too.
Instead of trying to fix security issues here on dead code by adding
missing checks, fix the entire thing by removing the functionality.
Note that systemd code using the OP_COPY function ignored the return
value, so it doesn't matter what we're doing here really - just in
case a lone server somewhere happens to be extremely unlucky and
running an affected old version of systemd. The relevant code from
font_copy_to_all_vcs() in systemd was:
/* copy font from active VT, where the font was uploaded to */
cfo.op = KD_FONT_OP_COPY;
cfo.height = vcs.v_active-1; /* tty1 == index 0 */
(void) ioctl(vcfd, KDFONTOP, &cfo);
Note this just disables the ioctl, garbage collecting the now unused
callbacks is left for -next.
v2: Tetsuo found the old mail, which allowed me to find it on another
archive. Add the link too.
Acked-by: Peilin Ye <yepeilin.cs@gmail.com>
Reported-by: Minh Yuan <yuanmingbuaa@gmail.com>
References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html
References: https://github.com/systemd/systemd/pull/3651
Cc: Greg KH <greg@kroah.com>
Cc: Peilin Ye <yepeilin.cs@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://lore.kernel.org/r/20201108153806.3140315-1-daniel.vetter@ffwll.ch
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Fix an uninitialized struct problem.
- Fix an iomap problem zeroing unwritten EOF blocks.
- Fix some clumsy error handling when writeback fails on
blocksize < pagesize filesystems.
- Fix a retry loop not resetting loop variables properly.
- Fix scrub flagging rtinherit inodes on a non-rt fs, since the kernel
actually does permit that combination.
- Fix excessive page cache flushing when unsharing part of a file.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl+jWK8ACgkQ+H93GTRK
tOuiEQ/+IAEncpqUS1PTSWRlNX7MEQDvlnoLl9ZqhaYrW9pNyz8JzxejubkP/7RA
qkI/fgcBIhxOf+mTKguAUsu81we49PmlObWCEb5mfBI2aeoSL/yM4zikOHFNpy0o
f4U9++kpKJwrWG6kyvNwYMyT6r74vLW0EO9lhYjxAY6+5KgZL0SuFuRAaADDtWj8
SKIc/dli6qDS3IrnkibQtzFOOcmeOEn0qcWcS4gD7tbUpJlw0M2g88JjBPoT8oTK
wRBNrspbAA42YbYqlwmkBQZZwM+XZKLZNcvzzLgQLaQdTKEem2w2pB1j1KvJXsSo
ibxhmk1/tGFKtPTmbpm7dUC9ubr7xch6J+GHNwHuaWL2hxBWJzNRVokG1BsbDXcc
FW3ilwLFd8CFUXttQqQfhiUx8wfe2eJ1aXEBK5JeHWRwD+egLI9WXFJQzjUUwe+v
T+7r+0kS2TL3SXKU5TE+gsuuI5mcJpYvcWVqYPwBxjZW0tIhUzBldpfBYysG3ZAm
uhYcw3BHw1ucsjqcSe14CWqA4KnwgfAcKva5AJSLjJBu3wOi1wrFKg/+Wtpo0xA2
yFAqFP5FGW13oqeYtqJy0J79qOw6Po9wl+XnekSiBCEif965KtV+RBMP2/TBG+Pl
R+bNvSXlb1QiDNSjiIG2b34RiDNoiV2k+ELxOz3SSbzIxx8gvm4=
=MqoT
-----END PGP SIGNATURE-----
Merge tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
- Fix an uninitialized struct problem
- Fix an iomap problem zeroing unwritten EOF blocks
- Fix some clumsy error handling when writeback fails on filesystems
with blocksize < pagesize
- Fix a retry loop not resetting loop variables properly
- Fix scrub flagging rtinherit inodes on a non-rt fs, since the kernel
actually does permit that combination
- Fix excessive page cache flushing when unsharing part of a file
* tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: only flush the unshared range in xfs_reflink_unshare
xfs: fix scrub flagging rtinherit even if there is no rt device
xfs: fix missing CoW blocks writeback conversion retry
iomap: clean up writeback state logic on writepage error
iomap: support partial page discard on writeback block mapping failure
xfs: flush new eof page on truncate to avoid post-eof corruption
xfs: set xefi_discard when creating a deferred agfl free log intent item
Merge procfs splice read fixes from Christoph Hellwig:
"Greg reported a problem due to the fact that Android tests use procfs
files to test splice, which stopped working with the changes for
set_fs() removal.
This series adds read_iter support for seq_file, and uses those for
various proc files using seq_file to restore splice read support"
[ Side note: Christoph initially had a scripted "move everything over"
patch, which looks fine, but I personally would prefer us to actively
discourage splice() on random files. So this does just the minimal
basic core set of proc file op conversions.
For completeness, and in case people care, that script was
sed -i -e 's/\.proc_read\(\s*=\s*\)seq_read/\.proc_read_iter\1seq_read_iter/g'
but I'll wait and see if somebody has a strong argument for using
splice on random small /proc files before I'd run it on the whole
kernel. - Linus ]
* emailed patches from Christoph Hellwig <hch@lst.de>:
proc "seq files": switch to ->read_iter
proc "single files": switch to ->read_iter
proc/stat: switch to ->read_iter
proc/cpuinfo: switch to ->read_iter
proc: wire up generic_file_splice_read for iter ops
seq_file: add seq_read_iter
- Use SYM_FUNC_START_WEAK in the mem* ASM functions instead of a
combination of .weak and SYM_FUNC_START_LOCAL which makes LLVMs
integrated assembler upset.
- Correct the mitigation selection logic which prevented the related prctl
to work correctly.
- Make the UV5 hubless system work correctly by fixing up the malformed
table entries and adding the missing ones.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+oDNYTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoaN0EACPWY15k1YuAEIjiQxRBhq22J8Y6wNX
Ui/rF2AZcAnNEJDTIyvjP6COnT9mjX/tuuluMaI6i/XY/9Xp5LpKvivkL2PXNN3X
onW01ouIc1iYxXwQEVZvhYHsOyhkR9Z8yNG/q9I7xYAXNSZcAHwXVar4VlPBT7Ay
iP75i8pGmb/NCc4oHNXuBp/dV/0/dCoLTndb5p5pX8oS60AAt9ZuK3IRc3ucayhI
M4rTTEya1oY+ZNbtP4A4Jp7Qc/NGYDo6q04za+jcxZ5Gqacs+fk/PNuWgL1fZZtW
sn1D+SMWEb55Xcsdy976b29FFU/DcOcf7TRASzyKgyPW5jg1dP6BZ6U0wpVV3KZw
S2h5/pt48JZI7olrDsLQ0tzjALlk2CcFNrnRtOMDduHdw9wyz+Sg58lZYuvH3sXK
5ZblWRJ3JiBNsNO0sA3kd4sp7xWQB3ey6mkYD8Vqb7zRIt8aXT9jqBxhDrP+Vqs/
/UKv+BJfD6WxC0nQ4x6MS3g4sDvI+1SLfHSZ/UjWJ6NfYJW5/w429pFCaF73xCTd
cqxja1dZYixn7ioFZjolMUdvuDiC5B2+5+RzEV87kaDzO9QZQyvsl7G74MSfwx6G
DAydvuyJoxP2qVASobOBcVOzLQO7DsLzFZzJTttZcnkK2iprcz4qrsFLMxF9SxTD
Amb8qck60dLfqA==
=JdPk
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of x86 fixes:
- Use SYM_FUNC_START_WEAK in the mem* ASM functions instead of a
combination of .weak and SYM_FUNC_START_LOCAL which makes LLVMs
integrated assembler upset
- Correct the mitigation selection logic which prevented the related
prctl to work correctly
- Make the UV5 hubless system work correctly by fixing up the
malformed table entries and adding the missing ones"
* tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/uv: Recognize UV5 hubless system identifier
x86/platform/uv: Remove spaces from OEM IDs
x86/platform/uv: Fix missing OEM_TABLE_ID
x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP
x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S
parser.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+oC4ATHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoXveD/4mIAq1umSj/YdThj65S6nAHjhW1MjO
VxGRNSEiUxOld6OnxZlyup1q4NvyYB1cVk71jLjdBIvEK/ANdq3Cd+PVB5tAAoNK
726zrPlOXx4NJulq2q17ygazZQgOtl/5QsYovS0PymCyyvwfz56yr/OfPRThCe+S
gJZs+zt8uunhEKz1J7lki/vn7D6bHRdaL8yTLabsPIQ296WXrUd2NS4fuanFNyLO
kxETMAECyy77q17dgClFFDheyLHAotoySLm6qSwNeWUPFwlCmoCbHgzSBoryIlH5
DMlwA5iuA5qVv/4jlZ0+WvVWeSWOyF0boZ1cZI+r4nitVvdwnnxTgXcyuTxDSggu
nygQqLlTIQ31dwQleAfKrooJfwk9nB5uLEQkTwdx0BQwdHhvb7usnNEcIMx8WVFf
86LorPlfbKrUTD7PWj7zg3x32GNv/nWjuDnzD0NW6BPKqVy6XaRZtaG2WyYfB9YW
9KXpfsxjDeTBrZK5b/YOna1ah00J+2hwIL7sUhVf24aOmmImy3+Yqe2bQOlR/JGs
yIBFOAPCWrl7AJOwziMD2NQGCTzk9hqKRyhy1WtPC/L0uohf16/xbnViyL1Ijhud
Hzu09qoo8EXYbSwRcHQ+qK39s9fwDK1yvPp+k4XGXS/v94YN0W5rd3o49bTMvhxE
T8CKfwngsBzeYg==
=r2wz
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Thomas Gleixner:
"A single fix for the perf core plugging a memory leak in the address
filter parser"
* tag 'perf-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix a memory leak in perf_event_parse_addr_filter()
underlying RT mutex was not handled correctly and triggering a BUG()
instead of treating it as another variant of retry condition.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+oCqcTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoUSvD/kBBV+cSHoLZqWzoYCySxnIgTX5xLgH
imdhBnGX4MZsHydD1zu9IKuHbOTdRdnf/aXk+6akqyRTqtAxqL3e1BW7zfX2aWo/
EOk5P36ZLzdqqLrc4Z/FZOLaUAx95t2dse0PA3jXOUTHaxr/bwih+/f1vR84xFBG
/Sv1TubPLU7koZI057SrZsHmhF7yc22BMvg30UFUgx8tZyXGBIUEYOYS6sgGBwQe
qXIxTHkpiRwCHu+i44QJiBdhPg8gEb4tsJQLlnmhPYz4m5v/w4HGe7ixZlBQU7f1
isY4GEuDHRRheOzvP0d5LgcRx6r1hNDDu8MWNjWp7N1sTTKMGnDnZ/a36TgBtlvc
Go+HVNCapfhSeul8qjZuL022a9Yv2NcsouwdPWfZpMyWuhkXYWnaVHWvuEKsB4UZ
z0b34q5pULwSz/R4BiqwoLZXdWNsN8x/C31F3v0jBNa/HwK5dTEAgeez0+HNJhDA
Ci/+CgggLuch6V6JYwOT1AEB4ymEXTuRK/d8IwlVADA3fN0wulUUAnbgVYHRWD4y
JaXJKWHnw/G8R7AI31aVHa70D64tAM+0UdsLM+4/idP2kc2Cv7EcCdIHz2OUfQBD
I1/RmugdCPsF5Zutb0dQFNumzxRGk6l+woa7EFnzpkO96GBJDCQiUZKvTFINdCh0
LoLv49vOBVDifQ==
=PX5t
-----END PGP SIGNATURE-----
Merge tag 'locking-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull futex fix from Thomas Gleixner:
"A single fix for the futex code where an intermediate state in the
underlying RT mutex was not handled correctly and triggering a BUG()
instead of treating it as another variant of retry condition"
* tag 'locking-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Handle transient "ownerless" rtmutex state correctly
- Fix the fallout of the IPI as interrupt conversion in Kconfig and the
BCM2836 interrupt chip driver/
- Fixes for interrupt affinity setting and the handling of hierarchical
irq domains in the SiFive PLIC driver.
- Make the unmapped event handling in the TI SCI driver work correctly.
- A few minor fixes and cleanups in various chip drivers and Kconfig.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+oCiATHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoUIkEAC9wYn4x5fObSRpamwj0lagGkUsd0dB
JmKW+u/oUkhM1ZosqPegslA6RDc7LitONFcnqBD348JBrxyP94OnskhBWITv9Nx1
p4AkLS7HDs0pgOpam66qJdxAmLZ0F0vOPnEgr2VQWqaRV8qlTESNiCxT2wKF7mRd
qJn5pW3kSJnyfA66cpFVC3Db64KmdWQvv0BCnc1Wqq3odXgOuTvOkqzVrlqJZQ59
RrZudU0Lz9FtUujQ7AuP+RZY7Ti/AjmEaZDvcsnz3SR6vZGV8qG1f/88RP31d4qK
62cIOfz/bsl1uxAwnKGM4U84tzYua6djhRZXInlzL4/iYKlm5qVR8Qpotl7IxT5n
ntMJGhu/Evy987mq1maOR2rWyqVNU5BoVJUeHHibP8LANTKf7sdWOaNqieXQRKYS
ZnTmoRImBOFhxi0BuqKwwHqJILC4yOZpOa+ARMPqns2KzA4jpAvN+MwPWwpsBVaD
giVm8e8CQvFgMQjjRHcOkfernSsQs/fyQSYQY9qJI/IzVqTEFcYUONneelJGFB7R
iDFMURx7aQUPNf8p9c7eEx0BSBUBan+Quul4HQ1I+SnmOZg3QgRvlMm/zPxhnfyU
3NV4oi4c/fG+7ex+tR3igcfYPCK35BP+6iDG+GKTBjAHimQqk9XXsDYN/Wh3nW6f
UnsJIs9zdhPJsA==
=HUPi
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A set of fixes for interrupt chip drivers:
- Fix the fallout of the IPI as interrupt conversion in Kconfig and
the BCM2836 interrupt chip driver
- Fixes for interrupt affinity setting and the handling of
hierarchical irq domains in the SiFive PLIC driver
- Make the unmapped event handling in the TI SCI driver work
correctly
- A few minor fixes and cleanups in various chip drivers and Kconfig"
* tag 'irq-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
dt-bindings: irqchip: ti, sci-inta: Fix diagram indentation for unmapped events
irqchip/ti-sci-inta: Add support for unmapped event handling
dt-bindings: irqchip: ti, sci-inta: Update for unmapped event handling
irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm
irqchip/sifive-plic: Fix chip_data access within a hierarchy
irqchip/sifive-plic: Fix broken irq_set_affinity() callback
irqchip/stm32-exti: Add all LP timer exti direct events support
irqchip/bcm2836: Fix missing __init annotation
irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY
irqchip/mst: Make mst_intc_of_init static
irqchip/mst: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7
genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
that the lockdep interrupt state needs not to be established before calling
the RCU check.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+oCN8THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYgDD/42A/CTltvs1meUm8gpCbACe/mwY4Ms
4K35p9E/3ozyhjkXbP1odZ6I/rh1pKZYz+qxS7JYDEDgl+3ZfZCeb8WFNspBRvD1
uURn3ou5aVFHGutjF/5yEzrHP+Vkw8ViOvIjhZ1zpPQ0kWyEP7P5K+0wPFV1BVda
ugk6T0ouQo7Kl7c7g/EP9tr9UNorih9ZmkJcsZ7fkzDin+5Ine6v9J1VRumSkm11
4OU7My/c/0EpTuslnW4w4kVYk90B+o6dJccG5SVJVrRHvziD3qhM3slI2UqnIIb+
/8NTrUy79mikZQh6LG5QrPoX9/+tp68csoFwzE6MpVvV+GTlfnbX7xn2ZW0t7ZSA
FVYUxYSO3kCzvU2VfqzLc8HtgK/esoMpcgloB0FM9XDw2khlgvnJyTCAnsn44QIP
tIA+FHGv9oa+QW5NoiRpBtTAYImTfiwaTa5AXCBFijHXsJVMziWDqZ+AQwf58LP9
RjItnOu0ZKzvLHRYDfqpusIUytFv04YuAn2kJ1vDWL4nXeo3vma2at+s/zbPHOjW
LZDUeavv7e5pqMDFC1QDja4o5fSfK0Q7ddVP8Xm1yrnZz5jPmQNCGpLxZWx8wsjq
qxPEUdfqTFvRJQhLHkVznuQUrOVYPVlZfoPro47HNHeHs+qLI1QxmsJHhRu2vd1M
PwVVBz1vvk4Txg==
=ZSdr
-----END PGP SIGNATURE-----
Merge tag 'core-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull entry code fix from Thomas Gleixner:
"A single fix for the generic entry code to correct the wrong
assumption that the lockdep interrupt state needs not to be
established before calling the RCU check"
* tag 'core-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry: Fix the incorrect ordering of lockdep and RCU check
Fix miscompilation with GCC 4.9 by using asm_goto_volatile for put_user().
A fix for an RCU splat at boot caused by a recent lockdep change.
A fix for a possible deadlock in our EEH debugfs code.
Several fixes for handling of _PAGE_ACCESSED on 32-bit platforms.
A build fix when CONFIG_NUMA=n.
Thanks to:
Andreas Schwab,
Christophe Leroy,
Oliver O'Halloran,
Qian Cai,
Scott Cheloha.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl+nxNcTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgEWMD/0SVRMNlv5K4QYi38GPQMR/fZ20uyf6
oLlGNwrXxdYfZEQGvjJ4XiNVEVHmj+nylVyqmI3HnjLRpjmfLopZE5HuIHMknszw
VaYZ/MwnbHcnIt8q/3xM56zpk2zJo9kK1FEItmupbWIbQirJyeE1CEpVI1LXn9FR
2hNSpSk6hhwI/xrT6L1exIReP0CFlsZCMCgNbP9vEDPqOorx3Wxf1uqznV9uZWaZ
AzQwevh2OvYd/rcsDLMlisRWv+JJTBJp/CyvKvawow9Akh81dpic26FqPPVLZkkT
maxY2uGLEyI/qpYFor7Fz1LanMbu3SnXT483Cu3jSv5wzL+2YcdmGsb0IMgqaFlQ
os8waD9q3KDeohCqgqcEYdnkNUo3TQjFP8ilZYYQXQZVBlsWuHkv5k59Bc03aa1w
OAAXvmv+SlhNuCDRZI0qQQbSFlIMIGaUo+RsUZ7WkXBnYE4SCmfykVvP8uB2Djsf
98F4dpfWGPDF4n+wothUBycjJa3NG3Ceset04r94KAMfp0SR73xYHHIRSR1Xa6Pj
1s8EQ0MNMgVb1UdJ3eJCRpfO6oU1p+V7cnVhYH9rTSKBGAoHAwnCh/Py6JPBMcea
8ydNkApKODdZ5d6/oHa4i5oXJTik34f5p3nrBNOWQlA/dCrQeZX7YbYlhCUi0/4U
OCI4V3sEs+dIaQ==
=IFi+
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- fix miscompilation with GCC 4.9 by using asm_goto_volatile for put_user()
- fix for an RCU splat at boot caused by a recent lockdep change
- fix for a possible deadlock in our EEH debugfs code
- several fixes for handling of _PAGE_ACCESSED on 32-bit platforms
- build fix when CONFIG_NUMA=n
Thanks to Andreas Schwab, Christophe Leroy, Oliver O'Halloran, Qian Cai,
and Scott Cheloha.
* tag 'powerpc-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/numa: Fix build when CONFIG_NUMA=n
powerpc/8xx: Manage _PAGE_ACCESSED through APG bits in L1 entry
powerpc/8xx: Always fault when _PAGE_ACCESSED is not set
powerpc/40x: Always fault when _PAGE_ACCESSED is not set
powerpc/603: Always fault when _PAGE_ACCESSED is not set
powerpc: Use asm_goto_volatile for put_user()
powerpc/smp: Call rcu_cpu_starting() earlier
powerpc/eeh_cache: Fix a possible debugfs deadlock
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+m/0MQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpjPHD/4gRQTmbcfC2K9lLvm1xrHZKDvaoRHW5IqT
gMHWYQzVGHKKi3lk0tRzgClXNRE21OGI19gyJ9gdGWqi/2NPYm+4CjDlZezXmQXC
x7icKhJygrOGgsi1/IDE5Y2XClIUIEL+iROIaFz+AtI7J+9v3BfuMty3kqMRK5VY
psqJDOUXdb5ci0SwXKMXzrvHJUHr4tiRUy+TbHBWYfjY4r8WObh35FKOtveBNNtK
5ZyxinXqRu0WcPPJ9t/wdvA8YiwiDJOY3w2zhlWPMGpiJ4gc+OBAxRiNmnEeJ6RN
XKW7xW7Qff6rmNad2K/rQMMMOK19e8csL6hk69gxgMnWDvkWqSJC0DUMzwRGbC2d
IezL/rrjJ7zwrKcwud+QpeGc7niO/DRgLbB4BuH96TE1J1ZbE/lqkVg/JkjDCDyo
atwzF62HbZ+z52+6/V/2IDqTFf0QalmjDJFXY/az3i6bzb8xQcVqri6MO8IbNJDE
pB44BDyZPKwPOH6e2W6U4s/K63E8wvp75YibwSOUhoFiAy7jkfkvQoE7dFXA21/s
kqCUl0voOrGAAAOchnxI4KAoQOXbBM1sjaJtiEWuKgMLxd57sR1GXSYw1zoz7ymX
8wqhKu76RP3WRFNtBQpEeq+xj3iCnaoXkX7q8EJ9tfpe6GJqln5/ROA3Gg9V6oHh
KMqgOAgdYg==
=zplt
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A set of fixes for io_uring:
- SQPOLL cancelation fixes
- Two fixes for the io_identity COW
- Cancelation overflow fix (Pavel)
- Drain request cancelation fix (Pavel)
- Link timeout race fix (Pavel)"
* tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-block:
io_uring: fix link lookup racing with link timeout
io_uring: use correct pointer for io_uring_show_cred()
io_uring: don't forget to task-cancel drained reqs
io_uring: fix overflowed cancel w/ linked ->files
io_uring: drop req/tctx io_identity separately
io_uring: ensure consistent view of original task ->mm from SQPOLL
io_uring: properly handle SQPOLL request cancelations
io-wq: cancel request if it's asking for files and we don't have them
Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner().
This is one possible chain of events leading to this:
Task Prio Operation
T1 120 lock(F)
T2 120 lock(F) -> blocks (top waiter)
T3 50 (RT) lock(F) -> boosts T1 and blocks (new top waiter)
XX timeout/ -> wakes T2
signal
T1 50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set)
T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top waiter
and the lower priority T2 cannot steal the lock.
-> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON()
The comment states that this is invalid and rt_mutex_real_owner() must
return a non NULL owner when the trylock failed, but in case of a queued
and woken up waiter rt_mutex_real_owner() == NULL is a valid transient
state. The higher priority waiter has simply not yet managed to take over
the rtmutex.
The BUG_ON() is therefore wrong and this is just another retry condition in
fixup_pi_state_owner().
Drop the locks, so that T3 can make progress, and then try the fixup again.
Gratian provided a great analysis, traces and a reproducer. The analysis is
to the point, but it confused the hell out of that tglx dude who had to
page in all the futex horrors again. Condensed version is above.
[ tglx: Wrote comment and changelog ]
Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Reported-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com
Link: https://lore.kernel.org/r/87sg9pkvf7.fsf@nanos.tec.linutronix.de
Pull i2c fixes from Wolfram Sang:
"Driver bugfixes for I2C.
Most of them are for the new mlxbf driver which got more exposure
after rc1. The sh_mobile patch should already have reached you during
the merge window, but I accidently dropped it. However, since it fixes
a problem with rebooting, it is still fine for rc3"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: slave should do WRITE_REQUESTED before WRITE_RECEIVED
i2c: designware: call i2c_dw_read_clear_intrbits_slave() once
i2c: mlxbf: I2C_MLXBF should depend on MELLANOX_PLATFORM
i2c: mlxbf: Update author and maintainer email info
i2c: mlxbf: Update reference clock frequency
i2c: mlxbf: Remove unecessary wrapper functions
i2c: mlxbf: Fix resrticted cast warning of sparse
i2c: mlxbf: Add CONFIG_ACPI to guard ACPI function call
i2c: sh_mobile: implement atomic transfers
i2c: mediatek: move dma reset before i2c reset
* An SPDX comment style fix.
* A fix to ignore memory that is unusable.
* A fix to avoid setting a kernel text offset for the !MMU kernels, where
skipping the first page of memory is both unnecessary and costly.
* A fix to avoid passing the flag bits in satp to pfn_to_virt().
* A fix to __put_kernel_nofault, where we had the arguments to
__put_user_nocheck reversed.
* A workaround for a bug in the FU540 to avoid triggering PMP issues during
early boot.
* A change to how we pull symbols out of the vDSO. The old mechanism was
removed from binutils-2.35 (and has been backported to Debian's 2.34).
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl+mTYMTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYicfIEACUOSbfrGDQGjXXjp7uy2q/t3B6MLFd
lfoT6jsoDEd5S56xz2GVRuf4usK/spZQTNCRRFC6y0mEA6t/b17WRHrgKgce6M+D
JIvg0p5aGKihU+wSEBWINUwmjRAzSmM4wnAdOLSGMYGmgb2XdsMm5XAG+ejpxlRc
23gfF3H50cDQKbvbHpEpIOH3sk81TMYl+YO+U4f1xz0jC6j21Fxui03xSn0NSJkQ
eci3awi5PSqhS1rlWgObMactomZtfDs1vfwUt4x5XIoPqzqWLvifigvpmXgXcK2u
38BpOf4EAe5qjLmQ3C8zfxqMLUoU0zG3j9LgEcZOtkyLONeYAzN56Jy/o8+u2bN8
3BX6LMkIdpb2KQowgeP4q//zaPS/uxMueaLkyqtUKqkdl2VDeB0VuASr/k6zZVbI
0aEKJAPvuSMtdpr0mrFE6m3WWIP2qzHT6pXkbv20yWrLkf/18DudjnNkPzgGooVW
sMkXZDe1TRbMlf7/xwIrIgl+iDfzNLUUEL2i+1VAoloNI5InI9e9n2et6KxEdhXb
8K7qemUwtyBxlYX1pgfittltj5t2SBCKVuCxuJJCt1SYiIqljEAKYTjTIKINrzzz
QIDpTcEN17hgHnjEprae6dBfpD/Sn7YdYx35zmgN+9ynpPLClbNFbYZyl/4kP0aD
5f793CVJErfRVQ==
=TT6+
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- SPDX comment style fix
- ignore memory that is unusable
- avoid setting a kernel text offset for the !MMU kernels, where
skipping the first page of memory is both unnecessary and costly
- avoid passing the flag bits in satp to pfn_to_virt()
- fix __put_kernel_nofault, where we had the arguments to
__put_user_nocheck reversed
- workaround for a bug in the FU540 to avoid triggering PMP issues
during early boot
- change to how we pull symbols out of the vDSO. The old mechanism was
removed from binutils-2.35 (and has been backported to Debian's 2.34)
* tag 'riscv-for-linus-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Fix the VDSO symbol generaton for binutils-2.35+
RISC-V: Use non-PGD mappings for early DTB access
riscv: uaccess: fix __put_kernel_nofault()
riscv: fix pfn_to_virt err in do_page_fault().
riscv: Set text_offset correctly for M-Mode
RISC-V: Remove any memblock representing unusable memory area
risc-v: kernel: ftrace: Fixes improper SPDX comment style