A few late-arriving fixes, plus two more significant changes that were

*almost* ready at the beginning of the merge window:
 
 - A new document on debugging techniques from Sebastian Fricke
 
 - A clarification on MODULE_LICENSE terms meant to head off the sort of
   confusion that led to the recent Tuxedo Computers mess.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmdFEf0PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YLCQH+wY0lGEF5BloFrNOcwKoB96rXQjLMlPVpccP
 lWVprapS+NrlhTq4RZ9b6qbQ1RAdu0JCppew1viwclO8g8SmUoXmqNnlYIFH+3MB
 HZETbPWUHK2BRQqV7h3VkgvO30hUa0kHL3WfmKpGEG1P6FsQQ5o3WDi3YN8GM6xk
 tfHSiR4rgBw40VLyeDtRi++aEgYa/DfWpdtco58poCiAS6soTDDEWCxSBdibeDOQ
 YDuj1NtqieMk963z8CoJm/Qbw/ZLfW2jd3A43cZ0h6g/oloVYSucFcMjXpePvoZr
 9BSkU9OyX5BRfhU/6EbU8eWYjgu0BuBk5uvCwkQgHz1p05MGRDE=
 =MmbD
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.13-2' of git://git.lwn.net/linux

Pull more documentation updates from Jonathan Corbet:
 "A few late-arriving fixes, plus two more significant changes that were
  *almost* ready at the beginning of the merge window:

   - A new document on debugging techniques from Sebastian Fricke

   - A clarification on MODULE_LICENSE terms meant to head off the sort
     of confusion that led to the recent Tuxedo Computers mess"

* tag 'docs-6.13-2' of git://git.lwn.net/linux:
  docs: Add debugging guide for the media subsystem
  docs: Add debugging section to process
  docs/licensing: Clarify wording about "GPL" and "Proprietary"
  docs: core-api/gfp_mask-from-fs-io: indicate that vmalloc supports GFP_NOFS/GFP_NOIO
  Documentation: kernel-doc: enumerate identifier *type*s
  Documentation: pwrseq: Fix trivial misspellings
  Documentation: filesystems: update filename extensions
This commit is contained in:
Linus Torvalds 2024-11-26 13:44:27 -08:00
commit e68ce9474a
17 changed files with 804 additions and 31 deletions

View File

@ -20,6 +20,11 @@ Documentation/driver-api/media/index.rst
- for driver development information and Kernel APIs used by
media devices;
Documentation/process/debugging/media_specific_debugging_guide.rst
- for advice about essential tools and techniques to debug drivers on this
subsystem
.. toctree::
:caption: Table of Contents
:maxdepth: 2

View File

@ -55,14 +55,16 @@ scope.
What about __vmalloc(GFP_NOFS)
==============================
vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
GFP_KERNEL allocations deep inside the allocator which are quite non-trivial
to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
almost always a bug. The good news is that the NOFS/NOIO semantic can be
achieved by the scope API.
Since v5.17, and specifically after the commit 451769ebb7e79 ("mm/vmalloc:
alloc GFP_NO{FS,IO} for vmalloc"), GFP_NOFS/GFP_NOIO are now supported in
``[k]vmalloc`` by implicitly using scope API.
In earlier kernels ``vmalloc`` didn't support GFP_NOFS semantic because there
were hardcoded GFP_KERNEL allocations deep inside the allocator. That means
that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO was almost always a bug.
In the ideal world, upper layers should already mark dangerous contexts
and so no special care is required and vmalloc should be called without
any problems. Sometimes if the context is not really clear or there are
layering violations then the recommended way around that is to wrap ``vmalloc``
by the scope API with a comment explaining the problem.
and so no special care is required and ``vmalloc`` should be called without any
problems. Sometimes if the context is not really clear or there are layering
violations then the recommended way around that (on pre-v5.17 kernels) is to
wrap ``vmalloc`` by the scope API with a comment explaining the problem.

View File

@ -533,6 +533,7 @@ identifiers: *[ function/type ...]*
Include documentation for each *function* and *type* in *source*.
If no *function* is specified, the documentation for all functions
and types in the *source* will be included.
*type* can be a struct, union, enum, or typedef identifier.
Examples::

View File

@ -11,7 +11,7 @@ Introduction
============
This framework is designed to abstract complex power-up sequences that are
shared between multiple logical devices in the linux kernel.
shared between multiple logical devices in the Linux kernel.
The intention is to allow consumers to obtain a power sequencing handle
exposed by the power sequence provider and delegate the actual requesting and
@ -25,7 +25,7 @@ The power sequencing API uses a number of terms specific to the subsystem:
Unit
A unit is a discreet chunk of a power sequence. For instance one unit may
A unit is a discrete chunk of a power sequence. For instance one unit may
enable a set of regulators, another may enable a specific GPIO. Units can
define dependencies in the form of other units that must be enabled before
it itself can be.
@ -62,7 +62,7 @@ Provider interface
The provider API is admittedly not nearly as straightforward as the one for
consumers but it makes up for it in flexibility.
Each provider can logically split the power-up sequence into descrete chunks
Each provider can logically split the power-up sequence into discrete chunks
(units) and define their dependencies. They can then expose named targets that
consumers may use as the final point in the sequence that they wish to reach.
@ -72,7 +72,7 @@ register with the pwrseq subsystem by calling pwrseq_device_register().
Dynamic consumer matching
-------------------------
The main difference between pwrseq and other linux kernel providers is the
The main difference between pwrseq and other Linux kernel providers is the
mechanism for dynamic matching of consumers and providers. Every power sequence
provider driver must implement the `match()` callback and pass it to the pwrseq
core when registering with the subsystems.

View File

@ -442,7 +442,7 @@ which can be used to communicate directly with the autofs filesystem.
It requires CAP_SYS_ADMIN for access.
The 'ioctl's that can be used on this device are described in a separate
document `autofs-mount-control.txt`, and are summarised briefly here.
document `autofs-mount-control.rst`, and are summarised briefly here.
Each ioctl is passed a pointer to an `autofs_dev_ioctl` structure::
struct autofs_dev_ioctl {

View File

@ -36,7 +36,7 @@ None
Usage
=====
If you're just interested in OCFS2, then please see ocfs2.txt. The
If you're just interested in OCFS2, then please see ocfs2.rst. The
rest of this document will be geared towards those who want to use
dlmfs for easy to setup and easy to use clustered locking in
userspace.

View File

@ -16,7 +16,7 @@ btrfs filesystems. Like fscrypt, not too much filesystem-specific
code is needed to support fs-verity.
fs-verity is similar to `dm-verity
<https://www.kernel.org/doc/Documentation/device-mapper/verity.txt>`_
<https://www.kernel.org/doc/Documentation/admin-guide/device-mapper/verity.rst>`_
but works on files rather than block devices. On regular files on
filesystems supporting fs-verity, userspace can execute an ioctl that
causes the filesystem to build a Merkle tree for the file and persist

View File

@ -531,7 +531,7 @@ this retry process in the next article.
Automount points are locations in the filesystem where an attempt to
lookup a name can trigger changes to how that lookup should be
handled, in particular by mounting a filesystem there. These are
covered in greater detail in autofs.txt in the Linux documentation
covered in greater detail in autofs.rst in the Linux documentation
tree, but a few notes specifically related to path lookup are in order
here.

View File

@ -379,4 +379,4 @@ Papers and other documentation on dcache locking
2. http://lse.sourceforge.net/locking/dcache/dcache.html
3. path-lookup.md in this directory.
3. path-lookup.rst in this directory.

View File

@ -315,7 +315,7 @@ the above threads) is:
2) The cpio archive format chosen by the kernel is simpler and cleaner (and
thus easier to create and parse) than any of the (literally dozens of)
various tar archive formats. The complete initramfs archive format is
explained in buffer-format.txt, created in usr/gen_init_cpio.c, and
explained in buffer-format.rst, created in usr/gen_init_cpio.c, and
extracted in init/initramfs.c. All three together come to less than 26k
total of human-readable text.

View File

@ -587,7 +587,7 @@ Defined in ``include/linux/export.h``
Similar to :c:func:`EXPORT_SYMBOL()` except that the symbols
exported by :c:func:`EXPORT_SYMBOL_GPL()` can only be seen by
modules with a :c:func:`MODULE_LICENSE()` that specifies a GPL
modules with a :c:func:`MODULE_LICENSE()` that specifies a GPLv2
compatible license. It implies that the function is considered an
internal implementation issue, and not really an interface. Some
maintainers and developers may however require EXPORT_SYMBOL_GPL()

View File

@ -0,0 +1,223 @@
.. SPDX-License-Identifier: GPL-2.0
========================================
Debugging advice for driver development
========================================
This document serves as a general starting point and lookup for debugging
device drivers.
While this guide focuses on debugging that requires re-compiling the
module/kernel, the :doc:`userspace debugging guide
</process/debugging/userspace_debugging_guide>` will guide
you through tools like dynamic debug, ftrace and other tools useful for
debugging issues and behavior.
For general debugging advice, see the :doc:`general advice document
</process/debugging/index>`.
.. contents::
:depth: 3
The following sections show you the available tools.
printk() & friends
------------------
These are derivatives of printf() with varying destinations and support for
being dynamically turned on or off, or lack thereof.
Simple printk()
~~~~~~~~~~~~~~~
The classic, can be used to great effect for quick and dirty development
of new modules or to extract arbitrary necessary data for troubleshooting.
Prerequisite: ``CONFIG_PRINTK`` (usually enabled by default)
**Pros**:
- No need to learn anything, simple to use
- Easy to modify exactly to your needs (formatting of the data (See:
:doc:`/core-api/printk-formats`), visibility in the log)
- Can cause delays in the execution of the code (beneficial to confirm whether
timing is a factor)
**Cons**:
- Requires rebuilding the kernel/module
- Can cause delays in the execution of the code (which can cause issues to be
not reproducible)
For the full documentation see :doc:`/core-api/printk-basics`
Trace_printk
~~~~~~~~~~~~
Prerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>``
It is a tiny bit less comfortable to use than printk(), because you will have
to read the messages from the trace file (See: :ref:`read_ftrace_log`
instead of from the kernel log, but very useful when printk() adds unwanted
delays into the code execution, causing issues to be flaky or hidden.)
If the processing of this still causes timing issues then you can try
trace_puts().
For the full Documentation see trace_printk()
dev_dbg
~~~~~~~
Print statement, which can be targeted by
:ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains
additional information about the device used within the context.
**When is it appropriate to leave a debug print in the code?**
Permanent debug statements have to be useful for a developer to troubleshoot
driver misbehavior. Judging that is a bit more of an art than a science, but
some guidelines are in the :ref:`Coding style guidelines
<process/coding-style:13) printing kernel messages>`. In almost all cases the
debug statements shouldn't be upstreamed, as a working driver is supposed to be
silent.
Custom printk
~~~~~~~~~~~~~
Example::
#define core_dbg(fmt, arg...) do { \
if (core_debug) \
printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \
} while (0)
**When should you do this?**
It is better to just use a pr_debug(), which can later be turned on/off with
dynamic debug. Additionally, a lot of drivers activate these prints via a
variable like ``core_debug`` set by a module parameter. However, Module
parameters `are not recommended anymore
<https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_.
Ftrace
------
Creating a custom Ftrace tracepoint
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A tracepoint adds a hook into your code that will be called and logged when the
tracepoint is enabled. This can be used, for example, to trace hitting a
conditional branch or to dump the internal state at specific points of the code
flow during a debugging session.
Here is a basic description of :ref:`how to implement new tracepoints
<trace/tracepoints:usage>`.
For the full event tracing documentation see :doc:`/trace/events`
For the full Ftrace documentation see :doc:`/trace/ftrace`
DebugFS
-------
Prerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>``
DebugFS differs from the other approaches of debugging, as it doesn't write
messages to the kernel log nor add traces to the code. Instead it allows the
developer to handle a set of files.
With these files you can either store values of variables or make
register/memory dumps or you can make these files writable and modify
values/settings in the driver.
Possible use-cases among others:
- Store register values
- Keep track of variables
- Store errors
- Store settings
- Toggle a setting like debug on/off
- Error injection
This is especially useful, when the size of a data dump would be hard to digest
as part of the general kernel log (for example when dumping raw bitstream data)
or when you are not interested in all the values all the time, but with the
possibility to inspect them.
The general idea is:
- Create a directory during probe (``struct dentry *parent =
debugfs_create_dir("my_driver", NULL);``)
- Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``)
- In this example the file is found in
``/sys/kernel/debug/my_driver/my_value`` (with read permissions for
user/group/all)
- any read of the file will return the current contents of the variable
``my_variable``
- Clean up the directory when removing the device
(``debugfs_remove_recursive(parent);``)
For the full documentation see :doc:`/filesystems/debugfs`.
KASAN, UBSAN, lockdep and other error checkers
----------------------------------------------
KASAN (Kernel Address Sanitizer)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prerequisite: ``CONFIG_KASAN``
KASAN is a dynamic memory error detector that helps to find use-after-free and
out-of-bounds bugs. It uses compile-time instrumentation to check every memory
access.
For the full documentation see :doc:`/dev-tools/kasan`.
UBSAN (Undefined Behavior Sanitizer)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prerequisite: ``CONFIG_UBSAN``
UBSAN relies on compiler instrumentation and runtime checks to detect undefined
behavior. It is designed to find a variety of issues, including signed integer
overflow, array index out of bounds, and more.
For the full documentation see :doc:`/dev-tools/ubsan`
lockdep (Lock Dependency Validator)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prerequisite: ``CONFIG_DEBUG_LOCKDEP``
lockdep is a runtime lock dependency validator that detects potential deadlocks
and other locking-related issues in the kernel.
It tracks lock acquisitions and releases, building a dependency graph that is
analyzed for potential deadlocks.
lockdep is especially useful for validating the correctness of lock ordering in
the kernel.
PSI (Pressure stall information tracking)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prerequisite: ``CONFIG_PSI``
PSI is a measurement tool to identify excessive overcommits on hardware
resources, that can cause performance disruptions or even OOM kills.
device coredump
---------------
Prerequisite: ``#include <linux/devcoredump.h>``
Provides the infrastructure for a driver to provide arbitrary data to userland.
It is most often used in conjunction with udev or similar userland application
to listen for kernel uevents, which indicate that the dump is ready. Udev has
rules to copy that file somewhere for long-term storage and analysis, as by
default, the data for the dump is automatically cleaned up after 5 minutes.
That data is analyzed with driver-specific tools or GDB.
You can find an example implementation at:
`drivers/media/platform/qcom/venus/core.c
<https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__
**Copyright** ©2024 : Collabora

View File

@ -0,0 +1,78 @@
.. SPDX-License-Identifier: GPL-2.0
============================================
Debugging advice for Linux Kernel developers
============================================
general guides
--------------
.. toctree::
:maxdepth: 1
driver_development_debugging_guide
userspace_debugging_guide
.. only:: subproject and html
subsystem specific guides
-------------------------
.. toctree::
:maxdepth: 1
media_specific_debugging_guide
.. only:: subproject and html
Indices
=======
* :ref:`genindex`
General debugging advice
========================
Depending on the issue, a different set of tools is available to track down the
problem or even to realize whether there is one in the first place.
As a first step you have to figure out what kind of issue you want to debug.
Depending on the answer, your methodology and choice of tools may vary.
Do I need to debug with limited access?
---------------------------------------
Do you have limited access to the machine or are you unable to stop the running
execution?
In this case your debugging capability depends on built-in debugging support of
provided distribution kernel.
The :doc:`/process/debugging/userspace_debugging_guide` provides a brief
overview over a range of possible debugging tools in that situation. You can
check the capability of your kernel, in most cases, by looking into config file
within the /boot directory.
Do I have root access to the system?
------------------------------------
Are you easily able to replace the module in question or to install a new
kernel?
In that case your range of available tools is a lot bigger, you can find the
tools in the :doc:`/process/debugging/driver_development_debugging_guide`.
Is timing a factor?
-------------------
It is important to understand if the problem you want to debug manifests itself
consistently (i.e. given a set of inputs you always get the same, incorrect
output), or inconsistently. If it manifests itself inconsistently, some timing
factor might be at play. If inserting delays into the code does change the
behavior, then quite likely timing is a factor.
When timing does alter the outcome of the code execution using a simple
printk() for debugging purposes may not work, a similar alternative is to use
trace_printk() , which logs the debug messages to the trace file instead of the
kernel log.
**Copyright** ©2024 : Collabora

View File

@ -0,0 +1,180 @@
.. SPDX-License-Identifier: GPL-2.0
============================================
Debugging and tracing in the media subsystem
============================================
This document serves as a starting point and lookup for debugging device
drivers in the media subsystem and to debug these drivers from userspace.
.. contents::
:depth: 3
General debugging advice
------------------------
For general advice see the :doc:`general advice document
</process/debugging/index>`.
The following sections show you some of the available tools.
dev_debug module parameter
--------------------------
Every video device provides a ``dev_debug`` parameter, which allows to get
further insights into the IOCTLs in the background.::
# cat /sys/class/video4linux/video3/name
rkvdec
# echo 0xff > /sys/class/video4linux/video3/dev_debug
# dmesg -wH
[...] videodev: v4l2_open: video3: open (0)
[ +0.000036] video3: VIDIOC_QUERYCAP: driver=rkvdec, card=rkvdec,
bus=platform:rkvdec, version=0x00060900, capabilities=0x84204000,
device_caps=0x04204000
For the full documentation see :ref:`driver-api/media/v4l2-dev:video device
debugging`
dev_dbg() / v4l2_dbg()
----------------------
Two debug print statements, which are specific for devices and for the v4l2
subsystem, avoid adding these to your final submission unless they have
long-term value for investigations.
For a general overview please see the
:ref:`process/debugging/driver_development_debugging_guide:printk() & friends`
guide.
- Difference between both?
- v4l2_dbg() utilizes v4l2_printk() under the hood, which further uses
printk() directly, thus it cannot be targeted by dynamic debug
- dev_dbg() can be targeted by dynamic debug
- v4l2_dbg() has a more specific prefix format for the media subsystem, while
dev_dbg only highlights the driver name and the location of the log
Dynamic debug
-------------
A method to trim down the debug output to your needs.
For general advice see the
:ref:`process/debugging/userspace_debugging_guide:dynamic debug` guide.
Here is one example, that enables all available pr_debug()'s within the file::
$ alias ddcmd='echo $* > /proc/dynamic_debug/control'
$ ddcmd '-p; file v4l2-h264.c +p'
$ grep =p /proc/dynamic_debug/control
drivers/media/v4l2-core/v4l2-h264.c:372 [v4l2_h264]print_ref_list_b =p
"ref_pic_list_b%u (cur_poc %u%c) %s"
drivers/media/v4l2-core/v4l2-h264.c:333 [v4l2_h264]print_ref_list_p =p
"ref_pic_list_p (cur_poc %u%c) %s\n"
Ftrace
------
An internal kernel tracer that can trace static predefined events, function
calls, etc. Very useful for debugging problems without changing the kernel and
understanding the behavior of subsystems.
For general advice see the
:ref:`process/debugging/userspace_debugging_guide:ftrace` guide.
DebugFS
-------
This tool allows you to dump or modify internal values of your driver to files
in a custom filesystem.
For general advice see the
:ref:`process/debugging/driver_development_debugging_guide:debugfs` guide.
Perf & alternatives
-------------------
Tools to measure the various stats on a running system to diagnose issues.
For general advice see the
:ref:`process/debugging/userspace_debugging_guide:perf & alternatives` guide.
Example for media devices:
Gather statistics data for a decoding job: (This example is on a RK3399 SoC
with the rkvdec codec driver using the `fluster test suite
<https://github.com/fluendo/fluster>`__)::
perf stat -d python3 fluster.py run -d GStreamer-H.264-V4L2SL-Gst1.0 -ts
JVT-AVC_V1 -tv AUD_MW_E -j1
...
Performance counter stats for 'python3 fluster.py run -d
GStreamer-H.264-V4L2SL-Gst1.0 -ts JVT-AVC_V1 -tv AUD_MW_E -j1 -v':
7794.23 msec task-clock:u # 0.697 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
11901 page-faults:u # 1.527 K/sec
882671556 cycles:u # 0.113 GHz (95.79%)
711708695 instructions:u # 0.81 insn per cycle (95.79%)
10581935 branches:u # 1.358 M/sec (15.13%)
6871144 branch-misses:u # 64.93% of all branches (95.79%)
281716547 L1-dcache-loads:u # 36.144 M/sec (95.79%)
9019581 L1-dcache-load-misses:u # 3.20% of all L1-dcache accesses (95.79%)
<not supported> LLC-loads:u
<not supported> LLC-load-misses:u
11.180830431 seconds time elapsed
1.502318000 seconds user
6.377221000 seconds sys
The availability of events and metrics depends on the system you are running.
Error checking & panic analysis
-------------------------------
Various Kernel configuration options to enhance error detection of the Linux
Kernel with the cost of lowering performance.
For general advice see the
:ref:`process/debugging/driver_development_debugging_guide:kasan, ubsan,
lockdep and other error checkers` guide.
Driver verification with v4l2-compliance
----------------------------------------
To verify, that a driver adheres to the v4l2 API, the tool v4l2-compliance is
used, which is part of the `v4l_utils
<https://git.linuxtv.org/v4l-utils.git>`__, a suite of userspace tools to work
with the media subsystem.
To see the detailed media topology (and check it) use::
v4l2-compliance -M /dev/mediaX --verbose
You can also run a full compliance check for all devices referenced in the
media topology with::
v4l2-compliance -m /dev/mediaX
Debugging problems with receiving video
---------------------------------------
Implementing vidioc_log_status in the driver: this can log the current status
to the kernel log. It's called by v4l2-ctl --log-status. Very useful for
debugging problems with receiving video (TV/S-Video/HDMI/etc) since the video
signal is external (so unpredictable). Less useful with camera sensor inputs
since you have control over what the camera sensor does.
Usually you can just assign the default::
.vidioc_log_status = v4l2_ctrl_log_status,
But you can also create your own callback, to create a custom status log.
You can find an example in the cobalt driver
(`drivers/media/pci/cobalt/cobalt-v4l2.c <https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/pci/cobalt/cobalt-v4l2.c#L567>`__).
**Copyright** ©2024 : Collabora

View File

@ -0,0 +1,280 @@
.. SPDX-License-Identifier: GPL-2.0
==========================
Userspace debugging advice
==========================
This document provides a brief overview of common tools to debug the Linux
Kernel from userspace.
For debugging advice aimed at driver developers go :doc:`here
</process/debugging/driver_development_debugging_guide>`.
For general debugging advice, see :doc:`general advice document
</process/debugging/index>`.
.. contents::
:depth: 3
The following sections show you the available tools.
Dynamic debug
-------------
Mechanism to filter what ends up in the kernel log by dis-/en-abling log
messages.
Prerequisite: ``CONFIG_DYNAMIC_DEBUG``
Dynamic debug is only able to target:
- pr_debug()
- dev_dbg()
- print_hex_dump_debug()
- print_hex_dump_bytes()
Therefore the usability of this tool is, as of now, quite limited as there is
no uniform rule for adding debug prints to the codebase, resulting in a variety
of ways these prints are implemented.
Also, note that most debug statements are implemented as a variation of
dprintk(), which have to be activated via a parameter in respective module,
dynamic debug is unable to do that step for you.
Here is one example, that enables all available pr_debug()'s within the file::
$ alias ddcmd='echo $* > /proc/dynamic_debug/control'
$ ddcmd '-p; file v4l2-h264.c +p'
$ grep =p /proc/dynamic_debug/control
drivers/media/v4l2-core/v4l2-h264.c:372 [v4l2_h264]print_ref_list_b =p
"ref_pic_list_b%u (cur_poc %u%c) %s"
drivers/media/v4l2-core/v4l2-h264.c:333 [v4l2_h264]print_ref_list_p =p
"ref_pic_list_p (cur_poc %u%c) %s\n"
**When should you use this over Ftrace ?**
- When the code contains one of the valid print statements (see above) or when
you have added multiple pr_debug() statements during development
- When timing is not an issue, meaning if multiple pr_debug() statements in
the code won't cause delays
- When you care more about receiving specific log messages than tracing the
pattern of how a function is called
For the full documentation see :doc:`/admin-guide/dynamic-debug-howto`
Ftrace
------
Prerequisite: ``CONFIG_DYNAMIC_FTRACE``
This tool uses the tracefs file system for the control files and output files.
That file system will be mounted as a ``tracing`` directory, which can be found
in either ``/sys/kernel/`` or ``/sys/debug/kernel/``.
Some of the most important operations for debugging are:
- You can perform a function trace by adding a function name to the
``set_ftrace_filter`` file (which accepts any function name found within the
``available_filter_functions`` file) or you can specifically disable certain
functions by adding their names to the ``set_ftrace_notrace`` file (more info
at: :ref:`trace/ftrace:dynamic ftrace`).
- In order to find out where calls originate from you can activate the
``func_stack_trace`` option under ``options/func_stack_trace``.
- Tracing the children of a function call and showing the return values are
possible by adding the desired function in the ``set_graph_function`` file
(requires config ``FUNCTION_GRAPH_RETVAL``); more info at
:ref:`trace/ftrace:dynamic ftrace with the function graph tracer`.
For the full Ftrace documentation see :doc:`/trace/ftrace`
Or you could also trace for specific events by :ref:`using event tracing
<trace/events:2. using event tracing>`, which can be defined as described here:
:ref:`Creating a custom Ftrace tracepoint
<process/debugging/driver_development_debugging_guide:ftrace>`.
For the full Ftrace event tracing documentation see :doc:`/trace/events`
.. _read_ftrace_log:
Reading the ftrace log
~~~~~~~~~~~~~~~~~~~~~~
The ``trace`` file can be read just like any other file (``cat``, ``tail``,
``head``, ``vim``, etc.), the size of the file is limited by the
``buffer_size_kb`` (``echo 1000 > buffer_size_kb``). The
:ref:`trace/ftrace:trace_pipe` will behave similarly to the ``trace`` file, but
whenever you read from the file the content is consumed.
Kernelshark
~~~~~~~~~~~
A GUI interface to visualize the traces as a graph and list view from the
output of the `trace-cmd
<https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/>`__ application.
For the full documentation see `<https://kernelshark.org/Documentation.html>`__
Perf & alternatives
-------------------
The tools mentioned above provide ways to inspect kernel code, results,
variable values, etc. Sometimes you have to find out first where to look and
for those cases, a box of performance tracking tools can help you to frame the
issue.
Why should you do a performance analysis?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A performance analysis is a good first step when among other reasons:
- you cannot define the issue
- you do not know where it occurs
- the running system should not be interrupted or it is a remote system, where
you cannot install a new module/kernel
How to do a simple analysis with linux tools?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For the start of a performance analysis, you can start with the usual tools
like:
- ``top`` / ``htop`` / ``atop`` (*get an overview of the system load, see
spikes on specific processes*)
- ``mpstat -P ALL`` (*look at the load distribution among CPUs*)
- ``iostat -x`` (*observe input and output devices utilization and performance*)
- ``vmstat`` (*overview of memory usage on the system*)
- ``pidstat`` (*similar to* ``vmstat`` *but per process, to dial it down to the
target*)
- ``strace -tp $PID`` (*once you know the process, you can figure out how it
communicates with the Kernel*)
These should help to narrow down the areas to look at sufficiently.
Diving deeper with perf
~~~~~~~~~~~~~~~~~~~~~~~
The **perf** tool provides a series of metrics and events to further dial down
on issues.
Prerequisite: build or install perf on your system
Gather statistics data for finding all files starting with ``gcc`` in ``/usr``::
# perf stat -d find /usr -name 'gcc*' | wc -l
Performance counter stats for 'find /usr -name gcc*':
1277.81 msec task-clock # 0.997 CPUs utilized
9 context-switches # 7.043 /sec
1 cpu-migrations # 0.783 /sec
704 page-faults # 550.943 /sec
766548897 cycles # 0.600 GHz (97.15%)
798285467 instructions # 1.04 insn per cycle (97.15%)
57582731 branches # 45.064 M/sec (2.85%)
3842573 branch-misses # 6.67% of all branches (97.15%)
281616097 L1-dcache-loads # 220.390 M/sec (97.15%)
4220975 L1-dcache-load-misses # 1.50% of all L1-dcache accesses (97.15%)
<not supported> LLC-loads
<not supported> LLC-load-misses
1.281746009 seconds time elapsed
0.508796000 seconds user
0.773209000 seconds sys
52
The availability of events and metrics depends on the system you are running.
For the full documentation see
`<https://perf.wiki.kernel.org/index.php/Main_Page>`__
Perfetto
~~~~~~~~
A set of tools to measure and analyze how well applications and systems perform.
You can use it to:
* identify bottlenecks
* optimize code
* make software run faster and more efficiently.
**What is the difference between perfetto and perf?**
* perf is tool as part of and specialized for the Linux Kernel and has CLI user
interface.
* perfetto cross-platform performance analysis stack, has extended
functionality into userspace and provides a WEB user interface.
For the full documentation see `<https://perfetto.dev/docs/>`__
Kernel panic analysis tools
---------------------------
To capture the crash dump please use ``Kdump`` & ``Kexec``. Below you can find
some advice for analysing the data.
For the full documentation see the :doc:`/admin-guide/kdump/kdump`
In order to find the corresponding line in the code you can use `faddr2line
<https://elixir.bootlin.com/linux/v6.11.6/source/scripts/faddr2line>`__; note
that you need to enable ``CONFIG_DEBUG_INFO`` for that to work.
An alternative to using ``faddr2line`` is the use of ``objdump`` (and its
derivatives for the different platforms like ``aarch64-linux-gnu-objdump``).
Take this line as an example:
``[ +0.000240] rkvdec_device_run+0x50/0x138 [rockchip_vdec]``.
We can find the corresponding line of code by executing::
aarch64-linux-gnu-objdump -dS drivers/staging/media/rkvdec/rockchip-vdec.ko | grep rkvdec_device_run\>: -A 40
0000000000000ac8 <rkvdec_device_run>:
ac8: d503201f nop
acc: d503201f nop
{
ad0: d503233f paciasp
ad4: a9bd7bfd stp x29, x30, [sp, #-48]!
ad8: 910003fd mov x29, sp
adc: a90153f3 stp x19, x20, [sp, #16]
ae0: a9025bf5 stp x21, x22, [sp, #32]
const struct rkvdec_coded_fmt_desc *desc = ctx->coded_fmt_desc;
ae4: f9411814 ldr x20, [x0, #560]
struct rkvdec_dev *rkvdec = ctx->dev;
ae8: f9418015 ldr x21, [x0, #768]
if (WARN_ON(!desc))
aec: b4000654 cbz x20, bb4 <rkvdec_device_run+0xec>
ret = pm_runtime_resume_and_get(rkvdec->dev);
af0: f943d2b6 ldr x22, [x21, #1952]
ret = __pm_runtime_resume(dev, RPM_GET_PUT);
af4: aa0003f3 mov x19, x0
af8: 52800081 mov w1, #0x4 // #4
afc: aa1603e0 mov x0, x22
b00: 94000000 bl 0 <__pm_runtime_resume>
if (ret < 0) {
b04: 37f80340 tbnz w0, #31, b6c <rkvdec_device_run+0xa4>
dev_warn(rkvdec->dev, "Not good\n");
b08: f943d2a0 ldr x0, [x21, #1952]
b0c: 90000001 adrp x1, 0 <rkvdec_try_ctrl-0x8>
b10: 91000021 add x1, x1, #0x0
b14: 94000000 bl 0 <_dev_warn>
*bad = 1;
b18: d2800001 mov x1, #0x0 // #0
...
Meaning, in this line from the crash dump::
[ +0.000240] rkvdec_device_run+0x50/0x138 [rockchip_vdec]
I can take the ``0x50`` as offset, which I have to add to the base address
of the corresponding function, which I find in this line::
0000000000000ac8 <rkvdec_device_run>:
The result of ``0xac8 + 0x50 = 0xb18``
And when I search for that address within the function I get the
following line::
*bad = 1;
b18: d2800001 mov x1, #0x0
**Copyright** ©2024 : Collabora

View File

@ -72,13 +72,15 @@ beyond).
Dealing with bugs
-----------------
Bugs are a fact of life; it is important that we handle them properly.
The documents below describe our policies around the handling of a couple
of special classes of bugs: regressions and security problems.
Bugs are a fact of life; it is important that we handle them properly. The
documents below provide general advice about debugging and describe our
policies around the handling of a couple of special classes of bugs:
regressions and security problems.
.. toctree::
:maxdepth: 1
debugging/index
handling-regressions
security-bugs
cve

View File

@ -471,14 +471,16 @@ _`MODULE_LICENSE`
source files.
"Proprietary" The module is under a proprietary license.
This string is solely for proprietary third
party modules and cannot be used for modules
which have their source code in the kernel
tree. Modules tagged that way are tainting
the kernel with the 'P' flag when loaded and
the kernel module loader refuses to link such
modules against symbols which are exported
with EXPORT_SYMBOL_GPL().
"Proprietary" is to be understood only as
"The license is not compatible to GPLv2".
This string is solely for non-GPL2 compatible
third party modules and cannot be used for
modules which have their source code in the
kernel tree. Modules tagged that way are
tainting the kernel with the 'P' flag when
loaded and the kernel module loader refuses
to link such modules against symbols which
are exported with EXPORT_SYMBOL_GPL().
============================= =============================================