mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-11 16:29:05 +00:00
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM: PM QOS update fix Freezer / cgroup freezer: Update stale locking comments PM / platform_bus: Allow runtime PM by default i2c: Fix bus-level power management callbacks PM QOS update PM / Hibernate: Fix block_io.c printk warning PM / Hibernate: Group swap ops PM / Hibernate: Move the first_sector out of swsusp_write PM / Hibernate: Separate block_io PM / Hibernate: Snapshot cleanup FS / libfs: Implement simple_write_to_buffer PM / Hibernate: document open(/dev/snapshot) side effects PM / Runtime: Add sysfs debug files PM: Improve device power management document PM: Update device power management document PM: Allow runtime_suspend methods to call pm_schedule_suspend() PM: pm_wakeup - switch to using bool
This commit is contained in:
commit
46ee964509
@ -1,7 +1,13 @@
|
||||
Device Power Management
|
||||
|
||||
Copyright (c) 2010 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
|
||||
Copyright (c) 2010 Alan Stern <stern@rowland.harvard.edu>
|
||||
|
||||
|
||||
Most of the code in Linux is device drivers, so most of the Linux power
|
||||
management code is also driver-specific. Most drivers will do very little;
|
||||
others, especially for platforms with small batteries (like cell phones),
|
||||
will do a lot.
|
||||
management (PM) code is also driver-specific. Most drivers will do very
|
||||
little; others, especially for platforms with small batteries (like cell
|
||||
phones), will do a lot.
|
||||
|
||||
This writeup gives an overview of how drivers interact with system-wide
|
||||
power management goals, emphasizing the models and interfaces that are
|
||||
@ -15,9 +21,10 @@ Drivers will use one or both of these models to put devices into low-power
|
||||
states:
|
||||
|
||||
System Sleep model:
|
||||
Drivers can enter low power states as part of entering system-wide
|
||||
low-power states like "suspend-to-ram", or (mostly for systems with
|
||||
disks) "hibernate" (suspend-to-disk).
|
||||
Drivers can enter low-power states as part of entering system-wide
|
||||
low-power states like "suspend" (also known as "suspend-to-RAM"), or
|
||||
(mostly for systems with disks) "hibernation" (also known as
|
||||
"suspend-to-disk").
|
||||
|
||||
This is something that device, bus, and class drivers collaborate on
|
||||
by implementing various role-specific suspend and resume methods to
|
||||
@ -25,33 +32,41 @@ states:
|
||||
them without loss of data.
|
||||
|
||||
Some drivers can manage hardware wakeup events, which make the system
|
||||
leave that low-power state. This feature may be disabled using the
|
||||
relevant /sys/devices/.../power/wakeup file; enabling it may cost some
|
||||
power usage, but let the whole system enter low power states more often.
|
||||
leave the low-power state. This feature may be enabled or disabled
|
||||
using the relevant /sys/devices/.../power/wakeup file (for Ethernet
|
||||
drivers the ioctl interface used by ethtool may also be used for this
|
||||
purpose); enabling it may cost some power usage, but let the whole
|
||||
system enter low-power states more often.
|
||||
|
||||
Runtime Power Management model:
|
||||
Drivers may also enter low power states while the system is running,
|
||||
independently of other power management activity. Upstream drivers
|
||||
will normally not know (or care) if the device is in some low power
|
||||
state when issuing requests; the driver will auto-resume anything
|
||||
that's needed when it gets a request.
|
||||
Devices may also be put into low-power states while the system is
|
||||
running, independently of other power management activity in principle.
|
||||
However, devices are not generally independent of each other (for
|
||||
example, a parent device cannot be suspended unless all of its child
|
||||
devices have been suspended). Moreover, depending on the bus type the
|
||||
device is on, it may be necessary to carry out some bus-specific
|
||||
operations on the device for this purpose. Devices put into low power
|
||||
states at run time may require special handling during system-wide power
|
||||
transitions (suspend or hibernation).
|
||||
|
||||
This doesn't have, or need much infrastructure; it's just something you
|
||||
should do when writing your drivers. For example, clk_disable() unused
|
||||
clocks as part of minimizing power drain for currently-unused hardware.
|
||||
Of course, sometimes clusters of drivers will collaborate with each
|
||||
other, which could involve task-specific power management.
|
||||
For these reasons not only the device driver itself, but also the
|
||||
appropriate subsystem (bus type, device type or device class) driver and
|
||||
the PM core are involved in runtime power management. As in the system
|
||||
sleep power management case, they need to collaborate by implementing
|
||||
various role-specific suspend and resume methods, so that the hardware
|
||||
is cleanly powered down and reactivated without data or service loss.
|
||||
|
||||
There's not a lot to be said about those low power states except that they
|
||||
are very system-specific, and often device-specific. Also, that if enough
|
||||
drivers put themselves into low power states (at "runtime"), the effect may be
|
||||
the same as entering some system-wide low-power state (system sleep) ... and
|
||||
that synergies exist, so that several drivers using runtime pm might put the
|
||||
system into a state where even deeper power saving options are available.
|
||||
There's not a lot to be said about those low-power states except that they are
|
||||
very system-specific, and often device-specific. Also, that if enough devices
|
||||
have been put into low-power states (at runtime), the effect may be very similar
|
||||
to entering some system-wide low-power state (system sleep) ... and that
|
||||
synergies exist, so that several drivers using runtime PM might put the system
|
||||
into a state where even deeper power saving options are available.
|
||||
|
||||
Most suspended devices will have quiesced all I/O: no more DMA or irqs, no
|
||||
more data read or written, and requests from upstream drivers are no longer
|
||||
accepted. A given bus or platform may have different requirements though.
|
||||
Most suspended devices will have quiesced all I/O: no more DMA or IRQs (except
|
||||
for wakeup events), no more data read or written, and requests from upstream
|
||||
drivers are no longer accepted. A given bus or platform may have different
|
||||
requirements though.
|
||||
|
||||
Examples of hardware wakeup events include an alarm from a real time clock,
|
||||
network wake-on-LAN packets, keyboard or mouse activity, and media insertion
|
||||
@ -60,129 +75,152 @@ or removal (for PCMCIA, MMC/SD, USB, and so on).
|
||||
|
||||
Interfaces for Entering System Sleep States
|
||||
===========================================
|
||||
Most of the programming interfaces a device driver needs to know about
|
||||
relate to that first model: entering a system-wide low power state,
|
||||
rather than just minimizing power consumption by one device.
|
||||
There are programming interfaces provided for subsystems (bus type, device type,
|
||||
device class) and device drivers to allow them to participate in the power
|
||||
management of devices they are concerned with. These interfaces cover both
|
||||
system sleep and runtime power management.
|
||||
|
||||
|
||||
Bus Driver Methods
|
||||
------------------
|
||||
The core methods to suspend and resume devices reside in struct bus_type.
|
||||
These are mostly of interest to people writing infrastructure for busses
|
||||
like PCI or USB, or because they define the primitives that device drivers
|
||||
may need to apply in domain-specific ways to their devices:
|
||||
Device Power Management Operations
|
||||
----------------------------------
|
||||
Device power management operations, at the subsystem level as well as at the
|
||||
device driver level, are implemented by defining and populating objects of type
|
||||
struct dev_pm_ops:
|
||||
|
||||
struct bus_type {
|
||||
...
|
||||
int (*suspend)(struct device *dev, pm_message_t state);
|
||||
int (*resume)(struct device *dev);
|
||||
struct dev_pm_ops {
|
||||
int (*prepare)(struct device *dev);
|
||||
void (*complete)(struct device *dev);
|
||||
int (*suspend)(struct device *dev);
|
||||
int (*resume)(struct device *dev);
|
||||
int (*freeze)(struct device *dev);
|
||||
int (*thaw)(struct device *dev);
|
||||
int (*poweroff)(struct device *dev);
|
||||
int (*restore)(struct device *dev);
|
||||
int (*suspend_noirq)(struct device *dev);
|
||||
int (*resume_noirq)(struct device *dev);
|
||||
int (*freeze_noirq)(struct device *dev);
|
||||
int (*thaw_noirq)(struct device *dev);
|
||||
int (*poweroff_noirq)(struct device *dev);
|
||||
int (*restore_noirq)(struct device *dev);
|
||||
int (*runtime_suspend)(struct device *dev);
|
||||
int (*runtime_resume)(struct device *dev);
|
||||
int (*runtime_idle)(struct device *dev);
|
||||
};
|
||||
|
||||
Bus drivers implement those methods as appropriate for the hardware and
|
||||
the drivers using it; PCI works differently from USB, and so on. Not many
|
||||
people write bus drivers; most driver code is a "device driver" that
|
||||
builds on top of bus-specific framework code.
|
||||
This structure is defined in include/linux/pm.h and the methods included in it
|
||||
are also described in that file. Their roles will be explained in what follows.
|
||||
For now, it should be sufficient to remember that the last three methods are
|
||||
specific to runtime power management while the remaining ones are used during
|
||||
system-wide power transitions.
|
||||
|
||||
There also is a deprecated "old" or "legacy" interface for power management
|
||||
operations available at least for some subsystems. This approach does not use
|
||||
struct dev_pm_ops objects and it is suitable only for implementing system sleep
|
||||
power management methods. Therefore it is not described in this document, so
|
||||
please refer directly to the source code for more information about it.
|
||||
|
||||
|
||||
Subsystem-Level Methods
|
||||
-----------------------
|
||||
The core methods to suspend and resume devices reside in struct dev_pm_ops
|
||||
pointed to by the pm member of struct bus_type, struct device_type and
|
||||
struct class. They are mostly of interest to the people writing infrastructure
|
||||
for buses, like PCI or USB, or device type and device class drivers.
|
||||
|
||||
Bus drivers implement these methods as appropriate for the hardware and the
|
||||
drivers using it; PCI works differently from USB, and so on. Not many people
|
||||
write subsystem-level drivers; most driver code is a "device driver" that builds
|
||||
on top of bus-specific framework code.
|
||||
|
||||
For more information on these driver calls, see the description later;
|
||||
they are called in phases for every device, respecting the parent-child
|
||||
sequencing in the driver model tree. Note that as this is being written,
|
||||
only the suspend() and resume() are widely available; not many bus drivers
|
||||
leverage all of those phases, or pass them down to lower driver levels.
|
||||
sequencing in the driver model tree.
|
||||
|
||||
|
||||
/sys/devices/.../power/wakeup files
|
||||
-----------------------------------
|
||||
All devices in the driver model have two flags to control handling of
|
||||
wakeup events, which are hardware signals that can force the device and/or
|
||||
system out of a low power state. These are initialized by bus or device
|
||||
driver code using device_init_wakeup(dev,can_wakeup).
|
||||
All devices in the driver model have two flags to control handling of wakeup
|
||||
events (hardware signals that can force the device and/or system out of a low
|
||||
power state). These flags are initialized by bus or device driver code using
|
||||
device_set_wakeup_capable() and device_set_wakeup_enable(), defined in
|
||||
include/linux/pm_wakeup.h.
|
||||
|
||||
The "can_wakeup" flag just records whether the device (and its driver) can
|
||||
physically support wakeup events. When that flag is clear, the sysfs
|
||||
"wakeup" file is empty, and device_may_wakeup() returns false.
|
||||
physically support wakeup events. The device_set_wakeup_capable() routine
|
||||
affects this flag. The "should_wakeup" flag controls whether the device should
|
||||
try to use its wakeup mechanism. device_set_wakeup_enable() affects this flag;
|
||||
for the most part drivers should not change its value. The initial value of
|
||||
should_wakeup is supposed to be false for the majority of devices; the major
|
||||
exceptions are power buttons, keyboards, and Ethernet adapters whose WoL
|
||||
(wake-on-LAN) feature has been set up with ethtool.
|
||||
|
||||
For devices that can issue wakeup events, a separate flag controls whether
|
||||
that device should try to use its wakeup mechanism. The initial value of
|
||||
device_may_wakeup() will be true, so that the device's "wakeup" file holds
|
||||
the value "enabled". Userspace can change that to "disabled" so that
|
||||
device_may_wakeup() returns false; or change it back to "enabled" (so that
|
||||
it returns true again).
|
||||
Whether or not a device is capable of issuing wakeup events is a hardware
|
||||
matter, and the kernel is responsible for keeping track of it. By contrast,
|
||||
whether or not a wakeup-capable device should issue wakeup events is a policy
|
||||
decision, and it is managed by user space through a sysfs attribute: the
|
||||
power/wakeup file. User space can write the strings "enabled" or "disabled" to
|
||||
set or clear the should_wakeup flag, respectively. Reads from the file will
|
||||
return the corresponding string if can_wakeup is true, but if can_wakeup is
|
||||
false then reads will return an empty string, to indicate that the device
|
||||
doesn't support wakeup events. (But even though the file appears empty, writes
|
||||
will still affect the should_wakeup flag.)
|
||||
|
||||
The device_may_wakeup() routine returns true only if both flags are set.
|
||||
Drivers should check this routine when putting devices in a low-power state
|
||||
during a system sleep transition, to see whether or not to enable the devices'
|
||||
wakeup mechanisms. However for runtime power management, wakeup events should
|
||||
be enabled whenever the device and driver both support them, regardless of the
|
||||
should_wakeup flag.
|
||||
|
||||
|
||||
EXAMPLE: PCI Device Driver Methods
|
||||
-----------------------------------
|
||||
PCI framework software calls these methods when the PCI device driver bound
|
||||
to a device device has provided them:
|
||||
/sys/devices/.../power/control files
|
||||
------------------------------------
|
||||
Each device in the driver model has a flag to control whether it is subject to
|
||||
runtime power management. This flag, called runtime_auto, is initialized by the
|
||||
bus type (or generally subsystem) code using pm_runtime_allow() or
|
||||
pm_runtime_forbid(); the default is to allow runtime power management.
|
||||
|
||||
struct pci_driver {
|
||||
...
|
||||
int (*suspend)(struct pci_device *pdev, pm_message_t state);
|
||||
int (*suspend_late)(struct pci_device *pdev, pm_message_t state);
|
||||
The setting can be adjusted by user space by writing either "on" or "auto" to
|
||||
the device's power/control sysfs file. Writing "auto" calls pm_runtime_allow(),
|
||||
setting the flag and allowing the device to be runtime power-managed by its
|
||||
driver. Writing "on" calls pm_runtime_forbid(), clearing the flag, returning
|
||||
the device to full power if it was in a low-power state, and preventing the
|
||||
device from being runtime power-managed. User space can check the current value
|
||||
of the runtime_auto flag by reading the file.
|
||||
|
||||
int (*resume_early)(struct pci_device *pdev);
|
||||
int (*resume)(struct pci_device *pdev);
|
||||
};
|
||||
The device's runtime_auto flag has no effect on the handling of system-wide
|
||||
power transitions. In particular, the device can (and in the majority of cases
|
||||
should and will) be put into a low-power state during a system-wide transition
|
||||
to a sleep state even though its runtime_auto flag is clear.
|
||||
|
||||
Drivers will implement those methods, and call PCI-specific procedures
|
||||
like pci_set_power_state(), pci_enable_wake(), pci_save_state(), and
|
||||
pci_restore_state() to manage PCI-specific mechanisms. (PCI config space
|
||||
could be saved during driver probe, if it weren't for the fact that some
|
||||
systems rely on userspace tweaking using setpci.) Devices are suspended
|
||||
before their bridges enter low power states, and likewise bridges resume
|
||||
before their devices.
|
||||
For more information about the runtime power management framework, refer to
|
||||
Documentation/power/runtime_pm.txt.
|
||||
|
||||
|
||||
Upper Layers of Driver Stacks
|
||||
-----------------------------
|
||||
Device drivers generally have at least two interfaces, and the methods
|
||||
sketched above are the ones which apply to the lower level (nearer PCI, USB,
|
||||
or other bus hardware). The network and block layers are examples of upper
|
||||
level interfaces, as is a character device talking to userspace.
|
||||
|
||||
Power management requests normally need to flow through those upper levels,
|
||||
which often use domain-oriented requests like "blank that screen". In
|
||||
some cases those upper levels will have power management intelligence that
|
||||
relates to end-user activity, or other devices that work in cooperation.
|
||||
|
||||
When those interfaces are structured using class interfaces, there is a
|
||||
standard way to have the upper layer stop issuing requests to a given
|
||||
class device (and restart later):
|
||||
|
||||
struct class {
|
||||
...
|
||||
int (*suspend)(struct device *dev, pm_message_t state);
|
||||
int (*resume)(struct device *dev);
|
||||
};
|
||||
|
||||
Those calls are issued in specific phases of the process by which the
|
||||
system enters a low power "suspend" state, or resumes from it.
|
||||
|
||||
|
||||
Calling Drivers to Enter System Sleep States
|
||||
============================================
|
||||
When the system enters a low power state, each device's driver is asked
|
||||
to suspend the device by putting it into state compatible with the target
|
||||
Calling Drivers to Enter and Leave System Sleep States
|
||||
======================================================
|
||||
When the system goes into a sleep state, each device's driver is asked to
|
||||
suspend the device by putting it into a state compatible with the target
|
||||
system state. That's usually some version of "off", but the details are
|
||||
system-specific. Also, wakeup-enabled devices will usually stay partly
|
||||
functional in order to wake the system.
|
||||
|
||||
When the system leaves that low power state, the device's driver is asked
|
||||
to resume it. The suspend and resume operations always go together, and
|
||||
both are multi-phase operations.
|
||||
When the system leaves that low-power state, the device's driver is asked to
|
||||
resume it by returning it to full power. The suspend and resume operations
|
||||
always go together, and both are multi-phase operations.
|
||||
|
||||
For simple drivers, suspend might quiesce the device using the class code
|
||||
and then turn its hardware as "off" as possible with late_suspend. The
|
||||
For simple drivers, suspend might quiesce the device using class code
|
||||
and then turn its hardware as "off" as possible during suspend_noirq. The
|
||||
matching resume calls would then completely reinitialize the hardware
|
||||
before reactivating its class I/O queues.
|
||||
|
||||
More power-aware drivers drivers will use more than one device low power
|
||||
state, either at runtime or during system sleep states, and might trigger
|
||||
system wakeup events.
|
||||
More power-aware drivers might prepare the devices for triggering system wakeup
|
||||
events.
|
||||
|
||||
|
||||
Call Sequence Guarantees
|
||||
------------------------
|
||||
To ensure that bridges and similar links needed to talk to a device are
|
||||
To ensure that bridges and similar links needing to talk to a device are
|
||||
available when the device is suspended or resumed, the device tree is
|
||||
walked in a bottom-up order to suspend devices. A top-down order is
|
||||
used to resume those devices.
|
||||
@ -194,67 +232,310 @@ its parent; and can't be removed or suspended after that parent.
|
||||
The policy is that the device tree should match hardware bus topology.
|
||||
(Or at least the control bus, for devices which use multiple busses.)
|
||||
In particular, this means that a device registration may fail if the parent of
|
||||
the device is suspending (ie. has been chosen by the PM core as the next
|
||||
the device is suspending (i.e. has been chosen by the PM core as the next
|
||||
device to suspend) or has already suspended, as well as after all of the other
|
||||
devices have been suspended. Device drivers must be prepared to cope with such
|
||||
situations.
|
||||
|
||||
|
||||
Suspending Devices
|
||||
------------------
|
||||
Suspending a given device is done in several phases. Suspending the
|
||||
system always includes every phase, executing calls for every device
|
||||
before the next phase begins. Not all busses or classes support all
|
||||
these callbacks; and not all drivers use all the callbacks.
|
||||
System Power Management Phases
|
||||
------------------------------
|
||||
Suspending or resuming the system is done in several phases. Different phases
|
||||
are used for standby or memory sleep states ("suspend-to-RAM") and the
|
||||
hibernation state ("suspend-to-disk"). Each phase involves executing callbacks
|
||||
for every device before the next phase begins. Not all busses or classes
|
||||
support all these callbacks and not all drivers use all the callbacks. The
|
||||
various phases always run after tasks have been frozen and before they are
|
||||
unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have
|
||||
been disabled (except for those marked with the IRQ_WAKEUP flag).
|
||||
|
||||
The phases are seen by driver notifications issued in this order:
|
||||
Most phases use bus, type, and class callbacks (that is, methods defined in
|
||||
dev->bus->pm, dev->type->pm, and dev->class->pm). The prepare and complete
|
||||
phases are exceptions; they use only bus callbacks. When multiple callbacks
|
||||
are used in a phase, they are invoked in the order: <class, type, bus> during
|
||||
power-down transitions and in the opposite order during power-up transitions.
|
||||
For example, during the suspend phase the PM core invokes
|
||||
|
||||
1 class.suspend(dev, message) is called after tasks are frozen, for
|
||||
devices associated with a class that has such a method. This
|
||||
method may sleep.
|
||||
dev->class->pm.suspend(dev);
|
||||
dev->type->pm.suspend(dev);
|
||||
dev->bus->pm.suspend(dev);
|
||||
|
||||
Since I/O activity usually comes from such higher layers, this is
|
||||
a good place to quiesce all drivers of a given type (and keep such
|
||||
code out of those drivers).
|
||||
before moving on to the next device, whereas during the resume phase the core
|
||||
invokes
|
||||
|
||||
2 bus.suspend(dev, message) is called next. This method may sleep,
|
||||
and is often morphed into a device driver call with bus-specific
|
||||
parameters and/or rules.
|
||||
dev->bus->pm.resume(dev);
|
||||
dev->type->pm.resume(dev);
|
||||
dev->class->pm.resume(dev);
|
||||
|
||||
This call should handle parts of device suspend logic that require
|
||||
sleeping. It probably does work to quiesce the device which hasn't
|
||||
been abstracted into class.suspend().
|
||||
These callbacks may in turn invoke device- or driver-specific methods stored in
|
||||
dev->driver->pm, but they don't have to.
|
||||
|
||||
The pm_message_t parameter is currently used to refine those semantics
|
||||
(described later).
|
||||
|
||||
At the end of those phases, drivers should normally have stopped all I/O
|
||||
transactions (DMA, IRQs), saved enough state that they can re-initialize
|
||||
or restore previous state (as needed by the hardware), and placed the
|
||||
device into a low-power state. On many platforms they will also use
|
||||
clk_disable() to gate off one or more clock sources; sometimes they will
|
||||
also switch off power supplies, or reduce voltages. Drivers which have
|
||||
runtime PM support may already have performed some or all of the steps
|
||||
needed to prepare for the upcoming system sleep state.
|
||||
Entering System Suspend
|
||||
-----------------------
|
||||
When the system goes into the standby or memory sleep state, the phases are:
|
||||
|
||||
When any driver sees that its device_can_wakeup(dev), it should make sure
|
||||
to use the relevant hardware signals to trigger a system wakeup event.
|
||||
For example, enable_irq_wake() might identify GPIO signals hooked up to
|
||||
a switch or other external hardware, and pci_enable_wake() does something
|
||||
similar for PCI's PME# signal.
|
||||
prepare, suspend, suspend_noirq.
|
||||
|
||||
If a driver (or bus, or class) fails it suspend method, the system won't
|
||||
enter the desired low power state; it will resume all the devices it's
|
||||
suspended so far.
|
||||
1. The prepare phase is meant to prevent races by preventing new devices
|
||||
from being registered; the PM core would never know that all the
|
||||
children of a device had been suspended if new children could be
|
||||
registered at will. (By contrast, devices may be unregistered at any
|
||||
time.) Unlike the other suspend-related phases, during the prepare
|
||||
phase the device tree is traversed top-down.
|
||||
|
||||
Note that drivers may need to perform different actions based on the target
|
||||
system lowpower/sleep state. At this writing, there are only platform
|
||||
specific APIs through which drivers could determine those target states.
|
||||
The prepare phase uses only a bus callback. After the callback method
|
||||
returns, no new children may be registered below the device. The method
|
||||
may also prepare the device or driver in some way for the upcoming
|
||||
system power transition, but it should not put the device into a
|
||||
low-power state.
|
||||
|
||||
2. The suspend methods should quiesce the device to stop it from performing
|
||||
I/O. They also may save the device registers and put it into the
|
||||
appropriate low-power state, depending on the bus type the device is on,
|
||||
and they may enable wakeup events.
|
||||
|
||||
3. The suspend_noirq phase occurs after IRQ handlers have been disabled,
|
||||
which means that the driver's interrupt handler will not be called while
|
||||
the callback method is running. The methods should save the values of
|
||||
the device's registers that weren't saved previously and finally put the
|
||||
device into the appropriate low-power state.
|
||||
|
||||
The majority of subsystems and device drivers need not implement this
|
||||
callback. However, bus types allowing devices to share interrupt
|
||||
vectors, like PCI, generally need it; otherwise a driver might encounter
|
||||
an error during the suspend phase by fielding a shared interrupt
|
||||
generated by some other device after its own device had been set to low
|
||||
power.
|
||||
|
||||
At the end of these phases, drivers should have stopped all I/O transactions
|
||||
(DMA, IRQs), saved enough state that they can re-initialize or restore previous
|
||||
state (as needed by the hardware), and placed the device into a low-power state.
|
||||
On many platforms they will gate off one or more clock sources; sometimes they
|
||||
will also switch off power supplies or reduce voltages. (Drivers supporting
|
||||
runtime PM may already have performed some or all of these steps.)
|
||||
|
||||
If device_may_wakeup(dev) returns true, the device should be prepared for
|
||||
generating hardware wakeup signals to trigger a system wakeup event when the
|
||||
system is in the sleep state. For example, enable_irq_wake() might identify
|
||||
GPIO signals hooked up to a switch or other external hardware, and
|
||||
pci_enable_wake() does something similar for the PCI PME signal.
|
||||
|
||||
If any of these callbacks returns an error, the system won't enter the desired
|
||||
low-power state. Instead the PM core will unwind its actions by resuming all
|
||||
the devices that were suspended.
|
||||
|
||||
|
||||
Leaving System Suspend
|
||||
----------------------
|
||||
When resuming from standby or memory sleep, the phases are:
|
||||
|
||||
resume_noirq, resume, complete.
|
||||
|
||||
1. The resume_noirq callback methods should perform any actions needed
|
||||
before the driver's interrupt handlers are invoked. This generally
|
||||
means undoing the actions of the suspend_noirq phase. If the bus type
|
||||
permits devices to share interrupt vectors, like PCI, the method should
|
||||
bring the device and its driver into a state in which the driver can
|
||||
recognize if the device is the source of incoming interrupts, if any,
|
||||
and handle them correctly.
|
||||
|
||||
For example, the PCI bus type's ->pm.resume_noirq() puts the device into
|
||||
the full-power state (D0 in the PCI terminology) and restores the
|
||||
standard configuration registers of the device. Then it calls the
|
||||
device driver's ->pm.resume_noirq() method to perform device-specific
|
||||
actions.
|
||||
|
||||
2. The resume methods should bring the the device back to its operating
|
||||
state, so that it can perform normal I/O. This generally involves
|
||||
undoing the actions of the suspend phase.
|
||||
|
||||
3. The complete phase uses only a bus callback. The method should undo the
|
||||
actions of the prepare phase. Note, however, that new children may be
|
||||
registered below the device as soon as the resume callbacks occur; it's
|
||||
not necessary to wait until the complete phase.
|
||||
|
||||
At the end of these phases, drivers should be as functional as they were before
|
||||
suspending: I/O can be performed using DMA and IRQs, and the relevant clocks are
|
||||
gated on. Even if the device was in a low-power state before the system sleep
|
||||
because of runtime power management, afterwards it should be back in its
|
||||
full-power state. There are multiple reasons why it's best to do this; they are
|
||||
discussed in more detail in Documentation/power/runtime_pm.txt.
|
||||
|
||||
However, the details here may again be platform-specific. For example,
|
||||
some systems support multiple "run" states, and the mode in effect at
|
||||
the end of resume might not be the one which preceded suspension.
|
||||
That means availability of certain clocks or power supplies changed,
|
||||
which could easily affect how a driver works.
|
||||
|
||||
Drivers need to be able to handle hardware which has been reset since the
|
||||
suspend methods were called, for example by complete reinitialization.
|
||||
This may be the hardest part, and the one most protected by NDA'd documents
|
||||
and chip errata. It's simplest if the hardware state hasn't changed since
|
||||
the suspend was carried out, but that can't be guaranteed (in fact, it ususally
|
||||
is not the case).
|
||||
|
||||
Drivers must also be prepared to notice that the device has been removed
|
||||
while the system was powered down, whenever that's physically possible.
|
||||
PCMCIA, MMC, USB, Firewire, SCSI, and even IDE are common examples of busses
|
||||
where common Linux platforms will see such removal. Details of how drivers
|
||||
will notice and handle such removals are currently bus-specific, and often
|
||||
involve a separate thread.
|
||||
|
||||
These callbacks may return an error value, but the PM core will ignore such
|
||||
errors since there's nothing it can do about them other than printing them in
|
||||
the system log.
|
||||
|
||||
|
||||
Entering Hibernation
|
||||
--------------------
|
||||
Hibernating the system is more complicated than putting it into the standby or
|
||||
memory sleep state, because it involves creating and saving a system image.
|
||||
Therefore there are more phases for hibernation, with a different set of
|
||||
callbacks. These phases always run after tasks have been frozen and memory has
|
||||
been freed.
|
||||
|
||||
The general procedure for hibernation is to quiesce all devices (freeze), create
|
||||
an image of the system memory while everything is stable, reactivate all
|
||||
devices (thaw), write the image to permanent storage, and finally shut down the
|
||||
system (poweroff). The phases used to accomplish this are:
|
||||
|
||||
prepare, freeze, freeze_noirq, thaw_noirq, thaw, complete,
|
||||
prepare, poweroff, poweroff_noirq
|
||||
|
||||
1. The prepare phase is discussed in the "Entering System Suspend" section
|
||||
above.
|
||||
|
||||
2. The freeze methods should quiesce the device so that it doesn't generate
|
||||
IRQs or DMA, and they may need to save the values of device registers.
|
||||
However the device does not have to be put in a low-power state, and to
|
||||
save time it's best not to do so. Also, the device should not be
|
||||
prepared to generate wakeup events.
|
||||
|
||||
3. The freeze_noirq phase is analogous to the suspend_noirq phase discussed
|
||||
above, except again that the device should not be put in a low-power
|
||||
state and should not be allowed to generate wakeup events.
|
||||
|
||||
At this point the system image is created. All devices should be inactive and
|
||||
the contents of memory should remain undisturbed while this happens, so that the
|
||||
image forms an atomic snapshot of the system state.
|
||||
|
||||
4. The thaw_noirq phase is analogous to the resume_noirq phase discussed
|
||||
above. The main difference is that its methods can assume the device is
|
||||
in the same state as at the end of the freeze_noirq phase.
|
||||
|
||||
5. The thaw phase is analogous to the resume phase discussed above. Its
|
||||
methods should bring the device back to an operating state, so that it
|
||||
can be used for saving the image if necessary.
|
||||
|
||||
6. The complete phase is discussed in the "Leaving System Suspend" section
|
||||
above.
|
||||
|
||||
At this point the system image is saved, and the devices then need to be
|
||||
prepared for the upcoming system shutdown. This is much like suspending them
|
||||
before putting the system into the standby or memory sleep state, and the phases
|
||||
are similar.
|
||||
|
||||
7. The prepare phase is discussed above.
|
||||
|
||||
8. The poweroff phase is analogous to the suspend phase.
|
||||
|
||||
9. The poweroff_noirq phase is analogous to the suspend_noirq phase.
|
||||
|
||||
The poweroff and poweroff_noirq callbacks should do essentially the same things
|
||||
as the suspend and suspend_noirq callbacks. The only notable difference is that
|
||||
they need not store the device register values, because the registers should
|
||||
already have been stored during the freeze or freeze_noirq phases.
|
||||
|
||||
|
||||
Leaving Hibernation
|
||||
-------------------
|
||||
Resuming from hibernation is, again, more complicated than resuming from a sleep
|
||||
state in which the contents of main memory are preserved, because it requires
|
||||
a system image to be loaded into memory and the pre-hibernation memory contents
|
||||
to be restored before control can be passed back to the image kernel.
|
||||
|
||||
Although in principle, the image might be loaded into memory and the
|
||||
pre-hibernation memory contents restored by the boot loader, in practice this
|
||||
can't be done because boot loaders aren't smart enough and there is no
|
||||
established protocol for passing the necessary information. So instead, the
|
||||
boot loader loads a fresh instance of the kernel, called the boot kernel, into
|
||||
memory and passes control to it in the usual way. Then the boot kernel reads
|
||||
the system image, restores the pre-hibernation memory contents, and passes
|
||||
control to the image kernel. Thus two different kernels are involved in
|
||||
resuming from hibernation. In fact, the boot kernel may be completely different
|
||||
from the image kernel: a different configuration and even a different version.
|
||||
This has important consequences for device drivers and their subsystems.
|
||||
|
||||
To be able to load the system image into memory, the boot kernel needs to
|
||||
include at least a subset of device drivers allowing it to access the storage
|
||||
medium containing the image, although it doesn't need to include all of the
|
||||
drivers present in the image kernel. After the image has been loaded, the
|
||||
devices managed by the boot kernel need to be prepared for passing control back
|
||||
to the image kernel. This is very similar to the initial steps involved in
|
||||
creating a system image, and it is accomplished in the same way, using prepare,
|
||||
freeze, and freeze_noirq phases. However the devices affected by these phases
|
||||
are only those having drivers in the boot kernel; other devices will still be in
|
||||
whatever state the boot loader left them.
|
||||
|
||||
Should the restoration of the pre-hibernation memory contents fail, the boot
|
||||
kernel would go through the "thawing" procedure described above, using the
|
||||
thaw_noirq, thaw, and complete phases, and then continue running normally. This
|
||||
happens only rarely. Most often the pre-hibernation memory contents are
|
||||
restored successfully and control is passed to the image kernel, which then
|
||||
becomes responsible for bringing the system back to the working state.
|
||||
|
||||
To achieve this, the image kernel must restore the devices' pre-hibernation
|
||||
functionality. The operation is much like waking up from the memory sleep
|
||||
state, although it involves different phases:
|
||||
|
||||
restore_noirq, restore, complete
|
||||
|
||||
1. The restore_noirq phase is analogous to the resume_noirq phase.
|
||||
|
||||
2. The restore phase is analogous to the resume phase.
|
||||
|
||||
3. The complete phase is discussed above.
|
||||
|
||||
The main difference from resume[_noirq] is that restore[_noirq] must assume the
|
||||
device has been accessed and reconfigured by the boot loader or the boot kernel.
|
||||
Consequently the state of the device may be different from the state remembered
|
||||
from the freeze and freeze_noirq phases. The device may even need to be reset
|
||||
and completely re-initialized. In many cases this difference doesn't matter, so
|
||||
the resume[_noirq] and restore[_norq] method pointers can be set to the same
|
||||
routines. Nevertheless, different callback pointers are used in case there is a
|
||||
situation where it actually matters.
|
||||
|
||||
|
||||
System Devices
|
||||
--------------
|
||||
System devices (sysdevs) follow a slightly different API, which can be found in
|
||||
|
||||
include/linux/sysdev.h
|
||||
drivers/base/sys.c
|
||||
|
||||
System devices will be suspended with interrupts disabled, and after all other
|
||||
devices have been suspended. On resume, they will be resumed before any other
|
||||
devices, and also with interrupts disabled. These things occur in special
|
||||
"sysdev_driver" phases, which affect only system devices.
|
||||
|
||||
Thus, after the suspend_noirq (or freeze_noirq or poweroff_noirq) phase, when
|
||||
the non-boot CPUs are all offline and IRQs are disabled on the remaining online
|
||||
CPU, then a sysdev_driver.suspend phase is carried out, and the system enters a
|
||||
sleep state (or a system image is created). During resume (or after the image
|
||||
has been created or loaded) a sysdev_driver.resume phase is carried out, IRQs
|
||||
are enabled on the only online CPU, the non-boot CPUs are enabled, and the
|
||||
resume_noirq (or thaw_noirq or restore_noirq) phase begins.
|
||||
|
||||
Code to actually enter and exit the system-wide low power state sometimes
|
||||
involves hardware details that are only known to the boot firmware, and
|
||||
may leave a CPU running software (from SRAM or flash memory) that monitors
|
||||
the system and manages its wakeup sequence.
|
||||
|
||||
|
||||
Device Low Power (suspend) States
|
||||
---------------------------------
|
||||
Device low-power states aren't very standard. One device might only handle
|
||||
Device low-power states aren't standard. One device might only handle
|
||||
"on" and "off, while another might support a dozen different versions of
|
||||
"on" (how many engines are active?), plus a state that gets back to "on"
|
||||
faster than from a full "off".
|
||||
@ -265,7 +546,7 @@ PCI device may not perform DMA or issue IRQs, and any wakeup events it
|
||||
issues would be issued through the PME# bus signal. Plus, there are
|
||||
several PCI-standard device states, some of which are optional.
|
||||
|
||||
In contrast, integrated system-on-chip processors often use irqs as the
|
||||
In contrast, integrated system-on-chip processors often use IRQs as the
|
||||
wakeup event sources (so drivers would call enable_irq_wake) and might
|
||||
be able to treat DMA completion as a wakeup event (sometimes DMA can stay
|
||||
active too, it'd only be the CPU and some peripherals that sleep).
|
||||
@ -284,120 +565,17 @@ ways; the aforementioned LCD might be active in one product's "standby",
|
||||
but a different product using the same SOC might work differently.
|
||||
|
||||
|
||||
Meaning of pm_message_t.event
|
||||
-----------------------------
|
||||
Parameters to suspend calls include the device affected and a message of
|
||||
type pm_message_t, which has one field: the event. If driver does not
|
||||
recognize the event code, suspend calls may abort the request and return
|
||||
a negative errno. However, most drivers will be fine if they implement
|
||||
PM_EVENT_SUSPEND semantics for all messages.
|
||||
Power Management Notifiers
|
||||
--------------------------
|
||||
There are some operations that cannot be carried out by the power management
|
||||
callbacks discussed above, because the callbacks occur too late or too early.
|
||||
To handle these cases, subsystems and device drivers may register power
|
||||
management notifiers that are called before tasks are frozen and after they have
|
||||
been thawed. Generally speaking, the PM notifiers are suitable for performing
|
||||
actions that either require user space to be available, or at least won't
|
||||
interfere with user space.
|
||||
|
||||
The event codes are used to refine the goal of suspending the device, and
|
||||
mostly matter when creating or resuming system memory image snapshots, as
|
||||
used with suspend-to-disk:
|
||||
|
||||
PM_EVENT_SUSPEND -- quiesce the driver and put hardware into a low-power
|
||||
state. When used with system sleep states like "suspend-to-RAM" or
|
||||
"standby", the upcoming resume() call will often be able to rely on
|
||||
state kept in hardware, or issue system wakeup events.
|
||||
|
||||
PM_EVENT_HIBERNATE -- Put hardware into a low-power state and enable wakeup
|
||||
events as appropriate. It is only used with hibernation
|
||||
(suspend-to-disk) and few devices are able to wake up the system from
|
||||
this state; most are completely powered off.
|
||||
|
||||
PM_EVENT_FREEZE -- quiesce the driver, but don't necessarily change into
|
||||
any low power mode. A system snapshot is about to be taken, often
|
||||
followed by a call to the driver's resume() method. Neither wakeup
|
||||
events nor DMA are allowed.
|
||||
|
||||
PM_EVENT_PRETHAW -- quiesce the driver, knowing that the upcoming resume()
|
||||
will restore a suspend-to-disk snapshot from a different kernel image.
|
||||
Drivers that are smart enough to look at their hardware state during
|
||||
resume() processing need that state to be correct ... a PRETHAW could
|
||||
be used to invalidate that state (by resetting the device), like a
|
||||
shutdown() invocation would before a kexec() or system halt. Other
|
||||
drivers might handle this the same way as PM_EVENT_FREEZE. Neither
|
||||
wakeup events nor DMA are allowed.
|
||||
|
||||
To enter "standby" (ACPI S1) or "Suspend to RAM" (STR, ACPI S3) states, or
|
||||
the similarly named APM states, only PM_EVENT_SUSPEND is used; the other event
|
||||
codes are used for hibernation ("Suspend to Disk", STD, ACPI S4).
|
||||
|
||||
There's also PM_EVENT_ON, a value which never appears as a suspend event
|
||||
but is sometimes used to record the "not suspended" device state.
|
||||
|
||||
|
||||
Resuming Devices
|
||||
----------------
|
||||
Resuming is done in multiple phases, much like suspending, with all
|
||||
devices processing each phase's calls before the next phase begins.
|
||||
|
||||
The phases are seen by driver notifications issued in this order:
|
||||
|
||||
1 bus.resume(dev) reverses the effects of bus.suspend(). This may
|
||||
be morphed into a device driver call with bus-specific parameters;
|
||||
implementations may sleep.
|
||||
|
||||
2 class.resume(dev) is called for devices associated with a class
|
||||
that has such a method. Implementations may sleep.
|
||||
|
||||
This reverses the effects of class.suspend(), and would usually
|
||||
reactivate the device's I/O queue.
|
||||
|
||||
At the end of those phases, drivers should normally be as functional as
|
||||
they were before suspending: I/O can be performed using DMA and IRQs, and
|
||||
the relevant clocks are gated on. The device need not be "fully on"; it
|
||||
might be in a runtime lowpower/suspend state that acts as if it were.
|
||||
|
||||
However, the details here may again be platform-specific. For example,
|
||||
some systems support multiple "run" states, and the mode in effect at
|
||||
the end of resume() might not be the one which preceded suspension.
|
||||
That means availability of certain clocks or power supplies changed,
|
||||
which could easily affect how a driver works.
|
||||
|
||||
|
||||
Drivers need to be able to handle hardware which has been reset since the
|
||||
suspend methods were called, for example by complete reinitialization.
|
||||
This may be the hardest part, and the one most protected by NDA'd documents
|
||||
and chip errata. It's simplest if the hardware state hasn't changed since
|
||||
the suspend() was called, but that can't always be guaranteed.
|
||||
|
||||
Drivers must also be prepared to notice that the device has been removed
|
||||
while the system was powered off, whenever that's physically possible.
|
||||
PCMCIA, MMC, USB, Firewire, SCSI, and even IDE are common examples of busses
|
||||
where common Linux platforms will see such removal. Details of how drivers
|
||||
will notice and handle such removals are currently bus-specific, and often
|
||||
involve a separate thread.
|
||||
|
||||
|
||||
Note that the bus-specific runtime PM wakeup mechanism can exist, and might
|
||||
be defined to share some of the same driver code as for system wakeup. For
|
||||
example, a bus-specific device driver's resume() method might be used there,
|
||||
so it wouldn't only be called from bus.resume() during system-wide wakeup.
|
||||
See bus-specific information about how runtime wakeup events are handled.
|
||||
|
||||
|
||||
System Devices
|
||||
--------------
|
||||
System devices follow a slightly different API, which can be found in
|
||||
|
||||
include/linux/sysdev.h
|
||||
drivers/base/sys.c
|
||||
|
||||
System devices will only be suspended with interrupts disabled, and after
|
||||
all other devices have been suspended. On resume, they will be resumed
|
||||
before any other devices, and also with interrupts disabled.
|
||||
|
||||
That is, IRQs are disabled, the suspend_late() phase begins, then the
|
||||
sysdev_driver.suspend() phase, and the system enters a sleep state. Then
|
||||
the sysdev_driver.resume() phase begins, followed by the resume_early()
|
||||
phase, after which IRQs are enabled.
|
||||
|
||||
Code to actually enter and exit the system-wide low power state sometimes
|
||||
involves hardware details that are only known to the boot firmware, and
|
||||
may leave a CPU running software (from SRAM or flash memory) that monitors
|
||||
the system and manages its wakeup sequence.
|
||||
For details refer to Documentation/power/notifiers.txt.
|
||||
|
||||
|
||||
Runtime Power Management
|
||||
@ -407,82 +585,23 @@ running. This feature is useful for devices that are not being used, and
|
||||
can offer significant power savings on a running system. These devices
|
||||
often support a range of runtime power states, which might use names such
|
||||
as "off", "sleep", "idle", "active", and so on. Those states will in some
|
||||
cases (like PCI) be partially constrained by a bus the device uses, and will
|
||||
cases (like PCI) be partially constrained by the bus the device uses, and will
|
||||
usually include hardware states that are also used in system sleep states.
|
||||
|
||||
However, note that if a driver puts a device into a runtime low power state
|
||||
and the system then goes into a system-wide sleep state, it normally ought
|
||||
to resume into that runtime low power state rather than "full on". Such
|
||||
distinctions would be part of the driver-internal state machine for that
|
||||
hardware; the whole point of runtime power management is to be sure that
|
||||
drivers are decoupled in that way from the state machine governing phases
|
||||
of the system-wide power/sleep state transitions.
|
||||
A system-wide power transition can be started while some devices are in low
|
||||
power states due to runtime power management. The system sleep PM callbacks
|
||||
should recognize such situations and react to them appropriately, but the
|
||||
necessary actions are subsystem-specific.
|
||||
|
||||
In some cases the decision may be made at the subsystem level while in other
|
||||
cases the device driver may be left to decide. In some cases it may be
|
||||
desirable to leave a suspended device in that state during a system-wide power
|
||||
transition, but in other cases the device must be put back into the full-power
|
||||
state temporarily, for example so that its system wakeup capability can be
|
||||
disabled. This all depends on the hardware and the design of the subsystem and
|
||||
device driver in question.
|
||||
|
||||
Power Saving Techniques
|
||||
-----------------------
|
||||
Normally runtime power management is handled by the drivers without specific
|
||||
userspace or kernel intervention, by device-aware use of techniques like:
|
||||
|
||||
Using information provided by other system layers
|
||||
- stay deeply "off" except between open() and close()
|
||||
- if transceiver/PHY indicates "nobody connected", stay "off"
|
||||
- application protocols may include power commands or hints
|
||||
|
||||
Using fewer CPU cycles
|
||||
- using DMA instead of PIO
|
||||
- removing timers, or making them lower frequency
|
||||
- shortening "hot" code paths
|
||||
- eliminating cache misses
|
||||
- (sometimes) offloading work to device firmware
|
||||
|
||||
Reducing other resource costs
|
||||
- gating off unused clocks in software (or hardware)
|
||||
- switching off unused power supplies
|
||||
- eliminating (or delaying/merging) IRQs
|
||||
- tuning DMA to use word and/or burst modes
|
||||
|
||||
Using device-specific low power states
|
||||
- using lower voltages
|
||||
- avoiding needless DMA transfers
|
||||
|
||||
Read your hardware documentation carefully to see the opportunities that
|
||||
may be available. If you can, measure the actual power usage and check
|
||||
it against the budget established for your project.
|
||||
|
||||
|
||||
Examples: USB hosts, system timer, system CPU
|
||||
----------------------------------------------
|
||||
USB host controllers make interesting, if complex, examples. In many cases
|
||||
these have no work to do: no USB devices are connected, or all of them are
|
||||
in the USB "suspend" state. Linux host controller drivers can then disable
|
||||
periodic DMA transfers that would otherwise be a constant power drain on the
|
||||
memory subsystem, and enter a suspend state. In power-aware controllers,
|
||||
entering that suspend state may disable the clock used with USB signaling,
|
||||
saving a certain amount of power.
|
||||
|
||||
The controller will be woken from that state (with an IRQ) by changes to the
|
||||
signal state on the data lines of a given port, for example by an existing
|
||||
peripheral requesting "remote wakeup" or by plugging a new peripheral. The
|
||||
same wakeup mechanism usually works from "standby" sleep states, and on some
|
||||
systems also from "suspend to RAM" (or even "suspend to disk") states.
|
||||
(Except that ACPI may be involved instead of normal IRQs, on some hardware.)
|
||||
|
||||
System devices like timers and CPUs may have special roles in the platform
|
||||
power management scheme. For example, system timers using a "dynamic tick"
|
||||
approach don't just save CPU cycles (by eliminating needless timer IRQs),
|
||||
but they may also open the door to using lower power CPU "idle" states that
|
||||
cost more than a jiffie to enter and exit. On x86 systems these are states
|
||||
like "C3"; note that periodic DMA transfers from a USB host controller will
|
||||
also prevent entry to a C3 state, much like a periodic timer IRQ.
|
||||
|
||||
That kind of runtime mechanism interaction is common. "System On Chip" (SOC)
|
||||
processors often have low power idle modes that can't be entered unless
|
||||
certain medium-speed clocks (often 12 or 48 MHz) are gated off. When the
|
||||
drivers gate those clocks effectively, then the system idle task may be able
|
||||
to use the lower power idle modes and thereby increase battery life.
|
||||
|
||||
If the CPU can have a "cpufreq" driver, there also may be opportunities
|
||||
to shift to lower voltage settings and reduce the power cost of executing
|
||||
a given number of instructions. (Without voltage adjustment, it's rare
|
||||
for cpufreq to save much power; the cost-per-instruction must go down.)
|
||||
During system-wide resume from a sleep state it's best to put devices into the
|
||||
full-power state, as explained in Documentation/power/runtime_pm.txt. Refer to
|
||||
that document for more information regarding this particular issue as well as
|
||||
for information on the device runtime power management framework in general.
|
||||
|
@ -18,44 +18,46 @@ and pm_qos_params.h. This is done because having the available parameters
|
||||
being runtime configurable or changeable from a driver was seen as too easy to
|
||||
abuse.
|
||||
|
||||
For each parameter a list of performance requirements is maintained along with
|
||||
For each parameter a list of performance requests is maintained along with
|
||||
an aggregated target value. The aggregated target value is updated with
|
||||
changes to the requirement list or elements of the list. Typically the
|
||||
aggregated target value is simply the max or min of the requirement values held
|
||||
changes to the request list or elements of the list. Typically the
|
||||
aggregated target value is simply the max or min of the request values held
|
||||
in the parameter list elements.
|
||||
|
||||
From kernel mode the use of this interface is simple:
|
||||
pm_qos_add_requirement(param_id, name, target_value):
|
||||
Will insert a named element in the list for that identified PM_QOS parameter
|
||||
with the target value. Upon change to this list the new target is recomputed
|
||||
and any registered notifiers are called only if the target value is now
|
||||
different.
|
||||
|
||||
pm_qos_update_requirement(param_id, name, new_target_value):
|
||||
Will search the list identified by the param_id for the named list element and
|
||||
then update its target value, calling the notification tree if the aggregated
|
||||
target is changed. with that name is already registered.
|
||||
handle = pm_qos_add_request(param_class, target_value):
|
||||
Will insert an element into the list for that identified PM_QOS class with the
|
||||
target value. Upon change to this list the new target is recomputed and any
|
||||
registered notifiers are called only if the target value is now different.
|
||||
Clients of pm_qos need to save the returned handle.
|
||||
|
||||
pm_qos_remove_requirement(param_id, name):
|
||||
Will search the identified list for the named element and remove it, after
|
||||
removal it will update the aggregate target and call the notification tree if
|
||||
the target was changed as a result of removing the named requirement.
|
||||
void pm_qos_update_request(handle, new_target_value):
|
||||
Will update the list element pointed to by the handle with the new target value
|
||||
and recompute the new aggregated target, calling the notification tree if the
|
||||
target is changed.
|
||||
|
||||
void pm_qos_remove_request(handle):
|
||||
Will remove the element. After removal it will update the aggregate target and
|
||||
call the notification tree if the target was changed as a result of removing
|
||||
the request.
|
||||
|
||||
|
||||
From user mode:
|
||||
Only processes can register a pm_qos requirement. To provide for automatic
|
||||
cleanup for process the interface requires the process to register its
|
||||
parameter requirements in the following way:
|
||||
Only processes can register a pm_qos request. To provide for automatic
|
||||
cleanup of a process, the interface requires the process to register its
|
||||
parameter requests in the following way:
|
||||
|
||||
To register the default pm_qos target for the specific parameter, the process
|
||||
must open one of /dev/[cpu_dma_latency, network_latency, network_throughput]
|
||||
|
||||
As long as the device node is held open that process has a registered
|
||||
requirement on the parameter. The name of the requirement is "process_<PID>"
|
||||
derived from the current->pid from within the open system call.
|
||||
request on the parameter.
|
||||
|
||||
To change the requested target value the process needs to write a s32 value to
|
||||
the open device node. This translates to a pm_qos_update_requirement call.
|
||||
To change the requested target value the process needs to write an s32 value to
|
||||
the open device node. Alternatively the user mode program could write a hex
|
||||
string for the value using 10 char long format e.g. "0x12345678". This
|
||||
translates to a pm_qos_update_request call.
|
||||
|
||||
To remove the user mode request for a target value simply close the device
|
||||
node.
|
||||
|
@ -24,6 +24,10 @@ assumed to be in the resume mode. The device cannot be open for simultaneous
|
||||
reading and writing. It is also impossible to have the device open more than
|
||||
once at a time.
|
||||
|
||||
Even opening the device has side effects. Data structures are
|
||||
allocated, and PM_HIBERNATION_PREPARE / PM_RESTORE_PREPARE chains are
|
||||
called.
|
||||
|
||||
The ioctl() commands recognized by the device are:
|
||||
|
||||
SNAPSHOT_FREEZE - freeze user space processes (the current process is
|
||||
|
@ -698,7 +698,7 @@ static int acpi_processor_power_seq_show(struct seq_file *seq, void *offset)
|
||||
"max_cstate: C%d\n"
|
||||
"maximum allowed latency: %d usec\n",
|
||||
pr->power.state ? pr->power.state - pr->power.states : 0,
|
||||
max_cstate, pm_qos_requirement(PM_QOS_CPU_DMA_LATENCY));
|
||||
max_cstate, pm_qos_request(PM_QOS_CPU_DMA_LATENCY));
|
||||
|
||||
seq_puts(seq, "states:\n");
|
||||
|
||||
|
@ -967,17 +967,17 @@ static int platform_pm_restore_noirq(struct device *dev)
|
||||
|
||||
int __weak platform_pm_runtime_suspend(struct device *dev)
|
||||
{
|
||||
return -ENOSYS;
|
||||
return pm_generic_runtime_suspend(dev);
|
||||
};
|
||||
|
||||
int __weak platform_pm_runtime_resume(struct device *dev)
|
||||
{
|
||||
return -ENOSYS;
|
||||
return pm_generic_runtime_resume(dev);
|
||||
};
|
||||
|
||||
int __weak platform_pm_runtime_idle(struct device *dev)
|
||||
{
|
||||
return -ENOSYS;
|
||||
return pm_generic_runtime_idle(dev);
|
||||
};
|
||||
|
||||
#else /* !CONFIG_PM_RUNTIME */
|
||||
|
@ -229,14 +229,16 @@ int __pm_runtime_suspend(struct device *dev, bool from_wq)
|
||||
|
||||
if (retval) {
|
||||
dev->power.runtime_status = RPM_ACTIVE;
|
||||
pm_runtime_cancel_pending(dev);
|
||||
|
||||
if (retval == -EAGAIN || retval == -EBUSY) {
|
||||
notify = true;
|
||||
if (dev->power.timer_expires == 0)
|
||||
notify = true;
|
||||
dev->power.runtime_error = 0;
|
||||
} else {
|
||||
pm_runtime_cancel_pending(dev);
|
||||
}
|
||||
} else {
|
||||
dev->power.runtime_status = RPM_SUSPENDED;
|
||||
pm_runtime_deactivate_timer(dev);
|
||||
|
||||
if (dev->parent) {
|
||||
parent = dev->parent;
|
||||
@ -659,8 +661,6 @@ int pm_schedule_suspend(struct device *dev, unsigned int delay)
|
||||
|
||||
if (dev->power.runtime_status == RPM_SUSPENDED)
|
||||
retval = 1;
|
||||
else if (dev->power.runtime_status == RPM_SUSPENDING)
|
||||
retval = -EINPROGRESS;
|
||||
else if (atomic_read(&dev->power.usage_count) > 0
|
||||
|| dev->power.disable_depth > 0)
|
||||
retval = -EAGAIN;
|
||||
|
@ -5,6 +5,7 @@
|
||||
#include <linux/device.h>
|
||||
#include <linux/string.h>
|
||||
#include <linux/pm_runtime.h>
|
||||
#include <asm/atomic.h>
|
||||
#include "power.h"
|
||||
|
||||
/*
|
||||
@ -143,7 +144,59 @@ wake_store(struct device * dev, struct device_attribute *attr,
|
||||
|
||||
static DEVICE_ATTR(wakeup, 0644, wake_show, wake_store);
|
||||
|
||||
#ifdef CONFIG_PM_SLEEP_ADVANCED_DEBUG
|
||||
#ifdef CONFIG_PM_ADVANCED_DEBUG
|
||||
#ifdef CONFIG_PM_RUNTIME
|
||||
|
||||
static ssize_t rtpm_usagecount_show(struct device *dev,
|
||||
struct device_attribute *attr, char *buf)
|
||||
{
|
||||
return sprintf(buf, "%d\n", atomic_read(&dev->power.usage_count));
|
||||
}
|
||||
|
||||
static ssize_t rtpm_children_show(struct device *dev,
|
||||
struct device_attribute *attr, char *buf)
|
||||
{
|
||||
return sprintf(buf, "%d\n", dev->power.ignore_children ?
|
||||
0 : atomic_read(&dev->power.child_count));
|
||||
}
|
||||
|
||||
static ssize_t rtpm_enabled_show(struct device *dev,
|
||||
struct device_attribute *attr, char *buf)
|
||||
{
|
||||
if ((dev->power.disable_depth) && (dev->power.runtime_auto == false))
|
||||
return sprintf(buf, "disabled & forbidden\n");
|
||||
else if (dev->power.disable_depth)
|
||||
return sprintf(buf, "disabled\n");
|
||||
else if (dev->power.runtime_auto == false)
|
||||
return sprintf(buf, "forbidden\n");
|
||||
return sprintf(buf, "enabled\n");
|
||||
}
|
||||
|
||||
static ssize_t rtpm_status_show(struct device *dev,
|
||||
struct device_attribute *attr, char *buf)
|
||||
{
|
||||
if (dev->power.runtime_error)
|
||||
return sprintf(buf, "error\n");
|
||||
switch (dev->power.runtime_status) {
|
||||
case RPM_SUSPENDED:
|
||||
return sprintf(buf, "suspended\n");
|
||||
case RPM_SUSPENDING:
|
||||
return sprintf(buf, "suspending\n");
|
||||
case RPM_RESUMING:
|
||||
return sprintf(buf, "resuming\n");
|
||||
case RPM_ACTIVE:
|
||||
return sprintf(buf, "active\n");
|
||||
}
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
static DEVICE_ATTR(runtime_usage, 0444, rtpm_usagecount_show, NULL);
|
||||
static DEVICE_ATTR(runtime_active_kids, 0444, rtpm_children_show, NULL);
|
||||
static DEVICE_ATTR(runtime_status, 0444, rtpm_status_show, NULL);
|
||||
static DEVICE_ATTR(runtime_enabled, 0444, rtpm_enabled_show, NULL);
|
||||
|
||||
#endif
|
||||
|
||||
static ssize_t async_show(struct device *dev, struct device_attribute *attr,
|
||||
char *buf)
|
||||
{
|
||||
@ -170,15 +223,21 @@ static ssize_t async_store(struct device *dev, struct device_attribute *attr,
|
||||
}
|
||||
|
||||
static DEVICE_ATTR(async, 0644, async_show, async_store);
|
||||
#endif /* CONFIG_PM_SLEEP_ADVANCED_DEBUG */
|
||||
#endif /* CONFIG_PM_ADVANCED_DEBUG */
|
||||
|
||||
static struct attribute * power_attrs[] = {
|
||||
#ifdef CONFIG_PM_RUNTIME
|
||||
&dev_attr_control.attr,
|
||||
#endif
|
||||
&dev_attr_wakeup.attr,
|
||||
#ifdef CONFIG_PM_SLEEP_ADVANCED_DEBUG
|
||||
#ifdef CONFIG_PM_ADVANCED_DEBUG
|
||||
&dev_attr_async.attr,
|
||||
#ifdef CONFIG_PM_RUNTIME
|
||||
&dev_attr_runtime_usage.attr,
|
||||
&dev_attr_runtime_active_kids.attr,
|
||||
&dev_attr_runtime_status.attr,
|
||||
&dev_attr_runtime_enabled.attr,
|
||||
#endif
|
||||
#endif
|
||||
NULL,
|
||||
};
|
||||
|
@ -67,7 +67,7 @@ static int ladder_select_state(struct cpuidle_device *dev)
|
||||
struct ladder_device *ldev = &__get_cpu_var(ladder_devices);
|
||||
struct ladder_device_state *last_state;
|
||||
int last_residency, last_idx = ldev->last_state_idx;
|
||||
int latency_req = pm_qos_requirement(PM_QOS_CPU_DMA_LATENCY);
|
||||
int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
|
||||
|
||||
/* Special case when user has set very strict latency requirement */
|
||||
if (unlikely(latency_req == 0)) {
|
||||
|
@ -182,7 +182,7 @@ static u64 div_round64(u64 dividend, u32 divisor)
|
||||
static int menu_select(struct cpuidle_device *dev)
|
||||
{
|
||||
struct menu_device *data = &__get_cpu_var(menu_devices);
|
||||
int latency_req = pm_qos_requirement(PM_QOS_CPU_DMA_LATENCY);
|
||||
int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
|
||||
int i;
|
||||
int multiplier;
|
||||
|
||||
|
@ -159,82 +159,8 @@ static void i2c_device_shutdown(struct device *dev)
|
||||
driver->shutdown(client);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_SUSPEND
|
||||
static int i2c_device_pm_suspend(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm;
|
||||
|
||||
if (!dev->driver)
|
||||
return 0;
|
||||
pm = dev->driver->pm;
|
||||
if (!pm || !pm->suspend)
|
||||
return 0;
|
||||
return pm->suspend(dev);
|
||||
}
|
||||
|
||||
static int i2c_device_pm_resume(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm;
|
||||
|
||||
if (!dev->driver)
|
||||
return 0;
|
||||
pm = dev->driver->pm;
|
||||
if (!pm || !pm->resume)
|
||||
return 0;
|
||||
return pm->resume(dev);
|
||||
}
|
||||
#else
|
||||
#define i2c_device_pm_suspend NULL
|
||||
#define i2c_device_pm_resume NULL
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_PM_RUNTIME
|
||||
static int i2c_device_runtime_suspend(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm;
|
||||
|
||||
if (!dev->driver)
|
||||
return 0;
|
||||
pm = dev->driver->pm;
|
||||
if (!pm || !pm->runtime_suspend)
|
||||
return 0;
|
||||
return pm->runtime_suspend(dev);
|
||||
}
|
||||
|
||||
static int i2c_device_runtime_resume(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm;
|
||||
|
||||
if (!dev->driver)
|
||||
return 0;
|
||||
pm = dev->driver->pm;
|
||||
if (!pm || !pm->runtime_resume)
|
||||
return 0;
|
||||
return pm->runtime_resume(dev);
|
||||
}
|
||||
|
||||
static int i2c_device_runtime_idle(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm = NULL;
|
||||
int ret;
|
||||
|
||||
if (dev->driver)
|
||||
pm = dev->driver->pm;
|
||||
if (pm && pm->runtime_idle) {
|
||||
ret = pm->runtime_idle(dev);
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
|
||||
return pm_runtime_suspend(dev);
|
||||
}
|
||||
#else
|
||||
#define i2c_device_runtime_suspend NULL
|
||||
#define i2c_device_runtime_resume NULL
|
||||
#define i2c_device_runtime_idle NULL
|
||||
#endif
|
||||
|
||||
static int i2c_device_suspend(struct device *dev, pm_message_t mesg)
|
||||
#ifdef CONFIG_PM_SLEEP
|
||||
static int i2c_legacy_suspend(struct device *dev, pm_message_t mesg)
|
||||
{
|
||||
struct i2c_client *client = i2c_verify_client(dev);
|
||||
struct i2c_driver *driver;
|
||||
@ -247,7 +173,7 @@ static int i2c_device_suspend(struct device *dev, pm_message_t mesg)
|
||||
return driver->suspend(client, mesg);
|
||||
}
|
||||
|
||||
static int i2c_device_resume(struct device *dev)
|
||||
static int i2c_legacy_resume(struct device *dev)
|
||||
{
|
||||
struct i2c_client *client = i2c_verify_client(dev);
|
||||
struct i2c_driver *driver;
|
||||
@ -260,6 +186,104 @@ static int i2c_device_resume(struct device *dev)
|
||||
return driver->resume(client);
|
||||
}
|
||||
|
||||
static int i2c_device_pm_suspend(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
|
||||
|
||||
if (pm_runtime_suspended(dev))
|
||||
return 0;
|
||||
|
||||
if (pm)
|
||||
return pm->suspend ? pm->suspend(dev) : 0;
|
||||
|
||||
return i2c_legacy_suspend(dev, PMSG_SUSPEND);
|
||||
}
|
||||
|
||||
static int i2c_device_pm_resume(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
|
||||
int ret;
|
||||
|
||||
if (pm)
|
||||
ret = pm->resume ? pm->resume(dev) : 0;
|
||||
else
|
||||
ret = i2c_legacy_resume(dev);
|
||||
|
||||
if (!ret) {
|
||||
pm_runtime_disable(dev);
|
||||
pm_runtime_set_active(dev);
|
||||
pm_runtime_enable(dev);
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int i2c_device_pm_freeze(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
|
||||
|
||||
if (pm_runtime_suspended(dev))
|
||||
return 0;
|
||||
|
||||
if (pm)
|
||||
return pm->freeze ? pm->freeze(dev) : 0;
|
||||
|
||||
return i2c_legacy_suspend(dev, PMSG_FREEZE);
|
||||
}
|
||||
|
||||
static int i2c_device_pm_thaw(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
|
||||
|
||||
if (pm_runtime_suspended(dev))
|
||||
return 0;
|
||||
|
||||
if (pm)
|
||||
return pm->thaw ? pm->thaw(dev) : 0;
|
||||
|
||||
return i2c_legacy_resume(dev);
|
||||
}
|
||||
|
||||
static int i2c_device_pm_poweroff(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
|
||||
|
||||
if (pm_runtime_suspended(dev))
|
||||
return 0;
|
||||
|
||||
if (pm)
|
||||
return pm->poweroff ? pm->poweroff(dev) : 0;
|
||||
|
||||
return i2c_legacy_suspend(dev, PMSG_HIBERNATE);
|
||||
}
|
||||
|
||||
static int i2c_device_pm_restore(struct device *dev)
|
||||
{
|
||||
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
|
||||
int ret;
|
||||
|
||||
if (pm)
|
||||
ret = pm->restore ? pm->restore(dev) : 0;
|
||||
else
|
||||
ret = i2c_legacy_resume(dev);
|
||||
|
||||
if (!ret) {
|
||||
pm_runtime_disable(dev);
|
||||
pm_runtime_set_active(dev);
|
||||
pm_runtime_enable(dev);
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
#else /* !CONFIG_PM_SLEEP */
|
||||
#define i2c_device_pm_suspend NULL
|
||||
#define i2c_device_pm_resume NULL
|
||||
#define i2c_device_pm_freeze NULL
|
||||
#define i2c_device_pm_thaw NULL
|
||||
#define i2c_device_pm_poweroff NULL
|
||||
#define i2c_device_pm_restore NULL
|
||||
#endif /* !CONFIG_PM_SLEEP */
|
||||
|
||||
static void i2c_client_dev_release(struct device *dev)
|
||||
{
|
||||
kfree(to_i2c_client(dev));
|
||||
@ -301,9 +325,15 @@ static const struct attribute_group *i2c_dev_attr_groups[] = {
|
||||
static const struct dev_pm_ops i2c_device_pm_ops = {
|
||||
.suspend = i2c_device_pm_suspend,
|
||||
.resume = i2c_device_pm_resume,
|
||||
.runtime_suspend = i2c_device_runtime_suspend,
|
||||
.runtime_resume = i2c_device_runtime_resume,
|
||||
.runtime_idle = i2c_device_runtime_idle,
|
||||
.freeze = i2c_device_pm_freeze,
|
||||
.thaw = i2c_device_pm_thaw,
|
||||
.poweroff = i2c_device_pm_poweroff,
|
||||
.restore = i2c_device_pm_restore,
|
||||
SET_RUNTIME_PM_OPS(
|
||||
pm_generic_runtime_suspend,
|
||||
pm_generic_runtime_resume,
|
||||
pm_generic_runtime_idle
|
||||
)
|
||||
};
|
||||
|
||||
struct bus_type i2c_bus_type = {
|
||||
@ -312,8 +342,6 @@ struct bus_type i2c_bus_type = {
|
||||
.probe = i2c_device_probe,
|
||||
.remove = i2c_device_remove,
|
||||
.shutdown = i2c_device_shutdown,
|
||||
.suspend = i2c_device_suspend,
|
||||
.resume = i2c_device_resume,
|
||||
.pm = &i2c_device_pm_ops,
|
||||
};
|
||||
EXPORT_SYMBOL_GPL(i2c_bus_type);
|
||||
|
@ -2524,12 +2524,12 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
|
||||
* excessive C-state transition latencies result in
|
||||
* dropped transactions.
|
||||
*/
|
||||
pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY,
|
||||
adapter->netdev->name, 55);
|
||||
pm_qos_update_request(
|
||||
adapter->netdev->pm_qos_req, 55);
|
||||
} else {
|
||||
pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY,
|
||||
adapter->netdev->name,
|
||||
PM_QOS_DEFAULT_VALUE);
|
||||
pm_qos_update_request(
|
||||
adapter->netdev->pm_qos_req,
|
||||
PM_QOS_DEFAULT_VALUE);
|
||||
}
|
||||
}
|
||||
|
||||
@ -2824,8 +2824,8 @@ int e1000e_up(struct e1000_adapter *adapter)
|
||||
|
||||
/* DMA latency requirement to workaround early-receive/jumbo issue */
|
||||
if (adapter->flags & FLAG_HAS_ERT)
|
||||
pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY,
|
||||
adapter->netdev->name,
|
||||
adapter->netdev->pm_qos_req =
|
||||
pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
|
||||
PM_QOS_DEFAULT_VALUE);
|
||||
|
||||
/* hardware has been reset, we need to reload some things */
|
||||
@ -2887,9 +2887,11 @@ void e1000e_down(struct e1000_adapter *adapter)
|
||||
e1000_clean_tx_ring(adapter);
|
||||
e1000_clean_rx_ring(adapter);
|
||||
|
||||
if (adapter->flags & FLAG_HAS_ERT)
|
||||
pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY,
|
||||
adapter->netdev->name);
|
||||
if (adapter->flags & FLAG_HAS_ERT) {
|
||||
pm_qos_remove_request(
|
||||
adapter->netdev->pm_qos_req);
|
||||
adapter->netdev->pm_qos_req = NULL;
|
||||
}
|
||||
|
||||
/*
|
||||
* TODO: for power management, we could drop the link and
|
||||
|
@ -48,6 +48,7 @@
|
||||
#define DRV_VERSION "1.0.0-k0"
|
||||
char igbvf_driver_name[] = "igbvf";
|
||||
const char igbvf_driver_version[] = DRV_VERSION;
|
||||
struct pm_qos_request_list *igbvf_driver_pm_qos_req;
|
||||
static const char igbvf_driver_string[] =
|
||||
"Intel(R) Virtual Function Network Driver";
|
||||
static const char igbvf_copyright[] = "Copyright (c) 2009 Intel Corporation.";
|
||||
@ -2899,7 +2900,7 @@ static int __init igbvf_init_module(void)
|
||||
printk(KERN_INFO "%s\n", igbvf_copyright);
|
||||
|
||||
ret = pci_register_driver(&igbvf_driver);
|
||||
pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY, igbvf_driver_name,
|
||||
igbvf_driver_pm_qos_req = pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
|
||||
PM_QOS_DEFAULT_VALUE);
|
||||
|
||||
return ret;
|
||||
@ -2915,7 +2916,8 @@ module_init(igbvf_init_module);
|
||||
static void __exit igbvf_exit_module(void)
|
||||
{
|
||||
pci_unregister_driver(&igbvf_driver);
|
||||
pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY, igbvf_driver_name);
|
||||
pm_qos_remove_request(igbvf_driver_pm_qos_req);
|
||||
igbvf_driver_pm_qos_req = NULL;
|
||||
}
|
||||
module_exit(igbvf_exit_module);
|
||||
|
||||
|
@ -174,6 +174,8 @@ that only one external action is invoked at a time.
|
||||
#define DRV_DESCRIPTION "Intel(R) PRO/Wireless 2100 Network Driver"
|
||||
#define DRV_COPYRIGHT "Copyright(c) 2003-2006 Intel Corporation"
|
||||
|
||||
struct pm_qos_request_list *ipw2100_pm_qos_req;
|
||||
|
||||
/* Debugging stuff */
|
||||
#ifdef CONFIG_IPW2100_DEBUG
|
||||
#define IPW2100_RX_DEBUG /* Reception debugging */
|
||||
@ -1739,7 +1741,7 @@ static int ipw2100_up(struct ipw2100_priv *priv, int deferred)
|
||||
/* the ipw2100 hardware really doesn't want power management delays
|
||||
* longer than 175usec
|
||||
*/
|
||||
pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "ipw2100", 175);
|
||||
pm_qos_update_request(ipw2100_pm_qos_req, 175);
|
||||
|
||||
/* If the interrupt is enabled, turn it off... */
|
||||
spin_lock_irqsave(&priv->low_lock, flags);
|
||||
@ -1887,8 +1889,7 @@ static void ipw2100_down(struct ipw2100_priv *priv)
|
||||
ipw2100_disable_interrupts(priv);
|
||||
spin_unlock_irqrestore(&priv->low_lock, flags);
|
||||
|
||||
pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "ipw2100",
|
||||
PM_QOS_DEFAULT_VALUE);
|
||||
pm_qos_update_request(ipw2100_pm_qos_req, PM_QOS_DEFAULT_VALUE);
|
||||
|
||||
/* We have to signal any supplicant if we are disassociating */
|
||||
if (associated)
|
||||
@ -6669,7 +6670,7 @@ static int __init ipw2100_init(void)
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY, "ipw2100",
|
||||
ipw2100_pm_qos_req = pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
|
||||
PM_QOS_DEFAULT_VALUE);
|
||||
#ifdef CONFIG_IPW2100_DEBUG
|
||||
ipw2100_debug_level = debug;
|
||||
@ -6692,7 +6693,7 @@ static void __exit ipw2100_exit(void)
|
||||
&driver_attr_debug_level);
|
||||
#endif
|
||||
pci_unregister_driver(&ipw2100_pci_driver);
|
||||
pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY, "ipw2100");
|
||||
pm_qos_remove_request(ipw2100_pm_qos_req);
|
||||
}
|
||||
|
||||
module_init(ipw2100_init);
|
||||
|
35
fs/libfs.c
35
fs/libfs.c
@ -546,6 +546,40 @@ ssize_t simple_read_from_buffer(void __user *to, size_t count, loff_t *ppos,
|
||||
return count;
|
||||
}
|
||||
|
||||
/**
|
||||
* simple_write_to_buffer - copy data from user space to the buffer
|
||||
* @to: the buffer to write to
|
||||
* @available: the size of the buffer
|
||||
* @ppos: the current position in the buffer
|
||||
* @from: the user space buffer to read from
|
||||
* @count: the maximum number of bytes to read
|
||||
*
|
||||
* The simple_write_to_buffer() function reads up to @count bytes from the user
|
||||
* space address starting at @from into the buffer @to at offset @ppos.
|
||||
*
|
||||
* On success, the number of bytes written is returned and the offset @ppos is
|
||||
* advanced by this number, or negative value is returned on error.
|
||||
**/
|
||||
ssize_t simple_write_to_buffer(void *to, size_t available, loff_t *ppos,
|
||||
const void __user *from, size_t count)
|
||||
{
|
||||
loff_t pos = *ppos;
|
||||
size_t res;
|
||||
|
||||
if (pos < 0)
|
||||
return -EINVAL;
|
||||
if (pos >= available || !count)
|
||||
return 0;
|
||||
if (count > available - pos)
|
||||
count = available - pos;
|
||||
res = copy_from_user(to + pos, from, count);
|
||||
if (res == count)
|
||||
return -EFAULT;
|
||||
count -= res;
|
||||
*ppos = pos + count;
|
||||
return count;
|
||||
}
|
||||
|
||||
/**
|
||||
* memory_read_from_buffer - copy data from the buffer
|
||||
* @to: the kernel space buffer to read to
|
||||
@ -864,6 +898,7 @@ EXPORT_SYMBOL(simple_statfs);
|
||||
EXPORT_SYMBOL(simple_sync_file);
|
||||
EXPORT_SYMBOL(simple_unlink);
|
||||
EXPORT_SYMBOL(simple_read_from_buffer);
|
||||
EXPORT_SYMBOL(simple_write_to_buffer);
|
||||
EXPORT_SYMBOL(memory_read_from_buffer);
|
||||
EXPORT_SYMBOL(simple_transaction_set);
|
||||
EXPORT_SYMBOL(simple_transaction_get);
|
||||
|
@ -2362,6 +2362,8 @@ extern void simple_release_fs(struct vfsmount **mount, int *count);
|
||||
|
||||
extern ssize_t simple_read_from_buffer(void __user *to, size_t count,
|
||||
loff_t *ppos, const void *from, size_t available);
|
||||
extern ssize_t simple_write_to_buffer(void *to, size_t available, loff_t *ppos,
|
||||
const void __user *from, size_t count);
|
||||
|
||||
extern int simple_fsync(struct file *, struct dentry *, int);
|
||||
|
||||
|
@ -31,6 +31,7 @@
|
||||
#include <linux/if_link.h>
|
||||
|
||||
#ifdef __KERNEL__
|
||||
#include <linux/pm_qos_params.h>
|
||||
#include <linux/timer.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/mm.h>
|
||||
@ -711,6 +712,9 @@ struct net_device {
|
||||
* the interface.
|
||||
*/
|
||||
char name[IFNAMSIZ];
|
||||
|
||||
struct pm_qos_request_list *pm_qos_req;
|
||||
|
||||
/* device name hash chain */
|
||||
struct hlist_node name_hlist;
|
||||
/* snmp alias */
|
||||
|
@ -14,12 +14,14 @@
|
||||
#define PM_QOS_NUM_CLASSES 4
|
||||
#define PM_QOS_DEFAULT_VALUE -1
|
||||
|
||||
int pm_qos_add_requirement(int qos, char *name, s32 value);
|
||||
int pm_qos_update_requirement(int qos, char *name, s32 new_value);
|
||||
void pm_qos_remove_requirement(int qos, char *name);
|
||||
struct pm_qos_request_list;
|
||||
|
||||
int pm_qos_requirement(int qos);
|
||||
struct pm_qos_request_list *pm_qos_add_request(int pm_qos_class, s32 value);
|
||||
void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
|
||||
s32 new_value);
|
||||
void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req);
|
||||
|
||||
int pm_qos_add_notifier(int qos, struct notifier_block *notifier);
|
||||
int pm_qos_remove_notifier(int qos, struct notifier_block *notifier);
|
||||
int pm_qos_request(int pm_qos_class);
|
||||
int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *notifier);
|
||||
int pm_qos_remove_notifier(int pm_qos_class, struct notifier_block *notifier);
|
||||
|
||||
|
@ -30,6 +30,9 @@ extern void pm_runtime_enable(struct device *dev);
|
||||
extern void __pm_runtime_disable(struct device *dev, bool check_resume);
|
||||
extern void pm_runtime_allow(struct device *dev);
|
||||
extern void pm_runtime_forbid(struct device *dev);
|
||||
extern int pm_generic_runtime_idle(struct device *dev);
|
||||
extern int pm_generic_runtime_suspend(struct device *dev);
|
||||
extern int pm_generic_runtime_resume(struct device *dev);
|
||||
|
||||
static inline bool pm_children_suspended(struct device *dev)
|
||||
{
|
||||
@ -96,6 +99,10 @@ static inline bool device_run_wake(struct device *dev) { return false; }
|
||||
static inline void device_set_run_wake(struct device *dev, bool enable) {}
|
||||
static inline bool pm_runtime_suspended(struct device *dev) { return false; }
|
||||
|
||||
static inline int pm_generic_runtime_idle(struct device *dev) { return 0; }
|
||||
static inline int pm_generic_runtime_suspend(struct device *dev) { return 0; }
|
||||
static inline int pm_generic_runtime_resume(struct device *dev) { return 0; }
|
||||
|
||||
#endif /* !CONFIG_PM_RUNTIME */
|
||||
|
||||
static inline int pm_runtime_get(struct device *dev)
|
||||
|
@ -25,32 +25,34 @@
|
||||
# error "please don't include this file directly"
|
||||
#endif
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
#ifdef CONFIG_PM
|
||||
|
||||
/* changes to device_may_wakeup take effect on the next pm state change.
|
||||
* by default, devices should wakeup if they can.
|
||||
*/
|
||||
static inline void device_init_wakeup(struct device *dev, int val)
|
||||
static inline void device_init_wakeup(struct device *dev, bool val)
|
||||
{
|
||||
dev->power.can_wakeup = dev->power.should_wakeup = !!val;
|
||||
dev->power.can_wakeup = dev->power.should_wakeup = val;
|
||||
}
|
||||
|
||||
static inline void device_set_wakeup_capable(struct device *dev, int val)
|
||||
static inline void device_set_wakeup_capable(struct device *dev, bool capable)
|
||||
{
|
||||
dev->power.can_wakeup = !!val;
|
||||
dev->power.can_wakeup = capable;
|
||||
}
|
||||
|
||||
static inline int device_can_wakeup(struct device *dev)
|
||||
static inline bool device_can_wakeup(struct device *dev)
|
||||
{
|
||||
return dev->power.can_wakeup;
|
||||
}
|
||||
|
||||
static inline void device_set_wakeup_enable(struct device *dev, int val)
|
||||
static inline void device_set_wakeup_enable(struct device *dev, bool enable)
|
||||
{
|
||||
dev->power.should_wakeup = !!val;
|
||||
dev->power.should_wakeup = enable;
|
||||
}
|
||||
|
||||
static inline int device_may_wakeup(struct device *dev)
|
||||
static inline bool device_may_wakeup(struct device *dev)
|
||||
{
|
||||
return dev->power.can_wakeup && dev->power.should_wakeup;
|
||||
}
|
||||
@ -58,20 +60,28 @@ static inline int device_may_wakeup(struct device *dev)
|
||||
#else /* !CONFIG_PM */
|
||||
|
||||
/* For some reason the next two routines work even without CONFIG_PM */
|
||||
static inline void device_init_wakeup(struct device *dev, int val)
|
||||
static inline void device_init_wakeup(struct device *dev, bool val)
|
||||
{
|
||||
dev->power.can_wakeup = !!val;
|
||||
dev->power.can_wakeup = val;
|
||||
}
|
||||
|
||||
static inline void device_set_wakeup_capable(struct device *dev, int val) { }
|
||||
static inline void device_set_wakeup_capable(struct device *dev, bool capable)
|
||||
{
|
||||
}
|
||||
|
||||
static inline int device_can_wakeup(struct device *dev)
|
||||
static inline bool device_can_wakeup(struct device *dev)
|
||||
{
|
||||
return dev->power.can_wakeup;
|
||||
}
|
||||
|
||||
#define device_set_wakeup_enable(dev, val) do {} while (0)
|
||||
#define device_may_wakeup(dev) 0
|
||||
static inline void device_set_wakeup_enable(struct device *dev, bool enable)
|
||||
{
|
||||
}
|
||||
|
||||
static inline bool device_may_wakeup(struct device *dev)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
#endif /* !CONFIG_PM */
|
||||
|
||||
|
@ -29,6 +29,7 @@
|
||||
#include <linux/poll.h>
|
||||
#include <linux/mm.h>
|
||||
#include <linux/bitops.h>
|
||||
#include <linux/pm_qos_params.h>
|
||||
|
||||
#define snd_pcm_substream_chip(substream) ((substream)->private_data)
|
||||
#define snd_pcm_chip(pcm) ((pcm)->private_data)
|
||||
@ -365,7 +366,7 @@ struct snd_pcm_substream {
|
||||
int number;
|
||||
char name[32]; /* substream name */
|
||||
int stream; /* stream (direction) */
|
||||
char latency_id[20]; /* latency identifier */
|
||||
struct pm_qos_request_list *latency_pm_qos_req; /* pm_qos request */
|
||||
size_t buffer_bytes_max; /* limit ring buffer size */
|
||||
struct snd_dma_buffer dma_buffer;
|
||||
unsigned int dma_buf_id;
|
||||
|
@ -89,10 +89,10 @@ struct cgroup_subsys freezer_subsys;
|
||||
|
||||
/* Locks taken and their ordering
|
||||
* ------------------------------
|
||||
* css_set_lock
|
||||
* cgroup_mutex (AKA cgroup_lock)
|
||||
* task->alloc_lock (AKA task_lock)
|
||||
* freezer->lock
|
||||
* css_set_lock
|
||||
* task->alloc_lock (AKA task_lock)
|
||||
* task->sighand->siglock
|
||||
*
|
||||
* cgroup code forces css_set_lock to be taken before task->alloc_lock
|
||||
@ -100,33 +100,38 @@ struct cgroup_subsys freezer_subsys;
|
||||
* freezer_create(), freezer_destroy():
|
||||
* cgroup_mutex [ by cgroup core ]
|
||||
*
|
||||
* can_attach():
|
||||
* cgroup_mutex
|
||||
* freezer_can_attach():
|
||||
* cgroup_mutex (held by caller of can_attach)
|
||||
*
|
||||
* cgroup_frozen():
|
||||
* cgroup_freezing_or_frozen():
|
||||
* task->alloc_lock (to get task's cgroup)
|
||||
*
|
||||
* freezer_fork() (preserving fork() performance means can't take cgroup_mutex):
|
||||
* task->alloc_lock (to get task's cgroup)
|
||||
* freezer->lock
|
||||
* sighand->siglock (if the cgroup is freezing)
|
||||
*
|
||||
* freezer_read():
|
||||
* cgroup_mutex
|
||||
* freezer->lock
|
||||
* write_lock css_set_lock (cgroup iterator start)
|
||||
* task->alloc_lock
|
||||
* read_lock css_set_lock (cgroup iterator start)
|
||||
*
|
||||
* freezer_write() (freeze):
|
||||
* cgroup_mutex
|
||||
* freezer->lock
|
||||
* write_lock css_set_lock (cgroup iterator start)
|
||||
* task->alloc_lock
|
||||
* read_lock css_set_lock (cgroup iterator start)
|
||||
* sighand->siglock
|
||||
* sighand->siglock (fake signal delivery inside freeze_task())
|
||||
*
|
||||
* freezer_write() (unfreeze):
|
||||
* cgroup_mutex
|
||||
* freezer->lock
|
||||
* write_lock css_set_lock (cgroup iterator start)
|
||||
* task->alloc_lock
|
||||
* read_lock css_set_lock (cgroup iterator start)
|
||||
* task->alloc_lock (to prevent races with freeze_task())
|
||||
* task->alloc_lock (inside thaw_process(), prevents race with refrigerator())
|
||||
* sighand->siglock
|
||||
*/
|
||||
static struct cgroup_subsys_state *freezer_create(struct cgroup_subsys *ss,
|
||||
|
@ -2,7 +2,7 @@
|
||||
* This module exposes the interface to kernel space for specifying
|
||||
* QoS dependencies. It provides infrastructure for registration of:
|
||||
*
|
||||
* Dependents on a QoS value : register requirements
|
||||
* Dependents on a QoS value : register requests
|
||||
* Watchers of QoS value : get notified when target QoS value changes
|
||||
*
|
||||
* This QoS design is best effort based. Dependents register their QoS needs.
|
||||
@ -14,19 +14,21 @@
|
||||
* timeout: usec <-- currently not used.
|
||||
* throughput: kbs (kilo byte / sec)
|
||||
*
|
||||
* There are lists of pm_qos_objects each one wrapping requirements, notifiers
|
||||
* There are lists of pm_qos_objects each one wrapping requests, notifiers
|
||||
*
|
||||
* User mode requirements on a QOS parameter register themselves to the
|
||||
* User mode requests on a QOS parameter register themselves to the
|
||||
* subsystem by opening the device node /dev/... and writing there request to
|
||||
* the node. As long as the process holds a file handle open to the node the
|
||||
* client continues to be accounted for. Upon file release the usermode
|
||||
* requirement is removed and a new qos target is computed. This way when the
|
||||
* requirement that the application has is cleaned up when closes the file
|
||||
* request is removed and a new qos target is computed. This way when the
|
||||
* request that the application has is cleaned up when closes the file
|
||||
* pointer or exits the pm_qos_object will get an opportunity to clean up.
|
||||
*
|
||||
* Mark Gross <mgross@linux.intel.com>
|
||||
*/
|
||||
|
||||
/*#define DEBUG*/
|
||||
|
||||
#include <linux/pm_qos_params.h>
|
||||
#include <linux/sched.h>
|
||||
#include <linux/spinlock.h>
|
||||
@ -42,25 +44,25 @@
|
||||
#include <linux/uaccess.h>
|
||||
|
||||
/*
|
||||
* locking rule: all changes to requirements or notifiers lists
|
||||
* locking rule: all changes to requests or notifiers lists
|
||||
* or pm_qos_object list and pm_qos_objects need to happen with pm_qos_lock
|
||||
* held, taken with _irqsave. One lock to rule them all
|
||||
*/
|
||||
struct requirement_list {
|
||||
struct pm_qos_request_list {
|
||||
struct list_head list;
|
||||
union {
|
||||
s32 value;
|
||||
s32 usec;
|
||||
s32 kbps;
|
||||
};
|
||||
char *name;
|
||||
int pm_qos_class;
|
||||
};
|
||||
|
||||
static s32 max_compare(s32 v1, s32 v2);
|
||||
static s32 min_compare(s32 v1, s32 v2);
|
||||
|
||||
struct pm_qos_object {
|
||||
struct requirement_list requirements;
|
||||
struct pm_qos_request_list requests;
|
||||
struct blocking_notifier_head *notifiers;
|
||||
struct miscdevice pm_qos_power_miscdev;
|
||||
char *name;
|
||||
@ -72,7 +74,7 @@ struct pm_qos_object {
|
||||
static struct pm_qos_object null_pm_qos;
|
||||
static BLOCKING_NOTIFIER_HEAD(cpu_dma_lat_notifier);
|
||||
static struct pm_qos_object cpu_dma_pm_qos = {
|
||||
.requirements = {LIST_HEAD_INIT(cpu_dma_pm_qos.requirements.list)},
|
||||
.requests = {LIST_HEAD_INIT(cpu_dma_pm_qos.requests.list)},
|
||||
.notifiers = &cpu_dma_lat_notifier,
|
||||
.name = "cpu_dma_latency",
|
||||
.default_value = 2000 * USEC_PER_SEC,
|
||||
@ -82,7 +84,7 @@ static struct pm_qos_object cpu_dma_pm_qos = {
|
||||
|
||||
static BLOCKING_NOTIFIER_HEAD(network_lat_notifier);
|
||||
static struct pm_qos_object network_lat_pm_qos = {
|
||||
.requirements = {LIST_HEAD_INIT(network_lat_pm_qos.requirements.list)},
|
||||
.requests = {LIST_HEAD_INIT(network_lat_pm_qos.requests.list)},
|
||||
.notifiers = &network_lat_notifier,
|
||||
.name = "network_latency",
|
||||
.default_value = 2000 * USEC_PER_SEC,
|
||||
@ -93,8 +95,7 @@ static struct pm_qos_object network_lat_pm_qos = {
|
||||
|
||||
static BLOCKING_NOTIFIER_HEAD(network_throughput_notifier);
|
||||
static struct pm_qos_object network_throughput_pm_qos = {
|
||||
.requirements =
|
||||
{LIST_HEAD_INIT(network_throughput_pm_qos.requirements.list)},
|
||||
.requests = {LIST_HEAD_INIT(network_throughput_pm_qos.requests.list)},
|
||||
.notifiers = &network_throughput_notifier,
|
||||
.name = "network_throughput",
|
||||
.default_value = 0,
|
||||
@ -135,31 +136,34 @@ static s32 min_compare(s32 v1, s32 v2)
|
||||
}
|
||||
|
||||
|
||||
static void update_target(int target)
|
||||
static void update_target(int pm_qos_class)
|
||||
{
|
||||
s32 extreme_value;
|
||||
struct requirement_list *node;
|
||||
struct pm_qos_request_list *node;
|
||||
unsigned long flags;
|
||||
int call_notifier = 0;
|
||||
|
||||
spin_lock_irqsave(&pm_qos_lock, flags);
|
||||
extreme_value = pm_qos_array[target]->default_value;
|
||||
extreme_value = pm_qos_array[pm_qos_class]->default_value;
|
||||
list_for_each_entry(node,
|
||||
&pm_qos_array[target]->requirements.list, list) {
|
||||
extreme_value = pm_qos_array[target]->comparitor(
|
||||
&pm_qos_array[pm_qos_class]->requests.list, list) {
|
||||
extreme_value = pm_qos_array[pm_qos_class]->comparitor(
|
||||
extreme_value, node->value);
|
||||
}
|
||||
if (atomic_read(&pm_qos_array[target]->target_value) != extreme_value) {
|
||||
if (atomic_read(&pm_qos_array[pm_qos_class]->target_value) !=
|
||||
extreme_value) {
|
||||
call_notifier = 1;
|
||||
atomic_set(&pm_qos_array[target]->target_value, extreme_value);
|
||||
pr_debug(KERN_ERR "new target for qos %d is %d\n", target,
|
||||
atomic_read(&pm_qos_array[target]->target_value));
|
||||
atomic_set(&pm_qos_array[pm_qos_class]->target_value,
|
||||
extreme_value);
|
||||
pr_debug(KERN_ERR "new target for qos %d is %d\n", pm_qos_class,
|
||||
atomic_read(&pm_qos_array[pm_qos_class]->target_value));
|
||||
}
|
||||
spin_unlock_irqrestore(&pm_qos_lock, flags);
|
||||
|
||||
if (call_notifier)
|
||||
blocking_notifier_call_chain(pm_qos_array[target]->notifiers,
|
||||
(unsigned long) extreme_value, NULL);
|
||||
blocking_notifier_call_chain(
|
||||
pm_qos_array[pm_qos_class]->notifiers,
|
||||
(unsigned long) extreme_value, NULL);
|
||||
}
|
||||
|
||||
static int register_pm_qos_misc(struct pm_qos_object *qos)
|
||||
@ -185,125 +189,112 @@ static int find_pm_qos_object_by_minor(int minor)
|
||||
}
|
||||
|
||||
/**
|
||||
* pm_qos_requirement - returns current system wide qos expectation
|
||||
* pm_qos_request - returns current system wide qos expectation
|
||||
* @pm_qos_class: identification of which qos value is requested
|
||||
*
|
||||
* This function returns the current target value in an atomic manner.
|
||||
*/
|
||||
int pm_qos_requirement(int pm_qos_class)
|
||||
int pm_qos_request(int pm_qos_class)
|
||||
{
|
||||
return atomic_read(&pm_qos_array[pm_qos_class]->target_value);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(pm_qos_requirement);
|
||||
EXPORT_SYMBOL_GPL(pm_qos_request);
|
||||
|
||||
/**
|
||||
* pm_qos_add_requirement - inserts new qos request into the list
|
||||
* pm_qos_add_request - inserts new qos request into the list
|
||||
* @pm_qos_class: identifies which list of qos request to us
|
||||
* @name: identifies the request
|
||||
* @value: defines the qos request
|
||||
*
|
||||
* This function inserts a new entry in the pm_qos_class list of requested qos
|
||||
* performance characteristics. It recomputes the aggregate QoS expectations
|
||||
* for the pm_qos_class of parameters.
|
||||
* for the pm_qos_class of parameters, and returns the pm_qos_request list
|
||||
* element as a handle for use in updating and removal. Call needs to save
|
||||
* this handle for later use.
|
||||
*/
|
||||
int pm_qos_add_requirement(int pm_qos_class, char *name, s32 value)
|
||||
struct pm_qos_request_list *pm_qos_add_request(int pm_qos_class, s32 value)
|
||||
{
|
||||
struct requirement_list *dep;
|
||||
struct pm_qos_request_list *dep;
|
||||
unsigned long flags;
|
||||
|
||||
dep = kzalloc(sizeof(struct requirement_list), GFP_KERNEL);
|
||||
dep = kzalloc(sizeof(struct pm_qos_request_list), GFP_KERNEL);
|
||||
if (dep) {
|
||||
if (value == PM_QOS_DEFAULT_VALUE)
|
||||
dep->value = pm_qos_array[pm_qos_class]->default_value;
|
||||
else
|
||||
dep->value = value;
|
||||
dep->name = kstrdup(name, GFP_KERNEL);
|
||||
if (!dep->name)
|
||||
goto cleanup;
|
||||
dep->pm_qos_class = pm_qos_class;
|
||||
|
||||
spin_lock_irqsave(&pm_qos_lock, flags);
|
||||
list_add(&dep->list,
|
||||
&pm_qos_array[pm_qos_class]->requirements.list);
|
||||
&pm_qos_array[pm_qos_class]->requests.list);
|
||||
spin_unlock_irqrestore(&pm_qos_lock, flags);
|
||||
update_target(pm_qos_class);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
cleanup:
|
||||
kfree(dep);
|
||||
return -ENOMEM;
|
||||
return dep;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(pm_qos_add_requirement);
|
||||
EXPORT_SYMBOL_GPL(pm_qos_add_request);
|
||||
|
||||
/**
|
||||
* pm_qos_update_requirement - modifies an existing qos request
|
||||
* @pm_qos_class: identifies which list of qos request to us
|
||||
* @name: identifies the request
|
||||
* pm_qos_update_request - modifies an existing qos request
|
||||
* @pm_qos_req : handle to list element holding a pm_qos request to use
|
||||
* @value: defines the qos request
|
||||
*
|
||||
* Updates an existing qos requirement for the pm_qos_class of parameters along
|
||||
* Updates an existing qos request for the pm_qos_class of parameters along
|
||||
* with updating the target pm_qos_class value.
|
||||
*
|
||||
* If the named request isn't in the list then no change is made.
|
||||
* Attempts are made to make this code callable on hot code paths.
|
||||
*/
|
||||
int pm_qos_update_requirement(int pm_qos_class, char *name, s32 new_value)
|
||||
void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
|
||||
s32 new_value)
|
||||
{
|
||||
unsigned long flags;
|
||||
struct requirement_list *node;
|
||||
int pending_update = 0;
|
||||
s32 temp;
|
||||
|
||||
spin_lock_irqsave(&pm_qos_lock, flags);
|
||||
list_for_each_entry(node,
|
||||
&pm_qos_array[pm_qos_class]->requirements.list, list) {
|
||||
if (strcmp(node->name, name) == 0) {
|
||||
if (new_value == PM_QOS_DEFAULT_VALUE)
|
||||
node->value =
|
||||
pm_qos_array[pm_qos_class]->default_value;
|
||||
else
|
||||
node->value = new_value;
|
||||
if (pm_qos_req) { /*guard against callers passing in null */
|
||||
spin_lock_irqsave(&pm_qos_lock, flags);
|
||||
if (new_value == PM_QOS_DEFAULT_VALUE)
|
||||
temp = pm_qos_array[pm_qos_req->pm_qos_class]->default_value;
|
||||
else
|
||||
temp = new_value;
|
||||
|
||||
if (temp != pm_qos_req->value) {
|
||||
pending_update = 1;
|
||||
break;
|
||||
pm_qos_req->value = temp;
|
||||
}
|
||||
spin_unlock_irqrestore(&pm_qos_lock, flags);
|
||||
if (pending_update)
|
||||
update_target(pm_qos_req->pm_qos_class);
|
||||
}
|
||||
spin_unlock_irqrestore(&pm_qos_lock, flags);
|
||||
if (pending_update)
|
||||
update_target(pm_qos_class);
|
||||
|
||||
return 0;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(pm_qos_update_requirement);
|
||||
EXPORT_SYMBOL_GPL(pm_qos_update_request);
|
||||
|
||||
/**
|
||||
* pm_qos_remove_requirement - modifies an existing qos request
|
||||
* @pm_qos_class: identifies which list of qos request to us
|
||||
* @name: identifies the request
|
||||
* pm_qos_remove_request - modifies an existing qos request
|
||||
* @pm_qos_req: handle to request list element
|
||||
*
|
||||
* Will remove named qos request from pm_qos_class list of parameters and
|
||||
* recompute the current target value for the pm_qos_class.
|
||||
* Will remove pm qos request from the list of requests and
|
||||
* recompute the current target value for the pm_qos_class. Call this
|
||||
* on slow code paths.
|
||||
*/
|
||||
void pm_qos_remove_requirement(int pm_qos_class, char *name)
|
||||
void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req)
|
||||
{
|
||||
unsigned long flags;
|
||||
struct requirement_list *node;
|
||||
int pending_update = 0;
|
||||
int qos_class;
|
||||
|
||||
if (pm_qos_req == NULL)
|
||||
return;
|
||||
/* silent return to keep pcm code cleaner */
|
||||
|
||||
qos_class = pm_qos_req->pm_qos_class;
|
||||
spin_lock_irqsave(&pm_qos_lock, flags);
|
||||
list_for_each_entry(node,
|
||||
&pm_qos_array[pm_qos_class]->requirements.list, list) {
|
||||
if (strcmp(node->name, name) == 0) {
|
||||
kfree(node->name);
|
||||
list_del(&node->list);
|
||||
kfree(node);
|
||||
pending_update = 1;
|
||||
break;
|
||||
}
|
||||
}
|
||||
list_del(&pm_qos_req->list);
|
||||
kfree(pm_qos_req);
|
||||
spin_unlock_irqrestore(&pm_qos_lock, flags);
|
||||
if (pending_update)
|
||||
update_target(pm_qos_class);
|
||||
update_target(qos_class);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(pm_qos_remove_requirement);
|
||||
EXPORT_SYMBOL_GPL(pm_qos_remove_request);
|
||||
|
||||
/**
|
||||
* pm_qos_add_notifier - sets notification entry for changes to target value
|
||||
@ -313,7 +304,7 @@ EXPORT_SYMBOL_GPL(pm_qos_remove_requirement);
|
||||
* will register the notifier into a notification chain that gets called
|
||||
* upon changes to the pm_qos_class target value.
|
||||
*/
|
||||
int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *notifier)
|
||||
int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *notifier)
|
||||
{
|
||||
int retval;
|
||||
|
||||
@ -343,21 +334,16 @@ int pm_qos_remove_notifier(int pm_qos_class, struct notifier_block *notifier)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(pm_qos_remove_notifier);
|
||||
|
||||
#define PID_NAME_LEN 32
|
||||
|
||||
static int pm_qos_power_open(struct inode *inode, struct file *filp)
|
||||
{
|
||||
int ret;
|
||||
long pm_qos_class;
|
||||
char name[PID_NAME_LEN];
|
||||
|
||||
pm_qos_class = find_pm_qos_object_by_minor(iminor(inode));
|
||||
if (pm_qos_class >= 0) {
|
||||
filp->private_data = (void *)pm_qos_class;
|
||||
snprintf(name, PID_NAME_LEN, "process_%d", current->pid);
|
||||
ret = pm_qos_add_requirement(pm_qos_class, name,
|
||||
PM_QOS_DEFAULT_VALUE);
|
||||
if (ret >= 0)
|
||||
filp->private_data = (void *) pm_qos_add_request(pm_qos_class,
|
||||
PM_QOS_DEFAULT_VALUE);
|
||||
|
||||
if (filp->private_data)
|
||||
return 0;
|
||||
}
|
||||
return -EPERM;
|
||||
@ -365,32 +351,40 @@ static int pm_qos_power_open(struct inode *inode, struct file *filp)
|
||||
|
||||
static int pm_qos_power_release(struct inode *inode, struct file *filp)
|
||||
{
|
||||
int pm_qos_class;
|
||||
char name[PID_NAME_LEN];
|
||||
struct pm_qos_request_list *req;
|
||||
|
||||
pm_qos_class = (long)filp->private_data;
|
||||
snprintf(name, PID_NAME_LEN, "process_%d", current->pid);
|
||||
pm_qos_remove_requirement(pm_qos_class, name);
|
||||
req = (struct pm_qos_request_list *)filp->private_data;
|
||||
pm_qos_remove_request(req);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
static ssize_t pm_qos_power_write(struct file *filp, const char __user *buf,
|
||||
size_t count, loff_t *f_pos)
|
||||
{
|
||||
s32 value;
|
||||
int pm_qos_class;
|
||||
char name[PID_NAME_LEN];
|
||||
int x;
|
||||
char ascii_value[11];
|
||||
struct pm_qos_request_list *pm_qos_req;
|
||||
|
||||
pm_qos_class = (long)filp->private_data;
|
||||
if (count != sizeof(s32))
|
||||
if (count == sizeof(s32)) {
|
||||
if (copy_from_user(&value, buf, sizeof(s32)))
|
||||
return -EFAULT;
|
||||
} else if (count == 11) { /* len('0x12345678/0') */
|
||||
if (copy_from_user(ascii_value, buf, 11))
|
||||
return -EFAULT;
|
||||
x = sscanf(ascii_value, "%x", &value);
|
||||
if (x != 1)
|
||||
return -EINVAL;
|
||||
pr_debug(KERN_ERR "%s, %d, 0x%x\n", ascii_value, x, value);
|
||||
} else
|
||||
return -EINVAL;
|
||||
if (copy_from_user(&value, buf, sizeof(s32)))
|
||||
return -EFAULT;
|
||||
snprintf(name, PID_NAME_LEN, "process_%d", current->pid);
|
||||
pm_qos_update_requirement(pm_qos_class, name, value);
|
||||
|
||||
return sizeof(s32);
|
||||
pm_qos_req = (struct pm_qos_request_list *)filp->private_data;
|
||||
pm_qos_update_request(pm_qos_req, value);
|
||||
|
||||
return count;
|
||||
}
|
||||
|
||||
|
||||
|
@ -8,7 +8,8 @@ obj-$(CONFIG_PM_SLEEP) += console.o
|
||||
obj-$(CONFIG_FREEZER) += process.o
|
||||
obj-$(CONFIG_SUSPEND) += suspend.o
|
||||
obj-$(CONFIG_PM_TEST_SUSPEND) += suspend_test.o
|
||||
obj-$(CONFIG_HIBERNATION) += hibernate.o snapshot.o swap.o user.o
|
||||
obj-$(CONFIG_HIBERNATION) += hibernate.o snapshot.o swap.o user.o \
|
||||
block_io.o
|
||||
obj-$(CONFIG_HIBERNATION_NVS) += hibernate_nvs.o
|
||||
|
||||
obj-$(CONFIG_MAGIC_SYSRQ) += poweroff.o
|
||||
|
103
kernel/power/block_io.c
Normal file
103
kernel/power/block_io.c
Normal file
@ -0,0 +1,103 @@
|
||||
/*
|
||||
* This file provides functions for block I/O operations on swap/file.
|
||||
*
|
||||
* Copyright (C) 1998,2001-2005 Pavel Machek <pavel@ucw.cz>
|
||||
* Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
|
||||
*
|
||||
* This file is released under the GPLv2.
|
||||
*/
|
||||
|
||||
#include <linux/bio.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/pagemap.h>
|
||||
#include <linux/swap.h>
|
||||
|
||||
#include "power.h"
|
||||
|
||||
/**
|
||||
* submit - submit BIO request.
|
||||
* @rw: READ or WRITE.
|
||||
* @off physical offset of page.
|
||||
* @page: page we're reading or writing.
|
||||
* @bio_chain: list of pending biod (for async reading)
|
||||
*
|
||||
* Straight from the textbook - allocate and initialize the bio.
|
||||
* If we're reading, make sure the page is marked as dirty.
|
||||
* Then submit it and, if @bio_chain == NULL, wait.
|
||||
*/
|
||||
static int submit(int rw, struct block_device *bdev, sector_t sector,
|
||||
struct page *page, struct bio **bio_chain)
|
||||
{
|
||||
const int bio_rw = rw | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG);
|
||||
struct bio *bio;
|
||||
|
||||
bio = bio_alloc(__GFP_WAIT | __GFP_HIGH, 1);
|
||||
bio->bi_sector = sector;
|
||||
bio->bi_bdev = bdev;
|
||||
bio->bi_end_io = end_swap_bio_read;
|
||||
|
||||
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
|
||||
printk(KERN_ERR "PM: Adding page to bio failed at %llu\n",
|
||||
(unsigned long long)sector);
|
||||
bio_put(bio);
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
lock_page(page);
|
||||
bio_get(bio);
|
||||
|
||||
if (bio_chain == NULL) {
|
||||
submit_bio(bio_rw, bio);
|
||||
wait_on_page_locked(page);
|
||||
if (rw == READ)
|
||||
bio_set_pages_dirty(bio);
|
||||
bio_put(bio);
|
||||
} else {
|
||||
if (rw == READ)
|
||||
get_page(page); /* These pages are freed later */
|
||||
bio->bi_private = *bio_chain;
|
||||
*bio_chain = bio;
|
||||
submit_bio(bio_rw, bio);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
int hib_bio_read_page(pgoff_t page_off, void *addr, struct bio **bio_chain)
|
||||
{
|
||||
return submit(READ, hib_resume_bdev, page_off * (PAGE_SIZE >> 9),
|
||||
virt_to_page(addr), bio_chain);
|
||||
}
|
||||
|
||||
int hib_bio_write_page(pgoff_t page_off, void *addr, struct bio **bio_chain)
|
||||
{
|
||||
return submit(WRITE, hib_resume_bdev, page_off * (PAGE_SIZE >> 9),
|
||||
virt_to_page(addr), bio_chain);
|
||||
}
|
||||
|
||||
int hib_wait_on_bio_chain(struct bio **bio_chain)
|
||||
{
|
||||
struct bio *bio;
|
||||
struct bio *next_bio;
|
||||
int ret = 0;
|
||||
|
||||
if (bio_chain == NULL)
|
||||
return 0;
|
||||
|
||||
bio = *bio_chain;
|
||||
if (bio == NULL)
|
||||
return 0;
|
||||
while (bio) {
|
||||
struct page *page;
|
||||
|
||||
next_bio = bio->bi_private;
|
||||
page = bio->bi_io_vec[0].bv_page;
|
||||
wait_on_page_locked(page);
|
||||
if (!PageUptodate(page) || PageError(page))
|
||||
ret = -EIO;
|
||||
put_page(page);
|
||||
bio_put(bio);
|
||||
bio = next_bio;
|
||||
}
|
||||
*bio_chain = NULL;
|
||||
return ret;
|
||||
}
|
@ -97,24 +97,12 @@ extern int hibernate_preallocate_memory(void);
|
||||
*/
|
||||
|
||||
struct snapshot_handle {
|
||||
loff_t offset; /* number of the last byte ready for reading
|
||||
* or writing in the sequence
|
||||
*/
|
||||
unsigned int cur; /* number of the block of PAGE_SIZE bytes the
|
||||
* next operation will refer to (ie. current)
|
||||
*/
|
||||
unsigned int cur_offset; /* offset with respect to the current
|
||||
* block (for the next operation)
|
||||
*/
|
||||
unsigned int prev; /* number of the block of PAGE_SIZE bytes that
|
||||
* was the current one previously
|
||||
*/
|
||||
void *buffer; /* address of the block to read from
|
||||
* or write to
|
||||
*/
|
||||
unsigned int buf_offset; /* location to read from or write to,
|
||||
* given as a displacement from 'buffer'
|
||||
*/
|
||||
int sync_read; /* Set to one to notify the caller of
|
||||
* snapshot_write_next() that it may
|
||||
* need to call wait_on_bio_chain()
|
||||
@ -125,12 +113,12 @@ struct snapshot_handle {
|
||||
* snapshot_read_next()/snapshot_write_next() is allowed to
|
||||
* read/write data after the function returns
|
||||
*/
|
||||
#define data_of(handle) ((handle).buffer + (handle).buf_offset)
|
||||
#define data_of(handle) ((handle).buffer)
|
||||
|
||||
extern unsigned int snapshot_additional_pages(struct zone *zone);
|
||||
extern unsigned long snapshot_get_image_size(void);
|
||||
extern int snapshot_read_next(struct snapshot_handle *handle, size_t count);
|
||||
extern int snapshot_write_next(struct snapshot_handle *handle, size_t count);
|
||||
extern int snapshot_read_next(struct snapshot_handle *handle);
|
||||
extern int snapshot_write_next(struct snapshot_handle *handle);
|
||||
extern void snapshot_write_finalize(struct snapshot_handle *handle);
|
||||
extern int snapshot_image_loaded(struct snapshot_handle *handle);
|
||||
|
||||
@ -154,6 +142,15 @@ extern int swsusp_read(unsigned int *flags_p);
|
||||
extern int swsusp_write(unsigned int flags);
|
||||
extern void swsusp_close(fmode_t);
|
||||
|
||||
/* kernel/power/block_io.c */
|
||||
extern struct block_device *hib_resume_bdev;
|
||||
|
||||
extern int hib_bio_read_page(pgoff_t page_off, void *addr,
|
||||
struct bio **bio_chain);
|
||||
extern int hib_bio_write_page(pgoff_t page_off, void *addr,
|
||||
struct bio **bio_chain);
|
||||
extern int hib_wait_on_bio_chain(struct bio **bio_chain);
|
||||
|
||||
struct timeval;
|
||||
/* kernel/power/swsusp.c */
|
||||
extern void swsusp_show_speed(struct timeval *, struct timeval *,
|
||||
|
@ -1604,14 +1604,9 @@ pack_pfns(unsigned long *buf, struct memory_bitmap *bm)
|
||||
* snapshot_handle structure. The structure gets updated and a pointer
|
||||
* to it should be passed to this function every next time.
|
||||
*
|
||||
* The @count parameter should contain the number of bytes the caller
|
||||
* wants to read from the snapshot. It must not be zero.
|
||||
*
|
||||
* On success the function returns a positive number. Then, the caller
|
||||
* is allowed to read up to the returned number of bytes from the memory
|
||||
* location computed by the data_of() macro. The number returned
|
||||
* may be smaller than @count, but this only happens if the read would
|
||||
* cross a page boundary otherwise.
|
||||
* location computed by the data_of() macro.
|
||||
*
|
||||
* The function returns 0 to indicate the end of data stream condition,
|
||||
* and a negative number is returned on error. In such cases the
|
||||
@ -1619,7 +1614,7 @@ pack_pfns(unsigned long *buf, struct memory_bitmap *bm)
|
||||
* any more.
|
||||
*/
|
||||
|
||||
int snapshot_read_next(struct snapshot_handle *handle, size_t count)
|
||||
int snapshot_read_next(struct snapshot_handle *handle)
|
||||
{
|
||||
if (handle->cur > nr_meta_pages + nr_copy_pages)
|
||||
return 0;
|
||||
@ -1630,7 +1625,7 @@ int snapshot_read_next(struct snapshot_handle *handle, size_t count)
|
||||
if (!buffer)
|
||||
return -ENOMEM;
|
||||
}
|
||||
if (!handle->offset) {
|
||||
if (!handle->cur) {
|
||||
int error;
|
||||
|
||||
error = init_header((struct swsusp_info *)buffer);
|
||||
@ -1639,42 +1634,30 @@ int snapshot_read_next(struct snapshot_handle *handle, size_t count)
|
||||
handle->buffer = buffer;
|
||||
memory_bm_position_reset(&orig_bm);
|
||||
memory_bm_position_reset(©_bm);
|
||||
}
|
||||
if (handle->prev < handle->cur) {
|
||||
if (handle->cur <= nr_meta_pages) {
|
||||
memset(buffer, 0, PAGE_SIZE);
|
||||
pack_pfns(buffer, &orig_bm);
|
||||
} else {
|
||||
struct page *page;
|
||||
|
||||
page = pfn_to_page(memory_bm_next_pfn(©_bm));
|
||||
if (PageHighMem(page)) {
|
||||
/* Highmem pages are copied to the buffer,
|
||||
* because we can't return with a kmapped
|
||||
* highmem page (we may not be called again).
|
||||
*/
|
||||
void *kaddr;
|
||||
|
||||
kaddr = kmap_atomic(page, KM_USER0);
|
||||
memcpy(buffer, kaddr, PAGE_SIZE);
|
||||
kunmap_atomic(kaddr, KM_USER0);
|
||||
handle->buffer = buffer;
|
||||
} else {
|
||||
handle->buffer = page_address(page);
|
||||
}
|
||||
}
|
||||
handle->prev = handle->cur;
|
||||
}
|
||||
handle->buf_offset = handle->cur_offset;
|
||||
if (handle->cur_offset + count >= PAGE_SIZE) {
|
||||
count = PAGE_SIZE - handle->cur_offset;
|
||||
handle->cur_offset = 0;
|
||||
handle->cur++;
|
||||
} else if (handle->cur <= nr_meta_pages) {
|
||||
memset(buffer, 0, PAGE_SIZE);
|
||||
pack_pfns(buffer, &orig_bm);
|
||||
} else {
|
||||
handle->cur_offset += count;
|
||||
struct page *page;
|
||||
|
||||
page = pfn_to_page(memory_bm_next_pfn(©_bm));
|
||||
if (PageHighMem(page)) {
|
||||
/* Highmem pages are copied to the buffer,
|
||||
* because we can't return with a kmapped
|
||||
* highmem page (we may not be called again).
|
||||
*/
|
||||
void *kaddr;
|
||||
|
||||
kaddr = kmap_atomic(page, KM_USER0);
|
||||
memcpy(buffer, kaddr, PAGE_SIZE);
|
||||
kunmap_atomic(kaddr, KM_USER0);
|
||||
handle->buffer = buffer;
|
||||
} else {
|
||||
handle->buffer = page_address(page);
|
||||
}
|
||||
}
|
||||
handle->offset += count;
|
||||
return count;
|
||||
handle->cur++;
|
||||
return PAGE_SIZE;
|
||||
}
|
||||
|
||||
/**
|
||||
@ -2133,14 +2116,9 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca)
|
||||
* snapshot_handle structure. The structure gets updated and a pointer
|
||||
* to it should be passed to this function every next time.
|
||||
*
|
||||
* The @count parameter should contain the number of bytes the caller
|
||||
* wants to write to the image. It must not be zero.
|
||||
*
|
||||
* On success the function returns a positive number. Then, the caller
|
||||
* is allowed to write up to the returned number of bytes to the memory
|
||||
* location computed by the data_of() macro. The number returned
|
||||
* may be smaller than @count, but this only happens if the write would
|
||||
* cross a page boundary otherwise.
|
||||
* location computed by the data_of() macro.
|
||||
*
|
||||
* The function returns 0 to indicate the "end of file" condition,
|
||||
* and a negative number is returned on error. In such cases the
|
||||
@ -2148,16 +2126,18 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca)
|
||||
* any more.
|
||||
*/
|
||||
|
||||
int snapshot_write_next(struct snapshot_handle *handle, size_t count)
|
||||
int snapshot_write_next(struct snapshot_handle *handle)
|
||||
{
|
||||
static struct chain_allocator ca;
|
||||
int error = 0;
|
||||
|
||||
/* Check if we have already loaded the entire image */
|
||||
if (handle->prev && handle->cur > nr_meta_pages + nr_copy_pages)
|
||||
if (handle->cur > 1 && handle->cur > nr_meta_pages + nr_copy_pages)
|
||||
return 0;
|
||||
|
||||
if (handle->offset == 0) {
|
||||
handle->sync_read = 1;
|
||||
|
||||
if (!handle->cur) {
|
||||
if (!buffer)
|
||||
/* This makes the buffer be freed by swsusp_free() */
|
||||
buffer = get_image_page(GFP_ATOMIC, PG_ANY);
|
||||
@ -2166,56 +2146,43 @@ int snapshot_write_next(struct snapshot_handle *handle, size_t count)
|
||||
return -ENOMEM;
|
||||
|
||||
handle->buffer = buffer;
|
||||
}
|
||||
handle->sync_read = 1;
|
||||
if (handle->prev < handle->cur) {
|
||||
if (handle->prev == 0) {
|
||||
error = load_header(buffer);
|
||||
} else if (handle->cur == 1) {
|
||||
error = load_header(buffer);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
error = memory_bm_create(©_bm, GFP_ATOMIC, PG_ANY);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
} else if (handle->cur <= nr_meta_pages + 1) {
|
||||
error = unpack_orig_pfns(buffer, ©_bm);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
if (handle->cur == nr_meta_pages + 1) {
|
||||
error = prepare_image(&orig_bm, ©_bm);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
error = memory_bm_create(©_bm, GFP_ATOMIC, PG_ANY);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
} else if (handle->prev <= nr_meta_pages) {
|
||||
error = unpack_orig_pfns(buffer, ©_bm);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
if (handle->prev == nr_meta_pages) {
|
||||
error = prepare_image(&orig_bm, ©_bm);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
chain_init(&ca, GFP_ATOMIC, PG_SAFE);
|
||||
memory_bm_position_reset(&orig_bm);
|
||||
restore_pblist = NULL;
|
||||
handle->buffer = get_buffer(&orig_bm, &ca);
|
||||
handle->sync_read = 0;
|
||||
if (IS_ERR(handle->buffer))
|
||||
return PTR_ERR(handle->buffer);
|
||||
}
|
||||
} else {
|
||||
copy_last_highmem_page();
|
||||
chain_init(&ca, GFP_ATOMIC, PG_SAFE);
|
||||
memory_bm_position_reset(&orig_bm);
|
||||
restore_pblist = NULL;
|
||||
handle->buffer = get_buffer(&orig_bm, &ca);
|
||||
handle->sync_read = 0;
|
||||
if (IS_ERR(handle->buffer))
|
||||
return PTR_ERR(handle->buffer);
|
||||
if (handle->buffer != buffer)
|
||||
handle->sync_read = 0;
|
||||
}
|
||||
handle->prev = handle->cur;
|
||||
}
|
||||
handle->buf_offset = handle->cur_offset;
|
||||
if (handle->cur_offset + count >= PAGE_SIZE) {
|
||||
count = PAGE_SIZE - handle->cur_offset;
|
||||
handle->cur_offset = 0;
|
||||
handle->cur++;
|
||||
} else {
|
||||
handle->cur_offset += count;
|
||||
copy_last_highmem_page();
|
||||
handle->buffer = get_buffer(&orig_bm, &ca);
|
||||
if (IS_ERR(handle->buffer))
|
||||
return PTR_ERR(handle->buffer);
|
||||
if (handle->buffer != buffer)
|
||||
handle->sync_read = 0;
|
||||
}
|
||||
handle->offset += count;
|
||||
return count;
|
||||
handle->cur++;
|
||||
return PAGE_SIZE;
|
||||
}
|
||||
|
||||
/**
|
||||
@ -2230,7 +2197,7 @@ void snapshot_write_finalize(struct snapshot_handle *handle)
|
||||
{
|
||||
copy_last_highmem_page();
|
||||
/* Free only if we have loaded the image entirely */
|
||||
if (handle->prev && handle->cur > nr_meta_pages + nr_copy_pages) {
|
||||
if (handle->cur > 1 && handle->cur > nr_meta_pages + nr_copy_pages) {
|
||||
memory_bm_free(&orig_bm, PG_UNSAFE_CLEAR);
|
||||
free_highmem_data();
|
||||
}
|
||||
|
@ -29,6 +29,40 @@
|
||||
|
||||
#define SWSUSP_SIG "S1SUSPEND"
|
||||
|
||||
/*
|
||||
* The swap map is a data structure used for keeping track of each page
|
||||
* written to a swap partition. It consists of many swap_map_page
|
||||
* structures that contain each an array of MAP_PAGE_SIZE swap entries.
|
||||
* These structures are stored on the swap and linked together with the
|
||||
* help of the .next_swap member.
|
||||
*
|
||||
* The swap map is created during suspend. The swap map pages are
|
||||
* allocated and populated one at a time, so we only need one memory
|
||||
* page to set up the entire structure.
|
||||
*
|
||||
* During resume we also only need to use one swap_map_page structure
|
||||
* at a time.
|
||||
*/
|
||||
|
||||
#define MAP_PAGE_ENTRIES (PAGE_SIZE / sizeof(sector_t) - 1)
|
||||
|
||||
struct swap_map_page {
|
||||
sector_t entries[MAP_PAGE_ENTRIES];
|
||||
sector_t next_swap;
|
||||
};
|
||||
|
||||
/**
|
||||
* The swap_map_handle structure is used for handling swap in
|
||||
* a file-alike way
|
||||
*/
|
||||
|
||||
struct swap_map_handle {
|
||||
struct swap_map_page *cur;
|
||||
sector_t cur_swap;
|
||||
sector_t first_sector;
|
||||
unsigned int k;
|
||||
};
|
||||
|
||||
struct swsusp_header {
|
||||
char reserved[PAGE_SIZE - 20 - sizeof(sector_t) - sizeof(int)];
|
||||
sector_t image;
|
||||
@ -145,110 +179,24 @@ int swsusp_swap_in_use(void)
|
||||
*/
|
||||
|
||||
static unsigned short root_swap = 0xffff;
|
||||
static struct block_device *resume_bdev;
|
||||
|
||||
/**
|
||||
* submit - submit BIO request.
|
||||
* @rw: READ or WRITE.
|
||||
* @off physical offset of page.
|
||||
* @page: page we're reading or writing.
|
||||
* @bio_chain: list of pending biod (for async reading)
|
||||
*
|
||||
* Straight from the textbook - allocate and initialize the bio.
|
||||
* If we're reading, make sure the page is marked as dirty.
|
||||
* Then submit it and, if @bio_chain == NULL, wait.
|
||||
*/
|
||||
static int submit(int rw, pgoff_t page_off, struct page *page,
|
||||
struct bio **bio_chain)
|
||||
{
|
||||
const int bio_rw = rw | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG);
|
||||
struct bio *bio;
|
||||
|
||||
bio = bio_alloc(__GFP_WAIT | __GFP_HIGH, 1);
|
||||
bio->bi_sector = page_off * (PAGE_SIZE >> 9);
|
||||
bio->bi_bdev = resume_bdev;
|
||||
bio->bi_end_io = end_swap_bio_read;
|
||||
|
||||
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
|
||||
printk(KERN_ERR "PM: Adding page to bio failed at %ld\n",
|
||||
page_off);
|
||||
bio_put(bio);
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
lock_page(page);
|
||||
bio_get(bio);
|
||||
|
||||
if (bio_chain == NULL) {
|
||||
submit_bio(bio_rw, bio);
|
||||
wait_on_page_locked(page);
|
||||
if (rw == READ)
|
||||
bio_set_pages_dirty(bio);
|
||||
bio_put(bio);
|
||||
} else {
|
||||
if (rw == READ)
|
||||
get_page(page); /* These pages are freed later */
|
||||
bio->bi_private = *bio_chain;
|
||||
*bio_chain = bio;
|
||||
submit_bio(bio_rw, bio);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int bio_read_page(pgoff_t page_off, void *addr, struct bio **bio_chain)
|
||||
{
|
||||
return submit(READ, page_off, virt_to_page(addr), bio_chain);
|
||||
}
|
||||
|
||||
static int bio_write_page(pgoff_t page_off, void *addr, struct bio **bio_chain)
|
||||
{
|
||||
return submit(WRITE, page_off, virt_to_page(addr), bio_chain);
|
||||
}
|
||||
|
||||
static int wait_on_bio_chain(struct bio **bio_chain)
|
||||
{
|
||||
struct bio *bio;
|
||||
struct bio *next_bio;
|
||||
int ret = 0;
|
||||
|
||||
if (bio_chain == NULL)
|
||||
return 0;
|
||||
|
||||
bio = *bio_chain;
|
||||
if (bio == NULL)
|
||||
return 0;
|
||||
while (bio) {
|
||||
struct page *page;
|
||||
|
||||
next_bio = bio->bi_private;
|
||||
page = bio->bi_io_vec[0].bv_page;
|
||||
wait_on_page_locked(page);
|
||||
if (!PageUptodate(page) || PageError(page))
|
||||
ret = -EIO;
|
||||
put_page(page);
|
||||
bio_put(bio);
|
||||
bio = next_bio;
|
||||
}
|
||||
*bio_chain = NULL;
|
||||
return ret;
|
||||
}
|
||||
struct block_device *hib_resume_bdev;
|
||||
|
||||
/*
|
||||
* Saving part
|
||||
*/
|
||||
|
||||
static int mark_swapfiles(sector_t start, unsigned int flags)
|
||||
static int mark_swapfiles(struct swap_map_handle *handle, unsigned int flags)
|
||||
{
|
||||
int error;
|
||||
|
||||
bio_read_page(swsusp_resume_block, swsusp_header, NULL);
|
||||
hib_bio_read_page(swsusp_resume_block, swsusp_header, NULL);
|
||||
if (!memcmp("SWAP-SPACE",swsusp_header->sig, 10) ||
|
||||
!memcmp("SWAPSPACE2",swsusp_header->sig, 10)) {
|
||||
memcpy(swsusp_header->orig_sig,swsusp_header->sig, 10);
|
||||
memcpy(swsusp_header->sig,SWSUSP_SIG, 10);
|
||||
swsusp_header->image = start;
|
||||
swsusp_header->image = handle->first_sector;
|
||||
swsusp_header->flags = flags;
|
||||
error = bio_write_page(swsusp_resume_block,
|
||||
error = hib_bio_write_page(swsusp_resume_block,
|
||||
swsusp_header, NULL);
|
||||
} else {
|
||||
printk(KERN_ERR "PM: Swap header not found!\n");
|
||||
@ -260,25 +208,26 @@ static int mark_swapfiles(sector_t start, unsigned int flags)
|
||||
/**
|
||||
* swsusp_swap_check - check if the resume device is a swap device
|
||||
* and get its index (if so)
|
||||
*
|
||||
* This is called before saving image
|
||||
*/
|
||||
|
||||
static int swsusp_swap_check(void) /* This is called before saving image */
|
||||
static int swsusp_swap_check(void)
|
||||
{
|
||||
int res;
|
||||
|
||||
res = swap_type_of(swsusp_resume_device, swsusp_resume_block,
|
||||
&resume_bdev);
|
||||
&hib_resume_bdev);
|
||||
if (res < 0)
|
||||
return res;
|
||||
|
||||
root_swap = res;
|
||||
res = blkdev_get(resume_bdev, FMODE_WRITE);
|
||||
res = blkdev_get(hib_resume_bdev, FMODE_WRITE);
|
||||
if (res)
|
||||
return res;
|
||||
|
||||
res = set_blocksize(resume_bdev, PAGE_SIZE);
|
||||
res = set_blocksize(hib_resume_bdev, PAGE_SIZE);
|
||||
if (res < 0)
|
||||
blkdev_put(resume_bdev, FMODE_WRITE);
|
||||
blkdev_put(hib_resume_bdev, FMODE_WRITE);
|
||||
|
||||
return res;
|
||||
}
|
||||
@ -309,42 +258,9 @@ static int write_page(void *buf, sector_t offset, struct bio **bio_chain)
|
||||
} else {
|
||||
src = buf;
|
||||
}
|
||||
return bio_write_page(offset, src, bio_chain);
|
||||
return hib_bio_write_page(offset, src, bio_chain);
|
||||
}
|
||||
|
||||
/*
|
||||
* The swap map is a data structure used for keeping track of each page
|
||||
* written to a swap partition. It consists of many swap_map_page
|
||||
* structures that contain each an array of MAP_PAGE_SIZE swap entries.
|
||||
* These structures are stored on the swap and linked together with the
|
||||
* help of the .next_swap member.
|
||||
*
|
||||
* The swap map is created during suspend. The swap map pages are
|
||||
* allocated and populated one at a time, so we only need one memory
|
||||
* page to set up the entire structure.
|
||||
*
|
||||
* During resume we also only need to use one swap_map_page structure
|
||||
* at a time.
|
||||
*/
|
||||
|
||||
#define MAP_PAGE_ENTRIES (PAGE_SIZE / sizeof(sector_t) - 1)
|
||||
|
||||
struct swap_map_page {
|
||||
sector_t entries[MAP_PAGE_ENTRIES];
|
||||
sector_t next_swap;
|
||||
};
|
||||
|
||||
/**
|
||||
* The swap_map_handle structure is used for handling swap in
|
||||
* a file-alike way
|
||||
*/
|
||||
|
||||
struct swap_map_handle {
|
||||
struct swap_map_page *cur;
|
||||
sector_t cur_swap;
|
||||
unsigned int k;
|
||||
};
|
||||
|
||||
static void release_swap_writer(struct swap_map_handle *handle)
|
||||
{
|
||||
if (handle->cur)
|
||||
@ -354,16 +270,33 @@ static void release_swap_writer(struct swap_map_handle *handle)
|
||||
|
||||
static int get_swap_writer(struct swap_map_handle *handle)
|
||||
{
|
||||
int ret;
|
||||
|
||||
ret = swsusp_swap_check();
|
||||
if (ret) {
|
||||
if (ret != -ENOSPC)
|
||||
printk(KERN_ERR "PM: Cannot find swap device, try "
|
||||
"swapon -a.\n");
|
||||
return ret;
|
||||
}
|
||||
handle->cur = (struct swap_map_page *)get_zeroed_page(GFP_KERNEL);
|
||||
if (!handle->cur)
|
||||
return -ENOMEM;
|
||||
if (!handle->cur) {
|
||||
ret = -ENOMEM;
|
||||
goto err_close;
|
||||
}
|
||||
handle->cur_swap = alloc_swapdev_block(root_swap);
|
||||
if (!handle->cur_swap) {
|
||||
release_swap_writer(handle);
|
||||
return -ENOSPC;
|
||||
ret = -ENOSPC;
|
||||
goto err_rel;
|
||||
}
|
||||
handle->k = 0;
|
||||
handle->first_sector = handle->cur_swap;
|
||||
return 0;
|
||||
err_rel:
|
||||
release_swap_writer(handle);
|
||||
err_close:
|
||||
swsusp_close(FMODE_WRITE);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int swap_write_page(struct swap_map_handle *handle, void *buf,
|
||||
@ -380,7 +313,7 @@ static int swap_write_page(struct swap_map_handle *handle, void *buf,
|
||||
return error;
|
||||
handle->cur->entries[handle->k++] = offset;
|
||||
if (handle->k >= MAP_PAGE_ENTRIES) {
|
||||
error = wait_on_bio_chain(bio_chain);
|
||||
error = hib_wait_on_bio_chain(bio_chain);
|
||||
if (error)
|
||||
goto out;
|
||||
offset = alloc_swapdev_block(root_swap);
|
||||
@ -406,6 +339,24 @@ static int flush_swap_writer(struct swap_map_handle *handle)
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
static int swap_writer_finish(struct swap_map_handle *handle,
|
||||
unsigned int flags, int error)
|
||||
{
|
||||
if (!error) {
|
||||
flush_swap_writer(handle);
|
||||
printk(KERN_INFO "PM: S");
|
||||
error = mark_swapfiles(handle, flags);
|
||||
printk("|\n");
|
||||
}
|
||||
|
||||
if (error)
|
||||
free_all_swap_pages(root_swap);
|
||||
release_swap_writer(handle);
|
||||
swsusp_close(FMODE_WRITE);
|
||||
|
||||
return error;
|
||||
}
|
||||
|
||||
/**
|
||||
* save_image - save the suspend image data
|
||||
*/
|
||||
@ -431,7 +382,7 @@ static int save_image(struct swap_map_handle *handle,
|
||||
bio = NULL;
|
||||
do_gettimeofday(&start);
|
||||
while (1) {
|
||||
ret = snapshot_read_next(snapshot, PAGE_SIZE);
|
||||
ret = snapshot_read_next(snapshot);
|
||||
if (ret <= 0)
|
||||
break;
|
||||
ret = swap_write_page(handle, data_of(*snapshot), &bio);
|
||||
@ -441,7 +392,7 @@ static int save_image(struct swap_map_handle *handle,
|
||||
printk(KERN_CONT "\b\b\b\b%3d%%", nr_pages / m);
|
||||
nr_pages++;
|
||||
}
|
||||
err2 = wait_on_bio_chain(&bio);
|
||||
err2 = hib_wait_on_bio_chain(&bio);
|
||||
do_gettimeofday(&stop);
|
||||
if (!ret)
|
||||
ret = err2;
|
||||
@ -483,50 +434,34 @@ int swsusp_write(unsigned int flags)
|
||||
struct swap_map_handle handle;
|
||||
struct snapshot_handle snapshot;
|
||||
struct swsusp_info *header;
|
||||
unsigned long pages;
|
||||
int error;
|
||||
|
||||
error = swsusp_swap_check();
|
||||
pages = snapshot_get_image_size();
|
||||
error = get_swap_writer(&handle);
|
||||
if (error) {
|
||||
printk(KERN_ERR "PM: Cannot find swap device, try "
|
||||
"swapon -a.\n");
|
||||
printk(KERN_ERR "PM: Cannot get swap writer\n");
|
||||
return error;
|
||||
}
|
||||
if (!enough_swap(pages)) {
|
||||
printk(KERN_ERR "PM: Not enough free swap\n");
|
||||
error = -ENOSPC;
|
||||
goto out_finish;
|
||||
}
|
||||
memset(&snapshot, 0, sizeof(struct snapshot_handle));
|
||||
error = snapshot_read_next(&snapshot, PAGE_SIZE);
|
||||
error = snapshot_read_next(&snapshot);
|
||||
if (error < PAGE_SIZE) {
|
||||
if (error >= 0)
|
||||
error = -EFAULT;
|
||||
|
||||
goto out;
|
||||
goto out_finish;
|
||||
}
|
||||
header = (struct swsusp_info *)data_of(snapshot);
|
||||
if (!enough_swap(header->pages)) {
|
||||
printk(KERN_ERR "PM: Not enough free swap\n");
|
||||
error = -ENOSPC;
|
||||
goto out;
|
||||
}
|
||||
error = get_swap_writer(&handle);
|
||||
if (!error) {
|
||||
sector_t start = handle.cur_swap;
|
||||
|
||||
error = swap_write_page(&handle, header, NULL);
|
||||
if (!error)
|
||||
error = save_image(&handle, &snapshot,
|
||||
header->pages - 1);
|
||||
|
||||
if (!error) {
|
||||
flush_swap_writer(&handle);
|
||||
printk(KERN_INFO "PM: S");
|
||||
error = mark_swapfiles(start, flags);
|
||||
printk("|\n");
|
||||
}
|
||||
}
|
||||
if (error)
|
||||
free_all_swap_pages(root_swap);
|
||||
|
||||
release_swap_writer(&handle);
|
||||
out:
|
||||
swsusp_close(FMODE_WRITE);
|
||||
error = swap_write_page(&handle, header, NULL);
|
||||
if (!error)
|
||||
error = save_image(&handle, &snapshot, pages - 1);
|
||||
out_finish:
|
||||
error = swap_writer_finish(&handle, flags, error);
|
||||
return error;
|
||||
}
|
||||
|
||||
@ -542,18 +477,21 @@ static void release_swap_reader(struct swap_map_handle *handle)
|
||||
handle->cur = NULL;
|
||||
}
|
||||
|
||||
static int get_swap_reader(struct swap_map_handle *handle, sector_t start)
|
||||
static int get_swap_reader(struct swap_map_handle *handle,
|
||||
unsigned int *flags_p)
|
||||
{
|
||||
int error;
|
||||
|
||||
if (!start)
|
||||
*flags_p = swsusp_header->flags;
|
||||
|
||||
if (!swsusp_header->image) /* how can this happen? */
|
||||
return -EINVAL;
|
||||
|
||||
handle->cur = (struct swap_map_page *)get_zeroed_page(__GFP_WAIT | __GFP_HIGH);
|
||||
if (!handle->cur)
|
||||
return -ENOMEM;
|
||||
|
||||
error = bio_read_page(start, handle->cur, NULL);
|
||||
error = hib_bio_read_page(swsusp_header->image, handle->cur, NULL);
|
||||
if (error) {
|
||||
release_swap_reader(handle);
|
||||
return error;
|
||||
@ -573,21 +511,28 @@ static int swap_read_page(struct swap_map_handle *handle, void *buf,
|
||||
offset = handle->cur->entries[handle->k];
|
||||
if (!offset)
|
||||
return -EFAULT;
|
||||
error = bio_read_page(offset, buf, bio_chain);
|
||||
error = hib_bio_read_page(offset, buf, bio_chain);
|
||||
if (error)
|
||||
return error;
|
||||
if (++handle->k >= MAP_PAGE_ENTRIES) {
|
||||
error = wait_on_bio_chain(bio_chain);
|
||||
error = hib_wait_on_bio_chain(bio_chain);
|
||||
handle->k = 0;
|
||||
offset = handle->cur->next_swap;
|
||||
if (!offset)
|
||||
release_swap_reader(handle);
|
||||
else if (!error)
|
||||
error = bio_read_page(offset, handle->cur, NULL);
|
||||
error = hib_bio_read_page(offset, handle->cur, NULL);
|
||||
}
|
||||
return error;
|
||||
}
|
||||
|
||||
static int swap_reader_finish(struct swap_map_handle *handle)
|
||||
{
|
||||
release_swap_reader(handle);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* load_image - load the image using the swap map handle
|
||||
* @handle and the snapshot handle @snapshot
|
||||
@ -615,21 +560,21 @@ static int load_image(struct swap_map_handle *handle,
|
||||
bio = NULL;
|
||||
do_gettimeofday(&start);
|
||||
for ( ; ; ) {
|
||||
error = snapshot_write_next(snapshot, PAGE_SIZE);
|
||||
error = snapshot_write_next(snapshot);
|
||||
if (error <= 0)
|
||||
break;
|
||||
error = swap_read_page(handle, data_of(*snapshot), &bio);
|
||||
if (error)
|
||||
break;
|
||||
if (snapshot->sync_read)
|
||||
error = wait_on_bio_chain(&bio);
|
||||
error = hib_wait_on_bio_chain(&bio);
|
||||
if (error)
|
||||
break;
|
||||
if (!(nr_pages % m))
|
||||
printk("\b\b\b\b%3d%%", nr_pages / m);
|
||||
nr_pages++;
|
||||
}
|
||||
err2 = wait_on_bio_chain(&bio);
|
||||
err2 = hib_wait_on_bio_chain(&bio);
|
||||
do_gettimeofday(&stop);
|
||||
if (!error)
|
||||
error = err2;
|
||||
@ -657,20 +602,20 @@ int swsusp_read(unsigned int *flags_p)
|
||||
struct snapshot_handle snapshot;
|
||||
struct swsusp_info *header;
|
||||
|
||||
*flags_p = swsusp_header->flags;
|
||||
|
||||
memset(&snapshot, 0, sizeof(struct snapshot_handle));
|
||||
error = snapshot_write_next(&snapshot, PAGE_SIZE);
|
||||
error = snapshot_write_next(&snapshot);
|
||||
if (error < PAGE_SIZE)
|
||||
return error < 0 ? error : -EFAULT;
|
||||
header = (struct swsusp_info *)data_of(snapshot);
|
||||
error = get_swap_reader(&handle, swsusp_header->image);
|
||||
error = get_swap_reader(&handle, flags_p);
|
||||
if (error)
|
||||
goto end;
|
||||
if (!error)
|
||||
error = swap_read_page(&handle, header, NULL);
|
||||
if (!error)
|
||||
error = load_image(&handle, &snapshot, header->pages - 1);
|
||||
release_swap_reader(&handle);
|
||||
|
||||
swap_reader_finish(&handle);
|
||||
end:
|
||||
if (!error)
|
||||
pr_debug("PM: Image successfully loaded\n");
|
||||
else
|
||||
@ -686,11 +631,11 @@ int swsusp_check(void)
|
||||
{
|
||||
int error;
|
||||
|
||||
resume_bdev = open_by_devnum(swsusp_resume_device, FMODE_READ);
|
||||
if (!IS_ERR(resume_bdev)) {
|
||||
set_blocksize(resume_bdev, PAGE_SIZE);
|
||||
hib_resume_bdev = open_by_devnum(swsusp_resume_device, FMODE_READ);
|
||||
if (!IS_ERR(hib_resume_bdev)) {
|
||||
set_blocksize(hib_resume_bdev, PAGE_SIZE);
|
||||
memset(swsusp_header, 0, PAGE_SIZE);
|
||||
error = bio_read_page(swsusp_resume_block,
|
||||
error = hib_bio_read_page(swsusp_resume_block,
|
||||
swsusp_header, NULL);
|
||||
if (error)
|
||||
goto put;
|
||||
@ -698,7 +643,7 @@ int swsusp_check(void)
|
||||
if (!memcmp(SWSUSP_SIG, swsusp_header->sig, 10)) {
|
||||
memcpy(swsusp_header->sig, swsusp_header->orig_sig, 10);
|
||||
/* Reset swap signature now */
|
||||
error = bio_write_page(swsusp_resume_block,
|
||||
error = hib_bio_write_page(swsusp_resume_block,
|
||||
swsusp_header, NULL);
|
||||
} else {
|
||||
error = -EINVAL;
|
||||
@ -706,11 +651,11 @@ int swsusp_check(void)
|
||||
|
||||
put:
|
||||
if (error)
|
||||
blkdev_put(resume_bdev, FMODE_READ);
|
||||
blkdev_put(hib_resume_bdev, FMODE_READ);
|
||||
else
|
||||
pr_debug("PM: Signature found, resuming\n");
|
||||
} else {
|
||||
error = PTR_ERR(resume_bdev);
|
||||
error = PTR_ERR(hib_resume_bdev);
|
||||
}
|
||||
|
||||
if (error)
|
||||
@ -725,12 +670,12 @@ put:
|
||||
|
||||
void swsusp_close(fmode_t mode)
|
||||
{
|
||||
if (IS_ERR(resume_bdev)) {
|
||||
if (IS_ERR(hib_resume_bdev)) {
|
||||
pr_debug("PM: Image device not initialised\n");
|
||||
return;
|
||||
}
|
||||
|
||||
blkdev_put(resume_bdev, mode);
|
||||
blkdev_put(hib_resume_bdev, mode);
|
||||
}
|
||||
|
||||
static int swsusp_header_init(void)
|
||||
|
@ -151,6 +151,7 @@ static ssize_t snapshot_read(struct file *filp, char __user *buf,
|
||||
{
|
||||
struct snapshot_data *data;
|
||||
ssize_t res;
|
||||
loff_t pg_offp = *offp & ~PAGE_MASK;
|
||||
|
||||
mutex_lock(&pm_mutex);
|
||||
|
||||
@ -159,14 +160,19 @@ static ssize_t snapshot_read(struct file *filp, char __user *buf,
|
||||
res = -ENODATA;
|
||||
goto Unlock;
|
||||
}
|
||||
res = snapshot_read_next(&data->handle, count);
|
||||
if (res > 0) {
|
||||
if (copy_to_user(buf, data_of(data->handle), res))
|
||||
res = -EFAULT;
|
||||
else
|
||||
*offp = data->handle.offset;
|
||||
if (!pg_offp) { /* on page boundary? */
|
||||
res = snapshot_read_next(&data->handle);
|
||||
if (res <= 0)
|
||||
goto Unlock;
|
||||
} else {
|
||||
res = PAGE_SIZE - pg_offp;
|
||||
}
|
||||
|
||||
res = simple_read_from_buffer(buf, count, &pg_offp,
|
||||
data_of(data->handle), res);
|
||||
if (res > 0)
|
||||
*offp += res;
|
||||
|
||||
Unlock:
|
||||
mutex_unlock(&pm_mutex);
|
||||
|
||||
@ -178,18 +184,25 @@ static ssize_t snapshot_write(struct file *filp, const char __user *buf,
|
||||
{
|
||||
struct snapshot_data *data;
|
||||
ssize_t res;
|
||||
loff_t pg_offp = *offp & ~PAGE_MASK;
|
||||
|
||||
mutex_lock(&pm_mutex);
|
||||
|
||||
data = filp->private_data;
|
||||
res = snapshot_write_next(&data->handle, count);
|
||||
if (res > 0) {
|
||||
if (copy_from_user(data_of(data->handle), buf, res))
|
||||
res = -EFAULT;
|
||||
else
|
||||
*offp = data->handle.offset;
|
||||
|
||||
if (!pg_offp) {
|
||||
res = snapshot_write_next(&data->handle);
|
||||
if (res <= 0)
|
||||
goto unlock;
|
||||
} else {
|
||||
res = PAGE_SIZE - pg_offp;
|
||||
}
|
||||
|
||||
res = simple_write_to_buffer(data_of(data->handle), res, &pg_offp,
|
||||
buf, count);
|
||||
if (res > 0)
|
||||
*offp += res;
|
||||
unlock:
|
||||
mutex_unlock(&pm_mutex);
|
||||
|
||||
return res;
|
||||
|
@ -495,7 +495,7 @@ void ieee80211_recalc_ps(struct ieee80211_local *local, s32 latency)
|
||||
s32 beaconint_us;
|
||||
|
||||
if (latency < 0)
|
||||
latency = pm_qos_requirement(PM_QOS_NETWORK_LATENCY);
|
||||
latency = pm_qos_request(PM_QOS_NETWORK_LATENCY);
|
||||
|
||||
beaconint_us = ieee80211_tu_to_usec(
|
||||
found->vif.bss_conf.beacon_int);
|
||||
|
@ -648,9 +648,6 @@ int snd_pcm_new_stream(struct snd_pcm *pcm, int stream, int substream_count)
|
||||
substream->number = idx;
|
||||
substream->stream = stream;
|
||||
sprintf(substream->name, "subdevice #%i", idx);
|
||||
snprintf(substream->latency_id, sizeof(substream->latency_id),
|
||||
"ALSA-PCM%d-%d%c%d", pcm->card->number, pcm->device,
|
||||
(stream ? 'c' : 'p'), idx);
|
||||
substream->buffer_bytes_max = UINT_MAX;
|
||||
if (prev == NULL)
|
||||
pstr->substream = substream;
|
||||
|
@ -484,11 +484,13 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream,
|
||||
snd_pcm_timer_resolution_change(substream);
|
||||
runtime->status->state = SNDRV_PCM_STATE_SETUP;
|
||||
|
||||
pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY,
|
||||
substream->latency_id);
|
||||
if (substream->latency_pm_qos_req) {
|
||||
pm_qos_remove_request(substream->latency_pm_qos_req);
|
||||
substream->latency_pm_qos_req = NULL;
|
||||
}
|
||||
if ((usecs = period_to_usecs(runtime)) >= 0)
|
||||
pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY,
|
||||
substream->latency_id, usecs);
|
||||
substream->latency_pm_qos_req = pm_qos_add_request(
|
||||
PM_QOS_CPU_DMA_LATENCY, usecs);
|
||||
return 0;
|
||||
_error:
|
||||
/* hardware might be unuseable from this time,
|
||||
@ -543,8 +545,8 @@ static int snd_pcm_hw_free(struct snd_pcm_substream *substream)
|
||||
if (substream->ops->hw_free)
|
||||
result = substream->ops->hw_free(substream);
|
||||
runtime->status->state = SNDRV_PCM_STATE_OPEN;
|
||||
pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY,
|
||||
substream->latency_id);
|
||||
pm_qos_remove_request(substream->latency_pm_qos_req);
|
||||
substream->latency_pm_qos_req = NULL;
|
||||
return result;
|
||||
}
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user