Got the following oops just before reboot:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
[<8028d300>] (__list_del_entry+0x44/0xac)
[<802e3320>] (__fw_load_abort.part.13+0x1c/0x50)
[<802e337c>] (fw_shutdown_notify+0x28/0x50)
[<80034f80>] (notifier_call_chain.isra.1+0x5c/0x9c)
[<800350ec>] (__blocking_notifier_call_chain+0x44/0x58)
[<80035114>] (blocking_notifier_call_chain+0x14/0x18)
[<80035d64>] (kernel_restart_prepare+0x14/0x38)
[<80035d94>] (kernel_restart+0xc/0x50)
The following race condition triggers here:
_request_firmware_load()
device_create_file(...)
kobject_uevent(...)
(schedule)
(resume)
firmware_loading_store(1)
firmware_loading_store(0)
list_del_init(&buf->pending_list)
(schedule)
(resume)
list_add(&buf->pending_list, &pending_fw_head);
wait_for_completion(&buf->completion);
causing an oops later when walking pending_list after the firmware has
been released.
The proposed fix is to move the list_add() before sysfs attribute
creation.
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Acked-by: Ming Lei <ming.lei@canonical.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
device_hotplug_lock is held around the acpi_bus_trim() call in
acpi_scan_hot_remove() which generally removes devices (it removes
ACPI device objects at least, but it may also remove "physical"
device objects through .detach() callbacks of ACPI scan handlers).
Thus, potentially, device sysfs attributes are removed under that
lock and to remove those attributes it is necessary to hold the
s_active references of their directory entries for writing.
On the other hand, the execution of a .show() or .store() callback
from a sysfs attribute is carried out with that attribute's s_active
reference held for reading. Consequently, if any device sysfs
attribute that may be removed from within acpi_scan_hot_remove()
through acpi_bus_trim() has a .store() or .show() callback which
acquires device_hotplug_lock, the execution of that callback may
deadlock with the removal of the attribute. [Unfortunately, the
"online" device attribute of CPUs and memory blocks is one of them.]
To avoid such deadlocks, make all of the sysfs attribute callbacks
that need to lock device hotplug, for example store_online(), use
a special function, lock_device_hotplug_sysfs(), to lock device
hotplug and return the result of that function immediately if it is
not zero. This will cause the s_active reference of the directory
entry in question to be released and the syscall to be restarted
if device_hotplug_lock cannot be acquired.
[show_online() actually doesn't need to lock device hotplug, but
it is useful to serialize it with respect to device_offline() and
device_online() for the same device (in case user space attempts to
run them concurrently) which can be done with the help of
device_lock().]
Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reported-and-tested-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
With devices which have a dense and small register map but placed at a large
offset the global cache_present bitmap imposes a huge memory overhead. Making
the cache_present per rbtree node avoids the issue and easily reduces the memory
footprint by a factor of ten. For devices with a more sparse map or without a
large base register offset the memory usage might increase slightly by a few
bytes, but not significantly. E.g. for a device which has ~50 registers at
offset 0x4000 the memory footprint of the register cache goes down form 2496
bytes to 175 bytes.
Moving the bitmap to a per node basis means that the handling of the bitmap is
now cache implementation specific and can no longer be managed by the core. The
regcache_sync_block() function is extended by a additional parameter so that the
cache implementation can tell the core which registers in the block are set and
which are not. The parameter is optional and if NULL the core assumes that all
registers are set. The rbtree cache also needs to implement its own drop
callback instead of relying on the core to handle this.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
Support for reducing the number of nodes and memory consumption of the rbtree
cache by allowing for small unused holes in the node's register cache block was
initially added in commit 0c7ed856 ("regmap: Cut down on the average # of nodes
in the rbtree cache"). But the commit had problems and so its effect was
reverted again in commit 4e67fb5 ("regmap: rbtree: Fix overlapping rbnodes.").
This patch brings the feature back of reducing the average number of nodes,
which will speedup node look-up, while at the same time also reducing the memory
usage of the rbtree cache. This patch takes a slightly different approach than
the original patch though. It modifies the adjacent node look-up to not only
consider nodes that are just one to the left or the right of the register but
any node that falls in a certain range around the register. The range is
calculated based on how much memory it would take to allocate a new node
compared to how much memory it takes adding a set of unused registers to an
existing node. E.g. if a node takes up 24 bytes and each register in a block
uses 1 byte the range will be from the register address - 24 to the register
address + 24. If we find a node that falls within this range it is cheaper or as
expensive to add the register to the existing node and have a couple of unused
registers in the node's cache compared to allocating a new node.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
A register which is adjacent to a node will either be left to the first
register or right to the last register. It will not be within the node's range,
so there is no point in checking for each register cached by the node whether
the new register is next to it. It is sufficient to check whether the register
comes before the first register or after the last register of the node.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
The regmap_debugfs_get_dump_start() function maps from a file offset to the
register that can be found at that position in the file. This is done using a
look-up table. Commit d6814a7d ("regmap: debugfs: Suppress cache for partial
register files") added a check to bypass the look-up table for partial register
files, since the offsets in that table are only correct for the full register
file. The check incorrectly uses the file offset instead of the register base
address and returns it. This will cause the file offset to be interpreted as a
register address which will result in a incorrect output from the registers file
for all reads except at position 0.
The issue can easily be reproduced by doing small reads the registers file, e.g.
`dd if=registers bs=10 count=5`.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org
This is needed to fix the build on sh systems.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are a couple of calculations, which convert between register addresses and
block indices, in regcache_rbtree_sync() and regcache_rbtree_node_alloc() which
assume that reg_stride is 1. This will break the rb cache for configurations
which do not use a reg_stride of 1.
Also rename 'base' in regcache_rbtree_sync() to 'start' to avoid confusion with
'base_reg'.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
This patch cleans the initialization of dma contiguous framework. The
all-in-one dma_declare_contiguous() function is now separated into
dma_contiguous_reserve_area() which only steals the the memory from
memblock allocator and dma_contiguous_add_device() function, which
assigns given device to the specified reserved memory area. This improves
the flexibility in defining contiguous memory areas and assigning device
to them, because now it is possible to assign more than one device to
the given contiguous memory area. Such split in initialization procedure
is also required for upcoming device tree support.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Tomasz Figa <t.figa@samsung.com>
* pm-cpufreq: (60 commits)
cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes
cpufreq: pmac64-cpufreq: remove device tree parsing for cpu nodes
cpufreq: maple-cpufreq: remove device tree parsing for cpu nodes
cpufreq: arm_big_little: remove device tree parsing for cpu nodes
cpufreq: kirkwood-cpufreq: remove device tree parsing for cpu nodes
cpufreq: spear-cpufreq: remove device tree parsing for cpu nodes
cpufreq: highbank-cpufreq: remove device tree parsing for cpu nodes
cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes
cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes
drivers/bus: arm-cci: avoid parsing DT for cpu device nodes
ARM: mvebu: remove device tree parsing for cpu nodes
ARM: topology: remove hwid/MPIDR dependency from cpu_capacity
of/device: add helper to get cpu device node from logical cpu index
driver/core: cpu: initialize of_node in cpu's device struture
ARM: DT/kernel: define ARM specific arch_match_cpu_phys_id
of: move of_get_cpu_node implementation to DT core library
powerpc: refactor of_get_cpu_node to support other architectures
openrisc: remove undefined of_get_cpu_node declaration
microblaze: remove undefined of_get_cpu_node declaration
cpufreq: fix bad unlock balance on !CONFIG_SMP
...
Use __ATTR_RW() instead of __ATTR() to make it more obvious what the
type of attribute is being created.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use DEVICE_ATTR_RO() instead of a "raw" __ATTR macro, making it easier
to audit exactly what is going on with the sysfs files.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are two bus attributes that can better be defined using
DRIVER_ATTR_WO(), so convert them to the new macro, making it easier to
audit attribute permissions.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The dev_attrs field of struct bus_type is going away soon, dev_groups
should be used instead. This converts the platform bus code to use
the correct field.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gotta love a macro that doesn't reduce the typing you have to do.
Also, only the driver core, and one network driver uses this. The
driver core functions will be going away soon, and I'll convert the
network driver soon to not need this as well, so delete it for now
before anyone else gets some bright ideas and wants to use it.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
These functions are being open-coded in 3 different places in the driver
core, and other driver subsystems will want to start doing this as well,
so move it to the sysfs core to keep it all in one place, where we know
it is written properly.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are two ways to set the online/offline state for a memory block:
echo 0|1 > online and echo online|online_kernel|online_movable|offline >
state.
The state attribute can online a memory block with extra data, the
"online type", where the online attribute uses a default online type of
ONLINE_KEEP, same as echo online > state.
Currently there is a state_mutex that provides consistency between the
memory block state and the underlying memory.
The problem is that this code does a lot of things that the common
device layer can do for us, such as the serialization of the
online/offline handlers using the device lock, setting the dev->offline
field, and calling kobject_uevent().
This patch refactors the online/offline code to allow the common
device_[online|offline] functions to be used. The result is a simpler
and more common code path for the two state setting mechanisms. It also
removes the state_mutex from the struct memory_block as the memory block
device lock provides the state consistency.
No functional change is intended by this patch.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Right now memory_dev_init() maintains the memory block pointer
between iterations of add_memory_section(). This is nasty.
This patch refactors add_memory_section() to become add_memory_block().
The refactoring pulls the section scanning out of memory_dev_init()
and simplifies the signature.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The path through add_memory_section() when the memory block already
exists uses flawed refcounting logic. A get_device() is done on a
memory block using a pointer that might not be valid as we dropped
our previous reference and didn't obtain a new reference in the
proper way.
Lets stop pretending and just remove the get/put. The
mem_sysfs_mutex, which we hold over the entire init loop now, will
prevent the memory blocks from disappearing from under us.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that add_memory_section() is only called from boot time, reduce
the logic and remove the enum.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
add_memory_section() is currently called from both boot time and run
time via hotplug and there is a lot of nastiness to allow for shared
code including an enum parameter to convey the calling context to
add_memory_section().
This patch is the first step in breaking up the messy code sharing by
pulling the hotplug path for add_memory_section() directly into
register_new_memory().
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use the [get|put]_device functions for ref'ing the memory block device
rather than the kobject functions which should be hidden away by the
device layer.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The error variable is not needed.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no point in releasing the mutex for each section that is added
during boot time. Just hold it over the entire initialization loop.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid overlapping register regions by making the initial blklen of a new
node 1. If a register write occurs to a yet uncached register, that is
lower than but near an existing node's base_reg, a new node is created
and it's blklen is set to an arbitrary value (sizeof(*rbnode)). That may
cause this node to overlap with another node. Those nodes should be merged,
but this merge doesn't happen yet, so this patch at least makes the initial
blklen small enough to avoid hitting the wrong node, which may otherwise
lead to severe breakage.
Signed-off-by: David Jander <david@protonic.nl>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org
CPUs are also registered as devices but the of_node in these cpu
devices are not initialized. Currently different drivers requiring
to access cpu device node are parsing the nodes themselves and
initialising the of_node in cpu device.
The of_node in all the cpu devices needs to be initialized properly
and at one place. The best place to update this is CPU subsystem
driver when registering the cpu devices.
The OF/DT core library now provides of_get_cpu_node to retrieve a cpu
device node for a given logical index by abstracting the architecture
specific details.
This patch uses of_get_cpu_node to assign of_node when registering the
cpu devices.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
__init belongs after the return type on functions, not before it.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__init belongs after the return type on functions, not before it.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__init belongs after the return type on functions, not before it.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It may be useful to register multiple patches with regmap, for example
one that depends on the device revision and one that depends on the system
configuration. Add support for doing this, appending any new patches to
the existing patches.
Signed-off-by: Mark Brown <broonie@linaro.org>
There is a potential race condition between cpu_subsys_online()
and either acpi_processor_remove() or remove_memory() that execute
try_offline_node(). Namely, it is possible that cpu_subsys_online()
will run right after the CPUs NUMA node has been put offline and
cpu_to_node() executed by it will return NUMA_NO_NODE (-1). In
that case the CPU is gone and it doesn't make sense to call cpu_up()
for it, so make cpu_subsys_online() return -ENODEV then.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
attribute groups are much more flexible than just a list of attributes,
due to their support for visibility of the attributes, and binary
attributes. Add bus_groups to struct bus_type which should be used
instead of bus_attrs.
bus_attrs will be removed from the structure soon.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
attribute groups are much more flexible than just a list of attributes,
due to their support for visibility of the attributes, and binary
attributes. Add drv_groups to struct bus_type which should be used
instead of drv_attrs.
drv_attrs will be removed from the structure soon.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
attribute groups are much more flexible than just a list of attributes,
due to their support for visibility of the attributes, and binary
attributes. Add dev_groups to struct bus_type which should be used
instead of dev_attrs.
dev_attrs will be removed from the structure soon.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The regmap_writeable() check should not be done in
regcache_write() because this prevents read-only
registers to be cached. After a read on a read-only
register its value will not be stored in the cache
and the next time someone will try to read it the
value will be read from the bus instead of the
cache.
Instead the regmap_writeable() check should be done
in _regmap_write() to prevent callers from writing
to read-only registers.
Signed-off-by: Ionut Nicu <ioan.nicu.ext@nsn.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
In the initial case when no reg_defaults values are
provided and no register value was added to the cache
yet, the cache_present bitmap is NULL. If this function
is invoked for any register it should return false
(i.e. the register is not cached) instead of true.
Signed-off-by: Ionut Nicu <ioan.nicu.ext@nsn.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
I see no reason why a virtual range shouldn't be allowed to cover its
own data window if the page selection register is in the same place
on every page.
For chips which use paged access for all of their registers, but only
when connected via I2C, and which can access the whole register space
directly when connected via SPI, this allows to avoid acrobatics with
the register ranges by simply mapping the I2C ranges over the data
window beginning at 0x0, and then using linear access for the SPI
variant.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
Export opp_add() so that modules can use it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
regcache_sync_block_raw_flush() expects the address of the register after last
register that needs to be synced as its parameter. But the last call to
regcache_sync_block_raw_flush() in regcache_sync_block_raw() passes the address
of the last register in the block. This effectively always skips over the last
register in a block, even if it needs to be synced. In order to fix it increase
the address by one register.
The issue was introduced in commit 75a5f89 ("regmap: cache: Write consecutive
registers in a single block write").
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>