29806 Commits

Author SHA1 Message Date
Hans de Goede
f4e9c82f64 watchdog: Add Locking support
This patch fixes some potential multithreading issues, despite only
allowing one process to open the /dev/watchdog device, we can still get
called multiple times at the same time, since a program could be using thread,
or could share the fd after a fork.

This causes 2 potential problems:
1) watchdog_start / open do an unlocked test_n_set / test_n_clear,
   if these 2 race, the watchdog could be stopped while the active
   bit indicates it is running or visa versa.

2) Most watchdog_dev drivers probably assume that only one
   watchdog-op will get called at a time, this is not necessary
   true atm.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-05-30 07:55:23 +02:00
Alan Cox
d6b469d915 watchdog: create all the proper device files
Create the watchdog class and it's associated devices.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-05-30 07:54:46 +02:00
Alan Cox
2bbeed016d watchdog: Add a flag to indicate the watchdog doesn't reboot things
Some watchdogs merely trigger external alarms and controls. In a managed
environment this is very useful but we want drivers to be able to figure
out which is which now multiple dogs can be loaded. Thus add an ALARMONLY
feature flag.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-05-30 07:54:40 +02:00
Alan Cox
45f5fed30a watchdog: Add multiple device support
We keep the old /dev/watchdog interface file for the first watchdog via
miscdev. This is basically a cut and paste of the relevant interface code
from the rtc driver layer tweaked for watchdog.

Revised to fix problems noted by Hans de Goede

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-05-30 07:54:25 +02:00
Viresh Kumar
257f8c4aae watchdog: Add watchdog_active() routine
Some watchdog may need to check if watchdog is ACTIVE or not, for example in
their suspend/resume hooks.

This patch adds this routine and changes the core drivers to use it.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-05-30 07:53:46 +02:00
Linus Torvalds
87a5af24e5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull EDAC internal API changes from Mauro Carvalho Chehab:
 "This changeset is the first part of a series of patches that fixes the
  EDAC sybsystem.  On this set, it changes the Kernel EDAC API in order
  to properly represent the Intel i3/i5/i7, Xeon 3xxx/5xxx/7xxx, and
  Intel E5-xxxx memory controllers.

  The EDAC core used to assume that:

       - the DRAM chip select pin is directly accessed by the memory
         controller

       - when multiple channels are used, they're all filled with the
         same type of memory.

  None of the above premises is true on Intel memory controllers since
  2002, when RAMBUS and FB-DIMMs were introduced, and Advanced Memory
  Buffer or by some similar technologies hides the direct access to the
  DRAM pins.

  So, the existing drivers for those chipsets had to lie to the EDAC
  core, in general telling that just one channel is filled.  That
  produces some hard to understand error messages like:

       EDAC MC0: CE row 3, channel 0, label "DIMM1": 1 Unknown error(s): memory read error on FATAL area : cpu=0 Err=0008:00c2 (ch=2), addr = 0xad1f73480 => socket=0, Channel=0(mask=2), rank=1

  The location information there (row3 channel 0) is completely bogus:
  it has no physical meaning, and are just some random values that the
  driver uses to talk with the EDAC core.  The error actually happened
  at CPU socket 0, channel 0, slot 1, but this is not reported anywhere,
  as the EDAC core doesn't know anything about the memory layout.  So,
  only advanced users that know how the EDAC driver works and that tests
  their systems to see how DIMMs are mapped can actually benefit for
  such error logs.

  This patch series fixes the error report logic, in order to allow the
  EDAC to expose the memory architecture used by them to the EDAC core.
  So, as the EDAC core now understands how the memory is organized, it
  can provide an useful report:

       EDAC MC0: CE memory read error on DIMM1 (channel:0 slot:1 page:0x364b1b offset:0x600 grain:32 syndrome:0x0 - count:1 area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:4)

  The location of the DIMM where the error happened is reported by "MC0"
  (cpu socket #0), at "channel:0 slot:1" location, and matches the
  physical location of the DIMM.

  There are two remaining issues not covered by this patch series:

       - The EDAC sysfs API will still report bogus values.  So,
         userspace tools like edac-utils will still use the bogus data;

       - Add a new tracepoint-based way to get the binary information
         about the errors.

  Those are on a second series of patches (also at -next), but will
  probably miss the train for 3.5, due to the slow review process."

Fix up trivial conflict (due to spelling correction of removed code) in
drivers/edac/edac_device.c

* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (42 commits)
  i7core: fix ranks information at the per-channel struct
  i5000: Fix the fatal error handling
  i5100_edac: Fix a warning when compiled with 32 bits
  i82975x_edac: Test nr_pages earlier to save a few CPU cycles
  e752x_edac: provide more info about how DIMMS/ranks are mapped
  i5000_edac: Fix the logic that retrieves memory information
  i5400_edac: improve debug messages to better represent the filled memory
  edac: Cleanup the logs for i7core and sb edac drivers
  edac: Initialize the dimm label with the known information
  edac: Remove the legacy EDAC ABI
  x38_edac: convert driver to use the new edac ABI
  tile_edac: convert driver to use the new edac ABI
  sb_edac: convert driver to use the new edac ABI
  r82600_edac: convert driver to use the new edac ABI
  ppc4xx_edac: convert driver to use the new edac ABI
  pasemi_edac: convert driver to use the new edac ABI
  mv64x60_edac: convert driver to use the new edac ABI
  mpc85xx_edac: convert driver to use the new edac ABI
  i82975x_edac: convert driver to use the new edac ABI
  i82875p_edac: convert driver to use the new edac ABI
  ...
2012-05-29 18:32:37 -07:00
Linus Torvalds
7e5b2db77b Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 "The whole series has been sitting in -next for quite a while with no
  complaints.  The last change to the series was before the weekend the
  removal of an SPI patch which Grant - even though previously acked by
  himself - appeared to raise objections.  So I removed it until the
  situation is clarified.  Other than that all the patches have the acks
  from their respective maintainers, all MIPS and x86 defconfigs are
  building fine and I'm not aware of any problems introduced by this
  series.

  Among the key features for this patch series is a sizable patchset for
  Lantiq which among other things introduces support for Lantiq's
  flagship product, the FALCON SOC.  It also means that the opensource
  developers behind this patchset have overtaken Lantiq's competing
  inhouse development team that was working behind closed doors.

  Less noteworthy the ath79 patchset which adds support for a few more
  chip variants, cleanups and fixes.  Finally the usual dose of tweaking
  of generic code."

Fix up trivial conflicts in arch/mips/lantiq/xway/gpio_{ebu,stp}.c where
printk spelling fixes clashed with file move and eventual removal of the
printk.

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (81 commits)
  MIPS: lantiq: remove orphaned code
  MIPS: Remove all -Wall and almost all -Werror usage from arch/mips.
  MIPS: lantiq: implement support for FALCON soc
  MTD: MIPS: lantiq: verify that the NOR interface is available on falcon soc
  MTD: MIPS: lantiq: implement OF support
  watchdog: MIPS: lantiq: implement OF support and minor fixes
  SERIAL: MIPS: lantiq: implement OF support
  GPIO: MIPS: lantiq: convert gpio-stp-xway to OF
  GPIO: MIPS: lantiq: convert gpio-mm-lantiq to OF and of_mm_gpio
  GPIO: MIPS: lantiq: move gpio-stp and gpio-ebu to the subsystem folder
  MIPS: pci: convert lantiq driver to OF
  MIPS: lantiq: convert dma to platform driver
  MIPS: lantiq: implement support for clkdev api
  MIPS: lantiq: drop ltq_gpio_request() and gpio_to_irq()
  OF: MIPS: lantiq: implement irq_domain support
  OF: MIPS: lantiq: implement OF support
  MIPS: lantiq: drop mips_machine support
  OF: PCI: const usage needed by MIPS
  MIPS: Cavium: Remove smp_reserve_lock.
  MIPS: Move cache setup to setup_arch().
  ...
2012-05-29 18:27:19 -07:00
Wolfram Sang
eb86c3064b rtc: ds1307: add trickle charger support
Some DS13XX devices have "trickle chargers".  Its configuration register
is at different locations, the setup is the same, though.  Since the
configuration is board specific, introduce a platform_data to this driver.
Tested with a DS1339 on a custom board.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:33 -07:00
Alexander Stein
e311c92959 rtc: add ioctl to get/clear battery low voltage status
Currently there is no generic way to get the RTC battery status within an
application.  So add an ioctl to read the status bit.  The idea is that
the bit is set once a low voltage is detected.  It stays there until it is
reset using the RTC_VL_CLR ioctl.

Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:33 -07:00
Stephen Boyd
4796dd200d vsprintf: fix %ps on non symbols when using kallsyms
Using %ps in a printk format will sometimes fail silently and print the
empty string if the address passed in does not match a symbol that
kallsyms knows about.  But using %pS will fall back to printing the full
address if kallsyms can't find the symbol.  Make %ps act the same as %pS
by falling back to printing the address.

While we're here also make %ps print the module that a symbol comes from
so that it matches what %pS already does.  Take this simple function for
example (in a module):

	static void test_printk(void)
	{
		int test;
		pr_info("with pS: %pS\n", &test);
		pr_info("with ps: %ps\n", &test);
	}

Before this patch:

 with pS: 0xdff7df44
 with ps:

After this patch:

 with pS: 0xdff7df44
 with ps: 0xdff7df44

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:32 -07:00
Kim, Milo
8035a50224 include/linux/led-lm3530.h: comment correction about the range of brightness
max brightness is 127, so the range of brt_val should be from 0 to 127

Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Shreshtha Kumar SAHU <shreshthakumar.sahu@stericsson.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Bryan Wu <bryan.wu@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:32 -07:00
Shuah Khan
b00961824a leds: add new field to led_classdev struct to save activation state
Add a new field to led_classdev to save activattion state after activate
routine is successful.  This saved state is used in deactivate routine to
do cleanup such as removing device files, and free memory allocated during
activation.  Currently trigger_data not being null is used for this
purpose.

Existing triggers will need changes to use this new field.

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Bryan Wu <bryan.wu@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:31 -07:00
H Hartley Sweeten
1615d210db drivers/video/backlight/apple_bl.c: include header for exported symbol prototypes
Include the header to pickup the exported symbol prototype.

Quiets the sparse warning:

  warning: symbol 'apple_bl_register' was not declared. Should it be static?
  warning: symbol 'apple_bl_unregister' was not declared. Should it be static?

[akpm@linux-foundation.org: fix resulting build error]
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:29 -07:00
Inki Dae
d54ad83f3d lcd: add callbacks for early fb event blank support
This patchset adds early fb blank feature that a callback of lcd panel
driver is called prior to specific fb driver's one.  In the case of
MIPI-DSI based video mode LCD Panel, for lcd power off, the power off
commands should be transferred to lcd panel with display and mipi-dsi
controller enabled because the commands is set to lcd panel at vsync porch
period.  and in opposite case, the callback of fb driver should be called
prior to lcd panel driver's one because of same issue.  Also if fb_blank
mode is changed to FB_BLANK_POWERDOWN then display controller would be
off(clock disable) but lcd panel would be still on.  at this time, you
could see some issue like sparkling on lcd panel because video clock to be
delivered to ldi module of lcd panel was disabled.  this issue could
occurs for all lcd panels.

The callback order is as the following:

at fb_blank function of fbmem.c
-> fb_notifier_call_chain(FB_EARLY_EVENT_BLANK)
       -> lcd panel driver's early_set_power()
-> info->fbops->fb_blank()
       -> spcefic fb driver's fb_blank()
-> fb_notifier_call_chain(FB_EVENT_BLANK)
       -> lcd panel driver's set_power()
   -> fb_notifier_call_chain(FB_R_EARLY_EVENT_BLANK) if
info->fops->fb_blank() was failed.

fb_notifier_call_chain(FB_R_EARLY_EVENT_BLANK) would be called to revert
the effects of previous FB_EARLY_EVENT_BLANK call.  and note that if
early_set_power() of lcd_ops is NULL then early fb blank callback would be
ignored.

This patch:

Add early_set_power and r_early_set_power callbacks.  early_set_power
callback is called prior to fb_blank() of fbmem.c and r_early_set_power
callback is called if fb_blank() was failed to revert the effects of the
early_set_power call of lcd panel driver.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:29 -07:00
Inki Dae
bf05929f41 fbdev: add events for early fb event support
Add FB_EARLY_EVENT_BLANK and FB_R_EARLY_EVENT_BLANK event mode supports.
first, fb_notifier_call_chain() is called with FB_EARLY_EVENT_BLANK and
fb_blank() of specific fb driver is called and then
fb_notifier_call_chain() is called with FB_EVENT_BLANK again at
fb_blank().  and if fb_blank() was failed then fb_nitifier_call_chain()
would be called with FB_R_EARLY_EVENT_BLANK to revert the previous
effects.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:28 -07:00
Hugh Dickins
fa9add641b mm/memcg: apply add/del_page to lruvec
Take lruvec further: pass it instead of zone to add_page_to_lru_list() and
del_page_from_lru_list(); and pagevec_lru_move_fn() pass lruvec down to
its target functions.

This cleanup eliminates a swathe of cruft in memcontrol.c, including
mem_cgroup_lru_add_list(), mem_cgroup_lru_del_list() and
mem_cgroup_lru_move_lists() - which never actually touched the lists.

In their place, mem_cgroup_page_lruvec() to decide the lruvec, previously
a side-effect of add, and mem_cgroup_update_lru_size() to maintain the
lru_size stats.

Whilst these are simplifications in their own right, the goal is to bring
the evaluation of lruvec next to the spin_locking of the lrus, in
preparation for a future patch.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:28 -07:00
Hugh Dickins
4d7dcca213 mm/memcg: get_lru_size not get_lruvec_size
Konstantin just introduced mem_cgroup_get_lruvec_size() and
get_lruvec_size(), I'm about to add mem_cgroup_update_lru_size(): but
we're dealing with the same thing, lru_size[lru].  We ought to agree on
the naming, and I do think lru_size is the more correct: so rename his
ones to get_lru_size().

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:28 -07:00
Glauber Costa
04eac7ffde rescounter: remove __must_check from res_counter_charge_nofail()
Since we will succeed with the allocation no matter what, there isn't a
need to use __must_check with it.  It can very well be optional.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ying Han <yinghan@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:27 -07:00
Frederic Weisbecker
2bb2ba9d51 rescounters: add res_counter_uncharge_until()
When killing a res_counter which is a child of other counter, we need to
do

	res_counter_uncharge(child, xxx)
	res_counter_charge(parent, xxx)

This is not atomic and wastes CPU.  This patch adds
res_counter_uncharge_until().  This function's uncharge propagates to
ancestors until specified res_counter.

	res_counter_uncharge_until(child, parent, xxx)

Now the operation is atomic and efficient.

Signed-off-by: Frederic Weisbecker <fweisbec@redhat.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Ying Han <yinghan@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:27 -07:00
Konstantin Khlebnikov
c56d5c7dfe mm/vmscan: push lruvec pointer into inactive_list_is_low()
Switch mem_cgroup_inactive_anon_is_low() to lruvec pointers,
mem_cgroup_get_lruvec_size() is more effective than
mem_cgroup_zone_nr_lru_pages()

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:26 -07:00
Konstantin Khlebnikov
074291fea8 mm/vmscan: replace zone_nr_lru_pages() with get_lruvec_size()
If memory cgroup is enabled we always use lruvecs which are embedded into
struct mem_cgroup_per_zone, so we can reach lru_size counters via
container_of().

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:26 -07:00
Konstantin Khlebnikov
7f5e86c2cc mm: add link from struct lruvec to struct zone
This is the first stage of struct mem_cgroup_zone removal.  Further
patches replace struct mem_cgroup_zone with a pointer to struct lruvec.

If CONFIG_CGROUP_MEM_RES_CTLR=n lruvec_zone() is just container_of().

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:26 -07:00
Konstantin Khlebnikov
bbf808ed7d mm/memcg: kill mem_cgroup_lru_del()
This patch kills mem_cgroup_lru_del(), we can use
mem_cgroup_lru_del_list() instead.  On 0-order isolation we already have
right lru list id.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:25 -07:00
Konstantin Khlebnikov
f3fd4a6192 mm: remove lru type checks from __isolate_lru_page()
After patch "mm: forbid lumpy-reclaim in shrink_active_list()" we can
completely remove anon/file and active/inactive lru type filters from
__isolate_lru_page(), because isolation for 0-order reclaim always
isolates pages from right lru list.  And pages-isolation for lumpy
shrink_inactive_list() or memory-compaction anyway allowed to isolate
pages from all evictable lru lists.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:25 -07:00
Konstantin Khlebnikov
014483bccc mm: mark mm-inline functions as __always_inline
GCC sometimes ignores "inline" directives even for small and simple functions.
This supposed to be fixed in gcc 4.7, but it was released only yesterday.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:25 -07:00
Hugh Dickins
89abfab133 mm/memcg: move reclaim_stat into lruvec
With mem_cgroup_disabled() now explicit, it becomes clear that the
zone_reclaim_stat structure actually belongs in lruvec, per-zone when
memcg is disabled but per-memcg per-zone when it's enabled.

We can delete mem_cgroup_get_reclaim_stat(), and change
update_page_reclaim_stat() to update just the one set of stats, the one
which get_scan_count() will actually use.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:25 -07:00
KAMEZAWA Hiroyuki
4b91355e9d memcg: fix/change behavior of shared anon at moving task
This patch changes memcg's behavior at task_move().

At task_move(), the kernel scans a task's page table and move the changes
for mapped pages from source cgroup to target cgroup.  There has been a
bug at handling shared anonymous pages for a long time.

Before patch:
  - The spec says 'shared anonymous pages are not moved.'
  - The implementation was 'shared anonymoys pages may be moved'.
    If page_mapcount <=2, shared anonymous pages's charge were moved.

After patch:
  - The spec says 'all anonymous pages are moved'.
  - The implementation is 'all anonymous pages are moved'.

Considering usage of memcg, this will not affect user's experience.
'shared anonymous' pages only exists between a tree of processes which
don't do exec().  Moving one of process without exec() seems not sane.
For example, libcgroup will not be affected by this change.  (Anyway, no
one noticed the implementation for a long time...)

Below is a discussion log:

 - current spec/implementation are complex
 - Now, shared file caches are moved
 - It adds unclear check as page_mapcount(). To do correct check,
   we should check swap users, etc.
 - No one notice this implementation behavior. So, no one get benefit
   from the design.
 - In general, once task is moved to a cgroup for running, it will not
   be moved....
 - Finally, we have control knob as memory.move_charge_at_immigrate.

Here is a patch to allow moving shared pages, completely. This makes
memcg simpler and fix current broken code.

Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:24 -07:00
Pravin B Shelar
5bf5f03c27 mm: fix slab->page flags corruption
Transparent huge pages can change page->flags (PG_compound_lock) without
taking Slab lock.  Since THP can not break slab pages we can safely access
compound page without taking compound lock.

Specifically this patch fixes a race between compound_unlock() and slab
functions which perform page-flags updates.  This can occur when
get_page()/put_page() is called on a page from slab.

[akpm@linux-foundation.org: tweak comment text, fix comment layout, fix label indenting]
Reported-by: Amey Bhide <abhide@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:24 -07:00
David Rientjes
a7f638f999 mm, oom: normalize oom scores to oom_score_adj scale only for userspace
The oom_score_adj scale ranges from -1000 to 1000 and represents the
proportion of memory available to the process at allocation time.  This
means an oom_score_adj value of 300, for example, will bias a process as
though it was using an extra 30.0% of available memory and a value of
-350 will discount 35.0% of available memory from its usage.

The oom killer badness heuristic also uses this scale to report the oom
score for each eligible process in determining the "best" process to
kill.  Thus, it can only differentiate each process's memory usage by
0.1% of system RAM.

On large systems, this can end up being a large amount of memory: 256MB
on 256GB systems, for example.

This can be fixed by having the badness heuristic to use the actual
memory usage in scoring threads and then normalizing it to the
oom_score_adj scale for userspace.  This results in better comparison
between eligible threads for kill and no change from the userspace
perspective.

Suggested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:24 -07:00
Hugh Dickins
17cf28afea mm/fs: remove truncate_range
Remove vmtruncate_range(), and remove the truncate_range method from
struct inode_operations: only tmpfs ever supported it, and tmpfs has now
converted over to using the fallocate method of file_operations.

Update Documentation accordingly, adding (setlease and) fallocate lines.
And while we're in mm.h, remove duplicate declarations of shmem_lock() and
shmem_file_setup(): everyone is now using the ones in shmem_fs.h.

Based-on-patch-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Cong Wang <amwang@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:23 -07:00
Hugh Dickins
bde05d1ccd shmem: replace page if mapping excludes its zone
The GMA500 GPU driver uses GEM shmem objects, but with a new twist: the
backing RAM has to be below 4GB.  Not a problem while the boards
supported only 4GB: but now Intel's D2700MUD boards support 8GB, and
their GMA3600 is managed by the GMA500 driver.

shmem/tmpfs has never pretended to support hardware restrictions on the
backing memory, but it might have appeared to do so before v3.1, and
even now it works fine until a page is swapped out then back in.  When
read_cache_page_gfp() supplied a freshly allocated page for copy, that
compensated for whatever choice might have been made by earlier swapin
readahead; but swapoff was likely to destroy the illusion.

We'd like to continue to support GMA500, so now add a new
shmem_should_replace_page() check on the zone when about to move a page
from swapcache to filecache (in swapin and swapoff cases), with
shmem_replace_page() to allocate and substitute a suitable page (given
gma500/gem.c's mapping_set_gfp_mask GFP_KERNEL | __GFP_DMA32).

This does involve a minor extension to mem_cgroup_replace_page_cache()
(the page may or may not have already been charged); and I've removed a
comment and call to mem_cgroup_uncharge_cache_page(), which in fact is
always a no-op while PageSwapCache.

Also removed optimization of an unlikely path in shmem_getpage_gfp(),
now that we need to check PageSwapCache more carefully (a racing caller
might already have made the copy).  And at one point shmem_unuse_inode()
needs to use the hitherto private page_swapcount(), to guard against
racing with inode eviction.

It would make sense to extend shmem_should_replace_page(), to cover
cpuset and NUMA mempolicy restrictions too, but set that aside for now:
needs a cleanup of shmem mempolicy handling, and more testing, and ought
to handle swap faults in do_swap_page() as well as shmem.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Stephane Marchesin <marcheu@chromium.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Rob Clark <rob.clark@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:22 -07:00
Bartlomiej Zolnierkiewicz
5ceb9ce6fe mm: compaction: handle incorrect MIGRATE_UNMOVABLE type pageblocks
When MIGRATE_UNMOVABLE pages are freed from MIGRATE_UNMOVABLE type
pageblock (and some MIGRATE_MOVABLE pages are left in it) waiting until an
allocation takes ownership of the block may take too long.  The type of
the pageblock remains unchanged so the pageblock cannot be used as a
migration target during compaction.

Fix it by:

* Adding enum compact_mode (COMPACT_ASYNC_[MOVABLE,UNMOVABLE], and
  COMPACT_SYNC) and then converting sync field in struct compact_control
  to use it.

* Adding nr_pageblocks_skipped field to struct compact_control and
  tracking how many destination pageblocks were of MIGRATE_UNMOVABLE type.
   If COMPACT_ASYNC_MOVABLE mode compaction ran fully in
  try_to_compact_pages() (COMPACT_COMPLETE) it implies that there is not a
  suitable page for allocation.  In this case then check how if there were
  enough MIGRATE_UNMOVABLE pageblocks to try a second pass in
  COMPACT_ASYNC_UNMOVABLE mode.

* Scanning the MIGRATE_UNMOVABLE pageblocks (during COMPACT_SYNC and
  COMPACT_ASYNC_UNMOVABLE compaction modes) and building a count based on
  finding PageBuddy pages, page_count(page) == 0 or PageLRU pages.  If all
  pages within the MIGRATE_UNMOVABLE pageblock are in one of those three
  sets change the whole pageblock type to MIGRATE_MOVABLE.

My particular test case (on a ARM EXYNOS4 device with 512 MiB, which means
131072 standard 4KiB pages in 'Normal' zone) is to:

- allocate 120000 pages for kernel's usage
- free every second page (60000 pages) of memory just allocated
- allocate and use 60000 pages from user space
- free remaining 60000 pages of kernel memory
  (now we have fragmented memory occupied mostly by user space pages)
- try to allocate 100 order-9 (2048 KiB) pages for kernel's usage

The results:
- with compaction disabled I get 11 successful allocations
- with compaction enabled - 14 successful allocations
- with this patch I'm able to get all 100 successful allocations

NOTE: If we can make kswapd aware of order-0 request during compaction, we
can enhance kswapd with changing mode to COMPACT_ASYNC_FULL
(COMPACT_ASYNC_MOVABLE + COMPACT_ASYNC_UNMOVABLE).  Please see the
following thread:

	http://marc.info/?l=linux-mm&m=133552069417068&w=2

[minchan@kernel.org: minor cleanups]
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:22 -07:00
Johannes Weiner
238305bb4d mm: remove sparsemem allocation details from the bootmem allocator
alloc_bootmem_section() derives allocation area constraints from the
specified sparsemem section.  This is a bit specific for a generic memory
allocator like bootmem, though, so move it over to sparsemem.

As __alloc_bootmem_node_nopanic() already retries failed allocations with
relaxed area constraints, the fallback code in sparsemem.c can be removed
and the code becomes a bit more compact overall.

[akpm@linux-foundation.org: fix build]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:22 -07:00
Alex Shi
2099597401 mm: move is_vma_temporary_stack() declaration to huge_mm.h
When transparent_hugepage_enabled() is used outside mm/, such as in
arch/x86/xx/tlb.c:

+       if (!cpu_has_invlpg || vma->vm_flags & VM_HUGETLB
+                       || transparent_hugepage_enabled(vma)) {
+               flush_tlb_mm(vma->vm_mm);

is_vma_temporary_stack() isn't referenced in huge_mm.h, so it has compile
errors:

  arch/x86/mm/tlb.c: In function `flush_tlb_range':
  arch/x86/mm/tlb.c:324:4: error: implicit declaration of function `is_vma_temporary_stack' [-Werror=implicit-function-declaration]

Since is_vma_temporay_stack() is just used in rmap.c and huge_memory.c, it
is better to move it to huge_mm.h from rmap.h to avoid such errors.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:21 -07:00
Ulrich Drepper
9295b7a07c kbuild: install kernel-page-flags.h
Programs using /proc/kpageflags need to know about the various flags.  The
<linux/kernel-page-flags.h> provides them and the comments in the file
indicate that it is supposed to be used by user-level code.  But the file
is not installed.

Install the headers and mark the unstable flags as out-of-bounds.  The
page-type tool is also adjusted to not duplicate the definitions

Signed-off-by: Ulrich Drepper <drepper@gmail.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:21 -07:00
Konstantin Khlebnikov
02602a18c3 bug: completely remove code generated by disabled VM_BUG_ON()
Even if CONFIG_DEBUG_VM=n gcc genereates code for some VM_BUG_ON()

for example VM_BUG_ON(!PageCompound(page) || !PageHead(page)); in
do_huge_pmd_wp_page() generates 114 bytes of code.

But they mostly disappears when I split this VM_BUG_ON into two:

  -VM_BUG_ON(!PageCompound(page) || !PageHead(page));
  +VM_BUG_ON(!PageCompound(page));
  +VM_BUG_ON(!PageHead(page));

weird... but anyway after this patch code disappears completely.

  add/remove: 0/0 grow/shrink: 7/97 up/down: 135/-1784 (-1649)

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:20 -07:00
Konstantin Khlebnikov
baf05aa927 bug: introduce BUILD_BUG_ON_INVALID() macro
Sometimes we want to check some expressions correctness at compile time.
"(void)(e);" or "if (e);" can be dangerous if the expression has
side-effects, and gcc sometimes generates a lot of code, even if the
expression has no effect.

This patch introduces macro BUILD_BUG_ON_INVALID() for such checks, it
forces a compilation error if expression is invalid without any extra
code.

[Cast to "long" required because sizeof does not work for bit-fields.]

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:20 -07:00
Johannes Weiner
c3ac9a8ade mm: memcg: count pte references from every member of the reclaimed hierarchy
The rmap walker checking page table references has historically ignored
references from VMAs that were not part of the memcg that was being
reclaimed during memcg hard limit reclaim.

When transitioning global reclaim to memcg hierarchy reclaim, I missed
that bit and now references from outside a memcg are ignored even during
global reclaim.

Reverting back to traditional behaviour - count all references during
global reclaim and only mind references of the memcg being reclaimed
during limit reclaim would be one option.

However, the more generic idea is to ignore references exactly then when
they are outside the hierarchy that is currently under reclaim; because
only then will their reclamation be of any use to help the pressure
situation.  It makes no sense to ignore references from a sibling memcg
and then evict a page that will be immediately refaulted by that sibling
which contributes to the same usage of the common ancestor under
reclaim.

The solution: make the rmap walker ignore references from VMAs that are
not part of the hierarchy that is being reclaimed.

Flat limit reclaim will stay the same, hierarchical limit reclaim will
mind the references only to pages that the hierarchy owns.  Global
reclaim, since it reclaims from all memcgs, will be fixed to regard all
references.

[akpm@linux-foundation.org: name the args in the declaration]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:20 -07:00
Andrew Morton
0ce72d4f73 mm: do_migrate_pages(): rename arguments
s/from_nodes/from and s/to_nodes/to/.  The "_nodes" is redundant - it
duplicates the argument's type.

Done in a fit of irritation over 80-col issues :(

Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <mkosaki@redhat.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:20 -07:00
Rik van Riel
e709ffd616 mm: remove swap token code
The swap token code no longer fits in with the current VM model.  It
does not play well with cgroups or the better NUMA placement code in
development, since we have only one swap token globally.

It also has the potential to mess with scalability of the system, by
increasing the number of non-reclaimable pages on the active and
inactive anon LRU lists.

Last but not least, the swap token code has been broken for a year
without complaints, as reported by Konstantin Khlebnikov.  This suggests
we no longer have much use for it.

The days of sub-1G memory systems with heavy use of swap are over.  If
we ever need thrashing reducing code in the future, we will have to
implement something that does scale.

Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: Bob Picco <bpicco@meloft.net>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:19 -07:00
Paul Gortmaker
af2e840971 pagemap.h: fix warning about possibly used before init var
Commit f56f821feb7b ("mm: extend prefault helpers to fault in more than
PAGE_SIZE") added in the new functions: fault_in_multipages_writeable()
and fault_in_multipages_readable().

However, we currently see:

  include/linux/pagemap.h:492: warning: 'ret' may be used uninitialized in this function
  include/linux/pagemap.h:492: note: 'ret' was declared here

Unlike a lot of gcc nags, this one appears somewhat legit.  i.e.  passing
in an invalid negative value of "size" does make it look like all the
conditionals in there would be bypassed and the uninitialized value would
be returned.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:18 -07:00
Linus Torvalds
4b78147468 MFD changes for 3.5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPw2QKAAoJEIqAPN1PVmxKfv8P/2L5d38tc3+9wYuGI1l+k7Mz
 xQt2PdAx/kHQGTjLE1DSoeOD6dn4aodFbPaTcsLsU9Eo4IiJnT68b8adr/bqYHKU
 Cod6NSPJMaBxLBJZxXsA7nY69Z6O5SMjXxEQsiDc24gaP0jjwaeY35KJSfMug8nF
 DA6rvEpchkF8QXzBmkO2t2/uPYr1YWqDZQkauLDnLRG01JnGXFz5ajv9N5pYhiFt
 QyYtheg8yEnfwnQ6AlmRtGK75jZRVmrj0kOzRjE9UL7ZwtzswWJes+RE3tlgk89m
 JQ7KASRmmqLpvcVJ9fG9SlGX//yBO6OEp5Km06RTxgmt0XftBDVqBTjk1EG2tfMR
 SR0NIz6gJ0twKAe6U0d+5HMYalOU45H5ha9e3vCqZ8vl9JfmM95RS+TmWbGcRIqj
 04Y5x3I4zq6e9D0u+211BeuRfzkQiefwWJmdPpn0oac3u5LeYbRj/aQ85fqwJWzG
 f99D9VU5xgfFHPAtL3SLFiwgd9yOiMBar6eeIva+okDyOW3KaEUzs8Y4dgDyvYcg
 IU//JGK51vLVmI5kXtGCwYkgRLF/Y7WKZ8TwypT+SY6iv6tPQVvApOZljq7RC9GI
 mXx2z2slA90jlg3TlEFZfxr1WqbZ3TCbonU1riLeMEtkiXUpLtmKC8gbhOqfGvvn
 Nzgt+YqRJXafZdELb/S+
 =Rh0r
 -----END PGP SIGNATURE-----

Merge tag 'mfd-3.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6

Pull MFD changes from Samuel Ortiz:
 "Besides the usual cleanups, this one brings:

   * Support for 5 new chipsets: Intel's ICH LPC and SCH Centerton,
     ST-E's STAX211, Samsung's MAX77693 and TI's LM3533.

   * Device tree support for the twl6040, tps65910, da9502 and ab8500
     drivers.

   * Fairly big tps56910, ab8500 and db8500 updates.

   * i2c support for mc13xxx.

   * Our regular update for the wm8xxx driver from Mark."

Fix up various conflicts with other trees, largely due to ab5500 removal
etc.

* tag 'mfd-3.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (106 commits)
  mfd: Fix build break of max77693 by adding REGMAP_I2C option
  mfd: Fix twl6040 build failure
  mfd: Fix max77693 build failure
  mfd: ab8500-core should depend on MFD_DB8500_PRCMU
  gpio: tps65910: dt: process gpio specific device node info
  mfd: Remove the parsing of dt info for tps65910 gpio
  mfd: Save device node parsed platform data for tps65910 sub devices
  mfd: Add r_select to lm3533 platform data
  gpio: Add Intel Centerton support to gpio-sch
  mfd: Emulate active low IRQs as well as active high IRQs for wm831x
  mfd: Mark two lm3533 zone registers as volatile
  mfd: Fix return type of lm533 attribute is_visible
  mfd: Enable Device Tree support in the ab8500-pwm driver
  mfd: Enable Device Tree support in the ab8500-sysctrl driver
  mfd: Add support for Device Tree to twl6040
  mfd: Register the twl6040 child for the ASoC codec unconditionally
  mfd: Allocate twl6040 IRQ numbers dynamically
  mfd: twl6040 code cleanup in interrupt initialization part
  mfd: Enable ab8500-gpadc driver for Device Tree
  mfd: Prevent unassigned pointer from being used in ab8500-gpadc driver
  ...
2012-05-29 11:53:11 -07:00
Linus Torvalds
53f2c4a8fd NFS client updates for Linux 3.5
New features include:
 - Rewrite the O_DIRECT code so that it can share the same coalescing and
   pNFS functionality as the page cache code.
 - Allow the server to provide hints as to when we should use pNFS, and
   when it is more efficient to read and write through the metadata
   server.
 - NFS cache consistency updates:
   - Use the ctime to emulate a change attribute for NFSv2/v3 so that
     all NFS versions can share the same cache management code.
   - New cache management code will only look at the change attribute
     and size attribute when deciding whether or not our cached data
     is still valid or not.
   - Don't request NFSv4 post-op attributes on writes in cases such as
     O_DIRECT, where we don't care about data cache consistency, or
     when we have a write delegation, and know that our cache is
     still consistent.
   - Don't request NFSv4 post-op attributes on operations such as
     COMMIT, where there are no expected metadata updates.
   - Don't request NFSv4 directory post-op attributes in cases where
     the operations themselves already return change attribute updates:
     i.e.  operations such as OPEN, CREATE, REMOVE, LINK and RENAME.
 - Speed up 'ls' and friends by using READDIR rather than READDIRPLUS
   if we detect no attempts to lookup filenames.
 - Improve the code sharing between NFSv2/v3 and v4 mounts
 - NFSv4.1 state management efficiency improvements
 - More patches in preparation for NFSv4/v4.1 migration functionality.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPw/MNAAoJEGcL54qWCgDyxU8P/2kKqhAlhoLEArBqo9FT3/OK
 YrNs5uO/erTgnCG8L0XQvTKjHB9F7TAeFXqTmBZuPlb1afRpHHt2vzPqzIvUCeOC
 ZXm8vzZf4nxWZgEFoTDdUBvqQi9lLdIzCRhSaVCKcRnNwiuaKDd/iwykbWGcHqmv
 jtR4lzXPllJdKCUL3yb3juVrpq6Vvn254ID2pqdnYcEtIJIHgaRZpwdp4Iz9+8b5
 Moishiw2rgCBJIhf+VCYd8B2oYfMgSDPxG1o3etkwY46qo+4s+CIls9Vu/6YzGXK
 3+NdLatRDqKhQpLm0/R+dI3rntnTZ8x6LgWnTGxUsiqb6pAaHZPK284rf2eh/s7M
 Q4G4203r0uw539kIt6eKOGqC9c8kZAPCHlQSPCaImZyCJsz+6OMShNlGB5bZpFPr
 tbdxaxudrhCF7UVKXicJCWgv2nIHtek6fNwey1jqFoYgZP5ipiBKymvXQC5WAMBw
 7RHJor/JEC+UJkVg/7Mkpg0UNw3E36CTYLeRJKlNCS6YO9NJQseCDxhhMNAy/ab7
 RGO8DVMkUsOUH20S+a19LyeFQtveWFIE0DiDqRn0KnNGhGwHrv2t4xFukjlrf4Sw
 8FQUBRdtFxfmspfA1IdoTY49XZQda5eagvTy1MyaWEh+jPSJ4G5j3sSjFiaKAJqw
 79iQKFGkxPOSHx2yCdAF
 =suVW
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "New features include:
   - Rewrite the O_DIRECT code so that it can share the same coalescing
     and pNFS functionality as the page cache code.
   - Allow the server to provide hints as to when we should use pNFS,
     and when it is more efficient to read and write through the
     metadata server.
   - NFS cache consistency updates:
     * Use the ctime to emulate a change attribute for NFSv2/v3 so that
       all NFS versions can share the same cache management code.
     * New cache management code will only look at the change attribute
       and size attribute when deciding whether or not our cached data
       is still valid or not.
     * Don't request NFSv4 post-op attributes on writes in cases such as
       O_DIRECT, where we don't care about data cache consistency, or
       when we have a write delegation, and know that our cache is still
       consistent.
     * Don't request NFSv4 post-op attributes on operations such as
       COMMIT, where there are no expected metadata updates.
     * Don't request NFSv4 directory post-op attributes in cases where
       the operations themselves already return change attribute
       updates: i.e. operations such as OPEN, CREATE, REMOVE, LINK and
       RENAME.
   - Speed up 'ls' and friends by using READDIR rather than READDIRPLUS
     if we detect no attempts to lookup filenames.
   - Improve the code sharing between NFSv2/v3 and v4 mounts
   - NFSv4.1 state management efficiency improvements
   - More patches in preparation for NFSv4/v4.1 migration functionality."

Fix trivial conflict in fs/nfs/nfs4proc.c that was due to the dcache
qstr name initialization changes (that made the length/hash a 64-bit
union)

* tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (146 commits)
  NFSv4: Add debugging printks to state manager
  NFSv4: Map NFS4ERR_SHARE_DENIED into an EACCES error instead of EIO
  NFSv4: update_changeattr does not need to set NFS_INO_REVAL_PAGECACHE
  NFSv4.1: nfs4_reset_session should use nfs4_handle_reclaim_lease_error
  NFSv4.1: Handle other occurrences of NFS4ERR_CONN_NOT_BOUND_TO_SESSION
  NFSv4.1: Handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION in the state manager
  NFSv4.1: Handle errors in nfs4_bind_conn_to_session
  NFSv4.1: nfs4_bind_conn_to_session should drain the session
  NFSv4.1: Don't clobber the seqid if exchange_id returns a confirmed clientid
  NFSv4.1: Add DESTROY_CLIENTID
  NFSv4.1: Ensure we use the correct credentials for bind_conn_to_session
  NFSv4.1: Ensure we use the correct credentials for session create/destroy
  NFSv4.1: Move NFSPROC4_CLNT_BIND_CONN_TO_SESSION to the end of the operations
  NFSv4.1: Handle NFS4ERR_SEQ_MISORDERED when confirming the lease
  NFSv4: When purging the lease, we must clear NFS4CLNT_LEASE_CONFIRM
  NFSv4: Clean up the error handling for nfs4_reclaim_lease
  NFSv4.1: Exchange ID must use GFP_NOFS allocation mode
  nfs41: Use BIND_CONN_TO_SESSION for CB_PATH_DOWN*
  nfs4.1: add BIND_CONN_TO_SESSION operation
  NFSv4.1 test the mdsthreshold hint parameters
  ...
2012-05-29 10:43:51 -07:00
Mauro Carvalho Chehab
5926ff502f edac: Initialize the dimm label with the known information
While userspace doesn't fill the dimm labels, add there the dimm location,
as described by the used memory model. This could eventually match what
is described at the dmidecode, making easier for people to identify the
memory.

For example, on an Intel motherboard where the DMI table is reliable,
the first memory stick is described as:

Memory Device
	Array Handle: 0x0029
	Error Information Handle: Not Provided
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 2048 MB
	Form Factor: DIMM
	Set: 1
	Locator: A1_DIMM0
	Bank Locator: A1_Node0_Channel0_Dimm0
	Type: <OUT OF SPEC>
	Type Detail: Synchronous
	Speed: 800 MHz
	Manufacturer: A1_Manufacturer0
	Serial Number: A1_SerNum0
	Asset Tag: A1_AssetTagNum0
	Part Number: A1_PartNum0

The memory named as "A1_DIMM0" is physically located at the first
memory controller (node 0), at channel 0, dimm slot 0.

After this patch, the memory label will be filled with:
	/sys/devices/system/edac/mc/csrow0/ch0_dimm_label:mc#0channel#0slot#0

And (after the new EDAC API patches) as:
	/sys/devices/system/edac/mc/mc0/dimm0/dimm_label:mc#0channel#0slot#0

So, even if the memory label is not initialized on userspace, an useful
information with the error location is filled there, expecially since
several systems/motherboards are provided with enough info to map from
channel/slot (or branch/channel/slot) into the DIMM label. So, letting the
EDAC core fill it by default is a good thing.

It should noticed that, as the label filling happens at the
edac_mc_alloc(), drivers can override it to better describe the memories
(and some actually do it).

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-05-28 19:13:50 -03:00
Mauro Carvalho Chehab
4275be6355 edac: Change internal representation to work with layers
Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core was written with the idea that memory controllers
are able to directly access csrows.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks with FB-DIMM and RAMBUS
memory controllers.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMMs are currently represented by multiple DIMM
entries in struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same DIMM.
This bug is present since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger in the drivers for FB-DIMMs.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-05-28 19:10:59 -03:00
Mauro Carvalho Chehab
982216a429 edac.h: Add generic layers for describing a memory location
The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks, accessed
via csrow/channel.

So, changes are needed in order to allow the EDAC core to
work with all types of architectures.

In preparation for handling non-csrows based memory controllers,
add some memory structs and a macro:

enum hw_event_mc_err_type: describes the type of error
			   (corrected, uncorrected, fatal)

To be used by the new edac_mc_handle_error function;

enum edac_mc_layer: describes the type of a given memory
architecture layer (branch, channel, slot, csrow).

struct edac_mc_layer: describes the properties of a memory
		      layer (type, size, and if the layer
		      will be used on a virtual csrow.

EDAC_DIMM_PTR() - as the number of layers can vary from 1 to 3,
this macro converts from an address with up to 3 layers into
a linear address.

Reviewed-by: Borislav Petkov <bp@amd64.org>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-05-28 19:10:59 -03:00
Mauro Carvalho Chehab
a895bf8b1e edac: move nr_pages to dimm struct
The number of pages is a dimm property. Move it to the dimm struct.

After this change, it is possible to add sysfs nodes for the DIMM's that
will properly represent the DIMM stick properties, including its size.

A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
the memory controller represents the memory via chip select rows.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-05-28 19:10:58 -03:00
Mauro Carvalho Chehab
084a4fccef edac: move dimm properties to struct dimm_info
On systems based on chip select rows, all channels need to use memories
with the same properties, otherwise the memories on channels A and B
won't be recognized.

However, such assumption is not true for all types of memory
controllers.

Controllers for FB-DIMM's don't have such requirements.

Also, modern Intel controllers seem to be capable of handling such
differences.

So, we need to get rid of storing the DIMM information into a per-csrow
data, storing it, instead at the right place.

The first step is to move grain, mtype, dtype and edac_mode to the
per-dimm struct.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@parallels.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Mike Williams <mike@mikebwilliams.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-05-28 19:10:58 -03:00
Mauro Carvalho Chehab
a7d7d2e1a0 edac: Create a dimm struct and move the labels into it
The way a DIMM is currently represented implies that they're
linked into a per-csrow struct. However, some drivers don't see
csrows, as they're ridden behind some chip like the AMB's
on FBDIMM's, for example.

This forced drivers to fake^Wvirtualize a csrow struct, and to create
a mess under csrow/channel original's concept.

Move the DIMM labels into a per-DIMM struct, and add there
the real location of the socket, in terms of csrow/channel.
Latter patches will modify the location to properly represent the
memory architecture.

All other drivers will use a per-csrow type of location.
Some of those drivers will require a latter conversion, as
they also fake the csrows internally.

TODO: While this patch doesn't change the existing behavior, on
csrows-based memory controllers, a csrow/channel pair points to a memory
rank. There's a known bug at the EDAC core that allows having different
labels for the same DIMM, if it has more than one rank. A latter patch
is need to merge the several ranks for a DIMM into the same dimm_info
struct, in order to avoid having different labels for the same DIMM.

The edac_mc_alloc() will now contain a per-dimm initialization loop that
will be changed by latter patches in order to match other types of
memory architectures.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-05-28 19:10:57 -03:00
Linus Torvalds
90324cc1b1 avoid iput() from flusher thread
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPw2J/AAoJECvKgwp+S8Ja5jkP/3uMxkhf8XQpXCI3O1QVfaQr
 uZFfM8sINqIPDVm1dtFjFj7f8Bw9mhE2KAnnJ1rKT8tQwqq9yAse1QPlhCG1ZqoP
 +AnMDDXHtx7WmQZXhBvS9b+unpZ7Jr6r6pO5XrmTL2kRL3YJPUhZ2+xbTT5belTB
 KoAu4WqORZRxfXoC76S7U8K+D4NcAGhAOxCClsIjmY+oocCiCag4FZOyzYIFViqc
 ghUN/+rLQ3fqGGv2yO7Ylx1gUM7sxIwkZQ/h962jFAtxz9czImr2NmRoMliOaOkS
 tvcnIf+E3u0n/zIjzFvzhxKgHJPP8PkcPMk60d3jKmFngBkqFTzNUeVTP8md7HrV
 4DlXisWr+z7YVyWUCFaNcJLmjiWSwQ8DV/clRLobeBf9EJKan5F1PjFgl6PLJM5F
 Qr1+LHMNaetdulBwMRTyveZTzYqw9RmDnD9dWMo4mX/kTpvtC4jTPVV7hkRD+Qlv
 5vTRR+VXL3Q50yClLf0AQMSKTnH2gBuepM/b+7cShLGfsMln8DtUjmbigv+niL63
 BibcCIbIlP2uWGnl37VhsC34AT+RKt3lggrBOpn/7XJMq/wKR7IRP/7V9TfYgaUN
 NBa+wtnLDa1pZEn/X7izdcQP62PzDtmB+ObvYT0Yb40A4+2ud3qF/lB53c1A1ewF
 /9c4zxxekjHZnn2oooEa
 =oLXf
 -----END PGP SIGNATURE-----

Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux

Pull writeback tree from Wu Fengguang:
 "Mainly from Jan Kara to avoid iput() in the flusher threads."

* tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
  writeback: Avoid iput() from flusher thread
  vfs: Rename end_writeback() to clear_inode()
  vfs: Move waiting for inode writeback from end_writeback() to evict_inode()
  writeback: Refactor writeback_single_inode()
  writeback: Remove wb->list_lock from writeback_single_inode()
  writeback: Separate inode requeueing after writeback
  writeback: Move I_DIRTY_PAGES handling
  writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()
  writeback: Move clearing of I_SYNC into inode_sync_complete()
  writeback: initialize global_dirty_limit
  fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds
  mm: page-writeback.c: local functions should not be exposed globally
2012-05-28 09:54:45 -07:00