mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-07 22:42:04 +00:00
d9104d1ca9
Pavel reported that in case if vma area get unmapped and then mapped (or expanded) in-place, the soft dirty tracker won't be able to recognize this situation since it works on pte level and ptes are get zapped on unmap, loosing soft dirty bit of course. So to resolve this situation we need to track actions on vma level, there VM_SOFTDIRTY flag comes in. When new vma area created (or old expanded) we set this bit, and keep it here until application calls for clearing soft dirty bit. Thus when user space application track memory changes now it can detect if vma area is renewed. Reported-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Matt Mackall <mpm@selenic.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Rob Landley <rob@landley.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
44 lines
1.7 KiB
Plaintext
44 lines
1.7 KiB
Plaintext
SOFT-DIRTY PTEs
|
|
|
|
The soft-dirty is a bit on a PTE which helps to track which pages a task
|
|
writes to. In order to do this tracking one should
|
|
|
|
1. Clear soft-dirty bits from the task's PTEs.
|
|
|
|
This is done by writing "4" into the /proc/PID/clear_refs file of the
|
|
task in question.
|
|
|
|
2. Wait some time.
|
|
|
|
3. Read soft-dirty bits from the PTEs.
|
|
|
|
This is done by reading from the /proc/PID/pagemap. The bit 55 of the
|
|
64-bit qword is the soft-dirty one. If set, the respective PTE was
|
|
written to since step 1.
|
|
|
|
|
|
Internally, to do this tracking, the writable bit is cleared from PTEs
|
|
when the soft-dirty bit is cleared. So, after this, when the task tries to
|
|
modify a page at some virtual address the #PF occurs and the kernel sets
|
|
the soft-dirty bit on the respective PTE.
|
|
|
|
Note, that although all the task's address space is marked as r/o after the
|
|
soft-dirty bits clear, the #PF-s that occur after that are processed fast.
|
|
This is so, since the pages are still mapped to physical memory, and thus all
|
|
the kernel does is finds this fact out and puts both writable and soft-dirty
|
|
bits on the PTE.
|
|
|
|
While in most cases tracking memory changes by #PF-s is more than enough
|
|
there is still a scenario when we can lose soft dirty bits -- a task
|
|
unmaps a previously mapped memory region and then maps a new one at exactly
|
|
the same place. When unmap is called, the kernel internally clears PTE values
|
|
including soft dirty bits. To notify user space application about such
|
|
memory region renewal the kernel always marks new memory regions (and
|
|
expanded regions) as soft dirty.
|
|
|
|
This feature is actively used by the checkpoint-restore project. You
|
|
can find more details about it on http://criu.org
|
|
|
|
|
|
-- Pavel Emelyanov, Apr 9, 2013
|