David Teigland b790c3b7c3 [DLM] can miss clearing resend flag
A long, complicated sequence of events, beginning with the RESEND flag not
being cleared on an lkb, can result in an unlock never completing.

- lkb on waiters list for remote lookup
- the remote node is both the dir node and the master node, so
  it optimizes the lookup into a request and sends a request
  reply back
- the request reply is saved on the requestqueue to be processed
  after recovery
- recovery runs dlm_recover_waiters_pre() which sets RESEND flag
  so the lookup will be resent after recovery
- end of recovery: process_requestqueue takes saved request reply
  which removes the lkb off the waitesr list, _without_ clearing
  the RESEND flag
- end of recovery: dlm_recover_waiters_post() doesn't do anything
  with the now completed lookup lkb (would usually clear RESEND)
- later, the node unmounts, unlocks this lkb that still has RESEND
  flag set
- the lkb is on the waiters list again, now for unlock, when recovery
  occurs, dlm_recover_waiters_pre() shows the lkb for unlock with RESEND
  set, doesn't do anything since the master still exists
- end of recovery: dlm_recover_waiters_post() takes this lkb off
  the waiters list because it has the RESEND flag set, then reports
  an error because unlocks are never supposed to be handled in
  recover_waiters_post().
- later, the unlock reply is received, doesn't find the lkb on
  the waiters list because recover_waiters_post() has wrongly
  removed it.
- the unlock operation has been lost, and we're left with a
  stray granted lock
- unmount spins waiting for the unlock to complete

The visible evidence of this problem will be a node where gfs umount is
spinning, the dlm waiters list will be empty, and the dlm locks list will
show a granted lock.

The fix is simply to clear the RESEND flag when taking an lkb off the
waiters list.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:50 -05:00
..
2007-01-05 23:55:22 -08:00
2006-12-08 08:28:44 -08:00
2007-02-05 13:37:50 -05:00
2006-12-08 08:28:45 -08:00
2007-02-05 13:37:41 -05:00
2006-12-08 08:28:45 -08:00
2007-01-30 08:26:44 -08:00
2006-12-08 08:28:45 -08:00
2006-12-08 08:28:45 -08:00
2006-12-08 08:28:47 -08:00
2006-11-16 11:43:38 -08:00
2006-12-08 08:28:48 -08:00
2006-12-08 08:28:49 -08:00
2006-12-08 08:28:50 -08:00
2007-01-30 08:26:45 -08:00
2006-11-16 11:43:38 -08:00
2006-01-11 18:42:13 -08:00
2005-04-16 15:20:36 -07:00
2005-04-16 15:20:36 -07:00
2006-10-04 06:51:26 -06:00
2006-12-07 08:39:25 -08:00
2006-01-08 20:12:40 -08:00
2006-04-11 13:53:33 +02:00
2006-12-13 09:05:50 -08:00
2006-12-07 08:39:25 -08:00
2006-12-13 09:05:50 -08:00
2006-10-01 00:39:19 -07:00
2005-04-16 15:20:36 -07:00
2006-03-23 07:38:11 -08:00
2006-12-13 09:05:47 -08:00
2006-12-22 08:55:48 -08:00
2005-04-16 15:20:36 -07:00