Sophie: kernel-2.6.18-238.el5 src

kernel-2.6.18-238.el5.src.rpm

From: Jonathan Brassow <jbrassow@redhat.com>
Subject: [RHEL5 PATCH 1 of 1] device-mapper mirroring - fix sync status 	change
Date: Tue, 28 Nov 2006 15:20:39 -0600
Bugzilla: 217582
Message-Id: <1164748839.20059.12.camel@hydrogen.msp.redhat.com>
Changelog: device-mapper mirroring - fix sync status change


Bug #217582 (a dup of RHEL4 bug #217581)

A previous patch designed to update the sync status of a mirror was to
hasty.  The original idea was to make a machine immediately aware of a
change from sync -> out-of-sync when dealing with cluster mirrors.
However, the machines need to discover this on their own, otherwise they
cannot tell if they can switch a primary device when it fails...

 brassow

We must only allow do_recovery to mark ms->in_sync as 1.  It is
the job of the fault handling code (like __bio_mark_nosync) to
mark ms->in_sync as 0, if necessary.

If do_recovery handles this, it is possible for us not to be able
to switch primary devices in the case of cluster mirroring.  The
scenario is:

0) Mirror is in-sync
1) Node1 writes to disk, but write fails to the primary device
2) Node1 increments the error count for that device
3) Node1 checks ms->in_sync to see if it is safe to switch the
   primary.  (We cannot switch the primary if other devices are
   not in-sync.  This would lead to bad data being read.)
4) Node1 switches the primary because the mirror is in-sync, then
   marks the region out-of-sync and ms->in_sync = 0.
5) Node2 writes and fails to the primary device
6) Node2 increments the error count for that device
7) Node2 checks ms->in_sync to see if it is safe to switch the
   primary.  It isn't because do_recovery has stepped in and changed
   ms->in_sync when it shouldn't have.
Index: linux-2.6.18-rhel5/drivers/md/dm-raid1.c
===================================================================
--- linux-2.6.18-rhel5.orig/drivers/md/dm-raid1.c	2006-11-07 15:41:53.000000000 -0600
+++ linux-2.6.18-rhel5/drivers/md/dm-raid1.c	2006-11-07 16:05:11.000000000 -0600
@@ -766,19 +766,14 @@ static void do_recovery(struct mirror_se
 	}
 
 	/*
-	 * Update the in sync flag if necessary.
-	 * Raise an event when the mirror becomes in-sync.
-	 *
-	 * After recovery completes, the mirror becomes in_sync.
-	 * Only an I/O failure can then take it back out-of-sync.
+	 * Update the in sync flag.
 	 */
-	if (log->type->get_sync_count(log) == ms->nr_regions) {
-		if (!ms->in_sync) {
-			dm_table_event(ms->ti->table);
-			ms->in_sync = 1;
-		}
-	} else if (ms->in_sync)
-		ms->in_sync = 0;
+	if (!ms->in_sync &&
+	    (log->type->get_sync_count(log) == ms->nr_regions)) {
+		/* the sync is complete */
+		dm_table_event(ms->ti->table);
+		ms->in_sync = 1;
+	}
 }
 
 /*-----------------------------------------------------------------
@@ -1008,10 +1003,9 @@ static void __bio_mark_nosync(struct mir
 	region_t region = bio_to_region(rh, bio);
 	int recovering = 0;
 
-	ms->in_sync = 0;
-
 	/* We must inform the log that the sync count has changed. */
 	log->type->set_region_sync(log, region, 0);
+	ms->in_sync = 0;
 
 	read_lock(&rh->hash_lock);
 	reg = __rh_find(rh, region);