Sophie: kernel-2.6.18-194.11.1.el5 src

kernel-2.6.18-194.11.1.el5.src.rpm

From: Markus Armbruster <armbru@redhat.com>
Date: Fri, 6 Feb 2009 10:17:09 +0100
Subject: [xen] fbfront dirty race
Message-id: 87ljsjapre.fsf@pike.pond.sub.org
O-Subject: [PATCH RHEL-5.4] Xen fbfront dirty race
Bugzilla: 456893
RH-Acked-by: Chris Lalancette <clalance@redhat.com>
RH-Acked-by: Don Dutile <ddutile@redhat.com>

The Xen PV framebuffer frontend contains a race condition that could
lead to a bogus dirty rectangle.  The code has a WARN_ON() for that.
Ancient backends can misbehave when they receive bogus rectangles (bug
443392, fixed in xen 3.0.3-63.el5).

The race was found to trigger occasionally during guest shutdown, at
least on some machines.

Here's how I think it bites:

xenfb_thread()          check dirty; yes, dirty = 0 (not holding lock)
|
| __xenfb_refresh()     extend dirty rect; dirty = 1 (holding lock)
|
xenfb_update_screen()   get & clear dirty rect (holding lock)
xenfb_thread()          loop around
xenfb_thread()          check dirty; yes, dirty = 0 (not holding lock)
xenfb_update_screen()   bogus rect

The bug was fixed in http://xenbits.xensource.com/linux-2.6.18-xen.hg
in the following changesets:

changeset:   789:25cc543a02e8
user:        Keir Fraser <keir.fraser@citrix.com>
date:        Tue Feb 03 13:59:17 2009 +0000
summary:     fbfront: Improve diagnostics when kthread_run() fails

changeset:   788:26ddc59c674d
user:        Keir Fraser <keir.fraser@citrix.com>
date:        Fri Jan 30 10:54:10 2009 +0000
summary:     xenfb: Revert mm_lock changes. They're not needed.

changeset:   783:8197c86e6729
user:        Keir Fraser <keir.fraser@citrix.com>
date:        Thu Jan 29 11:28:58 2009 +0000
summary:     xenfb: eliminate the update_wanted field.

changeset:   781:c9783c08495c
user:        Keir Fraser <keir.fraser@citrix.com>
date:        Wed Jan 28 13:41:33 2009 +0000
summary:     xenfb: fix xenfb_update_screen bogus rect

Pvops upstream is not affected, because it delegates the hairy parts
to fb_defio.

The appended patch is a backport of upstream's fix, except for
changeset 783.  That one changes the way the kthread is used, which I
didn't feel like risking in RHEL just for cleanup.

Note that 788 reverts a part of 781, namely an unnecessary and not
obviously correct locking change.  Locking is rather subtle there...

Bug 456893.  Please ACK.

diff --git a/drivers/xen/fbfront/xenfb.c b/drivers/xen/fbfront/xenfb.c
index f9aaf56..a4508ee 100644
--- a/drivers/xen/fbfront/xenfb.c
+++ b/drivers/xen/fbfront/xenfb.c
@@ -119,12 +119,19 @@ static void xenfb_update_screen(struct xenfb_info *info)
 	mutex_lock(&info->mm_lock);
 
 	spin_lock_irqsave(&info->dirty_lock, flags);
-	y1 = info->y1;
-	y2 = info->y2;
-	x1 = info->x1;
-	x2 = info->x2;
-	info->x1 = info->y1 = INT_MAX;
-	info->x2 = info->y2 = 0;
+	if (info->dirty){
+		info->dirty = 0;
+		y1 = info->y1;
+		y2 = info->y2;
+		x1 = info->x1;
+		x2 = info->x2;
+		info->x1 = info->y1 = INT_MAX;
+		info->x2 = info->y2 = 0;
+	} else {
+		spin_unlock_irqrestore(&info->dirty_lock, flags);
+		mutex_unlock(&info->mm_lock);
+		return;
+	}
 	spin_unlock_irqrestore(&info->dirty_lock, flags);
 
 	list_for_each_entry(map, &info->mappings, link) {
@@ -150,10 +157,7 @@ static int xenfb_thread(void *data)
 	struct xenfb_info *info = data;
 
 	while (!kthread_should_stop()) {
-		if (info->dirty) {
-			info->dirty = 0;
-			xenfb_update_screen(info);
-		}
+		xenfb_update_screen(info);
 		wait_event_interruptible(info->wq,
 			kthread_should_stop() || info->dirty);
 		try_to_freeze();
@@ -479,15 +483,6 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
 	}
 	info->fb_info = fb_info;
 
-	/* FIXME should this be delayed until backend XenbusStateConnected? */
-	info->kthread = kthread_run(xenfb_thread, info, "xenfb thread");
-	if (IS_ERR(info->kthread)) {
-		ret = PTR_ERR(info->kthread);
-		info->kthread = NULL;
-		xenbus_dev_fatal(dev, ret, "register_framebuffer");
-		goto error;
-	}
-
 	ret = xenfb_connect_backend(dev, info);
 	if (ret < 0)
 		goto error;
@@ -649,6 +644,13 @@ static void xenfb_backend_changed(struct xenbus_device *dev,
 			val = 0;
 		if (val)
 			info->update_wanted = 1;
+
+		info->kthread = kthread_run(xenfb_thread, info, "xenfb thread");
+		if (IS_ERR(info->kthread)) {
+			info->kthread = NULL;
+			xenbus_dev_fatal(dev, PTR_ERR(info->kthread),
+					"xenfb_thread");
+		}
 		break;
 
 	case XenbusStateClosing: