Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 27922b4260f65d317aabda37e42bbbff > files > 1521

kernel-2.6.18-238.el5.src.rpm

From: George Beshers <gbeshers@redhat.com>
Date: Thu, 31 Jul 2008 15:33:50 -0400
Subject: [IA64] Remove needless delay in MCA rendezvous
Message-id: 20080731192755.4411.80652.sendpatchset@dhcp-100-2-194.bos.redhat.com
O-Subject: [RHEL5.3 PATCH 14/19] [IA64] Remove needless delay in MCA rendezvous
Bugzilla: 455308
RH-Acked-by: Prarit Bhargava <prarit@redhat.com>

[IA64] Remove needless delay in MCA rendezvous

BZ#455308

Upstream: http://git.kernel.org/?p=linux/kernel/git/aegl/linux-2.6.git;a=commit;h=2bc5c282999af41042c2b703bf3a58ca1d7e3ee2

While testing the MCA recovery code, noticed that some machines would have a
five second delay rendezvousing cpus.  What was happening is that
ia64_wait_for_slaves() would check to see if all the slave CPUs had
rendezvoused.  If any had not, it would wait 1 millisecond then check again.
If any CPUs had still not rendezvoused, it would wait 5 seconds before
checking again.

On some configs the rendezvous takes more than 1 millisecond, causing the code
to wait the full 5 seconds, even though the last CPU rendezvoused after only
a few milliseconds.

The fix is to check every 1 millisecond to see if all the cpus have
rendezvoused.  After 5 seconds the code concludes the CPUs will never
rendezvous (same as before).

The MCA code is, by definition, not performance critical, but a needless
delay of 5 seconds is senseless.  The 5 seconds also adds up quickly
when running the error injection code in a loop.

This patch both simplifies the code and removes the needless delay.

Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

diff --git a/arch/ia64/kernel/mca.c b/arch/ia64/kernel/mca.c
index b90c505..84289e3 100644
--- a/arch/ia64/kernel/mca.c
+++ b/arch/ia64/kernel/mca.c
@@ -1133,30 +1133,27 @@ no_mod:
 static void
 ia64_wait_for_slaves(int monarch, const char *type)
 {
-	int c, wait = 0, missing = 0;
-	for_each_online_cpu(c) {
-		if (c == monarch)
-			continue;
-		if (ia64_mc_info.imi_rendez_checkin[c] == IA64_MCA_RENDEZ_CHECKIN_NOTDONE) {
-			udelay(1000);		/* short wait first */
-			wait = 1;
-			break;
-		}
-	}
-	if (!wait)
-		goto all_in;
-	for_each_online_cpu(c) {
-		if (c == monarch)
-			continue;
-		if (ia64_mc_info.imi_rendez_checkin[c] == IA64_MCA_RENDEZ_CHECKIN_NOTDONE) {
-			udelay(5*1000000);	/* wait 5 seconds for slaves (arbitrary) */
-			if (ia64_mc_info.imi_rendez_checkin[c] == IA64_MCA_RENDEZ_CHECKIN_NOTDONE)
-				missing = 1;
-			break;
+	int c, i , wait;
+
+	/*
+	 * wait 5 seconds total for slaves (arbitrary)
+	 */
+	for (i = 0; i < 5000; i++) {
+		wait = 0;
+		for_each_online_cpu(c) {
+			if (c == monarch)
+				continue;
+			if (ia64_mc_info.imi_rendez_checkin[c]
+					== IA64_MCA_RENDEZ_CHECKIN_NOTDONE) {
+				udelay(1000);		/* short wait */
+				wait = 1;
+				break;
+			}
 		}
+		if (!wait)
+			goto all_in;
 	}
-	if (!missing)
-		goto all_in;
+
 	/*
 	 * Maybe slave(s) dead. Print buffered messages immediately.
 	 */