Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 89877e42827f16fa5f86b1df0c2860b1 > files > 1831

kernel-2.6.18-128.1.10.el5.src.rpm

From: Scott Moser <smoser@redhat.com>
Subject: [RHEL5.1 PATCH] bz225481 EEH is improperly enabled for some Power4  systems
Date: Wed, 30 May 2007 17:18:40 -0400 (EDT)
Bugzilla: 225481
Message-Id: <Pine.LNX.4.64.0705291602290.5134@squad5-lp1.lab.boston.redhat.com>
Changelog: [ppc] EEH is improperly enabled for some Power4  systems


RHBZ#: 225481
------
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=225481

Description:
------------
    [POWERPC] pSeries: EEH improperly enabled for some Power4 systems

    It appears that EEH is improperly enabled for some Power4 systems.
    On these systems, the ibm,set-eeh-option returns a value of success
    even when EEH is not supported on the given node. Thus, an explicit
    check for support is required.

    During boot, on power4, without this patch, one sees messages
    similar to:

    EEH: event on unsupported device, rc=0 dn=/pci@400000000110/IBM,sp@1
    EEH: event on unsupported device, rc=0 dn=/pci@400000000110/pci@2
    EEH: event on unsupported device, rc=0 dn=/pci@400000000110/pci@2,2
    etc.

    The patch makes these go away.

    Without this patch, EEH recovery does seem to work correctly for
    at least some devices (I tested ethernet e1000), but fails to
    recover others (the Emulex LightPulse LPFC, most notably).
    Off the top of my head, I don't remember why some devices are
    affected, but not others.

    The PAPR indicates that the correct way to test for EEH is as
    done in this patch; its not clear to me if this was in the PAPR
    all along, or recently added; if it was there all along, its not
    clear to me why this hadn't been fixed long ago. I suspect only
    certain firmware levels are affected.
    
    Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
    Signed-off-by: Paul Mackerras <paulus@samba.org>

RHEL Version Found:
-------------------
bug against 5.0

Upstream Status:
----------------
This code is upstream in 2.6.20
git commit [1] 25c4a46f0ed8ece9ac6699e200fcc83a4642dce7

Test Status:
------------
These patches have been built against 2.6.18-19.el5, and successfully
built in brew task 794835 [2].

Test has been done by Linas Vepstas of IBM.

Proposed Patch:
----------------
Please review and ACK for RHEL5.1

-- 
[1] http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=25c4a46f0ed8ece9ac6699e200fcc83a4642dce7
[2] http://brewweb.devel.redhat.com/brew/taskinfo?taskID=794835

---
 arch/powerpc/platforms/pseries/eeh.c |   19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

Index: b/arch/powerpc/platforms/pseries/eeh.c
===================================================================
--- a/arch/powerpc/platforms/pseries/eeh.c
+++ b/arch/powerpc/platforms/pseries/eeh.c
@@ -746,6 +746,7 @@ struct eeh_early_enable_info {
 /* Enable eeh for the given device node. */
 static void *early_enable_eeh(struct device_node *dn, void *data)
 {
+	unsigned int rets[3];
 	struct eeh_early_enable_info *info = data;
 	int ret;
 	char *status = get_property(dn, "status", NULL);
@@ -802,16 +803,14 @@ static void *early_enable_eeh(struct dev
 		                regs[0], info->buid_hi, info->buid_lo,
 		                EEH_ENABLE);
 
+		enable = 0;
 		if (ret == 0) {
-			eeh_subsystem_enabled = 1;
-			pdn->eeh_mode |= EEH_MODE_SUPPORTED;
 			pdn->eeh_config_addr = regs[0];
 
 			/* If the newer, better, ibm,get-config-addr-info is supported, 
 			 * then use that instead. */
 			pdn->eeh_pe_config_addr = 0;
 			if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
-				unsigned int rets[2];
 				ret = rtas_call (ibm_get_config_addr_info, 4, 2, rets, 
 					pdn->eeh_config_addr, 
 					info->buid_hi, info->buid_lo,
@@ -819,6 +818,20 @@ static void *early_enable_eeh(struct dev
 				if (ret == 0)
 					pdn->eeh_pe_config_addr = rets[0];
 			}
+
+			/* Some older systems (Power4) allow the
+			 * ibm,set-eeh-option call to succeed even on nodes
+			 * where EEH is not supported. Verify support
+			 * explicitly. */
+			ret = read_slot_reset_state(pdn, rets);
+			if ((ret == 0) && (rets[1] == 1))
+				enable = 1;
+		}
+
+		if (enable) {
+			eeh_subsystem_enabled = 1;
+			pdn->eeh_mode |= EEH_MODE_SUPPORTED;
+
 #ifdef DEBUG
 			printk(KERN_DEBUG "EEH: %s: eeh enabled, config=%x pe_config=%x\n",
 			       dn->full_name, pdn->eeh_config_addr, pdn->eeh_pe_config_addr);