From: AMEET M. PARANJAPE <aparanja@redhat.com> Date: Mon, 8 Jun 2009 16:08:42 -0400 Subject: [scsi] ipr: fix PCI permanent error handler Message-id: 20090608200528.23882.84166.sendpatchset@squad5-lp1.lab.bos.redhat.com O-Subject: [PATCH RHEL5.4 BZ503960] ipr: fix PCI permanent error handler Bugzilla: 503960 RH-Acked-by: Pete Zaitcev <zaitcev@redhat.com> RH-Acked-by: Mike Christie <mchristi@redhat.com> RH-Acked-by: Stefan Assmann <sassmann@redhat.com> RH-Acked-by: Prarit Bhargava <prarit@redhat.com> RH-Acked-by: David Howells <dhowells@redhat.com> RHBZ#: ====== https://bugzilla.redhat.com/show_bug.cgi?id=503960 Description: =========== The ipr driver can hang if it encounters enough PCI errors to trigger the permanent error handler. The driver will attempt to initiate a "bringdown" of the adapter and fail all pending ops back. In this code path, we end up failing back with allow_cmds still set to 1. This results in some commands getting immediately re-issued to the adapter on the done call, which results in an infinite loop in ipr_fail_all_ops. Fix this by setting allow_cmds to zero in this path. RHEL Version Found: ================ RHEL 5.3 kABI Status: ============ No symbols were harmed. Brew: ===== Built on all platforms. http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1831949 Upstream Status: ================ http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commit;h=6e145ad73987cfc8375e5396073dd2692e07bd15 Test Status: ============ A testcase is given in the Bugzilla. Without the patch a PPC box with IPR device will hang when trying to remove ipr module after the 6th EEH error inject. With this patch the hang is not seen. =============================================================== Ameet Paranjape 978-392-3903 ext 23903 IBM on-site partner Proposed Patch: =============== diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c index b6fec36..63049ce 100644 --- a/drivers/scsi/ipr.c +++ b/drivers/scsi/ipr.c @@ -7045,6 +7045,7 @@ static void ipr_pci_perm_failure(struct pci_dev *pdev) ioa_cfg->sdt_state = ABORT_DUMP; ioa_cfg->reset_retries = IPR_NUM_RESET_RELOAD_RETRIES; ioa_cfg->in_ioa_bringdown = 1; + ioa_cfg->allow_cmds = 0; ipr_initiate_ioa_reset(ioa_cfg, IPR_SHUTDOWN_NONE); spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); }