Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 89877e42827f16fa5f86b1df0c2860b1 > files > 1705

kernel-2.6.18-128.1.10.el5.src.rpm

From: Doug Ledford <dledford@redhat.com>
Date: Thu, 11 Dec 2008 12:08:06 -0500
Subject: [openib] fix ipoib oops in unicast_arp_send
Message-id: 1229015286.32405.103.camel@firewall.xsintricity.com
O-Subject: [Patch RHEL5.3] Fix ipoib oops in unicast_arp_send
Bugzilla: 476005
RH-Acked-by: Peter Martuccelli <peterm@redhat.com>

This addresses https://bugzilla.redhat.com/show_bug.cgi?id=476005

After the last IPoIB oops patch, further testing turned up this
additional item.  It's already been submitted and accepted upstream into
the mainline kernel and into OFED 1.4.  I can't reproduce (my cluster is
too small), but IBM reports that this solves an issue that caused
something like 1400 machines to drop out of a cluster simultaneously.

--
Doug Ledford <dledford@redhat.com>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband

commit ff79ae80837cf45cb703b34824dd3862d2ddcb24
Author: Yossi Etigin <yosefe@Voltaire.COM>
Date:   Wed Nov 12 10:24:39 2008 -0800

    IPoIB: Fix crash in path_rec_completion()

    Fix a crash in path_rec_completion() during an SM up/down loop.  If
    more than one path record request is issued, the first completion
    releases path->done, allowing ipoib_flush_paths() to free the path,
    and thus corrupting it for the second completion.

    Commit ee1e2c82 ("IPoIB: Refresh paths instead of flushing them on SM
    change events") added the field path->valid and changed the test "if
    (!path)" to "if (!path || !path->valid)".  This change made it
    possible for a path with an outstanding query to pass the test and
    issue another query on the same path.  Having two queries on the same
    path leads to a crash.

    This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1325>.

    Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
    Signed-off-by: Roland Dreier <rolandd@cisco.com>

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 0f03d20..f7028ff 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -669,7 +669,7 @@ static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
 			skb_push(skb, sizeof *phdr);
 			__skb_queue_tail(&path->queue, skb);
 
-			if (path_rec_start(dev, path)) {
+			if (!path->query && path_rec_start(dev, path)) {
 				spin_unlock(&priv->lock);
 				path_free(dev, path);
 				return;