Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 89877e42827f16fa5f86b1df0c2860b1 > files > 1722

kernel-2.6.18-128.1.10.el5.src.rpm

From: AMEET M. PARANJAPE <aparanja@redhat.com>
Date: Thu, 18 Dec 2008 15:15:21 -0500
Subject: [openib] restore traffic in connected mode on HCA
Message-id: 20081218201442.8117.91571.sendpatchset@squad5-lp1.lab.bos.redhat.com
O-Subject: [PATCH RHEL5.3 BZ477000] restore data transmission in connected mode on any HCA
Bugzilla: 477000
RH-Acked-by: Doug Ledford <dledford@redhat.com>
RH-Acked-by: David Howells <dhowells@redhat.com>

RHBZ#:
======
https://bugzilla.redhat.com/show_bug.cgi?id=477000

Description:
===========
This patch restores traffic between systems using IPoIB connect mode (CM). It
assigns the receive array for CM mode.

Have tested this patch with netperf (multiple instances) on several different
combinations of HCAs between Rhel 5.3 (build 126) and Rhel 5.2 between
two system-Ps and system-P system-X.

RHEL Version Found:
================
RHEL 5.3 Beta Snapshot3

kABI Status:
============
No symbols were harmed.

Brew:
=====
Built on all platforms.
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1621075

Upstream Status:
================
This is specific to RHEL 5.3.  The problem is not seen in mainline.

Test Status:
============
To recreate the problem:
1. install RHEL5.3 snapshot 6 on any platforms, reboot the node, default is
ipoib-cm mode,
2. run "ping" from one node to another node, the remote node is unreachable,
3. echo datagram >/sys/class/net/ib0/mode,
4. ping the remote node again, it works with ipoib-ud mode.

Validate the fix:
1. apply this patch, rebuild/reload ib_ipoib module
2. run "ping" from the one node to another node, it works across different
platforms and different HCAs.
3. run "netperf/netserver" multiple streams test, ipoib-cm works fine.
4. echo datagram >/sys/class/net/ib0/mode
5. ping or netperf/netserver test, it works with ipoib-ud mode.

On Rhel 5.3 edit /etc/sysconfig/network-scripts/ifcfg-ib* to comment out
"CONNECTED_MODE" and "MTU" and execute "/etc/init.d/openibd restart" to change
to UD mode, instead of using "echo datagram > ...."

That is not supported with Rhel 5.3.
===============================================================
Ameet Paranjape 978-392-3903 ext 23903
IBM on-site partner

Proposed Patch:
===============

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 5f87d20..315a434 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -375,7 +375,7 @@ struct ib_sge *sge)
 			sge[i].length = PAGE_SIZE;
 
 	wr->next    = NULL;
-	wr->sg_list = priv->cm.rx_sge;
+	wr->sg_list = sge;
 	wr->num_sge = priv->cm.num_frags;
 }
 
@@ -1563,7 +1563,7 @@ destory_srq:
 int ipoib_cm_dev_init(struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
-	int i, ret;
+	int i, ret, j;
 	struct ib_device_attr attr;
 
 	INIT_LIST_HEAD(&priv->cm.passive_ids);
@@ -1601,6 +1601,25 @@ int ipoib_cm_dev_init(struct net_device *dev)
 		priv->cm.num_frags  = IPOIB_CM_RX_SG;
 	}
 
+	if (ipoib_cm_has_srq(dev)) {
+		for (j = 0; j < ipoib_recvq_size; ++j) {
+			for (i = 0; i < priv->cm.num_frags; ++i)
+				priv->cm.rx_wr_arr[j].rx_sge[i].lkey =
+				priv->mr->lkey;
+
+			priv->cm.rx_wr_arr[j].rx_sge[0].length =
+			IPOIB_CM_HEAD_SIZE;
+			for (i = 1; i < priv->cm.num_frags; ++i)
+				priv->cm.rx_wr_arr[j].rx_sge[i].length =
+				PAGE_SIZE;
+
+			priv->cm.rx_wr_arr[j].wr.sg_list =
+			priv->cm.rx_wr_arr[j].rx_sge;
+			priv->cm.rx_wr_arr[j].wr.num_sge = priv->cm.num_frags;
+		}
+		priv->cm.head = &priv->cm.rx_wr_arr[0];
+	}
+
 	ipoib_cm_init_rx_wr(dev, &priv->cm.rx_wr, priv->cm.rx_sge);
 
 	if (ipoib_cm_has_srq(dev)) {