Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 27922b4260f65d317aabda37e42bbbff > files > 2825

kernel-2.6.18-238.el5.src.rpm

From: Andy Gospodarek <gospo@redhat.com>
Date: Tue, 2 Jun 2009 10:48:54 -0400
Subject: [net] ixgbe: fix polling saturates CPU
Message-id: 20090602144854.GE10204@shell.devel.redhat.com
O-Subject: Re: [PATCH RHEL5.4 BZ503559] fix Ixgbe polling saturates CPU
Bugzilla: 503559

On Tue, Jun 02, 2009 at 10:09:20AM -0400, AMEET M. PARANJAPE wrote:
> RHBZ#:
> ======
> https://bugzilla.redhat.com/show_bug.cgi?id=503559
>
> Description:
> ===========
> This patch removes the need of tx_clean_complete=true to exit napi polling
> mode in ixgbe driver.
>
> Currently, Ixgbe never leaves napi poll mode leading to ksoftirqd to take
> about 100% of one processor all the time.  It also leads to other problems,
> such as issues when disabling the device during an EEH event.
>
> This is a regression against RHEL 5.3, and was introduced in revision 142 with
> the following change:
> "- [net] ixgbe: update to upstream version 2.0.8-k2 (Andy Gospodarek )
> [472547]"
> (linux-2.6-net-ixgbe-update-to-upstream-version-2-0-8-k2.patch).
>
> RHEL Version Found:
> ================
> RHEL 5.3
>
> kABI Status:
> ============
> No symbols were harmed.
>
> Brew:
> =====
> Built on all platforms.
> http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1821902
>
> Upstream Status:
> ================
> The problem does not happen with latest mainline kernel.
>
> Test Status:
> ============
> With the patch applied all instances of ksoftirqd are reported by top as
> consuming 0% CPU. Even after an uptime of 2 days, top reports near (or equal)
> to 0 in the TIME+ column, which is what would be expected for a machine not
> heavily loaded.
>
> Also verified that the patch fixes the EEH recovery for the device, which
> was failing (another symptom of the same problem).
>
> Regular FVT (including several netperf, ping, ethtool, and statistics
> verifications) was also run successfully.
> ===============================================================
> Ameet Paranjape 978-392-3903 ext 23903
> IBM on-site partner
>

I've been planning a patch to fix this and a few other napi issues and I
would prefer to fix this the way upstream does.  Plus there are some
other napi problems with ixgbe and 5.4 that need to be fixed.  I've yet
to post them since I'm working on some ixgbe fixes for 82599, but I'm
happy to use this bug for the napi fixes specifically and I'll get the
other ones with the other bugs I have already.

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 166ae5d..ff5a79a 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -1049,8 +1049,11 @@ static int ixgbe_clean_rxonly(struct net_device *netdev, int *budget)
 
 	ixgbe_clean_rx_irq(adapter, rx_ring, &work_done, work_to_do);
 
+	*budget -= work_done;
+	netdev->quota -= work_done;
+
 	/* If all Rx work done, exit the polling mode */
-	if ((work_done < work_to_do) || !netif_running(netdev)) {
+	if ((work_done < work_to_do) || !netif_running(adapter->netdev)) {
 quit_polling:
 		netif_rx_complete(netdev);
 		if (adapter->itr_setting & 1)
@@ -1099,12 +1102,15 @@ static int ixgbe_clean_rxonly_many(struct net_device *netdev, int *budget)
 		                      r_idx + 1);
 	}
 
+	*budget -= work_done;
+	netdev->quota -= work_done;
+
 	r_idx = find_first_bit(q_vector->rxr_idx, adapter->num_rx_queues);
 	rx_ring = &(adapter->rx_ring[r_idx]);
 	/* If all Rx work done, exit the polling mode */
 	if ((work_done < work_to_do) || !netif_running(adapter->netdev)) {
 quit_polling:
-		netif_rx_complete(adapter->netdev);
+		netif_rx_complete(netdev);
 		if (adapter->itr_setting & 1)
 			ixgbe_set_itr_msix(q_vector);
 		if (!test_bit(__IXGBE_DOWN, &adapter->state))
@@ -2302,9 +2308,11 @@ static int ixgbe_poll(struct net_device *netdev, int *budget)
 	*budget -= work_done;
 	netdev->quota -= work_done;
 
-	/* If no Tx and not enough Rx work done, exit the polling mode */
-	if ((!tx_clean_complete && (work_done == 0)) ||
-	    !netif_running(adapter->netdev)) {
+	if (!tx_clean_complete)
+		work_done = work_to_do;
+
+        /* If budget not fully consumed, exit the polling mode */
+        if ((work_done < work_to_do) || !netif_running(adapter->netdev)) {
 quit_polling:
 		netif_rx_complete(netdev);
 		if (adapter->itr_setting & 1)