From: John Feeney <jfeeney@redhat.com> Date: Thu, 30 Sep 2010 21:42:47 -0400 Subject: [net] bnx2: improve tx fast path performance Message-id: <4CA50457.2040407@redhat.com> Patchwork-id: 28531 O-Subject: [RHEL5.6 PATCH] bnx2: Improve tx fast path performance Bugzilla: 632057 RH-Acked-by: Stanislaw Gruszka <sgruszka@redhat.com> RH-Acked-by: Andy Gospodarek <gospo@redhat.com> RH-Acked-by: David S. Miller <davem@redhat.com> bz632057 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=632057 bnx2: Remove some unnecessary smp_mb() in tx fast path Description of problem: smp_mb() inside bnx2_tx_avail() is used twice in the normal bnx2_start_xmit() path. The full memory barrier is only necessary during race conditions with tx completion. Solution: >From the bz: "We can speed up the tx path by replacing smp_mb() in bnx2_tx_avail() with a compiler barrier. The compiler barrier is to force the compiler to fetch the tx_prod and tx_cons from memory." Upstream commit: Remove some unnecessary smp_mb() in tx fast path commit: 11848b964777af9c68d9160582628c2eb11f46d5 Brew: Successfully built in Brew for all architectures (task_2785135). Testing: Sanity tested with Connectathon on several bnx2 NICs. Asked Broadcom and Dell for testing feedback but have not received any word yet. Will provide update when it arrives. Acks would be appreciated. Thanks. Signed-off-by: Jarod Wilson <jarod@redhat.com> diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c index 1c3b3d5..8e99b53 100644 --- a/drivers/net/bnx2.c +++ b/drivers/net/bnx2.c @@ -251,7 +251,8 @@ static inline u32 bnx2_tx_avail(struct bnx2 *bp, struct bnx2_tx_ring_info *txr) { u32 diff; - smp_mb(); + /* Tell compiler to fetch tx_prod and tx_cons from memory. */ + barrier(); /* The ring uses 256 indices for 255 entries, one of them * needs to be skipped. @@ -6534,6 +6535,13 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev) if (unlikely(bnx2_tx_avail(bp, txr) <= MAX_SKB_FRAGS)) { netif_stop_queue(dev); + + /* netif_tx_stop_queue() must be done before checking + * tx index in bnx2_tx_avail() below, because in + * bnx2_tx_int(), we update tx index before checking for + * netif_tx_queue_stopped(). + */ + smp_mb(); if (bnx2_tx_avail(bp, txr) > bp->tx_wake_thresh) netif_wake_queue(dev); }