Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 27922b4260f65d317aabda37e42bbbff > files > 2474

kernel-2.6.18-238.el5.src.rpm

From: Andy Gospodarek <gospo@redhat.com>
Date: Thu, 30 Oct 2008 11:14:07 -0400
Subject: [net] bonding: allow downed interface before mod remove
Message-id: 20081030151407.GA10438@gospo.rdu.redhat.com
O-Subject: [RHEL5.3 PATCH] bonding: fix panic when taking bond interface down before removing module
Bugzilla: 467244
RH-Acked-by: Neil Horman <nhorman@redhat.com>
RH-Acked-by: David Miller <davem@redhat.com>
RH-Acked-by: Prarit Bhargava <prarit@redhat.com>

A panic was discovered with bonding when using mode 5 or 6 and trying to
remove the slaves from the bond after the interface was taken down.
When calling 'ifconfig bond0 down' the following happens:

    bond_close()
        bond_alb_deinitialize()
            tlb_deinitialize()
                kfree(bond_info->tx_hashtbl)
                bond_info->tx_hashtbl = NULL

Unfortunately if there are still slaves in the bond, when removing the
module the following happens:

    bonding_exit()
        bond_free_all()
            bond_release_all()
                bond_alb_deinit_slave()
                    tlb_clear_slave()
                        tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl
                        u32 next_index = tx_hash_table[index].next

As you might guess we panic when trying to access a few entries into the
table that no longer exists.

I experimented with several options (like moving the calls to
tlb_deinitialize somewhere else), but it really makes the most sense to
be part of the bond_close routine.  It also didn't seem logical move
tlb_clear_slave around too much, so the simplest option seems to add a
check in tlb_clear_slave to make sure we haven't already wiped the
tx_hashtbl away before searching for all the non-existent hash-table
entries that used to point to the slave as the output interface.

I have posted this upstream, but it hasn't been included yet.  I talked
with the upstream bonding maintainer and he seems fine with it, but
would like to test it before approving it for inclusion.  Hopefully
he will have a chance to do this sometime soon....

This will resolve RHBZ 467244.

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 41f7b92..99ef35b 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -167,11 +167,14 @@ static void tlb_clear_slave(struct bonding *bond, struct slave *slave, int save_
 	/* clear slave from tx_hashtbl */
 	tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl;
 
-	index = SLAVE_TLB_INFO(slave).head;
-	while (index != TLB_NULL_INDEX) {
-		u32 next_index = tx_hash_table[index].next;
-		tlb_init_table_entry(&tx_hash_table[index], save_load);
-		index = next_index;
+	/* skip this if we've already freed the tx hash table */
+	if (tx_hash_table) {
+		index = SLAVE_TLB_INFO(slave).head;
+		while (index != TLB_NULL_INDEX) {
+			u32 next_index = tx_hash_table[index].next;
+			tlb_init_table_entry(&tx_hash_table[index], save_load);
+			index = next_index;
+		}
 	}
 
 	tlb_init_slave(slave);