Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > fc11cd6e1c513a17304da94a5390f3cd > files > 2411

kernel-2.6.18-194.11.1.el5.src.rpm

From: Stanislaw Gruszka <sgruszka@redhat.com>
Date: Mon, 29 Jun 2009 13:34:15 +0200
Subject: [net] RTNL: assertion failed due to bonding notify
Message-id: 20090629133415.146afb7b@dhcp-lab-109.englab.brq.redhat.com
O-Subject: [RHEL5.4 PATCH] BZ508297: RTNL: assertion failed due to bonding notify
Bugzilla: 508297
RH-Acked-by: Jiri Pirko <jpirko@redhat.com>
RH-Acked-by: Ivan Vecera <ivecera@redhat.com>
RH-Acked-by: Jiri Pirko <jpirko@redhat.com>
RH-Acked-by: David Miller <davem@redhat.com>
RH-Acked-by: David Miller <davem@redhat.com>
RH-Acked-by: Thomas Graf <tgraf@redhat.com>
RH-Acked-by: Andy Gospodarek <gospo@redhat.com>

BZ#508297
=========
https://bugzilla.redhat.com/show_bug.cgi?id=508297

Description:
============
When I run bonding (with BONDING_OPTS="mode=balance-rr arp_interval=100
arp_ip_target=10.34.1.154) on my system, I have call traces due to RTNL
assertion.

RTNL: assertion failed at net/core/fib_rules.c (388)

Call Trace:
 [<ffffffff802357f4>] fib_rules_event+0x3d/0xff
 [<ffffffff80067eaa>] notifier_call_chain+0x20/0x32
 [<ffffffff88663501>] :bonding:bond_select_active_slave+0xf6/0x10f
 [<ffffffff886659ae>] :bonding:bond_loadbalance_arp_mon+0x1a3/0x1da
 [<ffffffff8866580b>] :bonding:bond_loadbalance_arp_mon+0x0/0x1da
 [<ffffffff8004dbfc>] run_workqueue+0x94/0xe4
 [<ffffffff8004a460>] worker_thread+0x0/0x122
 [<ffffffff800a02fa>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8004a550>] worker_thread+0xf0/0x122
 [<ffffffff8008ccc7>] default_wake_function+0x0/0xe
 [<ffffffff800a02fa>] keventd_create_kthread+0x0/0xc4
 [<ffffffff800a02fa>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80033062>] kthread+0xfe/0x132
 [<ffffffff8005efb1>] child_rip+0xa/0x11
 [<ffffffff800a02fa>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032f64>] kthread+0x0/0x132
 [<ffffffff8005efa7>] child_rip+0x0/0x11

RTNL: assertion failed at net/ipv4/devinet.c (986)

Call Trace:
 [<ffffffff8025c095>] inetdev_event+0x48/0x282
 [<ffffffff80067eaa>] notifier_call_chain+0x20/0x32
 [<ffffffff88663501>] :bonding:bond_select_active_slave+0xf6/0x10f
 [<ffffffff886659ae>] :bonding:bond_loadbalance_arp_mon+0x1a3/0x1da
 [<ffffffff8866580b>] :bonding:bond_loadbalance_arp_mon+0x0/0x1da
 [<ffffffff8004dbfc>] run_workqueue+0x94/0xe4
 [<ffffffff8004a460>] worker_thread+0x0/0x122
 [<ffffffff800a02fa>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8004a550>] worker_thread+0xf0/0x122
 [<ffffffff8008ccc7>] default_wake_function+0x0/0xe
 [<ffffffff800a02fa>] keventd_create_kthread+0x0/0xc4
 [<ffffffff800a02fa>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80033062>] kthread+0xfe/0x132
 [<ffffffff8005efb1>] child_rip+0xa/0x11
 [<ffffffff800a02fa>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032f64>] kthread+0x0/0x132
 [<ffffffff8005efa7>] child_rip+0x0/0x11

This happens because natdev_bonding_change() is called without rtnl_lock()
from bond_loadbalance_arp_mon() -> bond_select_active_slave(). That was added
in:

commit 47c4d639ac64ad423235c622306bc0bcba62b2d9
Author: Andy Gospodarek <gospo@redhat.com>
Date:   Thu Apr 23 14:44:45 2009 -0400

    [net] bonding: support for bonding of IPoIB interfaces

Patch move netdev_bonding_change() to bond_change_active_slave()
and call it only if mode is active-backup, so prevent running this
function from bond_loadbalance_arp_mon(). This is the same way as it was
done is mainline, in commit:

commit 01f3109de49a889db8adf9116449727547ee497e
Author: Or Gerlitz <ogerlitz@voltaire.com>
Date:   Fri Jun 13 18:12:02 2008 -0700

    bonding: deliver netdev event for fail-over under the active-backup mode

Compered to previous submission now patch has BZ entry, info about upstream
commit and no trailing whitespaces (at least I hope so).

kABI Status:
============
No symbols were harmed.

Brew:
====
https://brewweb.devel.redhat.com/taskinfo?taskID=1865323

Upstream Status:
===============
commit 01f3109de49a889db8adf9116449727547ee497e

Test Status:
============
I tested it on my system. No IPoIB testing.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d1accec..06ce365 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1231,6 +1231,14 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
 			bond->send_grat_arp = bond->params.num_grat_arp;
 			bond_send_gratuitous_arp(bond);
 
+			write_unlock_bh(&bond->curr_slave_lock);
+			read_unlock(&bond->lock);
+
+			netdev_bonding_change(bond->dev);
+
+			read_lock(&bond->lock);
+			write_lock_bh(&bond->curr_slave_lock);
+
 			bond->send_unsol_na = bond->params.num_unsol_na;
 			bond_send_unsolicited_na(bond);
 		}
@@ -1269,13 +1277,6 @@ void bond_select_active_slave(struct bonding *bond)
 			       "now running without any active interface !\n",
 			       bond->dev->name);
 
-			write_unlock_bh(&bond->curr_slave_lock);
-			read_unlock(&bond->lock);
-
-			netdev_bonding_change(bond->dev);
-
-			read_lock(&bond->lock);
-			write_lock_bh(&bond->curr_slave_lock);
 		}
 	}
 }