Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > fc11cd6e1c513a17304da94a5390f3cd > files > 3706

kernel-2.6.18-194.11.1.el5.src.rpm

From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Fri, 27 Feb 2009 09:18:27 -0300
Subject: [serial] 8250: fix boot hang when using with SOL port
Message-id: 20090227091827.30bbac7c@pedra.chehab.org
O-Subject: [PATCH RHEL 5.4 v2] BZ#467124 8250: fix boot hang with serial console when using with Serial Over Lan port
Bugzilla: 467124
RH-Acked-by: Aristeu Rozanski <aris@redhat.com>

Upstream: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b6adea334c6c89d5e6c94f9196bbf3a279cb53bd

BZ description:

We use Serial Over LAN (SOL) function provided in Intel 82571 Ethernet
controller. From the Linux stand point, the SOL port is seen as a regular 8250
serial port.

In some cases, our serial port may get blocked while running commands like
automatic installation using anaconda (no ouput). A simple keypress on the
serial client brings the port back into service. When the serial port hangs,
the whole system is frozen as well.

Our investigations have shown that a workaround to missing TX interrupts is
activated by error (flag UART_BUG_TXEN)into the 8250 serial driver. Thus, in
serial8250_start_tx function the workaround and the standard interrupt based
transmission prevent each other from working.
The proposed patch we have attached modifies the test used by the driver to
determine if the workaround shall be used : it introduces a delay after TX
interrupts have been activated and before the IIR register is read. The problem
does not show up anymore when the patch is applied.

It's a very critical issue because when all the Linux console outputs are
redirected to the SOL port, the whole system may hang and only an external
operator can resume the system by hitting a key.

Patch description:

Intel 8257x Ethernet boards have a feature called Serial Over Lan.

This feature works by emulating a serial port, and it is detected by
kernel as a normal 8250 port.  However, this emulation is not perfect, as
also noticed on changeset 7500b1f602aad75901774a67a687ee985d85893f.

Before this patch, the kernel were trying to check if the serial TX is
capable of work using IRQ's.

This were done with a code similar this:

        serial_outp(up, UART_IER, UART_IER_THRI);
        lsr = serial_in(up, UART_LSR);
        iir = serial_in(up, UART_IIR);
        serial_outp(up, UART_IER, 0);

        if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
		up->bugs |= UART_BUG_TXEN;

This works fine for other 8250 ports, but, on 8250-emulated SoL port, the
chip is a little lazy to down UART_IIR_NO_INT at UART_IIR register.

Due to that, UART_BUG_TXEN is sometimes enabled.  However, as TX IRQ keeps
working, and the TX polling is now enabled, the driver miss-interprets the
IRQ received later, hanging up the machine until a key is pressed at the
serial console.

This is the 6 version of this patch.  Previous versions were trying to
introduce a large enough delay between serial_outp and serial_in(up,
UART_IIR), but not taking forever.  However, the needed delay couldn't be
safely determined.

At the experimental tests, a delay of 1us solves most of the cases, but
still hangs sometimes.  Increasing the delay to 5us was better, but still
doesn't solve.  A very high delay of 50 ms seemed to work every time.

However, poking around with delays and pray for it to be enough doesn't
seem to be a good approach, even for a quirk.

So, instead of playing with random large arbitrary delays, let's just
disable UART_BUG_TXEN for all SoL ports.

The patch were successfully tested by the customer.

Version 2:
Fix a merge conflict with changeset 6b041dd34476c1ff6c38e9b82f072beb580ec6ee
	Author: Doug Chapman <dchapman@redhat.com>
	Date:   Fri Aug 10 17:49:18 2007 -0400

	serial: fix console hang on HP Integrity

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c
index c5e7cfb..c8240bc 100644
--- a/drivers/serial/8250.c
+++ b/drivers/serial/8250.c
@@ -1801,6 +1801,20 @@ static int serial8250_startup(struct uart_port *port)
 
 	serial8250_set_mctrl(&up->port, up->port.mctrl);
 
+	/* Serial over Lan (SoL) hack:
+	   Intel 8257x Gigabit ethernet chips have a
+	   16550 emulation, to be used for Serial Over Lan.
+	   Those chips take a longer time than a normal
+	   serial device to signalize that a transmission
+	   data was queued. Due to that, the above test generally
+	   fails. One solution would be to delay the reading of
+	   iir. However, this is not reliable, since the timeout
+	   is variable. So, let's just don't test if we receive
+	   TX irq. This way, we'll never enable UART_BUG_TXEN.
+	 */
+	if (up->port.flags & UPF_NO_TXEN_TEST)
+		goto dont_test_tx_en;
+
 	/*
 	 * Do a quick test to see if we receive an
 	 * interrupt when we enable the TX irq.
@@ -1820,6 +1834,7 @@ static int serial8250_startup(struct uart_port *port)
 		up->bugs &= ~UART_BUG_TXEN;
 	}
 
+dont_test_tx_en:
 	if (is_real_interrupt(up->port.irq)) {
 		spin_unlock(&up->port.lock);
 		spin_unlock_irqrestore(&irq_lists[up->port.irq].lock, flags);
diff --git a/drivers/serial/8250_pci.c b/drivers/serial/8250_pci.c
index 851e483..cca3d60 100644
--- a/drivers/serial/8250_pci.c
+++ b/drivers/serial/8250_pci.c
@@ -602,6 +602,21 @@ pci_default_setup(struct serial_private *priv, struct pciserial_board *board,
 	return setup_port(priv, port, bar, offset, board->reg_shift);
 }
 
+static int skip_tx_en_setup(struct serial_private *priv,
+			struct pciserial_board *board,
+			struct uart_port *port, int idx)
+{
+	port->flags |= UPF_NO_TXEN_TEST;
+	printk(KERN_DEBUG "serial8250: skipping TxEn test for device "
+			  "[%04x:%04x] subsystem [%04x:%04x]\n",
+			  priv->dev->vendor,
+			  priv->dev->device,
+			  priv->dev->subsystem_vendor,
+			  priv->dev->subsystem_device);
+
+	return pci_default_setup(priv, board, port, idx);
+}
+
 /* This should be in linux/pci_ids.h */
 #define PCI_VENDOR_ID_SBSMODULARIO	0x124B
 #define PCI_SUBVENDOR_ID_SBSMODULARIO	0x124B
@@ -653,6 +668,29 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
 		.init		= pci_inteli960ni_init,
 		.setup		= pci_default_setup,
 	},
+	{
+		.vendor		= PCI_VENDOR_ID_INTEL,
+		.device		= PCI_DEVICE_ID_INTEL_8257X_SOL,
+		.subvendor	= PCI_ANY_ID,
+		.subdevice	= PCI_ANY_ID,
+		.setup		= skip_tx_en_setup,
+	},
+	{
+		.vendor		= PCI_VENDOR_ID_INTEL,
+		.device		= PCI_DEVICE_ID_INTEL_82573L_SOL,
+		.subvendor	= PCI_ANY_ID,
+		.subdevice	= PCI_ANY_ID,
+		.setup		= skip_tx_en_setup,
+	},
+	{
+		.vendor		= PCI_VENDOR_ID_INTEL,
+		.device		= PCI_DEVICE_ID_INTEL_82573E_SOL,
+		.subvendor	= PCI_ANY_ID,
+		.subdevice	= PCI_ANY_ID,
+		.setup		= skip_tx_en_setup,
+	},
+
+
 	/*
 	 * Panacom
 	 */
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 3bd91ed..3ee4de4 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2152,6 +2152,9 @@
 #define PCI_DEVICE_ID_INTEL_82378	0x0484
 #define PCI_DEVICE_ID_INTEL_I960	0x0960
 #define PCI_DEVICE_ID_INTEL_I960RM	0x0962
+#define PCI_DEVICE_ID_INTEL_8257X_SOL	0x1062
+#define PCI_DEVICE_ID_INTEL_82573E_SOL	0x1085
+#define PCI_DEVICE_ID_INTEL_82573L_SOL	0x108F
 #define PCI_DEVICE_ID_INTEL_82815_MC	0x1130
 #define PCI_DEVICE_ID_INTEL_82815_CGC	0x1132
 #define PCI_DEVICE_ID_INTEL_82092AA_0	0x1221
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 00443cd..7d96865 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -254,6 +254,7 @@ struct uart_port {
 #define UPF_HARDPPS_CD		((__force upf_t) (1 << 11))
 #define UPF_LOW_LATENCY		((__force upf_t) (1 << 13))
 #define UPF_BUGGY_UART		((__force upf_t) (1 << 14))
+#define UPF_NO_TXEN_TEST	((__force upf_t) (1 << 15))
 #define UPF_MAGIC_MULTIPLIER	((__force upf_t) (1 << 16))
 #define UPF_CONS_FLOW		((__force upf_t) (1 << 23))
 #define UPF_SHARE_IRQ		((__force upf_t) (1 << 24))