From: Mauro Carvalho Chehab <mchehab@redhat.com> Date: Fri, 27 Feb 2009 09:18:27 -0300 Subject: [serial] 8250: fix boot hang when using with SOL port Message-id: 20090227091827.30bbac7c@pedra.chehab.org O-Subject: [PATCH RHEL 5.4 v2] BZ#467124 8250: fix boot hang with serial console when using with Serial Over Lan port Bugzilla: 467124 RH-Acked-by: Aristeu Rozanski <aris@redhat.com> Upstream: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b6adea334c6c89d5e6c94f9196bbf3a279cb53bd BZ description: We use Serial Over LAN (SOL) function provided in Intel 82571 Ethernet controller. From the Linux stand point, the SOL port is seen as a regular 8250 serial port. In some cases, our serial port may get blocked while running commands like automatic installation using anaconda (no ouput). A simple keypress on the serial client brings the port back into service. When the serial port hangs, the whole system is frozen as well. Our investigations have shown that a workaround to missing TX interrupts is activated by error (flag UART_BUG_TXEN)into the 8250 serial driver. Thus, in serial8250_start_tx function the workaround and the standard interrupt based transmission prevent each other from working. The proposed patch we have attached modifies the test used by the driver to determine if the workaround shall be used : it introduces a delay after TX interrupts have been activated and before the IIR register is read. The problem does not show up anymore when the patch is applied. It's a very critical issue because when all the Linux console outputs are redirected to the SOL port, the whole system may hang and only an external operator can resume the system by hitting a key. Patch description: Intel 8257x Ethernet boards have a feature called Serial Over Lan. This feature works by emulating a serial port, and it is detected by kernel as a normal 8250 port. However, this emulation is not perfect, as also noticed on changeset 7500b1f602aad75901774a67a687ee985d85893f. Before this patch, the kernel were trying to check if the serial TX is capable of work using IRQ's. This were done with a code similar this: serial_outp(up, UART_IER, UART_IER_THRI); lsr = serial_in(up, UART_LSR); iir = serial_in(up, UART_IIR); serial_outp(up, UART_IER, 0); if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT) up->bugs |= UART_BUG_TXEN; This works fine for other 8250 ports, but, on 8250-emulated SoL port, the chip is a little lazy to down UART_IIR_NO_INT at UART_IIR register. Due to that, UART_BUG_TXEN is sometimes enabled. However, as TX IRQ keeps working, and the TX polling is now enabled, the driver miss-interprets the IRQ received later, hanging up the machine until a key is pressed at the serial console. This is the 6 version of this patch. Previous versions were trying to introduce a large enough delay between serial_outp and serial_in(up, UART_IIR), but not taking forever. However, the needed delay couldn't be safely determined. At the experimental tests, a delay of 1us solves most of the cases, but still hangs sometimes. Increasing the delay to 5us was better, but still doesn't solve. A very high delay of 50 ms seemed to work every time. However, poking around with delays and pray for it to be enough doesn't seem to be a good approach, even for a quirk. So, instead of playing with random large arbitrary delays, let's just disable UART_BUG_TXEN for all SoL ports. The patch were successfully tested by the customer. Version 2: Fix a merge conflict with changeset 6b041dd34476c1ff6c38e9b82f072beb580ec6ee Author: Doug Chapman <dchapman@redhat.com> Date: Fri Aug 10 17:49:18 2007 -0400 serial: fix console hang on HP Integrity Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c index c5e7cfb..c8240bc 100644 --- a/drivers/serial/8250.c +++ b/drivers/serial/8250.c @@ -1801,6 +1801,20 @@ static int serial8250_startup(struct uart_port *port) serial8250_set_mctrl(&up->port, up->port.mctrl); + /* Serial over Lan (SoL) hack: + Intel 8257x Gigabit ethernet chips have a + 16550 emulation, to be used for Serial Over Lan. + Those chips take a longer time than a normal + serial device to signalize that a transmission + data was queued. Due to that, the above test generally + fails. One solution would be to delay the reading of + iir. However, this is not reliable, since the timeout + is variable. So, let's just don't test if we receive + TX irq. This way, we'll never enable UART_BUG_TXEN. + */ + if (up->port.flags & UPF_NO_TXEN_TEST) + goto dont_test_tx_en; + /* * Do a quick test to see if we receive an * interrupt when we enable the TX irq. @@ -1820,6 +1834,7 @@ static int serial8250_startup(struct uart_port *port) up->bugs &= ~UART_BUG_TXEN; } +dont_test_tx_en: if (is_real_interrupt(up->port.irq)) { spin_unlock(&up->port.lock); spin_unlock_irqrestore(&irq_lists[up->port.irq].lock, flags); diff --git a/drivers/serial/8250_pci.c b/drivers/serial/8250_pci.c index 851e483..cca3d60 100644 --- a/drivers/serial/8250_pci.c +++ b/drivers/serial/8250_pci.c @@ -602,6 +602,21 @@ pci_default_setup(struct serial_private *priv, struct pciserial_board *board, return setup_port(priv, port, bar, offset, board->reg_shift); } +static int skip_tx_en_setup(struct serial_private *priv, + struct pciserial_board *board, + struct uart_port *port, int idx) +{ + port->flags |= UPF_NO_TXEN_TEST; + printk(KERN_DEBUG "serial8250: skipping TxEn test for device " + "[%04x:%04x] subsystem [%04x:%04x]\n", + priv->dev->vendor, + priv->dev->device, + priv->dev->subsystem_vendor, + priv->dev->subsystem_device); + + return pci_default_setup(priv, board, port, idx); +} + /* This should be in linux/pci_ids.h */ #define PCI_VENDOR_ID_SBSMODULARIO 0x124B #define PCI_SUBVENDOR_ID_SBSMODULARIO 0x124B @@ -653,6 +668,29 @@ static struct pci_serial_quirk pci_serial_quirks[] = { .init = pci_inteli960ni_init, .setup = pci_default_setup, }, + { + .vendor = PCI_VENDOR_ID_INTEL, + .device = PCI_DEVICE_ID_INTEL_8257X_SOL, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = skip_tx_en_setup, + }, + { + .vendor = PCI_VENDOR_ID_INTEL, + .device = PCI_DEVICE_ID_INTEL_82573L_SOL, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = skip_tx_en_setup, + }, + { + .vendor = PCI_VENDOR_ID_INTEL, + .device = PCI_DEVICE_ID_INTEL_82573E_SOL, + .subvendor = PCI_ANY_ID, + .subdevice = PCI_ANY_ID, + .setup = skip_tx_en_setup, + }, + + /* * Panacom */ diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 3bd91ed..3ee4de4 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2152,6 +2152,9 @@ #define PCI_DEVICE_ID_INTEL_82378 0x0484 #define PCI_DEVICE_ID_INTEL_I960 0x0960 #define PCI_DEVICE_ID_INTEL_I960RM 0x0962 +#define PCI_DEVICE_ID_INTEL_8257X_SOL 0x1062 +#define PCI_DEVICE_ID_INTEL_82573E_SOL 0x1085 +#define PCI_DEVICE_ID_INTEL_82573L_SOL 0x108F #define PCI_DEVICE_ID_INTEL_82815_MC 0x1130 #define PCI_DEVICE_ID_INTEL_82815_CGC 0x1132 #define PCI_DEVICE_ID_INTEL_82092AA_0 0x1221 diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index 00443cd..7d96865 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -254,6 +254,7 @@ struct uart_port { #define UPF_HARDPPS_CD ((__force upf_t) (1 << 11)) #define UPF_LOW_LATENCY ((__force upf_t) (1 << 13)) #define UPF_BUGGY_UART ((__force upf_t) (1 << 14)) +#define UPF_NO_TXEN_TEST ((__force upf_t) (1 << 15)) #define UPF_MAGIC_MULTIPLIER ((__force upf_t) (1 << 16)) #define UPF_CONS_FLOW ((__force upf_t) (1 << 23)) #define UPF_SHARE_IRQ ((__force upf_t) (1 << 24))