Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 89877e42827f16fa5f86b1df0c2860b1 > files > 2599

kernel-2.6.18-128.1.10.el5.src.rpm

From: Konrad Rzeszutek <konradr@redhat.com>
Subject: [RHEL5 PATCH] #RHBZ 232666 x86_64: wall time is not compensated for lost timer ticks
Date: Tue, 29 May 2007 12:36:13 -0400
Bugzilla: 232666
Message-Id: <20070529163613.GA1127@localhost.localdomain>
Changelog: [x86_64] wall time is not compensated for lost timer ticks


RHBZ#:
------
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=232666

Description:
------------
Problem:
For every timer interrupt the function update_wall_time() is called which
updates the system's notion of time. Occasionally time interrupts (timer ticks)
get lost, e.g. because an interrupt handlers takes a long time or a lengthy SMI
occurs. The code in update_wall_time should compensate for those lost ticks and
add extra time to the wall time. However, for the x86_64 architecture it fails
to do so. 

Solution:
This patch removes an  optimization for non-CONFIG_GENERIC_TIME that broke 
lost tick compensation for x86_64. The code that is utilized when using 
CONFIG_GENERIC_TIME should not be enabled for x86_64 in 2.6.20 (and earlier), 
because the supporting code is simply not there. The code upstream (2.6.21) 
avoids the need for lost tick compensation as timekeeping is calculated from the 
continuous clock sources instead of being tick based.

In RHEL5 update_wall_time() is called at every timer tick. This patch would
force a clocksource_read() to get the current offset.

There are three cases:
case 1) CONFIG_GENERIC_TIME is set: In this case the patch makes no difference
because the clocksource_read() is forced anyway. This applies to i386.

If CONFIG_GENERIC_TIME is not set then clock is clocksource_jiffies.
clocksource_jiffies.cycle_interval is 1.

case 2) ppc64, ia64, s390: CONFIG_GENERIC_TIME is not set and jiffies are 
incremented by exactly 1 each clock tick. Since cycle_interval is 1 and the difference 
of jiffies between calls to update_wall_time is 1 the code does the same thing 
with the patch. The only difference is that we lose a small optimization. 
This applies to architectures other than x86_64.

case 3) x86_64: CONFIG_GENERIC_TIME is not set and jiffies may be incremented by
more than one. This is the only case where we potentially need to execute the
loop multiple times. The patch enables that by forcing a read of the difference
in jiffies (which can be >1) instead of using cycle_interval (which is always 1).

RHEL Version Found:
------------------
RHEL5 

Upstream Status:
----------------
Upstream the code that actually utilizes CONFIG_GENERIC_TIME framework
is implemented so this patch is not necessary in upstream kernel. I've asked 
the author of the 2.6.21 GTOD conversion code (John Stultz) about back-porting it
to 2.6.18 and his opinion is that it "is really too much change for me to 
feel comfortable back porting it".

Test Status:
------------
Tested successfully on the following machines (test involved running the timeskew
tests with a stock and a patched kernel - and I found no regressions with the patch):

IBM e326m
IBM x326
Dell PowerEdge 800
Dell PowerEdge 6800
Intel X7DB8
HP OptiPlex GX240
Athlon
IBM BladeCenter HS20 -[79813FZ]-
HP ProLiant DL380 G5
IBM System x3550 -[7978D5Z]-
IBM eServer BladeCenter HS21 -[8853ROZ]-
Dell PowerEdge SC430
Dell Precision WorkStation 380
Dell PowerEdge 830
HP ProLiant DL360 G4p
Dell PowerEdge 650
Dell PowerEdge 2850
NEC Express5800/120Eg [N8100-973]
SuperMicro X7DB8
Sun Microsystems, Inc. Sun Fire V40z
SGI Altix
IBM PowerPC (JS20)
IBM x3950 

I did not run the tests on all machine in the RHTS due to some being reserved,
other being killed by the watchdog (even when installing a stock RHEL5 distro!),
and some due to duplicity. If there are specific machines that you would like 
me to run the tests against, please respond.


Proposed Patch:
---------------
This patch is based on 2.6.18-18:

diff -uNr linux-2.6.18.i386.orig/kernel/timer.c linux-2.6.18.i386.time/kernel/timer.c
--- linux-2.6.18.i386.orig/kernel/timer.c	2007-05-08 14:06:09.000000000 -0400
+++ linux-2.6.18.i386.time/kernel/timer.c	2007-05-09 13:32:03.000000000 -0400
@@ -1119,11 +1119,8 @@
 	if (unlikely(timekeeping_suspended))
 		return;
 
-#ifdef CONFIG_GENERIC_TIME
 	offset = (clocksource_read(clock) - clock->cycle_last) & clock->mask;
-#else
-	offset = clock->cycle_interval;
-#endif
+
 	clock->xtime_nsec += (s64)xtime.tv_nsec << clock->shift;
 
 	/* normally this loop will run just once, however in the

-- 
Konrad Rzeszutek 1-(978)-392-3903 or 1-(617)-693-1718
IBM on-site partner.