Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > fc11cd6e1c513a17304da94a5390f3cd > files > 3966

kernel-2.6.18-194.11.1.el5.src.rpm

From: Glauber Costa <glommer@redhat.com>
Date: Thu, 17 Sep 2009 17:47:01 -0400
Subject: [x86] kvmclock: fix bogus wallclock value
Message-id: 1253224021-3163-1-git-send-email-glommer@redhat.com
O-Subject: [PATCH] RHEL5 BZ519771: fix bogus wallclock value at x86_64 kvmclock.
Bugzilla: 519771
RH-Acked-by: Rik van Riel <riel@redhat.com>
RH-Acked-by: Zachary Amsden <zamsden@redhat.com>
RH-Acked-by: Juan Quintela <quintela@redhat.com>
RH-Acked-by: Chris Lalancette <clalance@redhat.com>
RH-Acked-by: Juan Quintela <quintela@redhat.com>

[ to people who already acked this: ]
  I am respinning this to include another source of bogus value: vsyscall uses tsc
  based adjustments. We can't possibly call read_kvm_clock in there, so the best way
  is to do it via tsc to. But it is buggy, since once using kvmclock, we never update
  last_tsc again. I am including a fix for that here, since we are not aware of any
  specific problems that are fixed by this.

in RHEL5 kernel, __pa can miscalculate the address of C symbols. We have to use
__pa_symbol() for that. So we effectively register the wallclock time MSR in
the wrong address, and get a bogus value in return.

I haven't seen this before, because userspace uses hwclock program to adjust
clock on bootup (which is harmful, because it increases delta between host and
guest).

Furthermore, if we disable hwclock and fix the above issue, wall clock will
still be wrong in 10 minutes. This is because I failed to adjust an initial
base for monotonic clock, making the first interrupt after bootup try to
compensate exactly 10 minutes of lost ticks.

Again, the presence of the hwclock utility papered over this.

brew build at http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1946898

Signed-off-by: Glauber Costa <glommer@redhat.com>

diff --git a/arch/i386/kernel/kvmclock.c b/arch/i386/kernel/kvmclock.c
index 0b8f5e8..a168db3 100644
--- a/arch/i386/kernel/kvmclock.c
+++ b/arch/i386/kernel/kvmclock.c
@@ -54,8 +54,8 @@ unsigned long kvm_get_wallclock(void)
 	struct timespec ts;
 	int low, high;
 
-	low = (int)__pa(&wall_clock);
-	high = ((u64)__pa(&wall_clock) >> 32);
+	low = (int)__pa_symbol(&wall_clock);
+	high = ((u64)__pa_symbol(&wall_clock) >> 32);
 	wrmsr(MSR_KVM_WALL_CLOCK, low, high);
 
 	vcpu_time = &get_cpu_var(hv_clock);
diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
index b1f1e22..6b4d2cc 100644
--- a/arch/x86_64/kernel/time.c
+++ b/arch/x86_64/kernel/time.c
@@ -514,9 +514,10 @@ static void do_timer_tsc_timekeeping(struct pt_regs *regs)
 		tsc_accounted += cycles_per_tick;
 	}
 
-	if (use_kvm_time)
+	if (use_kvm_time) {
 		monotonic_base += (tsc_accounted - *last);
-	else
+		vxtime.last_tsc = get_cycles_sync();
+	} else
 		monotonic_base += ((tsc_accounted - *last) *
 					1000000 / cpu_khz);
 
@@ -1086,6 +1087,7 @@ void __init time_init(void)
 		timename = "KVM";
 		/* no need to get frequency here, since we'll skip the calibrate loop anyway */
 		timekeeping_use_tsc = 1;
+		vxtime.last_kvm = kvm_clock_read();
 	} else if (hpet_use_timer) {
 		/* set tick_nsec to use the proper rate for HPET */
 	  	tick_nsec = TICK_NSEC_HPET;