Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > fc11cd6e1c513a17304da94a5390f3cd > files > 2611

kernel-2.6.18-194.11.1.el5.src.rpm

From: Peter Staubach <staubach@redhat.com>
Date: Thu, 4 Dec 2008 11:13:55 -0500
Subject: [nfs] lockd: handle long grace periods correctly
Message-id: 493801C3.20101@redhat.com
O-Subject: Re: [RHEL-5.4 PATCH] lockd: return NLM_LCK_DENIED_GRACE_PERIOD after long periods
Bugzilla: 474590
RH-Acked-by: Jeff Layton <jlayton@redhat.com>
RH-Acked-by: Steve Dickson <SteveD@redhat.com>

Peter Staubach wrote:
> Hi.
>
> Attached is a patch to address bz474590, "lockd: return
> NLM_LCK_DENIED_GRACE_PERIOD after long periods".
>
> The problem is that the NFS server lock manager uses a grace
> period after it comes up to allow clients to reacquire locks
> that they were holding when the server went down.  The server
> handles the grace period by calculating when it should end by
> adding a computed number of jiffies to the current value of
> jiffies.  When the value of jiffies exceeds the computed
> value, then the grace period is considered to be completed.
>
> The problem is that jiffies can wrap and in a fairly short
> period time, namely a few weeks.  This can lead to the lock
> manager assuming that it is once again, back in the grace
> period, thus denying new lock requests.
>
> The solution is to set a flag indicating that the server is
> in the grace period and to use a timer to clear the flag
> when the grace period should be terminated.
>
> The upstream solution is essentially this, but using a
> bunch of things that RHEL-5 does not have.  This solution
> matches the RHEL-4 solution previously implemented.
>
>  Thanx...
>
>     ps
>

Still moving too fast...

This time, with the patch too...

       ps

diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index 928721f..be186f4 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -78,6 +78,8 @@ static const int		nlm_port_min = 0, nlm_port_max = 65535;
 
 static struct ctl_table_header * nlm_sysctl_table;
 
+static struct timer_list	nlm_grace_period_timer;
+
 static unsigned long set_grace_period(void)
 {
 	unsigned long grace_period;
@@ -92,7 +94,7 @@ static unsigned long set_grace_period(void)
 	return grace_period + jiffies;
 }
 
-static inline void clear_grace_period(void)
+static inline void clear_grace_period(unsigned long not_used)
 {
 	nlmsvc_grace_period = 0;
 }
@@ -138,6 +140,12 @@ lockd(struct svc_rqst *rqstp)
 
 	grace_period_expire = set_grace_period();
 
+	init_timer(&nlm_grace_period_timer);
+	nlm_grace_period_timer.function = clear_grace_period;
+	nlm_grace_period_timer.expires = grace_period_expire;
+
+	add_timer(&nlm_grace_period_timer);
+
 	/*
 	 * The main request loop. We don't terminate until the last
 	 * NFS mount or NFS daemon has gone away, and we've been sent a
@@ -151,6 +159,8 @@ lockd(struct svc_rqst *rqstp)
 			if (nlmsvc_ops) {
 				nlmsvc_invalidate_all();
 				grace_period_expire = set_grace_period();
+				mod_timer(&nlm_grace_period_timer,
+					grace_period_expire);
 			}
 		}
 
@@ -160,10 +170,8 @@ lockd(struct svc_rqst *rqstp)
 		 * (Theoretically, there shouldn't even be blocked locks
 		 * during grace period).
 		 */
-		if (!nlmsvc_grace_period) {
+		if (!nlmsvc_grace_period)
 			timeout = nlmsvc_retry_blocked();
-		} else if (time_before(grace_period_expire, jiffies))
-			clear_grace_period();
 
 		/*
 		 * Find a socket with data available and call its
@@ -188,6 +196,8 @@ lockd(struct svc_rqst *rqstp)
 
 	flush_signals(current);
 
+	del_timer(&nlm_grace_period_timer);
+
 	/*
 	 * Check whether there's a new lockd process before
 	 * shutting down the hosts and clearing the slot.