Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > fc11cd6e1c513a17304da94a5390f3cd > files > 2586

kernel-2.6.18-194.11.1.el5.src.rpm

From: Peter Staubach <staubach@redhat.com>
Date: Mon, 2 Feb 2009 16:14:46 -0500
Subject: [nfs] fix hangs during heavy write workloads
Message-id: 49876246.2060005@redhat.com
O-Subject: [RHEL-5.4 PATCH] bz469848 [RHEL5.2] nfs_getattr() hangs during heavy write workloads
Bugzilla: 469848
RH-Acked-by: Rik van Riel <riel@redhat.com>
RH-Acked-by: Jeff Layton <jlayton@redhat.com>
RH-Acked-by: Anton Arapov <aarapov@redhat.com>
RH-Acked-by: Jon Masters <jcm@redhat.com>

Hi.

Attached is a patch which addresses bz469848, "[RHEL5.2]
nfs_getattr() hangs during heavy write workloads".  The
problem described in the bugzilla is a situation where an
ls command, run in a directory which contains a large file
being actively written to, can take a very long time to
complete.  The ls command blocks, attempting to stat the
file being written to.

The problem is due to a combination of required file system
semantics and the NFS client side page caching implementation.

The file system semantics require that the modification time
of a file reflect a time greater than or equal to the time of
the last write system call made to the file.  In NFS, the file
times are maintained by the server.  Thus, the file
modification time is not updated until the server receives a
WRITE request to write data into the file.

The NFS client attempts to cache as many pages as possible
before sending any data to the server.  Thus, there could
potentially be a large amount of cached data, not yet written
to the server, in the memory on the client.

The semantics require that all of this dirty data be written
to the server so that the correct modification time can be
retrieved from the server.  Since this can be a large amount
of data, this can take a very long time to flush.
Additionally, if the file is being actively written to, then
more pages could be being dirtied while pages are being
written to the server.  The writing process can effectively
starve the process doing the stat on the file.

The solution is to block the writing process while the page
flush is being done so that the process doing the stat is not
starved.  The page flush will eventually complete and the
stat processing can continue.

This is the solution backported from upstream.

    Thanx...

       ps

This patch is a backport of upstream commits:
commit 28c494c5c8d425e15b7b82571e4df6d6bc34594d
NFS: Prevent nfs_getattr() hang during heavy write workloads
commit 634707388baa440d9c9d082cfc4c950500c8952b
[PATCH] nfs: nfs_getattr() can't call nfs_sync_mapping_range() for non-regular files

Flavio Leitner

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 67a9bb9..33c1a52 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -445,8 +445,19 @@ int nfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
 	int need_atime = NFS_I(inode)->cache_validity & NFS_INO_INVALID_ATIME;
 	int err;
 
-	/* Flush out writes to the server in order to update c/mtime */
-	nfs_sync_inode_wait(inode, 0, 0, FLUSH_NOCOMMIT);
+	/*
+	 * Flush out writes to the server in order to update c/mtime.
+	 *
+	 * Hold the i_mutex to suspend application writes temporarily;
+	 * this prevents long-running writing applications from blocking
+	 * nfs_sync_inode_wait.
+	 */
+	if (S_ISREG(inode->i_mode)) {
+		mutex_lock(&inode->i_mutex);
+		nfs_sync_inode_wait(inode, 0, 0, FLUSH_NOCOMMIT);
+		mutex_unlock(&inode->i_mutex);
+	}
+
 
 	/*
 	 * We may force a getattr if the user cares about atime.