Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 27922b4260f65d317aabda37e42bbbff > files > 2301

kernel-2.6.18-238.el5.src.rpm

From: Larry Woodman <lwoodman@redhat.com>
Date: Tue, 21 Apr 2009 17:05:44 -0400
Subject: [mm] tweak vm diry_ratio to prevent stalls on some DBs
Message-id: 1240347944.11613.20.camel@dhcp-100-19-198.bos.redhat.com
O-Subject: [RHEL5-U4 patch] repost - Provide reasonable work-around for DB2 server becomes unresponsive when performing backups
Bugzilla: 295291
RH-Acked-by: Rik van Riel <riel@redhat.com>
RH-Acked-by: Jeff Moyer <jmoyer@redhat.com>

We have a relatively old BZ(295291) that complains about systems running
DB2 servers becoming unresponsive when backups are performance.  After
investigation we realized the problem was deeper in that the backups
were being performed over NFS and the NFS server would periodically
stall.  When this happens the amount of dirty memory in the pagecache
climbs over /proc/sys/vm/dirty_ratio which by default is 40% of memory.
Practically all of the dirty memory is from NFS mounted files and the
clean-up is stalled because the NFS server stalled.  When a local file
system write occurs we end up calling balance_dirty_pages() because more
than dirty_ratio of the memory is dirty.  balance_dirty_pages() calls
get_dirty_limits() to get the dirty_ratio then forces the process to
writeback_inodes() and block because we are over the dirty_ratio even
though it was a local file system write and the cause of the dirty
memory is NFS.

The real fix for this problem is upstream, per-device dirty limits.
However this is a big change that is way to complicated and kABI
breaking for RHEL5.  However, we could workaround this problem by
increasing the dirty_ratio so that the dirty memory does not go above
the dirty_ratio when the NFS stalls occur.  The problem is that
get_dirty_limits will not obey /proc/sys/vm/dirty_ratio if the dirty
memory exceeds 1/2 of the unmapped memory.  In other
words /proc/sys/vm/dirty_ratio does not work, its capped at 1/2 of the
unmapped memory(pagecache memory thats not mapped).  This has
undesirable side effects of some system that run applications that map
lots of memory, like DB2.

Since changing that behavior for all systems would be far too risky we
can prevent the system from capping /proc/sys/vm/dirty_ratio if it has
been tuned to 100.  The attached patch allows one to tune dirty_ratio
and have it work as desired without effecting system that have not been
tuned.

Fixes BZ295291

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 4fa0f54..4edf8e9 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -145,7 +145,9 @@ get_dirty_limits(long *pbackground, long *pdirty,
 					total_pages;
 
 	dirty_ratio = vm_dirty_ratio;
-	if (dirty_ratio > unmapped_ratio / 2)
+
+	/* if vm_dirty_ratio is 100 dont limit to 1/2 unmapped_ratio */
+	if ((dirty_ratio > unmapped_ratio / 2) && (dirty_ratio != 100))
 		dirty_ratio = unmapped_ratio / 2;
 
 	if (dirty_ratio < 5)