Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 27922b4260f65d317aabda37e42bbbff > files > 388

kernel-2.6.18-238.el5.src.rpm

From: Milan Broz <mbroz@redhat.com>
Date: Tue, 2 Dec 2008 18:19:22 +0100
Subject: [block] fix max_segment_size, seg_boundary mask setting
Message-id: 49356E1A.1090208@redhat.com
O-Subject: [RHEL 5.3 PATCH] block: Fix setting of max_segment_size and seg_boundary mask
Bugzilla: 471639
RH-Acked-by: Alasdair G Kergon <agk@redhat.com>
RH-Acked-by: Mike Christie <mchristi@redhat.com>
RH-Acked-by: Tomas Henzl <thenzl@redhat.com>

RHEL5.3 block: Fix setting of max_segment_size and seg_boundary mask
Resolves: rhbz#471639

Patch is not yet upstream (sent to lkml this week) but customer
systems hits this issue.

When stacking devices (LVM over MD over SCSI) max_segment_size
and seg_moundary mask attributes of the request queue
are not set up correctly by default.

It means that these attributes are not correctly propagated through
the stack of devices.

If you create MD device over SCSI attributes are zeroed.

Problem become when there is over this mapping next device-mapper
mapping - queue attributes are set in DM this way:

request_queue   max_segment_size  seg_boundary_mask
SCSI                65536             0xffffffff
MD RAID1                0                      0
LVM                 65536                 -1 (64bit)

Unfortunately bio_add_page (resp. bio_phys_segments) calculates number
of physical segments according to these parameters.

During the generic_make_request() is segment count recalculated and
can increase bio->bi_phys_segments count over the allowed limit.
(After bio_clone() in stack operation.)

This is specially problem in CCISS driver, where it produce OOPS here
    BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);
(MAXSEGENTRIES is 31 by default.)

Sometimes even this command is enough to cause oops:
  dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10

This command generates bios with 250 sectors, allocated in 32 4k-pages
(last page uses only 1024 bytes).

For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
unfortunately on lower layer it is recalculated to 32 segments and
this violates CCISS restriction and triggers BUG_ON().

Patch include:
 * initializing attributes above in queue request constructor
   blk_queue_make_request()
 * make sure that blk_queue_stack_limits() inherits setting
 * sets the default seg_boundary value in one place (blkdev.h)
 * use this mask default in DM (instead of -1, which differs in 64bit)

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 2b4e523..b336b06 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -247,6 +247,8 @@ void blk_queue_make_request(request_queue_t * q, make_request_fn * mfn)
 	q->nr_requests = BLKDEV_MAX_RQ;
 	blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
 	blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS);
+	blk_queue_segment_boundary(q, BLK_SEG_BOUNDARY_MASK);
+	blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
 	q->make_request_fn = mfn;
 	q->backing_dev_info.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
 	q->backing_dev_info.state = 0;
@@ -779,6 +781,7 @@ void blk_queue_stack_limits(request_queue_t *t, request_queue_t *b)
 	/* zero is "infinity" */
 	t->max_sectors = min_not_zero(t->max_sectors,b->max_sectors);
 	t->max_hw_sectors = min_not_zero(t->max_hw_sectors,b->max_hw_sectors);
+	t->seg_boundary_mask = min_not_zero(t->seg_boundary_mask, b->seg_boundary_mask);
 
 	t->max_phys_segments = min(t->max_phys_segments,b->max_phys_segments);
 	t->max_hw_segments = min(t->max_hw_segments,b->max_hw_segments);
@@ -2026,7 +2029,7 @@ blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id)
 	q->queue_flags		= (1 << QUEUE_FLAG_CLUSTER);
 	q->queue_lock		= lock;
 
-	blk_queue_segment_boundary(q, 0xffffffff);
+	blk_queue_segment_boundary(q, BLK_SEG_BOUNDARY_MASK);
 
 	blk_queue_make_request(q, __make_request);
 	blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 86357a2..2a01dfd 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -704,7 +704,7 @@ static void check_for_valid_limits(struct io_restrictions *rs)
 	if (!rs->max_segment_size)
 		rs->max_segment_size = MAX_SEGMENT_SIZE;
 	if (!rs->seg_boundary_mask)
-		rs->seg_boundary_mask = -1;
+		rs->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK;
 }
 
 int dm_table_add_target(struct dm_table *t, const char *type,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3d43fd1..f819808 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -782,6 +782,8 @@ extern int blkdev_issue_flush(struct block_device *, sector_t *);
 
 #define MAX_SEGMENT_SIZE	65536
 
+#define BLK_SEG_BOUNDARY_MASK	0xFFFFFFFFUL
+
 #define blkdev_entry_to_request(entry) list_entry((entry), struct request, queuelist)
 
 static inline int queue_hardsect_size(request_queue_t *q)