Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 89877e42827f16fa5f86b1df0c2860b1 > files > 2131

kernel-2.6.18-128.1.10.el5.src.rpm

From: Mike Christie <mchristi@redhat.com>
Date: Thu, 7 Feb 2008 19:22:10 -0600
Subject: [scsi] fix medium error handling with bad devices
Message-id: 1202433730.8820.9.camel@max
O-Subject: [PATCH RHEL 5.2] fix medium error handling with bad devices
Bugzilla: 431365
RH-Acked-by: Alan Cox <alan@redhat.com>

This is for BZ 431365.

During a hardware or medium error the disk can tell us which part of the
IO failed, and the scsi layer will only report part of the IO as
completing successfully. This patch fixes a problem where disks could
return a medium or hardware error with bad values, and cause the scsi
layer to incorrectly report the IO as succeeding when it really failed.

The patch is in the SCSI maintainer's tree:
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commit;h=b42abb39ca8a5414f039839866f0357725a53618
and will be sent to Linus for 2.6.25.

I tested the patch by using the hardware error testing code in the
scsi_debug module, and then hacked it up to send bad values too. The
redhat bugzilla bug reporter tested the patch against the upstream
kernel and verified that it fixed his problem.

 drivers/scsi/sd.c |   34 ++++++++++++++++------------------
 1 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 525864b..32e5a09 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -955,6 +955,7 @@ static void sd_rw_intr(struct scsi_cmnd * SCpnt)
  	unsigned int xfer_size = SCpnt->request_bufflen;
  	unsigned int good_bytes = result ? 0 : xfer_size;
  	u64 start_lba = SCpnt->request->sector;
+	u64 end_lba = SCpnt->request->sector + (xfer_size / 512);
  	u64 bad_lba;
 	struct scsi_sense_hdr sshdr;
 	int sense_valid = 0;
@@ -991,26 +992,23 @@ static void sd_rw_intr(struct scsi_cmnd * SCpnt)
 			goto out;
 		if (xfer_size <= SCpnt->device->sector_size)
 			goto out;
-		switch (SCpnt->device->sector_size) {
-		case 256:
+		if (SCpnt->device->sector_size < 512) {
+			/* only legitimate sector_size here is 256 */
 			start_lba <<= 1;
-			break;
-		case 512:
-			break;
-		case 1024:
-			start_lba >>= 1;
-			break;
-		case 2048:
-			start_lba >>= 2;
-			break;
-		case 4096:
-			start_lba >>= 3;
-			break;
-		default:
-			/* Print something here with limiting frequency. */
-			goto out;
-			break;
+			end_lba <<= 1;
+		} else {
+			/* be careful ... don't want any overflows */
+			u64 factor = SCpnt->device->sector_size / 512;
+			do_div(start_lba, factor);
+			do_div(end_lba, factor);
 		}
+
+		if (bad_lba < start_lba  || bad_lba >= end_lba)
+			/* the bad lba was reported incorrectly, we have
+			 * no idea where the error is
+			 */
+			goto out;
+
 		/* This computation should always be done in terms of
 		 * the resolution of the device's medium.
 		 */