Sophie: kernel-2.6.18-194.11.1.el5 src

kernel-2.6.18-194.11.1.el5.src.rpm

From: mchristi@redhat.com <mchristi@redhat.com>
Date: Thu, 18 Sep 2008 13:48:28 -0500
Subject: [scsi] modify failfast so it does not always fail fast
Message-id: 1221763708-18235-1-git-send-email-mchristi@redhat.com
O-Subject: [PATCH] RHEL 5.3: Modify block/scsi fail fast so it does not always fail fast
Bugzilla: 447586
RH-Acked-by: Jeff Garzik <jgarzik@redhat.com>

From: Mike Christie <mchristi@redhat.com>

This is for BZ 447586.

The problem we are trying to solve is multipath sets the
failfast bits on the bios, but really only wants transport
errors that the lower levels cannot handle quickly.

Currenly we get hit with the following type of bzs a lot:
1. Customers get their logs filled with scsi and block errors
for problems that could be handled at the scsi layer. A common
case is that a frame is dropped, and retrying at the scsi layer
normally handles the problem, but failfast is set so it escalates
the problem when we do not need to. The path/transport/connection is fine. A
frame is just dropped due to some other non bad transport problem.
2. Customers are getting their logs filled with errors for
transient transport problems. So dm-multpath wants to know about
transport problems, but if the problem is quickly handled at
the scsi/fc/iscsi-transport-class level then there is no need
to escalate the problem. The iscsi class has the replacement_timeout
to handle this and the fc class has the fast io fail tmo to determine
how quickly to fail transport errors.
3. Logs getting filled up is not that bad. We could add some code
to make the logging less verbose. The problem is that for
many active passive targets (targets that transfer resources from
one path to another), switching paths can be expensive
and slow. And even for some active-active boxes that emuluex was hitting
switching paths is more cosly then just retrying at the scsi/fc layer
when we can.

Upstream we seperated failfast into seperate bits and
added new SCSI transport errors. RAID could
ask for device errors to be failed quickly so it can kick
in spares. multipath asks for transport errors. Then operations
like read ahead and some passthrough ask for any error to be
failed qiuckly.

The block, dm/md, and initiatial scsi pieces are in the
SCSI maintainer's tree for the next feature window:
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-post-merge-2.6.git;a=commitdiff;h=3ba18818696112197332d5fed10103822ac94bbf
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-post-merge-2.6.git;a=commitdiff;h=d187921f21d4a0b052ce4d9bd7237c9d8c2beab1
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-post-merge-2.6.git;a=commitdiff;h=359dc19fd47343f606a4c93e242f121ecb46591e
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-post-merge-2.6.git;a=commitdiff;h=9839526f5e5eb7e884cce47d01afaab4bf90071d
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-post-merge-2.6.git;a=commitdiff;h=112d080af6b85a39b0a62790ac0cb6e60da0f207
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-post-merge-2.6.git;a=commitdiff;h=f394fc57b23c3be42e8fd3a5711a99cad728c6ca
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-post-merge-2.6.git;a=commitdiff;h=ebf16cab8f89794bc9931ec10049d0b813c0957b

There is some code in this patch that is not yet upstream, like the
code related to the SCSI_MLQUEUE defines in scsi.h, due to patch
conflicts. The scsi maintainer tried to merge it but due to conflicts
with the block layer maintainer's tree IBM has to redo its part of
the patchset:
http://marc.info/?l=linux-scsi&m=122056343804360&w=2
But the code has been reviewed and is heading upstream when we finish
fixing up other issues in the block tree (there are some timer changes
upstream which are buggy and we are fixing those before resending the
final patches to make sure that the patchset goes in nicely).

I tested this port with software iscsi and hardware iscsi with and
without dm-multipath, by setting the queue depths small (hits SCSI_MLQUEUE
changes), setting the command timeout to 1 second (hits scsi_eh path),
and then pulling cables (hits the requeue logic).
I also tested with scsi_debug so we could hit the errors like
recovered errors, medium errors (this is interesting because
it stresses the partial completion retry/requeue code), etc.

Our partners IBM and Emulex have also tested, and internally
some of our engineers have been testing on their boxes.

I ran check-kabi over this patch a couple days ago (aginst
whatever was in git then) and there were no errors. I ran
it again over the current git tree and there are tons of
errors, but this patch does not add new ones, so I think it
is ok KABI wise.

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 56ffc13..2b4e523 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -2944,7 +2944,16 @@ static void init_request_from_bio(struct request *req, struct bio *bio)
 	/*
 	 * inherit FAILFAST from bio (for read-ahead, and explicit FAILFAST)
 	 */
-	if (bio_rw_ahead(bio) || bio_failfast(bio))
+	if (bio_rw_ahead(bio))
+		req->flags |= (REQ_FAILFAST | REQ_FAILFAST_DEV |
+				REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER);
+	if (bio_failfast_dev(bio))
+		req->flags |= REQ_FAILFAST_DEV;
+	if (bio_failfast_transport(bio))
+		req->flags |= REQ_FAILFAST_TRANSPORT;
+	if (bio_failfast_driver(bio))
+		req->flags |= REQ_FAILFAST_DRIVER;
+	if (bio_failfast(bio))
 		req->flags |= REQ_FAILFAST;
 
 	/*
diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index 537ed07..05300b9 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -778,7 +778,7 @@ static int cdrom_decode_status(ide_drive_t *drive, int good_stat, int *stat_ret)
 
 		/* Handle errors from READ and WRITE requests. */
 
-		if (blk_noretry_request(rq))
+		if (blk_noretry_ff_request(rq))
 			do_end_request = 1;
 
 		if (sense_key == NOT_READY) {
diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index 72cb55c..082f5df 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -65,7 +65,7 @@ static int __ide_end_request(ide_drive_t *drive, struct request *rq,
 	 * if failfast is set on a request, override number of sectors and
 	 * complete the whole request right now
 	 */
-	if (blk_noretry_request(rq) && end_io_error(uptodate))
+	if (blk_noretry_ff_request(rq) && end_io_error(uptodate))
 		nr_sectors = rq->hard_nr_sectors;
 
 	if (!blk_fs_request(rq) && end_io_error(uptodate) && !rq->errors)
@@ -250,7 +250,7 @@ int ide_end_dequeued_request(ide_drive_t *drive, struct request *rq,
 	 * if failfast is set on a request, override number of sectors and
 	 * complete the whole request right now
 	 */
-	if (blk_noretry_request(rq) && end_io_error(uptodate))
+	if (blk_noretry_ff_request(rq) && end_io_error(uptodate))
 		nr_sectors = rq->hard_nr_sectors;
 
 	if (!blk_fs_request(rq) && end_io_error(uptodate) && !rq->errors)
@@ -511,7 +511,7 @@ static ide_startstop_t ide_ata_error(ide_drive_t *drive, struct request *rq, u8
 		/* force an abort */
 		hwif->OUTB(WIN_IDLEIMMEDIATE, IDE_COMMAND_REG);
 
-	if (rq->errors >= ERROR_MAX || blk_noretry_request(rq))
+	if (rq->errors >= ERROR_MAX || blk_noretry_ff_request(rq))
 		ide_kill_rq(drive, rq);
 	else {
 		if ((rq->errors & ERROR_RESET) == ERROR_RESET) {
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 1ca98a3..a82f4c3 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -866,7 +866,12 @@ static int multipath_map(struct dm_target *ti, struct bio *bio,
 	dm_bio_record(&mpio->details, bio);
 
 	map_context->ptr = mpio;
-	bio->bi_rw |= (1 << BIO_RW_FAILFAST);
+	/*
+	 * We set both incase 3rd party drivers were only checking
+	 * for REQ_FAILFAST.
+	 */
+	bio->bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
+	bio->bi_rw |= (1 << REQ_FAILFAST);
 	r = map_io(m, bio, mpio, 0);
 	if (r < 0 || r == DM_MAPIO_REQUEUE)
 		mempool_free(mpio, m->mpio_pool);
diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
index 33f67ca..756117e 100644
--- a/drivers/md/multipath.c
+++ b/drivers/md/multipath.c
@@ -178,7 +178,12 @@ static int multipath_make_request (request_queue_t *q, struct bio * bio)
 	mp_bh->bio = *bio;
 	mp_bh->bio.bi_sector += multipath->rdev->data_offset;
 	mp_bh->bio.bi_bdev = multipath->rdev->bdev;
-	mp_bh->bio.bi_rw |= (1 << BIO_RW_FAILFAST);
+	/*
+	 * We set both incase 3rd party drivers were only checking
+	 * for REQ_FAILFAST.
+	 */
+	mp_bh->bio.bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
+	mp_bh->bio.bi_rw |= (1 << REQ_FAILFAST);
 	mp_bh->bio.bi_end_io = multipath_end_request;
 	mp_bh->bio.bi_private = mp_bh;
 	generic_make_request(&mp_bh->bio);
@@ -400,7 +405,7 @@ static void multipathd (mddev_t *mddev)
 			*bio = *(mp_bh->master_bio);
 			bio->bi_sector += conf->multipaths[mp_bh->path].rdev->data_offset;
 			bio->bi_bdev = conf->multipaths[mp_bh->path].rdev->bdev;
-			bio->bi_rw |= (1 << BIO_RW_FAILFAST);
+			bio->bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
 			bio->bi_end_io = multipath_end_request;
 			bio->bi_private = mp_bh;
 			generic_make_request(bio);
diff --git a/drivers/s390/block/dasd_diag.c b/drivers/s390/block/dasd_diag.c
index aeb9e61..63bd607 100644
--- a/drivers/s390/block/dasd_diag.c
+++ b/drivers/s390/block/dasd_diag.c
@@ -548,7 +548,7 @@ dasd_diag_build_cp(struct dasd_device * device, struct request *req)
 	}
 	cqr->retries = DIAG_MAX_RETRIES;
 	cqr->buildclk = get_clock();
-	if (req->flags & REQ_FAILFAST)
+	if (blk_noretry_ff_request(req))
 		set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
 	cqr->device = device;
 	cqr->expires = DIAG_TIMEOUT;
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index 802eb41..2ea8bf3 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -1383,7 +1383,7 @@ dasd_eckd_build_cp(struct dasd_device * device, struct request *req)
 			recid++;
 		}
 	}
-	if (req->flags & REQ_FAILFAST)
+	if (blk_noretry_ff_request(req))
 		set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
 	cqr->device = device;
 	cqr->expires = 5 * 60 * HZ;	/* 5 minutes */
diff --git a/drivers/s390/block/dasd_fba.c b/drivers/s390/block/dasd_fba.c
index e85015b..f4a1c28 100644
--- a/drivers/s390/block/dasd_fba.c
+++ b/drivers/s390/block/dasd_fba.c
@@ -344,7 +344,7 @@ dasd_fba_build_cp(struct dasd_device * device, struct request *req)
 			recid++;
 		}
 	}
-	if (req->flags & REQ_FAILFAST)
+	if (blk_noretry_ff_request(req))
 		set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
 	cqr->device = device;
 	cqr->expires = 5 * 60 * HZ;	/* 5 minutes */
diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
index 14336a8..f3cf8e1 100644
--- a/drivers/scsi/constants.c
+++ b/drivers/scsi/constants.c
@@ -1350,7 +1350,8 @@ EXPORT_SYMBOL(scsi_print_command);
 static const char * const hostbyte_table[]={
 "DID_OK", "DID_NO_CONNECT", "DID_BUS_BUSY", "DID_TIME_OUT", "DID_BAD_TARGET",
 "DID_ABORT", "DID_PARITY", "DID_ERROR", "DID_RESET", "DID_BAD_INTR",
-"DID_PASSTHROUGH", "DID_SOFT_ERROR", "DID_IMM_RETRY"};
+"DID_PASSTHROUGH", "DID_SOFT_ERROR", "DID_IMM_RETRY", "DID_REQUEUE",
+"DID_TRANSPORT_DISRUPTED", "DID_TRANSPORT_FAILFAST" };
 #define NUM_HOSTBYTE_STRS ARRAY_SIZE(hostbyte_table)
 
 void scsi_print_hostbyte(int scsiresult)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index edd36ef..3748b7b 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -110,7 +110,8 @@ static struct request *get_alua_req(struct scsi_device *sdev,
 		return NULL;
 	}
 
-	rq->flags |= REQ_FAILFAST | REQ_NOMERGE | REQ_BLOCK_PC;
+	rq->flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
+			REQ_FAILFAST_DRIVER | REQ_NOMERGE | REQ_BLOCK_PC;
 	rq->retries = ALUA_FAILOVER_RETRIES;
 	rq->timeout = ALUA_FAILOVER_TIMEOUT;
 
@@ -426,45 +427,45 @@ static int alua_check_sense(struct scsi_device *sdev,
 			/*
 			 * LUN Not Accessible - ALUA state transition
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0b)
 			/*
 			 * LUN Not Accessible -- Target port in standby state
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0c)
 			/*
 			 * LUN Not Accessible -- Target port in unavailable state
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x12)
 			/*
 			 * LUN Not Ready -- Offline
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		break;
 	case UNIT_ATTENTION:
 		if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)
 			/*
 			 * Power On, Reset, or Bus Device Reset, just retry.
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		if (sense_hdr->asc == 0x2a && sense_hdr->ascq == 0x06) {
 			/*
 			 * ALUA state changed
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		}
 		if (sense_hdr->asc == 0x2a && sense_hdr->ascq == 0x07) {
 			/*
 			 * Implicit ALUA state transition failed
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		}
 		break;
 	}
 
-	return SCSI_RETURN_NOT_HANDLED;
+	return 0;
 }
 
 /*
diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c
index 896501c..3cdd97a 100644
--- a/drivers/scsi/device_handler/scsi_dh_rdac.c
+++ b/drivers/scsi/device_handler/scsi_dh_rdac.c
@@ -230,7 +230,8 @@ static struct request *get_rdac_req(struct scsi_device *sdev,
 
 	memset(rq->cmd, 0, BLK_MAX_CDB);
 
-	rq->flags |= REQ_FAILFAST | REQ_NOMERGE | REQ_BLOCK_PC;
+	rq->flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
+			REQ_FAILFAST_DRIVER | REQ_NOMERGE | REQ_BLOCK_PC;
 	rq->retries = RDAC_RETRIES;
 	rq->timeout = RDAC_TIMEOUT;
 
@@ -549,13 +550,13 @@ static int rdac_check_sense(struct scsi_device *sdev,
 			 *
 			 * Nothing we can do here. Try to bypass the path.
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0xA1)
 			/* LUN Not Ready - Quiescense in progress
 			 *
 			 * Just retry and wait.
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		break;
 	case ILLEGAL_REQUEST:
 		if (sense_hdr->asc == 0x94 && sense_hdr->ascq == 0x01) {
@@ -564,7 +565,7 @@ static int rdac_check_sense(struct scsi_device *sdev,
 			 * Fail the path, so that the other path be used.
 			 */
 			h->state = RDAC_STATE_PASSIVE;
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		}
 		break;
 	case UNIT_ATTENTION:
@@ -572,11 +573,11 @@ static int rdac_check_sense(struct scsi_device *sdev,
 			/*
 			 * Power On, Reset, or Bus Device Reset, just retry.
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		break;
 	}
 	/* success just means we do not care what scsi-ml does */
-	return SCSI_RETURN_NOT_HANDLED;
+	return 0;
 }
 
 const struct scsi_dh_devlist rdac_dev_list[] = {
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 9e7b9fe..c8f3f82 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2002,7 +2002,7 @@ flush_control_queues(struct iscsi_session *session, struct iscsi_conn *conn)
 }
 
 /* Fail commands. Mutex and session lock held and recv side suspended */
-static void fail_all_commands(struct iscsi_conn *conn)
+static void fail_all_commands(struct iscsi_conn *conn, int error)
 {
 	struct iscsi_cmd_task *ctask, *tmp;
 
@@ -2010,14 +2010,14 @@ static void fail_all_commands(struct iscsi_conn *conn)
 	list_for_each_entry_safe(ctask, tmp, &conn->xmitqueue, running) {
 		debug_scsi("failing pending sc %p itt 0x%x\n", ctask->sc,
 			   ctask->itt);
-		fail_command(conn, ctask, DID_BUS_BUSY << 16);
+		fail_command(conn, ctask, error << 16);
 	}
 
 	/* fail all other running */
 	list_for_each_entry_safe(ctask, tmp, &conn->run_list, running) {
 		debug_scsi("failing in progress sc %p itt 0x%x\n",
 			   ctask->sc, ctask->itt);
-		fail_command(conn, ctask, DID_BUS_BUSY << 16);
+		fail_command(conn, ctask, error << 16);
 	}
 
 	conn->ctask = NULL;
@@ -2090,7 +2090,10 @@ static void iscsi_start_session_recovery(struct iscsi_session *session,
 	 * flush queues.
 	 */
 	spin_lock_bh(&session->lock);
-	fail_all_commands(conn);
+	if (flag == STOP_CONN_RECOVER)
+		fail_all_commands(conn, DID_TRANSPORT_DISRUPTED);
+	else
+		fail_all_commands(conn, DID_ERROR);
 	flush_control_queues(session, conn);
 	spin_unlock_bh(&session->lock);
 	mutex_unlock(&session->eh_mutex);
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
index e772a90..0a0b9bb 100644
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -133,14 +133,6 @@ lpfc_terminate_rport_io(struct fc_rport *rport)
 			&phba->sli.ring[phba->sli.fcp_ring],
 			ndlp->nlp_sid, 0, LPFC_CTX_TGT);
 	}
-
-	/*
-	 * A device is normally blocked for rediscovery and unblocked when
-	 * devloss timeout happens.  In case a vport is removed or driver
-	 * unloaded before devloss timeout happens, we need to unblock here.
-	 */
-	scsi_target_unblock(&rport->dev);
-	return;
 }
 
 /*
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index d871042..5b585b2 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -358,7 +358,6 @@ void scsi_log_send(struct scsi_cmnd *cmd)
 void scsi_log_completion(struct scsi_cmnd *cmd, int disposition)
 {
 	unsigned int level;
-	struct scsi_device *sdev;
 
 	/*
 	 * If ML COMPLETE log level is greater than or equal to:
@@ -375,38 +374,40 @@ void scsi_log_completion(struct scsi_cmnd *cmd, int disposition)
 	if (unlikely(scsi_logging_level)) {
 		level = SCSI_LOG_LEVEL(SCSI_LOG_MLCOMPLETE_SHIFT,
 				       SCSI_LOG_MLCOMPLETE_BITS);
-		if (((level > 0) && (cmd->result || disposition != SUCCESS)) ||
+		if (((level > 0) &&
+		    (cmd->result || !scsi_disposition_finish(disposition))) ||
 		    (level > 1)) {
-			sdev = cmd->device;
-			sdev_printk(KERN_INFO, sdev, "done ");
+			scmd_printk(KERN_INFO, cmd, "Done: ");
 			if (level > 2)
 				printk("0x%p ", cmd);
+
 			/*
 			 * Dump truncated values, so we usually fit within
 			 * 80 chars.
 			 */
-			switch (disposition) {
-			case SUCCESS:
-				printk("SUCCESS");
-				break;
-			case NEEDS_RETRY:
-				printk("RETRY  ");
-				break;
-			case ADD_TO_MLQUEUE:
-				printk("MLQUEUE");
-				break;
-			case FAILED:
-				printk("FAILED ");
-				break;
-			case TIMEOUT_ERROR:
-				/* 
-				 * If called via scsi_times_out.
-				 */
-				printk("TIMEOUT");
-				break;
-			default:
-				printk("UNKNOWN");
+			if (scsi_disposition_finish(disposition))
+				printk("SUCCESS\n");
+			else if (scsi_disposition_retry(disposition))
+				printk("RETRY\n");
+			else if (scsi_disposition_fail(disposition))
+				printk("FAILED\n");
+			else {
+				switch (disposition) {
+				case SUCCESS:
+					printk("SUCCESS\n");
+					break;
+				case TIMEOUT_ERROR:
+					/*
+					 * If called via scsi_times_out.
+					 */
+					printk("TIMEOUT\n");
+					break;
+				default:
+					printk("UNKNOWN: 0x%x\n",
+					       disposition);
+				}
 			}
+
 			printk(" %8x ", cmd->result);
 			scsi_print_command(cmd);
 			if (status_byte(cmd->result) & CHECK_CONDITION) {
@@ -418,8 +419,8 @@ void scsi_log_completion(struct scsi_cmnd *cmd, int disposition)
 			}
 			if (level > 3) {
 				printk(KERN_INFO "scsi host busy %d failed %d\n",
-				       sdev->host->host_busy,
-				       sdev->host->host_failed);
+				       cmd->device->host->host_busy,
+				       cmd->device->host->host_failed);
 			}
 		}
 	}
@@ -477,7 +478,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
 		 * future requests should not occur until the device 
 		 * transitions out of the suspend state.
 		 */
-		scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
+		scsi_attempt_requeue_command(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
 
 		SCSI_LOG_MLQUEUE(3, printk("queuecommand : device blocked \n"));
 
@@ -559,7 +560,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
 	if (rtn) {
 		if (scsi_delete_timer(cmd)) {
 			atomic_inc(&cmd->device->iodone_cnt);
-			scsi_queue_insert(cmd,
+			scsi_attempt_requeue_command(cmd,
 					  (rtn == SCSI_MLQUEUE_DEVICE_BUSY) ?
 					  rtn : SCSI_MLQUEUE_HOST_BUSY);
 		}
@@ -650,27 +651,6 @@ void __scsi_done(struct scsi_cmnd *cmd)
 }
 
 /*
- * Function:    scsi_retry_command
- *
- * Purpose:     Send a command back to the low level to be retried.
- *
- * Notes:       This command is always executed in the context of the
- *              bottom half handler, or the error handler thread. Low
- *              level drivers should not become re-entrant as a result of
- *              this.
- */
-int scsi_retry_command(struct scsi_cmnd *cmd)
-{
-        /*
-         * Zero the sense information from the last time we tried
-         * this command.
-         */
-	memset(cmd->sense_buffer, 0, sizeof(cmd->sense_buffer));
-
-	return scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY);
-}
-
-/*
  * Function:    scsi_finish_command
  *
  * Purpose:     Pass command off to upper layer for finishing of I/O
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 6cb4584..341a82a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -291,7 +291,9 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost,
  * @scmd:	Cmd to have sense checked.
  *
  * Return value:
- * 	SUCCESS or FAILED or NEEDS_RETRY
+ * 	SCSI_MLQUEUE_DIS_FINISH
+ * 	SCSI_MLQUEUE_DIS_RETRY
+ * 	SCSI_MLQUEUE_DIS_FAIL
  *
  * Notes:
  *	When a deferred error is detected the current command has
@@ -304,17 +306,17 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 	struct scsi_dh_data *scsi_dh_data = retrieve_scsi_dh_data(sdev);
 
 	if (! scsi_command_normalize_sense(scmd, &sshdr))
-		return FAILED;	/* no valid sense data */
+		return SCSI_MLQUEUE_DIS_FAIL;	/* no valid sense data */
 
 	if (scsi_sense_is_deferred(&sshdr))
-		return NEEDS_RETRY;
+		return SCSI_MLQUEUE_DIS_DEV_RETRY;
 
 	if (scsi_dh_data && scsi_dh_data->scsi_dh &&
 			scsi_dh_data->scsi_dh->check_sense) {
 		int rc;
 
 		rc = scsi_dh_data->scsi_dh->check_sense(sdev, &sshdr);
-		if (rc != SCSI_RETURN_NOT_HANDLED)
+		if (rc)
 			return rc;
 		/* handler does not care. Drop down to default handling */
 	}
@@ -326,7 +328,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 	if (sshdr.response_code == 0x70) {
 		/* fixed format */
 		if (scmd->sense_buffer[2] & 0xe0)
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 	} else {
 		/*
 		 * descriptor format: look for "stream commands sense data
@@ -336,17 +338,17 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		if ((sshdr.additional_length > 3) &&
 		    (scmd->sense_buffer[8] == 0x4) &&
 		    (scmd->sense_buffer[11] & 0xe0))
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 	}
 
 	switch (sshdr.sense_key) {
 	case NO_SENSE:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case RECOVERED_ERROR:
-		return /* soft_error */ SUCCESS;
+		return /* soft_error */ SCSI_MLQUEUE_DIS_FINISH;
 
 	case ABORTED_COMMAND:
-		return NEEDS_RETRY;
+		return SCSI_MLQUEUE_DIS_DEV_RETRY;
 	case NOT_READY:
 	case UNIT_ATTENTION:
 		/*
@@ -357,43 +359,43 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		 */
 		if (scmd->device->expecting_cc_ua) {
 			scmd->device->expecting_cc_ua = 0;
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_DIS_DEV_RETRY;
 		}
 		/*
 		 * if the device is in the process of becoming ready, we 
 		 * should retry.
 		 */
 		if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01))
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_DIS_DEV_RETRY;
 		/*
 		 * if the device is not started, we need to wake
 		 * the error handler to start the motor
 		 */
 		if (scmd->device->allow_restart &&
 		    (sshdr.asc == 0x04) && (sshdr.ascq == 0x02))
-			return FAILED;
-		return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FAIL;
+		return SCSI_MLQUEUE_DIS_FINISH;
 
 		/* these three are not supported */
 	case COPY_ABORTED:
 	case VOLUME_OVERFLOW:
 	case MISCOMPARE:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 
 	case MEDIUM_ERROR:
-		return NEEDS_RETRY;
+		return SCSI_MLQUEUE_DIS_DEV_RETRY;
 
 	case HARDWARE_ERROR:
 		if (scmd->device->retry_hwerror)
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_DIS_DEV_RETRY;
 		else
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 
 	case ILLEGAL_REQUEST:
 	case BLANK_CHECK:
 	case DATA_PROTECT:
 	default:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	}
 }
 
@@ -423,13 +425,13 @@ static int scsi_eh_completed_normally(struct scsi_cmnd *scmd)
 		return scsi_check_sense(scmd);
 	}
 	if (host_byte(scmd->result) != DID_OK)
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 
 	/*
 	 * next, check the message byte.
 	 */
 	if (msg_byte(scmd->result) != COMMAND_COMPLETE)
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 
 	/*
 	 * now, check the status byte to see if this indicates
@@ -438,7 +440,7 @@ static int scsi_eh_completed_normally(struct scsi_cmnd *scmd)
 	switch (status_byte(scmd->result)) {
 	case GOOD:
 	case COMMAND_TERMINATED:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case CHECK_CONDITION:
 		return scsi_check_sense(scmd);
 	case CONDITION_GOOD:
@@ -447,14 +449,14 @@ static int scsi_eh_completed_normally(struct scsi_cmnd *scmd)
 		/*
 		 * who knows?  FIXME(eric)
 		 */
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case BUSY:
 	case QUEUE_FULL:
 	case RESERVATION_CONFLICT:
 	default:
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 	}
-	return FAILED;
+	return SCSI_MLQUEUE_DIS_FAIL;
 }
 
 /**
@@ -713,15 +715,12 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
 			printk("%s: scsi_eh_completed_normally %x\n",
 			       __FUNCTION__, rtn));
 
-		switch (rtn) {
-		case SUCCESS:
-		case NEEDS_RETRY:
-		case FAILED:
-			break;
-		default:
+		if (scsi_disposition_finish(rtn))
+			rtn = SUCCESS;
+		else if (scsi_disposition_retry(rtn))
+			rtn = NEEDS_RETRY;
+		else
 			rtn = FAILED;
-			break;
-		}
 	} else {
 		scsi_abort_eh_cmnd(scmd);
 		rtn = FAILED;
@@ -786,6 +785,8 @@ void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q)
 {
 	scmd->device->host->host_failed--;
 	scmd->eh_eflags = 0;
+	if (!scmd->result)
+		scmd->result |= (DRIVER_TIMEOUT << 24);
 	list_move_tail(&scmd->eh_entry, done_q);
 }
 EXPORT_SYMBOL(scsi_eh_finish_cmd);
@@ -973,7 +974,7 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 		stu_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry)
 			if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) &&
-			    scsi_check_sense(scmd) == FAILED ) {
+			    scsi_check_sense(scmd) == SCSI_MLQUEUE_DIS_FAIL) {
 				stu_scmd = scmd;
 				break;
 			}
@@ -1194,19 +1195,6 @@ static void scsi_eh_offline_sdevs(struct list_head *work_q,
  **/
 int scsi_decide_disposition(struct scsi_cmnd *scmd)
 {
-	int rtn;
-
-	/*
-	 * if the device is offline, then we clearly just pass the result back
-	 * up to the top level.
-	 */
-	if (!scsi_device_online(scmd->device)) {
-		SCSI_LOG_ERROR_RECOVERY(5, printk("%s: device offline - report"
-						  " as SUCCESS\n",
-						  __FUNCTION__));
-		return SUCCESS;
-	}
-
 	/*
 	 * first check the host byte, to see if there is anything in there
 	 * that would indicate what we need to do.
@@ -1219,7 +1207,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * did_ok.
 		 */
 		scmd->result &= 0xff00ffff;
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case DID_OK:
 		/*
 		 * looks good.  drop through, and check the next byte.
@@ -1233,7 +1221,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * to the top level driver, not that we actually think
 		 * that it indicates SUCCESS.
 		 */
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 		/*
 		 * when the low level driver returns did_soft_error,
 		 * it is responsible for keeping an internal retry counter 
@@ -1244,13 +1232,25 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * and not get stuck in a loop.
 		 */
 	case DID_SOFT_ERROR:
-		goto maybe_retry;
+		return SCSI_MLQUEUE_DIS_DRV_RETRY;
 	case DID_IMM_RETRY:
-		return NEEDS_RETRY;
-
+		return SCSI_MLQUEUE_IMM_RETRY;
 	case DID_REQUEUE:
-		return ADD_TO_MLQUEUE;
-
+		return SCSI_MLQUEUE_DEVICE_BUSY;
+	case DID_TRANSPORT_DISRUPTED:
+		/*
+		 * LLD/transport was disrupted during processing of the IO.
+		 * The transport class is now blocked/blocking,
+		 * and the transport will decide what to do with the IO
+		 * based on its timers and recovery capablilities.
+		 */
+		return SCSI_MLQUEUE_TARGET_BUSY2;
+	case DID_TRANSPORT_FAILFAST:
+		/*
+		 * The transport decided to failfast the IO (most likely
+		 * the fast io fail tmo fired), so send IO directly upwards.
+		 */
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case DID_ERROR:
 		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
 		    status_byte(scmd->result) == RESERVATION_CONFLICT)
@@ -1259,11 +1259,11 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 			 * lower down
 			 */
 			break;
-		/* fallthrough */
-
-	case DID_BUS_BUSY:
+		/* fall through */
 	case DID_PARITY:
-		goto maybe_retry;
+		return SCSI_MLQUEUE_DIS_DEV_RETRY;
+	case DID_BUS_BUSY:
+		return SCSI_MLQUEUE_DIS_XPT_RETRY;
 	case DID_TIME_OUT:
 		/*
 		 * when we scan the bus, we get timeout messages for
@@ -1272,21 +1272,21 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 */
 		if ((scmd->cmnd[0] == TEST_UNIT_READY ||
 		     scmd->cmnd[0] == INQUIRY)) {
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		} else {
-			return FAILED;
+			return SCSI_MLQUEUE_DIS_FAIL;
 		}
 	case DID_RESET:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	default:
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 	}
 
 	/*
 	 * next, check the message byte.
 	 */
 	if (msg_byte(scmd->result) != COMMAND_COMPLETE)
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 
 	/*
 	 * check the status byte to see if this indicates anything special.
@@ -1304,20 +1304,13 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * the empty queue handling to trigger a stall in the
 		 * device.
 		 */
-		return ADD_TO_MLQUEUE;
+		return SCSI_MLQUEUE_DEVICE_BUSY;
 	case GOOD:
 	case COMMAND_TERMINATED:
 	case TASK_ABORTED:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case CHECK_CONDITION:
-		rtn = scsi_check_sense(scmd);
-		if (rtn == NEEDS_RETRY)
-			goto maybe_retry;
-		/* if rtn == FAILED, we have no sense information;
-		 * returning FAILED will wake the error handler thread
-		 * to collect the sense and redo the decide
-		 * disposition */
-		return rtn;
+		return scsi_check_sense(scmd);
 	case CONDITION_GOOD:
 	case INTERMEDIATE_GOOD:
 	case INTERMEDIATE_C_GOOD:
@@ -1325,32 +1318,17 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		/*
 		 * who knows?  FIXME(eric)
 		 */
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 
 	case RESERVATION_CONFLICT:
 		sdev_printk(KERN_INFO, scmd->device,
 			    "reservation conflict\n");
-		return SUCCESS; /* causes immediate i/o error */
+		return SCSI_MLQUEUE_DIS_FINISH; /* causes immediate i/o error */
 	default:
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 	}
-	return FAILED;
-
-      maybe_retry:
+	return SCSI_MLQUEUE_DIS_FAIL;
 
-	/* we requeue for retry because the error was retryable, and
-	 * the request was not marked fast fail.  Note that above,
-	 * even if the request is marked fast fail, we still requeue
-	 * for queue congestion conditions (QUEUE_FULL or BUSY) */
-	if ((++scmd->retries) <= scmd->allowed
-	    && !blk_noretry_request(scmd->request)) {
-		return NEEDS_RETRY;
-	} else {
-		/*
-		 * no more retries - report this one back to upper level.
-		 */
-		return SUCCESS;
-	}
 }
 
 /**
@@ -1466,27 +1444,10 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 
 	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
 		list_del_init(&scmd->eh_entry);
-		if (scsi_device_online(scmd->device) &&
-		    !blk_noretry_request(scmd->request) &&
-		    (++scmd->retries <= scmd->allowed)) {
-			SCSI_LOG_ERROR_RECOVERY(3, printk("%s: flush"
-							  " retry cmd: %p\n",
-							  current->comm,
-							  scmd));
-				scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
-		} else {
-			/*
-			 * If just we got sense for the device (called
-			 * scsi_eh_get_sense), scmd->result is already
-			 * set, do not set DRIVER_TIMEOUT.
-			 */
-			if (!scmd->result)
-				scmd->result |= (DRIVER_TIMEOUT << 24);
-			SCSI_LOG_ERROR_RECOVERY(3, printk("%s: flush finish"
-							" cmd: %p\n",
-							current->comm, scmd));
-			scsi_finish_command(scmd);
-		}
+		SCSI_LOG_ERROR_RECOVERY(3, printk("%s: flush"
+						  "attempt retry cmd: %p\n",
+						  current->comm, scmd));
+		scsi_attempt_requeue_command(scmd, SCSI_MLQUEUE_DIS_RETRY);
 	}
 }
 EXPORT_SYMBOL(scsi_eh_flush_done_q);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 35332a4..c19492e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -90,9 +90,9 @@ static void scsi_unprep_request(struct request *req)
 }
 
 /*
- * Function:    scsi_queue_insert()
+ * Function:    scsi_attempt_requeue_command()
  *
- * Purpose:     Insert a command in the midlevel queue.
+ * Purpose:     Attempt to insert a command in the midlevel queue.
  *
  * Arguments:   cmd    - command that we are adding to queue.
  *              reason - why we are inserting command to queue.
@@ -101,23 +101,44 @@ static void scsi_unprep_request(struct request *req)
  *
  * Returns:     Nothing.
  *
- * Notes:       We do this for one of two cases.  Either the host is busy
- *              and it cannot accept any more commands for the time being,
- *              or the device returned QUEUE_FULL and can accept no more
- *              commands.
+ * Notes:       We do this for multiple cases.
+ *
+ *		Host or device queueing:
+ *		Either the host or device is busy and it cannot accept any more
+ *		commands for the time being.
+ *
+ * 		SCSI error processing:
+ * 		The scsi-eh has decided to requeue a command after getting
+ * 		a command it believes ir retryable.
+ *
  * Notes:       This could be called either from an interrupt context or a
  *              normal process context.
  */
-int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
+int scsi_attempt_requeue_command(struct scsi_cmnd *cmd, int reason)
 {
 	struct Scsi_Host *host = cmd->device->host;
 	struct scsi_device *device = cmd->device;
 	struct request_queue *q = device->request_queue;
+	unsigned long wait_for = (cmd->allowed + 1) * cmd->timeout_per_command;
 	unsigned long flags;
 
 	SCSI_LOG_MLQUEUE(1,
 		 printk("Inserting command %p into mlqueue\n", cmd));
 
+	if (time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
+		sdev_printk(KERN_ERR, cmd->device, "timing out command, "
+			    "waited %lus\n", wait_for/HZ);
+		cmd->result |= DRIVER_TIMEOUT << 24;
+		scsi_finish_command(cmd);
+		return 0;
+	}
+
+	if (!scsi_device_online(cmd->device)) {
+		cmd->result |= DRIVER_HARD << 24;
+		scsi_finish_command(cmd);
+		return 0;
+	}
+
 	/*
 	 * Set the appropriate busy bit for the device/host.
 	 *
@@ -131,12 +152,60 @@ int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
 	 * if a command is requeued with no other commands outstanding
 	 * either for the device or for the host.
 	 */
-	if (reason == SCSI_MLQUEUE_HOST_BUSY)
+	switch (reason) {
+	case SCSI_MLQUEUE_HOST_BUSY:
+	case SCSI_MLQUEUE_HOST_BUSY2:
 		host->host_blocked = host->max_host_blocked;
-	else if (reason == SCSI_MLQUEUE_DEVICE_BUSY)
+		break;
+	case SCSI_MLQUEUE_DEVICE_BUSY:
+	case SCSI_MLQUEUE_DEVICE_BUSY2:
 		device->device_blocked = device->max_device_blocked;
+		break;
+	}
 
 	/*
+	 * If drivers are using the old values, then we
+	 * want to bypass the failfast and retry checks like we do
+	 * with the new ones.
+	 */
+	if (SCSI_MLQUEUE_HOST_BUSY || SCSI_MLQUEUE_DEVICE_BUSY ||
+	    SCSI_MLQUEUE_EH_RETRY)
+		goto cleanup;
+
+	if (!scsi_ign_failfast(reason) && scsi_disposition_retry(reason)) {
+		if (reason & SCSI_MLQUEUE_DIS_XPT_RETRY) {
+			if (!blk_failfast_transport(cmd->request))
+				goto check_retries;
+		} else if (reason & SCSI_MLQUEUE_DIS_DEV_RETRY) {
+			if (!blk_failfast_dev(cmd->request))
+				goto check_retries;
+		} else if (reason & SCSI_MLQUEUE_DIS_DRV_RETRY) {
+			if (!blk_failfast_driver(cmd->request))
+				goto check_retries;
+		} else if (reason & SCSI_MLQUEUE_DIS_RETRY) {
+			if (!blk_noretry_ff_request(cmd->request))
+				goto check_retries;
+		} else
+			goto check_retries;
+
+		if (!cmd->result)
+			cmd->result |= DRIVER_ERROR << 24;
+		scsi_finish_command(cmd);
+		return 0;
+	}
+
+check_retries:
+	if (!scsi_ign_cmd_retries(reason)) {
+		if (++cmd->retries > cmd->allowed) {
+			if (!cmd->result)
+				cmd->result |= DRIVER_ERROR << 24;
+			scsi_finish_command(cmd);
+			return 0;
+		}
+	}
+
+cleanup:
+	/*
 	 * Decrement the counters, since these commands are no longer
 	 * active on the host/device.
 	 */
@@ -671,7 +740,7 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int uptodate,
 			leftover = req->data_len;
 
 		/* kill remainder if no retrys */
-		if (!uptodate && blk_noretry_request(req))
+		if (!uptodate && blk_noretry_ff_request(req))
 			end_that_request_chunk(req, 0, leftover);
 		else {
 			if (requeue) {
@@ -1372,36 +1441,21 @@ static void scsi_kill_request(struct request *req, request_queue_t *q)
 static void scsi_softirq_done(struct request *rq)
 {
 	struct scsi_cmnd *cmd = rq->completion_data;
-	unsigned long wait_for = (cmd->allowed + 1) * cmd->timeout_per_command;
 	int disposition;
 
 	INIT_LIST_HEAD(&cmd->eh_entry);
 
 	disposition = scsi_decide_disposition(cmd);
-	if (disposition != SUCCESS &&
-	    time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
-		sdev_printk(KERN_ERR, cmd->device,
-			    "timing out command, waited %lus\n",
-			    wait_for/HZ);
-		disposition = SUCCESS;
-	}
 			
 	scsi_log_completion(cmd, disposition);
 
-	switch (disposition) {
-		case SUCCESS:
+	if (scsi_disposition_finish(disposition))
+		scsi_finish_command(cmd);
+	else if (scsi_disposition_retry(disposition))
+		scsi_attempt_requeue_command(cmd, disposition);
+	else
+		if (!scsi_eh_scmd_add(cmd, 0))
 			scsi_finish_command(cmd);
-			break;
-		case NEEDS_RETRY:
-			scsi_retry_command(cmd);
-			break;
-		case ADD_TO_MLQUEUE:
-			scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
-			break;
-		default:
-			if (!scsi_eh_scmd_add(cmd, 0))
-				scsi_finish_command(cmd);
-	}
 }
 
 /*
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index dc6d139..c207961 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -28,7 +28,6 @@ extern int scsi_dispatch_cmd(struct scsi_cmnd *cmd);
 extern int scsi_setup_command_freelist(struct Scsi_Host *shost);
 extern void scsi_destroy_command_freelist(struct Scsi_Host *shost);
 extern void __scsi_done(struct scsi_cmnd *cmd);
-extern int scsi_retry_command(struct scsi_cmnd *cmd);
 #ifdef CONFIG_SCSI_LOGGING
 void scsi_log_send(struct scsi_cmnd *cmd);
 void scsi_log_completion(struct scsi_cmnd *cmd, int disposition);
@@ -67,7 +66,7 @@ int scsi_eh_get_sense(struct list_head *work_q,
 /* scsi_lib.c */
 extern int scsi_maybe_unblock_host(struct scsi_device *sdev);
 extern void scsi_device_unbusy(struct scsi_device *sdev);
-extern int scsi_queue_insert(struct scsi_cmnd *cmd, int reason);
+extern int scsi_attempt_requeue_command(struct scsi_cmnd *cmd, int reason);
 extern void scsi_next_command(struct scsi_cmnd *cmd);
 extern void scsi_run_host_queues(struct Scsi_Host *shost);
 extern struct request_queue *scsi_alloc_queue(struct scsi_device *sdev);
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index d0962c6..3893f7b 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -1569,8 +1569,7 @@ fc_attach_transport(struct fc_function_template *ft)
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(roles);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(port_state);
 	SETUP_PRIVATE_RPORT_ATTRIBUTE_RD(scsi_target_id);
-	if (ft->terminate_rport_io)
-		SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(fast_io_fail_tmo);
+	SETUP_PRIVATE_RPORT_ATTRIBUTE_RW(fast_io_fail_tmo);
 
 	BUG_ON(count > FC_RPORT_NUM_ATTRS);
 
@@ -1739,6 +1738,22 @@ fc_remove_host(struct Scsi_Host *shost)
 }
 EXPORT_SYMBOL(fc_remove_host);
 
+static void fc_terminate_rport_io(struct fc_rport *rport)
+{
+	struct Scsi_Host *shost = rport_to_shost(rport);
+	struct fc_internal *i = to_fc_internal(shost->transportt);
+
+	/* Involve the LLDD if possible to terminate all io on the rport. */
+	if (i->f->terminate_rport_io)
+		i->f->terminate_rport_io(rport);
+
+	/*
+	 * must unblock to flush queued IO. The caller will have set
+	 * the port_state or flags, so that fc_remote_port_chkready will
+	 * fail IO.
+	 */
+	scsi_target_unblock(&rport->dev);
+}
 
 /**
  * fc_starget_delete - called to delete the scsi decendents of an rport
@@ -1761,8 +1776,7 @@ fc_starget_delete(void *data)
 	 */
 	if (i->f->dev_loss_tmo_callbk)
 		i->f->dev_loss_tmo_callbk(rport);
-	else if (i->f->terminate_rport_io)
-		i->f->terminate_rport_io(rport);
+	fc_terminate_rport_io(rport);
 
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (rport->flags & FC_RPORT_DEVLOSS_PENDING) {
@@ -1806,8 +1820,7 @@ fc_rport_final_delete(void *data)
 		fc_starget_delete(data);
 	else if (i->f->dev_loss_tmo_callbk)
 		i->f->dev_loss_tmo_callbk(rport);
-	else if (i->f->terminate_rport_io)
-		i->f->terminate_rport_io(rport);
+	fc_terminate_rport_io(rport);
 
 	transport_remove_device(dev);
 	device_del(dev);
@@ -2039,7 +2052,8 @@ fc_remote_port_add(struct Scsi_Host *shost, int channel,
 
 				spin_lock_irqsave(shost->host_lock, flags);
 
-				rport->flags &= ~FC_RPORT_DEVLOSS_PENDING;
+				rport->flags &= ~(FC_RPORT_FAST_FAIL_TIMEDOUT |
+						  FC_RPORT_DEVLOSS_PENDING);
 
 				/* initiate a scan of the target */
 				rport->flags |= FC_RPORT_SCAN_PENDING;
@@ -2095,6 +2109,7 @@ fc_remote_port_add(struct Scsi_Host *shost, int channel,
 			rport->port_id = ids->port_id;
 			rport->roles = ids->roles;
 			rport->port_state = FC_PORTSTATE_ONLINE;
+			rport->flags &= ~FC_RPORT_FAST_FAIL_TIMEDOUT;
 
 			if (fci->f->dd_fcrport_size)
 				memset(rport->dd_data, 0,
@@ -2177,7 +2192,6 @@ void
 fc_remote_port_delete(struct fc_rport  *rport)
 {
 	struct Scsi_Host *shost = rport_to_shost(rport);
-	struct fc_internal *i = to_fc_internal(shost->transportt);
 	int timeout = rport->dev_loss_tmo;
 	unsigned long flags;
 
@@ -2210,7 +2224,7 @@ fc_remote_port_delete(struct fc_rport  *rport)
 
 	/* see if we need to kill io faster than waiting for device loss */
 	if ((rport->fast_io_fail_tmo != -1) &&
-	    (rport->fast_io_fail_tmo < timeout) && (i->f->terminate_rport_io))
+	    (rport->fast_io_fail_tmo < timeout))
 		fc_queue_devloss_work(shost, &rport->fail_io_work,
 					rport->fast_io_fail_tmo * HZ);
 
@@ -2279,7 +2293,8 @@ fc_remote_port_rolechg(struct fc_rport  *rport, u32 roles)
 			fc_flush_devloss(shost);
 
 		spin_lock_irqsave(shost->host_lock, flags);
-		rport->flags &= ~FC_RPORT_DEVLOSS_PENDING;
+		rport->flags &= ~(FC_RPORT_FAST_FAIL_TIMEDOUT |
+				  FC_RPORT_DEVLOSS_PENDING);
 		spin_unlock_irqrestore(shost->host_lock, flags);
 
 		/* ensure any stgt delete functions are done */
@@ -2369,6 +2384,7 @@ fc_timeout_deleted_rport(void  *data)
 	rport->supported_classes = FC_COS_UNSPECIFIED;
 	rport->roles = FC_RPORT_ROLE_UNKNOWN;
 	rport->port_state = FC_PORTSTATE_NOTPRESENT;
+	rport->flags &= ~FC_RPORT_FAST_FAIL_TIMEDOUT;
 
 	/* remove the identifiers that aren't used in the consisting binding */
 	switch (fc_host->tgtid_bind_type) {
@@ -2412,13 +2428,12 @@ static void
 fc_timeout_fail_rport_io(void  *data)
 {
 	struct fc_rport *rport = (struct fc_rport *)data;
-	struct Scsi_Host *shost = rport_to_shost(rport);
-	struct fc_internal *i = to_fc_internal(shost->transportt);
 
 	if (rport->port_state != FC_PORTSTATE_BLOCKED)
 		return;
 
-	i->f->terminate_rport_io(rport);
+	rport->flags |= FC_RPORT_FAST_FAIL_TIMEDOUT;
+	fc_terminate_rport_io(rport);
 }
 
 /**
diff --git a/drivers/scsi/scsi_transport_spi.c b/drivers/scsi/scsi_transport_spi.c
index 5409b46..bbb8cdf 100644
--- a/drivers/scsi/scsi_transport_spi.c
+++ b/drivers/scsi/scsi_transport_spi.c
@@ -116,7 +116,9 @@ static int spi_execute(struct scsi_device *sdev, const void *cmd,
 	for(i = 0; i < DV_RETRIES; i++) {
 		result = scsi_execute(sdev, cmd, dir, buffer, bufflen,
 				      sense, DV_TIMEOUT, /* retries */ 1,
-				      REQ_FAILFAST);
+				      REQ_FAILFAST_DEV |
+				      REQ_FAILFAST_TRANSPORT |
+				      REQ_FAILFAST_DRIVER);
 		if (result & DRIVER_SENSE) {
 			struct scsi_sense_hdr sshdr_tmp;
 			if (!sshdr)
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 76bdaea..e8ef327 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -142,12 +142,18 @@ struct bio {
  * bit 2 -- barrier
  * bit 3 -- fail fast, don't want low level driver retries
  * bit 4 -- synchronous I/O hint: the block layer will unplug immediately
+ * bit 5 -- fail fast device errors
+ * bit 6 -- fail fast transport errors
+ * bit 7 -- fail fast driver errors
  */
 #define BIO_RW		0
 #define BIO_RW_AHEAD	1
 #define BIO_RW_BARRIER	2
 #define BIO_RW_FAILFAST	3
 #define BIO_RW_SYNC	4
+#define BIO_RW_FAILFAST_DEV		5
+#define BIO_RW_FAILFAST_TRANSPORT	6
+#define BIO_RW_FAILFAST_DRIVER		7
 
 /*
  * upper 16 bits of bi_rw define the io priority of this bio
@@ -176,7 +182,11 @@ struct bio {
 #define bio_data(bio)		(page_address(bio_page((bio))) + bio_offset((bio)))
 #define bio_barrier(bio)	((bio)->bi_rw & (1 << BIO_RW_BARRIER))
 #define bio_sync(bio)		((bio)->bi_rw & (1 << BIO_RW_SYNC))
-#define bio_failfast(bio)	((bio)->bi_rw & (1 << BIO_RW_FAILFAST))
+#define bio_failfast_dev(bio)	((bio)->bi_rw &	(1 << BIO_RW_FAILFAST_DEV))
+#define bio_failfast_transport(bio)	\
+	((bio)->bi_rw & (1 << BIO_RW_FAILFAST_TRANSPORT))
+#define bio_failfast_driver(bio) ((bio)->bi_rw & (1 << BIO_RW_FAILFAST_DRIVER))
+#define bio_failfast(bio)      ((bio)->bi_rw & (1 << BIO_RW_FAILFAST))
 #define bio_rw_ahead(bio)	((bio)->bi_rw & (1 << BIO_RW_AHEAD))
 
 /*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 616f89a..3d43fd1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -237,10 +237,16 @@ enum rq_flag_bits {
 	__REQ_ORDERED_COLOR,	/* is before or after barrier */
 	__REQ_RW_SYNC,		/* request is sync (O_DIRECT) */
 	__REQ_NR_BITS,		/* stops here */
+	__REQ_FAILFAST_DEV,	/* no driver retries of device errors */
+	__REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
+	__REQ_FAILFAST_DRIVER,	/* no driver retries of driver errors */
 };
 
 #define REQ_RW		(1 << __REQ_RW)
 #define REQ_FAILFAST	(1 << __REQ_FAILFAST)
+#define REQ_FAILFAST_DEV	(1 << __REQ_FAILFAST_DEV)
+#define REQ_FAILFAST_TRANSPORT	(1 << __REQ_FAILFAST_TRANSPORT)
+#define REQ_FAILFAST_DRIVER	(1 << __REQ_FAILFAST_DRIVER)
 #define REQ_SORTED	(1 << __REQ_SORTED)
 #define REQ_SOFTBARRIER	(1 << __REQ_SOFTBARRIER)
 #define REQ_HARDBARRIER	(1 << __REQ_HARDBARRIER)
@@ -492,9 +498,25 @@ enum {
 
 #define blk_fs_request(rq)	((rq)->flags & REQ_CMD)
 #define blk_pc_request(rq)	((rq)->flags & REQ_BLOCK_PC)
-#define blk_noretry_request(rq)	((rq)->flags & REQ_FAILFAST)
 #define blk_rq_started(rq)	((rq)->flags & REQ_STARTED)
 
+#define blk_noretry_request(rq)	((rq)->flags & REQ_FAILFAST)
+#define blk_failfast_dev(rq)	((rq)->flags & REQ_FAILFAST_DEV)
+#define blk_failfast_transport(rq) ((rq)->flags & REQ_FAILFAST_TRANSPORT)
+#define blk_failfast_driver(rq)	((rq)->flags & REQ_FAILFAST_DRIVER)
+/*
+ * For KABI compat reasons blk_noretry_request only checks REQ_FAILFAST.
+ * 3rd part SCSI drivers can continue to use this to check for
+ * upper layers requesting failfast for muliptath or read ahead.
+ *
+ * Drivers that are updated should use blk_noretry_ff_request or
+ * one of the macros for a specific type of failfast request.
+ */
+#define blk_noretry_ff_request(rq) (blk_failfast_dev(rq) ||		\
+					blk_failfast_transport(rq) ||	\
+					blk_failfast_driver(rq) ||	\
+					blk_noretry_request(rq))
+
 #define blk_account_rq(rq)	(blk_rq_started(rq) && blk_fs_request(rq))
 
 #define blk_pm_suspend_request(rq)	((rq)->flags & REQ_PM_SUSPEND)
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5b7f526..4dd674e 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -316,6 +316,11 @@ struct scsi_lun {
 #define DID_IMM_RETRY   0x0c	/* Retry without decrementing retry count  */
 #define DID_REQUEUE	0x0d	/* Requeue command (no immediate retry) also
 				 * without decrementing the retry count	   */
+#define DID_TRANSPORT_DISRUPTED 0x0e /* Transport error disrupted execution
+				      * and the driver blocked the port to
+				      * recover the link. Transport class will
+				      * retry or fail IO */
+#define DID_TRANSPORT_FAILFAST	0x0f /* Transport class fastfailed the io */
 #define DRIVER_OK       0x00	/* Driver status                           */
 
 /*
@@ -363,6 +368,54 @@ struct scsi_lun {
 #define SCSI_MLQUEUE_EH_RETRY    0x1057
 
 /*
+ * New Midlevel queue return values. To maintain KABI we keep
+ * the old values above and rename the new ones with a version
+ * number 2.
+ */
+enum {
+	/*
+	 * Retry Constraints
+	 *
+	 * SCSI_IGN_ALLOWED		: Ignore cmd retries allowed check
+	 * SCSI_IGN_BLK_FAILFAST	: Ignore blk_failfast check.
+	 */
+	SCSI_IGN_ALLOWED	= 0x01,
+	SCSI_IGN_BLK_FAILFAST	= 0x02,
+
+	SCSI_MLQUEUE_DIS_SHIFT		= 4,
+	SCSI_MLQUEUE_DIS_FINISH		= 0x01 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_RETRY		= 0x02 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_XPT_RETRY	= 0x04 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_DEV_RETRY	= 0x08 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_DRV_RETRY	= 0x10 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_FAIL		= 0x20 << SCSI_MLQUEUE_DIS_SHIFT,
+
+	SCSI_MLQUEUE_BUSY_SHIFT		= 8,
+	SCSI_MLQUEUE_HOST_BUSY2		= (0x01 << SCSI_MLQUEUE_BUSY_SHIFT) |
+		SCSI_MLQUEUE_DIS_RETRY | SCSI_IGN_BLK_FAILFAST |
+		SCSI_IGN_ALLOWED,
+	SCSI_MLQUEUE_DEVICE_BUSY2	= (0x02 << SCSI_MLQUEUE_BUSY_SHIFT) |
+		SCSI_MLQUEUE_DIS_RETRY | SCSI_IGN_BLK_FAILFAST |
+		SCSI_IGN_ALLOWED,
+	SCSI_MLQUEUE_TARGET_BUSY2	= (0x04 << SCSI_MLQUEUE_BUSY_SHIFT) |
+		SCSI_MLQUEUE_DIS_RETRY | SCSI_IGN_BLK_FAILFAST |
+		SCSI_IGN_ALLOWED,
+	SCSI_MLQUEUE_IMM_RETRY		= (0x08 << SCSI_MLQUEUE_BUSY_SHIFT) |
+		SCSI_MLQUEUE_DIS_RETRY | SCSI_IGN_BLK_FAILFAST |
+		SCSI_IGN_ALLOWED,
+};
+
+#define scsi_disposition_finish(dis) (dis & SCSI_MLQUEUE_DIS_FINISH)
+#define scsi_disposition_fail(dis) (dis & SCSI_MLQUEUE_DIS_FAIL)
+#define scsi_disposition_retry(dis)			\
+	((dis & SCSI_MLQUEUE_DIS_RETRY)		||	\
+	 (dis & SCSI_MLQUEUE_DIS_XPT_RETRY)	||	\
+	 (dis & SCSI_MLQUEUE_DIS_DEV_RETRY)	||	\
+	 (dis & SCSI_MLQUEUE_DIS_DRV_RETRY))
+#define scsi_ign_cmd_retries(dis) (dis & SCSI_IGN_ALLOWED)
+#define scsi_ign_failfast(dis) (dis & SCSI_IGN_BLK_FAILFAST)
+
+/*
  *  Use these to separate status msg and our bytes
  *
  *  These are set by:
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index 6833b96..dc33796 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -218,6 +218,7 @@ struct fc_rport {	/* aka fc_starget_attrs */
 /* bit field values for struct fc_rport "flags" field: */
 #define FC_RPORT_DEVLOSS_PENDING	0x01
 #define FC_RPORT_SCAN_PENDING		0x02
+#define FC_RPORT_FAST_FAIL_TIMEDOUT	0x03
 
 #define	dev_to_rport(d)				\
 	container_of(d, struct fc_rport, dev)
@@ -515,12 +516,15 @@ fc_remote_port_chkready(struct fc_rport *rport)
 		if (rport->roles & FC_RPORT_ROLE_FCP_TARGET)
 			result = 0;
 		else if (rport->flags & FC_RPORT_DEVLOSS_PENDING)
-			result = DID_IMM_RETRY << 16;
+			result = DID_TRANSPORT_DISRUPTED << 16;
 		else
 			result = DID_NO_CONNECT << 16;
 		break;
 	case FC_PORTSTATE_BLOCKED:
-		result = DID_IMM_RETRY << 16;
+		if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
+			result = DID_TRANSPORT_FAILFAST << 16;
+		else
+			result = DID_TRANSPORT_DISRUPTED << 16;
 		break;
 	default:
 		result = DID_NO_CONNECT << 16;