Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 27922b4260f65d317aabda37e42bbbff > files > 3101

kernel-2.6.18-238.el5.src.rpm

From: mchristi@redhat.com <mchristi@redhat.com>
Date: Wed, 11 Feb 2009 18:11:43 -0600
Subject: [net] skbuff: fix oops in skb_seq_read
Message-id: 1234397503674-git-send-email-mchristi@redhat.com
O-Subject: [PATCH] RHEL 5.4: net: fix oops in skb_seq_read
Bugzilla: 483285
RH-Acked-by: Neil Horman <nhorman@redhat.com>
RH-Acked-by: David Miller <davem@redhat.com>
RH-Acked-by: Thomas Graf <tgraf@redhat.com>

From: Mike Christie <mchristi@redhat.com>

This is for BZ 483285.

This patch fixes multiple bugs in skb_seq_read.

>From the usptream commits:

-------------------------------------------------------
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=95e3b24cfb4ec0479d2c42f7a1780d68063a542a
The frag_list handling was broken in skb_seq_read:

1) We didn't add the stepped offset when looking at the head
are of fragments other than the first.

2) We didn't take the stepped offset away when setting the data
pointer in the head area.

3) The frag index wasn't reset

-------------------------------------------------------
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=71b3346d182355f19509fadb8fe45114a35cc499

The st pointer is shifted to the next skb whereas actually
it should have hit the second else if first since the data is in the
frag_list.

-------------------------------------------------------
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=5b5a60da281c767196427ce8144deae6ec46b389

Having walked through the entire skbuff, skb_seq_read would leave the
last fragment mapped.  As a consequence, the unwary caller would leak
kmaps, and proceed with preempt_count off by one. The only (kind of
non-intuitive) workaround is to use skb_seq_read_abort.

I have tested with a couple ixgbe drivers from intel's website with LRO
enabled. Some there do not build becuase of the missing net if lro, but
others do compile and have a warning to not use with iscsi because of
its use of skb_seq_read. Our iscsi driver in 5.3 does not use
skb_seq_read, but I am porting iscsi fixes for 5.4 that will use
it, so this is why I am sending the net fixes now.

With or without the intel LRO driver I did not hit the oops. Upstream,
I also did not hit the oops like intel and lsi did. I would see data
corruption every once and a while instead (did not see that with rhel
though).

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 750ec6f..836cc5f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1720,11 +1720,11 @@ void skb_prepare_seq_read(struct sk_buff *skb, unsigned int from,
  * of bytes already consumed and the next call to
  * skb_seq_read() will return the remaining part of the block.
  *
- * Note: The size of each block of data returned can be arbitary,
+ * Note 1: The size of each block of data returned can be arbitary,
  *       this limitation is the cost for zerocopy seqeuental
  *       reads of potentially non linear data.
  *
- * Note: Fragment lists within fragments are not implemented
+ * Note 2: Fragment lists within fragments are not implemented
  *       at the moment, state->root_skb could be replaced with
  *       a stack for this purpose.
  */
@@ -1738,10 +1738,10 @@ unsigned int skb_seq_read(unsigned int consumed, const u8 **data,
 		return 0;
 
 next_skb:
-	block_limit = skb_headlen(st->cur_skb);
+	block_limit = skb_headlen(st->cur_skb) + st->stepped_offset;
 
 	if (abs_offset < block_limit) {
-		*data = st->cur_skb->data + abs_offset;
+		*data = st->cur_skb->data + (abs_offset - st->stepped_offset);
 		return block_limit - abs_offset;
 	}
 
@@ -1771,13 +1771,19 @@ next_skb:
 		st->stepped_offset += frag->size;
 	}
 
-	if (st->cur_skb->next) {
-		st->cur_skb = st->cur_skb->next;
+	if (st->frag_data) {
+		kunmap_skb_frag(st->frag_data);
+		st->frag_data = NULL;
+	}
+
+	if (st->root_skb == st->cur_skb &&
+	    skb_shinfo(st->root_skb)->frag_list) {
+		st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
 		st->frag_idx = 0;
 		goto next_skb;
-	} else if (st->root_skb == st->cur_skb &&
-		   skb_shinfo(st->root_skb)->frag_list) {
-		st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
+	} else if (st->cur_skb->next) {
+		st->cur_skb = st->cur_skb->next;
+		st->frag_idx = 0;
 		goto next_skb;
 	}