Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > fc11cd6e1c513a17304da94a5390f3cd > files > 4412

kernel-2.6.18-194.11.1.el5.src.rpm

From: Bhavna Sarathy <bnagendr@redhat.com>
Date: Fri, 31 Jul 2009 14:04:39 -0400
Subject: [xen] amd iommu: crash with pass-through on large memory
Message-id: 4A733237.40103@redhat.com
O-Subject: Re: [RHEL5.4 Xen PATCH] Fix system crash issue on AMD IOMMU with large memory
Bugzilla: 514910
RH-Acked-by: Prarit Bhargava <prarit@redhat.com>
RH-Acked-by: Markus Armbruster <armbru@redhat.com>
RH-Acked-by: Rik van Riel <riel@redhat.com>

Posting patch with description as requested by Don, it has notes
regarding upstream.

Resolves BZ 514910

AMD QA found issue with device pass-through to an HVM guest RHEL 5.4 Xen
with IOMMU enabled, results in a crash. The circumstances under which this
system crash happens  - host memory >= 16G, and guest memory >= 1024.
This is an entire system crash, system reboots.

root cause: sometimes alloc_domheap_page() might return NULL pointer, which
causes map_domain_page() to crash the whole system. The solution is to check
page pointer returned by alloc_domheap_page() in alloc_amd_iommu_pgtable().
With the patch shown below, the crash disappears.

For now this is a RHEL specific fix as upstream IOMMU implementation uses
dom heap, and RHEL (3.1.2) uses xen heap.  If upstream as similar issue
(testing ongoing), same patch will be submitted upstream.

We are finding that with large memory sizes the IO page table is huge,
but it needs to be allocated on the xen heap which is fixed size.
This is the reason for resulting in the NULL pointer.  The fix is simple,
contained to AMD IOMMU code base, and appears as good programming practice
to check pointer and return if NULL.

Patch tested on several Toonie IOMMU systems, and fixes the crash seen
with >= 16G memory, 1024 guest memory.  See BZ for further results.

Requested exception. Please review and ACK for inclusion in RHEL5.4.

diff --git a/include/asm-x86/hvm/svm/amd-iommu-proto.h b/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 4dc1577..cf2d60f 100644
--- a/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -122,6 +122,8 @@ static inline struct page_info* alloc_amd_iommu_pgtable(void)
     void *vaddr;
 
     pg = alloc_domheap_page(NULL);
+    if ( pg == NULL )
+        return 0;
     vaddr = map_domain_page(page_to_mfn(pg));
     if ( !vaddr )
         return 0;