Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > fc11cd6e1c513a17304da94a5390f3cd > files > 2949

kernel-2.6.18-194.11.1.el5.src.rpm

From: AMEET M. PARANJAPE <aparanja@redhat.com>
Date: Wed, 8 Oct 2008 12:10:52 -0400
Subject: [ppc64] fix race for a free SPU
Message-id: 20081008161052.31420.16764.sendpatchset@squad5-lp1.lab.bos.redhat.com
O-Subject: [PATCH RHEL5.3 BZ465581] Fix race for a free SPU
Bugzilla: 465581
RH-Acked-by: David Howells <dhowells@redhat.com>

RHBZ#
======
https://bugzilla.redhat.com/show_bug.cgi?id=465581

Description:
===========
There is currently a race for a free Synergistic Processing Element (SPE)
where one thread is doing a spu_yield() and another doing a spu_activate().

This change introduces a 'free_spu' flag to spu_unschedule, to indicate
whether or not the function should free the spu after descheduling the context.
We only set this flag if we're not going to re-schedule another context on
this SPU.

RHEL Version Found:
================
RHEL 5.2

kABI Status:
============
No symbols were harmed.

Brew:
=====
Built on all platforms.
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1507204

Upstream Status:
================
This patch was accepted upstream in kernel 2.6.27-rc6:
http://kernel.org/pub/linux/kernel/v2.6/testing/ChangeLog-2.6.27-rc6

Test Status:
============
Without this patch the system locks up and shows and oops message on shutdown.
A testcase is provided in the Red Hat Bugzilla that causes this behavior in
less than one hour.

With the patch applied the testcase is started and the system runs normally
without any of the hangs.  This test was run for over 24 hours.

===============================================================

Ameet Paranjape 978-392-3903 ext 23903
IBM on-site partner

Proposed Patch:
===============

diff --git a/arch/powerpc/platforms/cell/spufs/sched.c b/arch/powerpc/platforms/cell/spufs/sched.c
index 53ca21b..864930a 100644
--- a/arch/powerpc/platforms/cell/spufs/sched.c
+++ b/arch/powerpc/platforms/cell/spufs/sched.c
@@ -713,13 +713,28 @@ static void spu_schedule(struct spu *spu, struct spu_context *ctx)
 	spu_release(ctx);
 }
 
-static void spu_unschedule(struct spu *spu, struct spu_context *ctx)
+/**
+ * spu_unschedule - remove a context from a spu, and possibly release it.
+ * @spu:	The SPU to unschedule from
+ * @ctx:	The context currently scheduled on the SPU
+ * @free_spu	Whether to free the SPU for other contexts
+ *
+ * Unbinds the context @ctx from the SPU @spu. If @free_spu is non-zero, the
+ * SPU is made available for other contexts (ie, may be returned by
+ * spu_get_idle). If this is zero, the caller is expected to schedule another
+ * context to this spu.
+ *
+ * Should be called with ctx->state_mutex held.
+ */
+static void spu_unschedule(struct spu *spu, struct spu_context *ctx,
+		int free_spu)
 {
 	int node = spu->node;
 
 	mutex_lock(&cbe_spu_info[node].list_mutex);
 	cbe_spu_info[node].nr_active--;
-	spu->alloc_state = SPU_FREE;
+	if (free_spu)
+		spu->alloc_state = SPU_FREE;
 	spu_unbind_context(spu, ctx);
 	ctx->stats.invol_ctx_switch++;
 	spu->stats.invol_ctx_switch++;
@@ -819,7 +834,7 @@ static int __spu_deactivate(struct spu_context *ctx, int force, int max_prio)
 	if (spu) {
 		new = grab_runnable_context(max_prio, spu->node);
 		if (new || force) {
-			spu_unschedule(spu, ctx);
+			spu_unschedule(spu, ctx, new == NULL);
 			if (new) {
 				if (new->flags & SPU_CREATE_NOSCHED)
 					wake_up(&new->stop_wq);
@@ -887,7 +902,7 @@ static noinline void spusched_tick(struct spu_context *ctx)
 	spu = ctx->spu;
 	new = grab_runnable_context(ctx->prio + 1, spu->node);
 	if (new) {
-		spu_unschedule(spu, ctx);
+		spu_unschedule(spu, ctx, 0);
 		if (test_bit(SPU_SCHED_SPU_RUN, &ctx->sched_flags))
 			spu_add_to_rq(ctx);
 	} else {