Sophie

Sophie

distrib > Scientific%20Linux > 5x > x86_64 > by-pkgid > 89877e42827f16fa5f86b1df0c2860b1 > files > 1227

kernel-2.6.18-128.1.10.el5.src.rpm

From: Roland McGrath <roland@redhat.com>
Subject: Re: [RHEL5.1 patch] utrace update
Date: Wed, 20 Jun 2007 05:26:34 -0700 (PDT)
Bugzilla: 229886 228397 217809 210693
Message-Id: <20070620122634.614194D05DC@magilla.localdomain>
Changelog: [misc] utrace update


This fixes lots of problems in the utrace layer and in utrace-based ptrace.
Related BZ#s:

229886
228397
225531
217809
210693

Deadline or no, there will be more utrace changes after this.  I am still
working on some known bugs.  These are all either potential crashers, or
regressions from upstream and RHEL4.  Since I know I will be freshening this
soon, I am not putting any stock in serious testing with this version yet.
The troubled code will be changing more anyway.  But none of the backporting
issues should be affected by that, and those are covered or close to now.

I think I may have created a mild kabi snafu.  I meant to keep utrace out
of kabi changes, but I never thought much about what genksyms does exactly.
There is no actual compatibility issue, just a symversions issue.  The
"struct utrace" type is entirely private to code in kernel/utrace.c, but
task_struct has a member "struct utrace *utrace".  The paches have changed
this type (which is now defined privately in one source file).  In fact,
everything would be happy if the field were "void *utrace".  But I get the
impression now that genksyms will have used the old "struct utrace"
definition in the signature of anything using task_struct.  Am I right that
this is now screwed up?  How do I fix it?  Where is the easy test to run to
see if you changed symversions?

This obsoletes linux-2.6-utrace-exploit-and-unkillable-cpu-fixes.patch.
It also obsoletes the last couple hunks of linux-2.6-proc-self-maps-fix.patch
and for that I'm appending a replacement version of that patch.

interdiff did surprisingly well, so what I'm posting here is the interdiff
between the old and new linux-2.6-utrace.patch files.  But I don't suggest
this be committed.  Instead, I'd replace linux-2.6-utrace.patch with the new
version, which you can find in ~roland/dist/kernel/RHEL-5/ on devserv.

In fact, what I would really prefer is to replace linux-2.6-utrace.patch
with my 12-patch series at http://people.redhat.com/roland/utrace/2.6.18/
as individual patches in the spec/cvs.  I've started using quilt to
maintain the backports now.  Future utrace fixes are likely to touch only
one or two of the patches in the series, which are the ones that are taken
just about verbatim from the "upstream" utrace series.  The earlier patches
in the series are the ones with all the backporting work, and they are
unlikely to change.  I think it would be easiest to manage their future
changes in dist cvs as separate patches.

Of course, we have diff and sh, so I can produce whatever you want for the
RHEL-5 cvs.  But the way I'm actually going to be doing the backport
maintenance (for some time to come) is by wholesale fresh backports of the
new "upstream" utrace core code after making fixes there, not by
incremental patches to old versions.  It's just the only reasonable way to
deal with the utrace core and ptrace code, which might take another update
cycle or two to really settle down.


Thanks,
Roland

--- linux-2.6.18/arch/arm/kernel/ptrace.c
+++ linux-2.6.18/arch/arm/kernel/ptrace.c
@@ -812,18 +812,19 @@ asmlinkage int syscall_trace(int why, st
 {
 	unsigned long ip;
 
-	if (test_thread_flag(TIF_SYSCALL_TRACE)) {
-		/*
-		 * Save IP.  IP is used to denote syscall entry/exit:
-		 *  IP = 0 -> entry, = 1 -> exit
-		 */
-		ip = regs->ARM_ip;
-		regs->ARM_ip = why;
+	if (!test_thread_flag(TIF_SYSCALL_TRACE))
+		return scno;
 
-		tracehook_report_syscall(regs, why);
+	/*
+	 * Save IP.  IP is used to denote syscall entry/exit:
+	 *  IP = 0 -> entry, = 1 -> exit
+	 */
+	ip = regs->ARM_ip;
+	regs->ARM_ip = why;
 
-		regs->ARM_ip = ip;
-	}
+	current->ptrace_message = scno;
 
-	return scno;
+	tracehook_report_syscall(regs, why);
+
+	return current->ptrace_message;
 }
--- linux-2.6.18/arch/i386/kernel/ptrace.c
+++ linux-2.6.18/arch/i386/kernel/ptrace.c
@@ -29,6 +29,7 @@
 #include <asm/debugreg.h>
 #include <asm/ldt.h>
 #include <asm/desc.h>
+#include <asm/tracehook.h>
 
 
 /*
@@ -85,33 +86,45 @@ static int putreg(struct task_struct *ch
 	unsigned long regno, unsigned long value)
 {
 	switch (regno >> 2) {
-		case FS:
-			if (value && (value & 3) != 3)
-				return -EIO;
-			child->thread.fs = value;
-			return 0;
-		case GS:
-			if (value && (value & 3) != 3)
-				return -EIO;
-			child->thread.gs = value;
-			return 0;
-		case DS:
-		case ES:
-			if (value && (value & 3) != 3)
-				return -EIO;
-			value &= 0xffff;
-			break;
-		case SS:
-		case CS:
-			if ((value & 3) != 3)
-				return -EIO;
-			value &= 0xffff;
-			break;
-		case EFL:
-			value &= FLAG_MASK;
-			value |= get_stack_long(child, EFL_OFFSET) & ~FLAG_MASK;
-			clear_tsk_thread_flag(child, TIF_FORCED_TF);
-			break;
+	case FS:
+		if (value && (value & 3) != 3)
+			return -EIO;
+		child->thread.fs = value;
+		if (child == current)
+			/*
+			 * The user-mode %gs is not affected by
+			 * kernel entry, so we must update the CPU.
+			 */
+			loadsegment(fs, value);
+		return 0;
+	case GS:
+		if (value && (value & 3) != 3)
+			return -EIO;
+		child->thread.gs = value;
+		if (child == current)
+			/*
+			 * The user-mode %gs is not affected by
+			 * kernel entry, so we must update the CPU.
+			 */
+			loadsegment(gs, value);
+		return 0;
+	case DS:
+	case ES:
+		if (value && (value & 3) != 3)
+			return -EIO;
+		value &= 0xffff;
+		break;
+	case SS:
+	case CS:
+		if ((value & 3) != 3)
+			return -EIO;
+		value &= 0xffff;
+		break;
+	case EFL:
+		value &= FLAG_MASK;
+		value |= get_stack_long(child, EFL_OFFSET) & ~FLAG_MASK;
+		clear_tsk_thread_flag(child, TIF_FORCED_TF);
+		break;
 	}
 	if (regno > GS*4)
 		regno -= 2*4;
@@ -125,29 +138,27 @@ static unsigned long getreg(struct task_
 	unsigned long retval = ~0UL;
 
 	switch (regno >> 2) {
-		case FS:
-			retval = child->thread.fs;
-			break;
-		case GS:
-			retval = child->thread.gs;
-			break;
-		case EFL:
-			if (test_tsk_thread_flag(child, TIF_FORCED_TF))
-				retval &= ~X86_EFLAGS_TF;
-			goto fetch;
-		case DS:
-		case ES:
-		case SS:
-		case CS:
-			retval = 0xffff;
-			/* fall through */
-		default:
-		fetch:
-			if (regno > GS*4)
-				regno -= 2*4;
-			regno = regno - sizeof(struct pt_regs);
-			retval &= get_stack_long(child, regno);
-			break;
+	case FS:
+		retval = child->thread.fs;
+		if (child == current)
+			savesegment(fs, retval);
+		break;
+	case GS:
+		retval = child->thread.gs;
+		if (child == current)
+			savesegment(gs, retval);
+		break;
+	case DS:
+	case ES:
+	case SS:
+	case CS:
+		retval = 0xffff;
+		/* fall through */
+	default:
+		if (regno > GS*4)
+			regno -= 2*4;
+		regno = regno - sizeof(struct pt_regs);
+		retval &= get_stack_long(child, regno);
 	}
 	return retval;
 }
@@ -313,7 +324,6 @@ genregs_set(struct task_struct *target,
 		}
 	}
 	else {
-		int ret = 0;
 		const unsigned long __user *up = ubuf;
 		while (!ret && count > 0) {
 			unsigned long val;
@@ -540,7 +550,7 @@ dbregs_set(struct task_struct *target,
 		else
 			clear_tsk_thread_flag(target, TIF_DEBUG);
 
-	set:
+set:
 		target->thread.debugreg[pos] = val;
 		if (target == current)
 			switch (pos) {
@@ -724,24 +734,29 @@ static const struct utrace_regset native
 	},
 };
 
-const struct utrace_regset_view utrace_i386_native = {
+
+static const struct utrace_regset_view utrace_i386_native = {
 	.name = "i386", .e_machine = EM_386,
-	.regsets = native_regsets,
-	.n = sizeof native_regsets / sizeof native_regsets[0],
+	.regsets = native_regsets, .n = ARRAY_SIZE(native_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_i386_native);
+
+const struct utrace_regset_view *utrace_native_view(struct task_struct *tsk)
+{
+	return &utrace_i386_native;
+}
 
 #ifdef CONFIG_PTRACE
 static const struct ptrace_layout_segment i386_uarea[] = {
 	{0, FRAME_SIZE*4, 0, 0},
+	{FRAME_SIZE*4, offsetof(struct user, u_debugreg[0]), -1, 0},
 	{offsetof(struct user, u_debugreg[0]),
 	 offsetof(struct user, u_debugreg[8]), 4, 0},
 	{0, 0, -1, 0}
 };
 
-fastcall int arch_ptrace(long *req, struct task_struct *child,
-			 struct utrace_attached_engine *engine,
-			 unsigned long addr, unsigned long data, long *val)
+int arch_ptrace(long *req, struct task_struct *child,
+		struct utrace_attached_engine *engine,
+		unsigned long addr, unsigned long data, long *val)
 {
 	switch (*req) {
 	case PTRACE_PEEKUSR:
--- linux-2.6.18/arch/i386/math-emu/fpu_entry.c
+++ linux-2.6.18/arch/i386/math-emu/fpu_entry.c
@@ -25,7 +25,6 @@
  +---------------------------------------------------------------------------*/
 
 #include <linux/signal.h>
-#include <linux/ptrace.h>
 
 #include <asm/uaccess.h>
 #include <asm/desc.h>
@@ -211,9 +210,8 @@ asmlinkage void math_emulate(long arg)
       if ( code_limit < code_base ) code_limit = 0xffffffff;
     }
 
-  FPU_lookahead = 1;
-  if (current->ptrace & PT_PTRACED)
-    FPU_lookahead = 0;
+  /* Don't run ahead if single-stepping.  */
+  FPU_lookahead = (FPU_EFLAGS & X86_EFLAGS_TF) == 0;
 
   if ( !valid_prefix(&byte1, (u_char __user **)&FPU_EIP,
 		     &addr_modes.override) )
--- linux-2.6.18/arch/ia64/ia32/sys_ia32.c
+++ linux-2.6.18/arch/ia64/ia32/sys_ia32.c
@@ -2332,10 +2332,8 @@ static const struct utrace_regset ia32_r
 
 const struct utrace_regset_view utrace_ia32_view = {
 	.name = "i386", .e_machine = EM_386,
-	.regsets = ia32_regsets,
-	.n = sizeof ia32_regsets / sizeof ia32_regsets[0],
+	.regsets = ia32_regsets, .n = ARRAY_SIZE(ia32_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_ia32_view);
 #endif
 
 #ifdef CONFIG_PTRACE
--- linux-2.6.18/arch/ia64/kernel/ptrace.c
+++ linux-2.6.18/arch/ia64/kernel/ptrace.c
@@ -763,6 +763,11 @@ syscall_trace_leave (long arg0, long arg
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall(&regs, 1);
+
+	if (test_thread_flag(TIF_SINGLESTEP)) {
+		force_sig(SIGTRAP, current); /* XXX */
+		tracehook_report_syscall_step(&regs);
+	}
 }
 
 
@@ -1540,14 +1545,21 @@ static const struct utrace_regset native
 	}
 };
 
-const struct utrace_regset_view utrace_ia64_native = {
+static const struct utrace_regset_view utrace_ia64_native = {
 	.name = "ia64",
 	.e_machine = EM_IA_64,
-	.regsets = native_regsets,
-	.n = sizeof native_regsets / sizeof native_regsets[0],
+	.regsets = native_regsets, .n = ARRAY_SIZE(native_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_ia64_native);
 
+const struct utrace_regset_view *utrace_native_view(struct task_struct *tsk)
+{
+#ifdef CONFIG_IA32_SUPPORT
+	extern const struct utrace_regset_view utrace_ia32_view;
+	if (IS_IA32_PROCESS(task_pt_regs(tsk)))
+		return &utrace_ia32_view;
+#endif
+	return &utrace_ia64_native;
+}
 #endif	/* CONFIG_UTRACE */
 
 
@@ -1562,22 +1574,26 @@ static const struct ptrace_layout_segmen
 	{WORD(cfm, 1),			0,	ELF_CFM_OFFSET},
 	{WORD(cr_ipsr, 1),		0,	ELF_CR_IPSR_OFFSET},
 	{WORD(pr, 1),			0,	ELF_PR_OFFSET},
-	{WORD(gr[0], 32),		0,	ELF_GR_OFFSET(0)},
+	{WORD(gr[0], 1),		-1,	-1},
+	{WORD(gr[1], 31),		0,	ELF_GR_OFFSET(1)},
 	{WORD(br[0], 8),		0, 	ELF_BR_OFFSET(0)},
-	{WORD(ar[0], 16),		-1,	0},
+	{WORD(ar[0], 16),		-1,	-1},
 	{WORD(ar[PT_AUR_RSC], 4),	0,	ELF_AR_RSC_OFFSET},
-	{WORD(ar[PT_AUR_RNAT+1], 12),	-1,	0},
+	{WORD(ar[PT_AUR_RNAT+1], 12),	-1,	-1},
 	{WORD(ar[PT_AUR_CCV], 1),	0,	ELF_AR_CCV_OFFSET},
-	{WORD(ar[PT_AUR_CCV+1], 3),	-1,	0},
+	{WORD(ar[PT_AUR_CCV+1], 3),	-1,	-1},
 	{WORD(ar[PT_AUR_UNAT], 1), 	0,	ELF_AR_UNAT_OFFSET},
-	{WORD(ar[PT_AUR_UNAT+1], 3),	-1,	0},
+	{WORD(ar[PT_AUR_UNAT+1], 3),	-1,	-1},
 	{WORD(ar[PT_AUR_FPSR], 1), 	0,	ELF_AR_FPSR_OFFSET},
-	{WORD(ar[PT_AUR_FPSR+1], 24), 	-1,	0},
+	{WORD(ar[PT_AUR_FPSR+1], 23), 	-1,	-1},
 	{WORD(ar[PT_AUR_PFS], 3),  	0,	ELF_AR_PFS_OFFSET},
-	{WORD(ar[PT_AUR_EC+1], 62),	-1,	0},
+	{WORD(ar[PT_AUR_EC+1], 62),	-1,	-1},
 	{offsetof(struct pt_all_user_regs, fr[0]),
+	 offsetof(struct pt_all_user_regs, fr[2]),
+	 -1, -1},
+	{offsetof(struct pt_all_user_regs, fr[2]),
 	 offsetof(struct pt_all_user_regs, fr[128]),
-	 1, 0},
+	 1, 2 * sizeof(elf_fpreg_t)},
 	{0, 0, -1, 0}
 };
 #undef WORD
@@ -1618,9 +1634,9 @@ static const struct ptrace_layout_segmen
 };
 #undef NEXT
 
-fastcall int arch_ptrace(long *request, struct task_struct *child,
-			 struct utrace_attached_engine *engine,
-			 unsigned long addr, unsigned long data, long *val)
+int arch_ptrace(long *request, struct task_struct *child,
+		struct utrace_attached_engine *engine,
+		unsigned long addr, unsigned long data, long *val)
 {
 	int ret = -ENOSYS;
 	switch (*request) {
--- linux-2.6.18/arch/powerpc/kernel/ptrace.c
+++ linux-2.6.18/arch/powerpc/kernel/ptrace.c
@@ -342,12 +342,10 @@ static const struct utrace_regset native
 #endif
 };
 
-const struct utrace_regset_view utrace_ppc_native_view = {
+static const struct utrace_regset_view utrace_ppc_native_view = {
 	.name = UTS_MACHINE, .e_machine = ELF_ARCH,
-	.regsets = native_regsets,
-	.n = sizeof native_regsets / sizeof native_regsets[0],
+	.regsets = native_regsets, .n = ARRAY_SIZE(native_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_ppc_native_view);
 
 
 #ifdef CONFIG_PPC64
@@ -455,14 +453,21 @@ static const struct utrace_regset ppc32_
 	},
 };
 
-const struct utrace_regset_view utrace_ppc32_view = {
+static const struct utrace_regset_view utrace_ppc32_view = {
 	.name = "ppc", .e_machine = EM_PPC,
-	.regsets = ppc32_regsets,
-	.n = sizeof ppc32_regsets / sizeof ppc32_regsets[0],
+	.regsets = ppc32_regsets, .n = ARRAY_SIZE(ppc32_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_ppc32_view);
 #endif
 
+const struct utrace_regset_view *utrace_native_view(struct task_struct *tsk)
+{
+#ifdef CONFIG_PPC64
+	if (test_tsk_thread_flag(tsk, TIF_32BIT))
+		return &utrace_ppc32_view;
+#endif
+	return &utrace_ppc_native_view;
+}
+
 
 #ifdef CONFIG_PTRACE
 static const struct ptrace_layout_segment ppc_uarea[] = {
@@ -471,9 +476,9 @@ static const struct ptrace_layout_segmen
 	{0, 0, -1, 0}
 };
 
-fastcall int arch_ptrace(long *request, struct task_struct *child,
-			 struct utrace_attached_engine *engine,
-			 unsigned long addr, unsigned long data, long *val)
+int arch_ptrace(long *request, struct task_struct *child,
+		struct utrace_attached_engine *engine,
+		unsigned long addr, unsigned long data, long *val)
 {
 	switch (*request) {
 	case PTRACE_PEEKUSR:
@@ -533,11 +538,11 @@ static const struct ptrace_layout_segmen
 	{0, 0, -1, 0}
 };
 
-fastcall int arch_compat_ptrace(compat_long_t *request,
-				struct task_struct *child,
-				struct utrace_attached_engine *engine,
-				compat_ulong_t addr, compat_ulong_t data,
-				compat_long_t *val)
+int arch_compat_ptrace(compat_long_t *request,
+		       struct task_struct *child,
+		       struct utrace_attached_engine *engine,
+		       compat_ulong_t addr, compat_ulong_t data,
+		       compat_long_t *val)
 {
 	void __user *uaddr = (void __user *) (unsigned long) addr;
 	int ret = -ENOSYS;
--- linux-2.6.18/arch/powerpc/platforms/cell/spufs/run.c
+++ linux-2.6.18/arch/powerpc/platforms/cell/spufs/run.c
@@ -166,7 +166,6 @@ static inline int spu_run_fini(struct sp
 
 	if (signal_pending(current))
 		ret = -ERESTARTSYS;
-
 	return ret;
 }
 
--- linux-2.6.18/arch/s390/kernel/compat_wrapper.S
+++ linux-2.6.18/arch/s390/kernel/compat_wrapper.S
@@ -121,7 +121,7 @@ sys32_ptrace_wrapper:
 	lgfr	%r3,%r3			# long
 	llgtr	%r4,%r4			# long
 	llgfr	%r5,%r5			# long
-	jg	sys_ptrace		# branch to system call
+	jg	compat_sys_ptrace	# branch to system call
 
 	.globl  sys32_alarm_wrapper 
 sys32_alarm_wrapper:
--- linux-2.6.18/arch/s390/kernel/ptrace.c
+++ linux-2.6.18/arch/s390/kernel/ptrace.c
@@ -311,12 +311,10 @@ static const struct utrace_regset native
 	},
 };
 
-const struct utrace_regset_view utrace_s390_native_view = {
+static const struct utrace_regset_view utrace_s390_native_view = {
 	.name = UTS_MACHINE, .e_machine = ELF_ARCH,
-	.regsets = native_regsets,
-	.n = sizeof native_regsets / sizeof native_regsets[0],
+	.regsets = native_regsets, .n = ARRAY_SIZE(native_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_s390_native_view);
 
 
 #ifdef CONFIG_COMPAT
@@ -561,14 +559,21 @@ static const struct utrace_regset s390_c
 	},
 };
 
-const struct utrace_regset_view utrace_s390_compat_view = {
+static const struct utrace_regset_view utrace_s390_compat_view = {
 	.name = "s390", .e_machine = EM_S390,
-	.regsets = s390_compat_regsets,
-	.n = sizeof s390_compat_regsets / sizeof s390_compat_regsets[0],
+	.regsets = s390_compat_regsets, .n = ARRAY_SIZE(s390_compat_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_s390_compat_view);
 #endif	/* CONFIG_COMPAT */
 
+const struct utrace_regset_view *utrace_native_view(struct task_struct *tsk)
+{
+#ifdef CONFIG_COMPAT
+        if (test_tsk_thread_flag(tsk, TIF_31BIT))
+                return &utrace_s390_compat_view;
+#endif
+        return &utrace_s390_native_view;
+}
+
 
 #ifdef CONFIG_PTRACE
 static const struct ptrace_layout_segment s390_uarea[] = {
@@ -579,9 +584,9 @@ static const struct ptrace_layout_segmen
 	{0, 0, -1, 0}
 };
 
-fastcall int arch_ptrace(long *request, struct task_struct *child,
-			 struct utrace_attached_engine *engine,
-			 unsigned long addr, unsigned long data, long *val)
+int arch_ptrace(long *request, struct task_struct *child,
+		struct utrace_attached_engine *engine,
+		unsigned long addr, unsigned long data, long *val)
 {
 	ptrace_area parea;
 	unsigned long tmp;
@@ -589,8 +594,49 @@ fastcall int arch_ptrace(long *request, 
 
 	switch (*request) {
 	case PTRACE_PEEKUSR:
+#ifdef CONFIG_64BIT
+		/*
+		 * Stupid gdb peeks/pokes the access registers in 64 bit with
+		 * an alignment of 4. Programmers from hell...
+		 */
+		if (addr >= PT_ACR0 && addr < PT_ACR15) {
+			if (addr & 3)
+				return -EIO;
+			tmp = *(unsigned long *)
+				((char *) child->thread.acrs + addr - PT_ACR0);
+			return put_user(tmp, (unsigned long __user *) data);
+		}
+		else if (addr == PT_ACR15) {
+			/*
+			 * Very special case: old & broken 64 bit gdb reading
+			 * from acrs[15]. Result is a 64 bit value. Read the
+			 * 32 bit acrs[15] value and shift it by 32. Sick...
+			 */
+			tmp = ((unsigned long) child->thread.acrs[15]) << 32;
+			return put_user(tmp, (unsigned long __user *) data);
+		}
+#endif
 		return ptrace_peekusr(child, engine, s390_uarea, addr, data);
 	case PTRACE_POKEUSR:
+#ifdef CONFIG_64BIT
+		if (addr >= PT_ACR0 && addr < PT_ACR15) {
+			if (addr & 3)
+				return -EIO;
+			*(unsigned long *) ((char *) child->thread.acrs
+					    + addr - PT_ACR0) = data;
+			return 0;
+		}
+		else if (addr == PT_ACR15) {
+			/*
+			 * Very special case: old & broken 64 bit gdb writing
+			 * to acrs[15] with a 64 bit value. Ignore the lower
+			 * half of the value and write the upper 32 bit to
+			 * acrs[15]. Sick...
+			 */
+			child->thread.acrs[15] = data >> 32;
+			return 0;
+		}
+#endif
 		return ptrace_pokeusr(child, engine, s390_uarea, addr, data);
 
 	case PTRACE_PEEKUSR_AREA:
@@ -641,11 +687,11 @@ static const struct ptrace_layout_segmen
 	{0, 0, -1, 0}
 };
 
-fastcall int arch_compat_ptrace(compat_long_t *request,
-				struct task_struct *child,
-				struct utrace_attached_engine *engine,
-				compat_ulong_t addr, compat_ulong_t data,
-				compat_long_t *val)
+int arch_compat_ptrace(compat_long_t *request,
+		       struct task_struct *child,
+		       struct utrace_attached_engine *engine,
+		       compat_ulong_t addr, compat_ulong_t data,
+		       compat_long_t *val)
 {
 	ptrace_area_emu31 parea;
 
--- linux-2.6.18/arch/sparc64/kernel/ptrace.c
+++ linux-2.6.18/arch/sparc64/kernel/ptrace.c
@@ -254,12 +254,10 @@ static const struct utrace_regset native
 	},
 };
 
-const struct utrace_regset_view utrace_sparc64_native_view = {
+static const struct utrace_regset_view utrace_sparc64_native_view = {
 	.name = UTS_MACHINE, .e_machine = ELF_ARCH,
-	.regsets = native_regsets,
-	.n = sizeof native_regsets / sizeof native_regsets[0],
+	.regsets = native_regsets, .n = ARRAY_SIZE(native_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_sparc64_native_view);
 
 #ifdef CONFIG_COMPAT
 
@@ -593,15 +591,23 @@ static const struct utrace_regset sparc3
 	},
 };
 
-const struct utrace_regset_view utrace_sparc32_view = {
+static const struct utrace_regset_view utrace_sparc32_view = {
 	.name = "sparc", .e_machine = EM_SPARC,
-	.regsets = sparc32_regsets,
-	.n = sizeof sparc32_regsets / sizeof sparc32_regsets[0],
+	.regsets = sparc32_regsets, .n = ARRAY_SIZE(sparc32_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_sparc32_view);
 
 #endif	/* CONFIG_COMPAT */
 
+const struct utrace_regset_view *utrace_native_view(struct task_struct *tsk)
+{
+#ifdef CONFIG_COMPAT
+	if (test_tsk_thread_flag(tsk, TIF_32BIT))
+		return &utrace_sparc32_view;
+#endif
+	return &utrace_sparc64_native_view;
+}
+
+
 /* To get the necessary page struct, access_process_vm() first calls
  * get_user_pages().  This has done a flush_dcache_page() on the
  * accessed page.  Then our caller (copy_{to,from}_user_page()) did
--- linux-2.6.18/arch/x86_64/ia32/ia32entry.S
+++ linux-2.6.18/arch/x86_64/ia32/ia32entry.S
@@ -318,7 +318,7 @@ ENTRY(ia32_syscall)
 	jnz ia32_tracesys
 ia32_do_syscall:	
 	cmpl $(IA32_NR_syscalls-1),%eax
-	ja  ia32_badsys
+	ja  int_ret_from_sys_call	/* ia32_tracesys has set RAX(%rsp) */
 	IA32_ARG_FIXUP
 	call *ia32_sys_call_table(,%rax,8) # xxx: rip relative
 ia32_sysret:
@@ -327,7 +327,7 @@ ia32_sysret:
 
 ia32_tracesys:			 
 	SAVE_REST
-	movq $-ENOSYS,RAX(%rsp)	/* really needed? */
+	movq $-ENOSYS,RAX(%rsp)	/* ptrace can change this for a bad syscall */
 	movq %rsp,%rdi        /* &pt_regs -> arg1 */
 	call syscall_trace_enter
 	LOAD_ARGS ARGOFFSET  /* reload args from stack in case ptrace changed it */
--- linux-2.6.18/arch/x86_64/ia32/ia32_signal.c
+++ linux-2.6.18/arch/x86_64/ia32/ia32_signal.c
@@ -491,6 +491,7 @@ int ia32_setup_frame(int sig, struct k_s
 
 	regs->cs = __USER32_CS; 
 	regs->ss = __USER32_DS; 
+
 	set_fs(USER_DS);
 
 #if DEBUG_SIG
@@ -583,6 +584,7 @@ int ia32_setup_rt_frame(int sig, struct 
 	
 	regs->cs = __USER32_CS; 
 	regs->ss = __USER32_DS; 
+
 	set_fs(USER_DS);
 
 #if DEBUG_SIG
--- linux-2.6.18/arch/x86_64/ia32/ptrace32.c
+++ linux-2.6.18/arch/x86_64/ia32/ptrace32.c
@@ -49,22 +49,30 @@ static int putreg32(struct task_struct *
 	switch (regno) {
 	case offsetof(struct user_regs_struct32, fs):
 		if (val && (val & 3) != 3) return -EIO; 
-		child->thread.fsindex = val & 0xffff;
+		child->thread.fsindex = val &= 0xffff;
+		if (child == current)
+			loadsegment(fs, val);
 		break;
 	case offsetof(struct user_regs_struct32, gs):
 		if (val && (val & 3) != 3) return -EIO; 
-		child->thread.gsindex = val & 0xffff;
+		child->thread.gsindex = val &= 0xffff;
+		if (child == current)
+			loadsegment(gs, val);
 		break;
 	case offsetof(struct user_regs_struct32, ds):
 		if (val && (val & 3) != 3) return -EIO; 
-		child->thread.ds = val & 0xffff;
+		child->thread.ds = val &= 0xffff;
+		if (child == current)
+			loadsegment(ds, val);
 		break;
 	case offsetof(struct user_regs_struct32, es):
-		child->thread.es = val & 0xffff;
+		child->thread.es = val &= 0xffff;
+		if (child == current)
+			loadsegment(ds, val);
 		break;
 	case offsetof(struct user_regs_struct32, ss):
 		if ((val & 3) != 3) return -EIO;
-        	stack[offsetof(struct pt_regs, ss)/8] = val & 0xffff;
+		stack[offsetof(struct pt_regs, ss)/8] = val & 0xffff;
 		break;
 	case offsetof(struct user_regs_struct32, cs):
 		if ((val & 3) != 3) return -EIO;
@@ -108,16 +116,24 @@ static int getreg32(struct task_struct *
 
 	switch (regno) {
 	case offsetof(struct user_regs_struct32, fs):
-	        val = child->thread.fsindex;
+		val = child->thread.fsindex;
+		if (child == current)
+			asm("movl %%fs,%0" : "=r" (val));
 		break;
 	case offsetof(struct user_regs_struct32, gs):
 		val = child->thread.gsindex;
+		if (child == current)
+			asm("movl %%gs,%0" : "=r" (val));
 		break;
 	case offsetof(struct user_regs_struct32, ds):
 		val = child->thread.ds;
+		if (child == current)
+			asm("movl %%ds,%0" : "=r" (val));
 		break;
 	case offsetof(struct user_regs_struct32, es):
 		val = child->thread.es;
+		if (child == current)
+			asm("movl %%es,%0" : "=r" (val));
 		break;
 
 	R32(cs, cs);
@@ -382,7 +398,6 @@ ia32_dbregs_set(struct task_struct *targ
 	 * We'll just hijack the native setter to do the real work for us.
 	 */
 	const struct utrace_regset *dbregset = &utrace_x86_64_native.regsets[2];
-
 	int ret = 0;
 
 	for (pos >>= 2, count >>= 2; count > 0; --count, ++pos) {
@@ -428,7 +443,7 @@ ia32_tls_get(struct task_struct *target,
 #define GET_BASE(desc) ( \
 	(((desc)->a >> 16) & 0x0000ffff) | \
 	(((desc)->b << 16) & 0x00ff0000) | \
-	( (desc)->b        & 0xff000000)   )
+	( (desc)->b    	   & 0xff000000)   )
 
 #define GET_LIMIT(desc) ( \
 	((desc)->a & 0x0ffff) | \
@@ -578,10 +593,8 @@ static const struct utrace_regset ia32_r
 
 const struct utrace_regset_view utrace_ia32_view = {
 	.name = "i386", .e_machine = EM_386,
-	.regsets = ia32_regsets,
-	.n = sizeof ia32_regsets / sizeof ia32_regsets[0],
+	.regsets = ia32_regsets, .n = ARRAY_SIZE(ia32_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_ia32_view);
 
 
 #ifdef CONFIG_PTRACE
@@ -591,15 +604,17 @@ EXPORT_SYMBOL_GPL(utrace_ia32_view);
 
 static const struct ptrace_layout_segment ia32_uarea[] = {
 	{0, sizeof(struct user_regs_struct32), 0, 0},
+	{sizeof(struct user_regs_struct32),
+	 offsetof(struct user32, u_debugreg[0]), -1, 0},
 	{offsetof(struct user32, u_debugreg[0]),
 	 offsetof(struct user32, u_debugreg[8]), 4, 0},
 	{0, 0, -1, 0}
 };
 
-fastcall int arch_compat_ptrace(compat_long_t *req, struct task_struct *child,
-				struct utrace_attached_engine *engine,
-				compat_ulong_t addr, compat_ulong_t data,
-				compat_long_t *val)
+int arch_compat_ptrace(compat_long_t *req, struct task_struct *child,
+		       struct utrace_attached_engine *engine,
+		       compat_ulong_t addr, compat_ulong_t data,
+		       compat_long_t *val)
 {
 	switch (*req) {
 	case PTRACE_PEEKUSR:
--- linux-2.6.18/arch/x86_64/kernel/entry.S
+++ linux-2.6.18/arch/x86_64/kernel/entry.S
@@ -298,17 +298,17 @@ badsys:
 tracesys:			 
 	CFI_RESTORE_STATE
 	SAVE_REST
-	movq $-ENOSYS,RAX(%rsp)
+	movq $-ENOSYS,RAX(%rsp) /* ptrace can change this for a bad syscall */
 	FIXUP_TOP_OF_STACK %rdi
 	movq %rsp,%rdi
 	call syscall_trace_enter
 	LOAD_ARGS ARGOFFSET  /* reload args from stack in case ptrace changed it */
 	RESTORE_REST
 	cmpq $__NR_syscall_max,%rax
-	ja  1f
+	ja   int_ret_from_sys_call	/* -ENOSYS already in RAX(%rsp) */
 	movq %r10,%rcx	/* fixup for C */
 	call *sys_call_table(,%rax,8)
-1:	movq %rax,RAX-ARGOFFSET(%rsp)
+	movq %rax,RAX-ARGOFFSET(%rsp)
 	/* Use IRET because user could have changed frame */
 	jmp int_ret_from_sys_call
 	CFI_ENDPROC
--- linux-2.6.18/arch/x86_64/kernel/ptrace.c
+++ linux-2.6.18/arch/x86_64/kernel/ptrace.c
@@ -22,6 +22,7 @@
 #include <linux/signal.h>
 #include <linux/module.h>
 
+#include <asm/tracehook.h>
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
 #include <asm/system.h>
@@ -230,53 +231,61 @@ static int putreg(struct task_struct *ch
 	if (test_tsk_thread_flag(child, TIF_IA32))
 		value &= 0xffffffff;
 	switch (regno) {
-		case offsetof(struct user_regs_struct,fs):
-			if (value && (value & 3) != 3)
-				return -EIO;
-			child->thread.fsindex = value & 0xffff; 
-			return 0;
-		case offsetof(struct user_regs_struct,gs):
-			if (value && (value & 3) != 3)
-				return -EIO;
-			child->thread.gsindex = value & 0xffff;
-			return 0;
-		case offsetof(struct user_regs_struct,ds):
-			if (value && (value & 3) != 3)
-				return -EIO;
-			child->thread.ds = value & 0xffff;
-			return 0;
-		case offsetof(struct user_regs_struct,es): 
-			if (value && (value & 3) != 3)
-				return -EIO;
-			child->thread.es = value & 0xffff;
-			return 0;
-		case offsetof(struct user_regs_struct,ss):
-			if ((value & 3) != 3)
-				return -EIO;
-			value &= 0xffff;
-			return 0;
-		case offsetof(struct user_regs_struct,fs_base):
-			if (value >= TASK_SIZE_OF(child))
-				return -EIO;
-			child->thread.fs = value;
-			return 0;
-		case offsetof(struct user_regs_struct,gs_base):
-			if (value >= TASK_SIZE_OF(child))
-				return -EIO;
-			child->thread.gs = value;
-			return 0;
-		case offsetof(struct user_regs_struct, eflags):
-			value &= FLAG_MASK;
-			tmp = get_stack_long(child, EFL_OFFSET); 
-			tmp &= ~FLAG_MASK; 
-			value |= tmp;
-			clear_tsk_thread_flag(child, TIF_FORCED_TF);
-			break;
-		case offsetof(struct user_regs_struct,cs): 
-			if ((value & 3) != 3)
-				return -EIO;
-			value &= 0xffff;
-			break;
+	case offsetof(struct user_regs_struct,fs):
+		if (value && (value & 3) != 3)
+			return -EIO;
+		child->thread.fsindex = value &= 0xffff;
+		if (child == current)
+			loadsegment(fs, value);
+		return 0;
+	case offsetof(struct user_regs_struct,gs):
+		if (value && (value & 3) != 3)
+			return -EIO;
+		child->thread.gsindex = value &= 0xffff;
+		if (child == current)
+			loadsegment(gs, value);
+		return 0;
+	case offsetof(struct user_regs_struct,ds):
+		if (value && (value & 3) != 3)
+			return -EIO;
+		child->thread.ds = value &= 0xffff;
+		if (child == current)
+			loadsegment(ds, value);
+		return 0;
+	case offsetof(struct user_regs_struct,es):
+		if (value && (value & 3) != 3)
+			return -EIO;
+		child->thread.es = value &= 0xffff;
+		if (child == current)
+			loadsegment(es, value);
+		return 0;
+	case offsetof(struct user_regs_struct,ss):
+		if ((value & 3) != 3)
+			return -EIO;
+		value &= 0xffff;
+		return 0;
+	case offsetof(struct user_regs_struct,fs_base):
+		if (value >= TASK_SIZE_OF(child))
+			return -EIO;
+		child->thread.fs = value;
+		return 0;
+	case offsetof(struct user_regs_struct,gs_base):
+		if (value >= TASK_SIZE_OF(child))
+			return -EIO;
+		child->thread.gs = value;
+		return 0;
+	case offsetof(struct user_regs_struct, eflags):
+		value &= FLAG_MASK;
+		tmp = get_stack_long(child, EFL_OFFSET);
+		tmp &= ~FLAG_MASK;
+		value |= tmp;
+		clear_tsk_thread_flag(child, TIF_FORCED_TF);
+		break;
+	case offsetof(struct user_regs_struct,cs):
+		if ((value & 3) != 3)
+			return -EIO;
+		value &= 0xffff;
+		break;
 	}
 	put_stack_long(child, regno - sizeof(struct pt_regs), value);
 	return 0;
@@ -285,29 +294,47 @@ static int putreg(struct task_struct *ch
 static unsigned long getreg(struct task_struct *child, unsigned long regno)
 {
 	unsigned long val;
+	unsigned int seg;
 	switch (regno) {
-		case offsetof(struct user_regs_struct, fs):
-			return child->thread.fsindex;
-		case offsetof(struct user_regs_struct, gs):
-			return child->thread.gsindex;
-		case offsetof(struct user_regs_struct, ds):
-			return child->thread.ds;
-		case offsetof(struct user_regs_struct, es):
-			return child->thread.es; 
-		case offsetof(struct user_regs_struct, fs_base):
-			return child->thread.fs;
-		case offsetof(struct user_regs_struct, gs_base):
-			return child->thread.gs;
-		default:
-			regno = regno - sizeof(struct pt_regs);
-			val = get_stack_long(child, regno);
-			if (test_tsk_thread_flag(child, TIF_IA32))
-				val &= 0xffffffff;
-			if (regno == (offsetof(struct user_regs_struct, eflags)
-				      - sizeof(struct pt_regs))
-			    && test_tsk_thread_flag(child, TIF_FORCED_TF))
-				val &= ~X86_EFLAGS_TF;
-			return val;
+	case offsetof(struct user_regs_struct, fs):
+		if (child == current) {
+			/* Older gas can't assemble movq %?s,%r?? */
+			asm("movl %%fs,%0" : "=r" (seg));
+			return seg;
+		}
+		return child->thread.fsindex;
+	case offsetof(struct user_regs_struct, gs):
+		if (child == current) {
+			asm("movl %%gs,%0" : "=r" (seg));
+			return seg;
+		}
+		return child->thread.gsindex;
+	case offsetof(struct user_regs_struct, ds):
+		if (child == current) {
+			asm("movl %%ds,%0" : "=r" (seg));
+			return seg;
+		}
+		return child->thread.ds;
+	case offsetof(struct user_regs_struct, es):
+		if (child == current) {
+			asm("movl %%es,%0" : "=r" (seg));
+			return seg;
+		}
+		return child->thread.es;
+	case offsetof(struct user_regs_struct, fs_base):
+		return child->thread.fs;
+	case offsetof(struct user_regs_struct, gs_base):
+		return child->thread.gs;
+	default:
+		regno = regno - sizeof(struct pt_regs);
+		val = get_stack_long(child, regno);
+		if (test_tsk_thread_flag(child, TIF_IA32))
+			val &= 0xffffffff;
+		if (regno == (offsetof(struct user_regs_struct, eflags)
+			      - sizeof(struct pt_regs))
+		    && test_tsk_thread_flag(child, TIF_FORCED_TF))
+			val &= ~X86_EFLAGS_TF;
+		return val;
 	}
 
 }
@@ -424,6 +451,7 @@ dbregs_set(struct task_struct *target,
 	   unsigned int pos, unsigned int count,
 	   const void *kbuf, const void __user *ubuf)
 {
+
 	unsigned long maxaddr = TASK_SIZE_OF(target);
 	maxaddr -= test_tsk_thread_flag(target, TIF_IA32) ? 3 : 7;
 
@@ -669,25 +697,32 @@ static const struct utrace_regset native
 
 const struct utrace_regset_view utrace_x86_64_native = {
 	.name = "x86-64", .e_machine = EM_X86_64,
-	.regsets = native_regsets,
-	.n = sizeof native_regsets / sizeof native_regsets[0],
+	.regsets = native_regsets, .n = ARRAY_SIZE(native_regsets)
 };
-EXPORT_SYMBOL_GPL(utrace_x86_64_native);
+
+const struct utrace_regset_view *utrace_native_view(struct task_struct *tsk)
+{
+#ifdef CONFIG_IA32_EMULATION
+	if (test_tsk_thread_flag(tsk, TIF_IA32))
+		return &utrace_ia32_view;
+#endif
+	return &utrace_x86_64_native;
+}
 
 
 #ifdef CONFIG_PTRACE
 static const struct ptrace_layout_segment x86_64_uarea[] = {
 	{0, sizeof(struct user_regs_struct), 0, 0},
+	{sizeof(struct user_regs_struct),
+	 offsetof(struct user, u_debugreg[0]), -1, 0},
 	{offsetof(struct user, u_debugreg[0]),
-	 offsetof(struct user, u_debugreg[4]), 3, 0},
-	{offsetof(struct user, u_debugreg[6]),
-	 offsetof(struct user, u_debugreg[8]), 3, 6 * sizeof(long)},
+	 offsetof(struct user, u_debugreg[8]), 3, 0},
 	{0, 0, -1, 0}
 };
 
-fastcall int arch_ptrace(long *req, struct task_struct *child,
-			 struct utrace_attached_engine *engine,
-			 unsigned long addr, unsigned long data, long *val)
+int arch_ptrace(long *req, struct task_struct *child,
+		struct utrace_attached_engine *engine,
+		unsigned long addr, unsigned long data, long *val)
 {
 	switch (*req) {
 	case PTRACE_PEEKUSR:
--- linux-2.6.18/Documentation/DocBook/Makefile
+++ linux-2.6.18/Documentation/DocBook/Makefile
@@ -9,7 +9,7 @@
 DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
 	    kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
 	    procfs-guide.xml writing_usb_driver.xml \
-	    kernel-api.xml journal-api.xml lsm.xml usb.xml \
+	    kernel-api.xml journal-api.xml lsm.xml utrace.xml usb.xml \
 	    gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
 	    genericirq.xml
 
--- linux-2.6.18/Documentation/DocBook/utrace.tmpl
+++ linux-2.6.18/Documentation/DocBook/utrace.tmpl
@@ -0,0 +1,23 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
+	"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
+
+<book id="utrace">
+ <bookinfo>
+  <title>The utrace User Debugging Infrastructure</title>
+ </bookinfo>
+
+<toc></toc>
+
+<chapter><title>The utrace core API</title>
+!Iinclude/linux/utrace.h
+!Ekernel/utrace.c
+    </chapter>
+
+<chapter><title>Machine state access via utrace</title>
+!Finclude/linux/tracehook.h struct utrace_regset
+!Finclude/linux/tracehook.h struct utrace_regset_view
+!Finclude/linux/tracehook.h utrace_native_view
+    </chapter>
+
+</book>
--- linux-2.6.18/Documentation/utrace.txt
+++ linux-2.6.18/Documentation/utrace.txt
@@ -51,7 +51,7 @@ code.  Using the UTRACE starts out by at
 
 	struct utrace_attached_engine *
 	utrace_attach(struct task_struct *target, int flags,
-		      const struct utrace_engine_ops *ops, unsigned long data);
+		      const struct utrace_engine_ops *ops, void *data);
 
 Calling utrace_attach is what sets up a tracing engine to trace a
 thread.  Use UTRACE_ATTACH_CREATE in flags, and pass your engine's ops.
--- linux-2.6.18/fs/proc/array.c
+++ linux-2.6.18/fs/proc/array.c
@@ -165,12 +165,10 @@ static inline char * task_state(struct t
 	int g;
 	struct fdtable *fdt = NULL;
 
-	rcu_read_lock();
+	read_lock(&tasklist_lock);
 	tracer = tracehook_tracer_task(p);
 	tracer_pid = tracer == NULL ? 0 : tracer->pid;
-	rcu_read_unlock();
 
-	read_lock(&tasklist_lock);
 	buffer += sprintf(buffer,
 		"State:\t%s\n"
 		"SleepAVG:\t%lu%%\n"
--- linux-2.6.18/fs/proc/base.c
+++ linux-2.6.18/fs/proc/base.c
@@ -364,6 +364,46 @@ static int get_nr_threads(struct task_st
 	return count;
 }
 
+static int __ptrace_may_attach(struct task_struct *task)
+{
+	/* May we inspect the given task?
+	 * This check is used both for attaching with ptrace
+	 * and for allowing access to sensitive information in /proc.
+	 *
+	 * ptrace_attach denies several cases that /proc allows
+	 * because setting up the necessary parent/child relationship
+	 * or halting the specified task is impossible.
+	 */
+	int dumpable = 0;
+	/* Don't let security modules deny introspection */
+	if (task == current)
+		return 0;
+	if (((current->uid != task->euid) ||
+	     (current->uid != task->suid) ||
+	     (current->uid != task->uid) ||
+	     (current->gid != task->egid) ||
+	     (current->gid != task->sgid) ||
+	     (current->gid != task->gid)) && !capable(CAP_SYS_PTRACE))
+		return -EPERM;
+	smp_rmb();
+	if (task->mm)
+		dumpable = task->mm->dumpable;
+	if (!dumpable && !capable(CAP_SYS_PTRACE))
+		return -EPERM;
+
+	return security_ptrace(current, task);
+}
+
+int ptrace_may_attach(struct task_struct *task)
+{
+	int err;
+	task_lock(task);
+	err = __ptrace_may_attach(task);
+	task_unlock(task);
+	return !err;
+}
+
+
 static int proc_cwd_link(struct inode *inode, struct dentry **dentry, struct vfsmount **mnt)
 {
 	struct task_struct *task = get_proc_task(inode);
--- linux-2.6.18/include/asm-i386/tracehook.h
+++ linux-2.6.18/include/asm-i386/tracehook.h
@@ -1,5 +1,13 @@
 /*
  * Tracing hooks, i386 CPU support
+ *
+ * Copyright (C) 2006, 2007 Red Hat, Inc.  All rights reserved.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU General Public License v.2.
+ *
+ * Red Hat Author: Roland McGrath.
  */
 
 #ifndef _ASM_TRACEHOOK_H
@@ -33,17 +41,12 @@ static inline void tracehook_disable_sys
 	clear_tsk_thread_flag(tsk, TIF_SYSCALL_TRACE);
 }
 
+#define tracehook_syscall_callno(regs)	(&(regs)->orig_eax)
+#define tracehook_syscall_retval(regs)	(&(regs)->eax)
 static inline void tracehook_abort_syscall(struct pt_regs *regs)
 {
 	regs->orig_eax = -1;
 }
 
-extern const struct utrace_regset_view utrace_i386_native;
-static inline const struct utrace_regset_view *
-utrace_native_view(struct task_struct *tsk)
-{
-	return &utrace_i386_native;
-}
-
 
 #endif
--- linux-2.6.18/include/asm-ia64/thread_info.h
+++ linux-2.6.18/include/asm-ia64/thread_info.h
@@ -84,6 +84,7 @@ struct thread_info {
 #define TIF_NEED_RESCHED	2	/* rescheduling necessary */
 #define TIF_SYSCALL_TRACE	3	/* syscall trace active */
 #define TIF_SYSCALL_AUDIT	4	/* syscall auditing active */
+#define TIF_SINGLESTEP		5	/* restore singlestep on return to user mode */
 #define TIF_POLLING_NRFLAG	16	/* true if poll_idle() is polling TIF_NEED_RESCHED */
 #define TIF_MEMDIE		17
 #define TIF_MCA_INIT		18	/* this task is processing MCA or INIT */
@@ -91,7 +92,8 @@ struct thread_info {
 
 #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
 #define _TIF_SYSCALL_AUDIT	(1 << TIF_SYSCALL_AUDIT)
-#define _TIF_SYSCALL_TRACEAUDIT	(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT)
+#define _TIF_SINGLESTEP		(1 << TIF_SINGLESTEP)
+#define _TIF_SYSCALL_TRACEAUDIT	(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP)
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
--- linux-2.6.18/include/asm-ia64/tracehook.h
+++ linux-2.6.18/include/asm-ia64/tracehook.h
@@ -23,24 +23,30 @@ static inline void tracehook_enable_sing
 {
 	struct pt_regs *pt = task_pt_regs(tsk);
 	ia64_psr(pt)->ss = 1;
+	set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
 }
 
 static inline void tracehook_disable_single_step(struct task_struct *tsk)
 {
 	struct pt_regs *pt = task_pt_regs(tsk);
 	ia64_psr(pt)->ss = 0;
+	if (ia64_psr(pt)->tb == 0)
+		clear_tsk_thread_flag(tsk, TIF_SINGLESTEP);
 }
 
 static inline void tracehook_enable_block_step(struct task_struct *tsk)
 {
 	struct pt_regs *pt = task_pt_regs(tsk);
 	ia64_psr(pt)->tb = 1;
+	set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
 }
 
 static inline void tracehook_disable_block_step(struct task_struct *tsk)
 {
 	struct pt_regs *pt = task_pt_regs(tsk);
 	ia64_psr(pt)->tb = 0;
+	if (ia64_psr(pt)->ss == 0)
+		clear_tsk_thread_flag(tsk, TIF_SINGLESTEP);
 }
 
 static inline void tracehook_enable_syscall_trace(struct task_struct *tsk)
@@ -67,17 +73,4 @@ static inline void tracehook_abort_sysca
 		regs->r15 = -1UL;
 }
 
-extern const struct utrace_regset_view utrace_ia64_native;
-static inline const struct utrace_regset_view *
-utrace_native_view(struct task_struct *tsk)
-{
-#ifdef CONFIG_IA32_SUPPORT
-	extern const struct utrace_regset_view utrace_ia32_view;
-	if (IS_IA32_PROCESS(task_pt_regs(tsk)))
-		return &utrace_ia32_view;
-#endif
-	return &utrace_ia64_native;
-}
-
-
 #endif	/* asm/tracehook.h */
--- linux-2.6.18/include/asm-powerpc/tracehook.h
+++ linux-2.6.18/include/asm-powerpc/tracehook.h
@@ -1,5 +1,13 @@
 /*
  * Tracing hooks, PowerPC CPU support
+ *
+ * Copyright (C) 2006, 2007 Red Hat, Inc.  All rights reserved.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU General Public License v.2.
+ *
+ * Red Hat Author: Roland McGrath.
  */
 
 #ifndef _ASM_TRACEHOOK_H
@@ -63,18 +71,4 @@ static inline void tracehook_abort_sysca
 }
 
 
-extern const struct utrace_regset_view utrace_ppc_native_view;
-static inline const struct utrace_regset_view *
-utrace_native_view(struct task_struct *tsk)
-{
-#ifdef CONFIG_PPC64
-	extern const struct utrace_regset_view utrace_ppc32_view;
-
-	if (test_tsk_thread_flag(tsk, TIF_32BIT))
-		return &utrace_ppc32_view;
-#endif
-	return &utrace_ppc_native_view;
-}
-
-
 #endif
--- linux-2.6.18/include/asm-s390/tracehook.h
+++ linux-2.6.18/include/asm-s390/tracehook.h
@@ -1,5 +1,13 @@
 /*
  * Tracing hooks, s390/s390x support.
+ *
+ * Copyright (C) 2006, 2007 Red Hat, Inc.  All rights reserved.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU General Public License v.2.
+ *
+ * Red Hat Author: Roland McGrath.
  */
 
 #ifndef _ASM_TRACEHOOK_H
@@ -35,19 +43,4 @@ static inline void tracehook_abort_sysca
 	regs->gprs[2] = -1L;
 }
 
-
-extern const struct utrace_regset_view utrace_s390_native_view;
-static inline const struct utrace_regset_view *
-utrace_native_view(struct task_struct *tsk)
-{
-#ifdef CONFIG_COMPAT
-        extern const struct utrace_regset_view utrace_s390_compat_view;
-
-        if (test_tsk_thread_flag(tsk, TIF_31BIT))
-                return &utrace_s390_compat_view;
-#endif
-        return &utrace_s390_native_view;
-}
-
-
 #endif
--- linux-2.6.18/include/asm-sparc64/tracehook.h
+++ linux-2.6.18/include/asm-sparc64/tracehook.h
@@ -1,5 +1,13 @@
 /*
  * Tracing hooks, SPARC64 CPU support
+ *
+ * Copyright (C) 2006, 2007 Red Hat, Inc.  All rights reserved.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU General Public License v.2.
+ *
+ * Red Hat Author: Roland McGrath.
  */
 
 #ifndef _ASM_TRACEHOOK_H
@@ -29,16 +37,4 @@ static inline void tracehook_abort_sysca
 	regs->u_regs[UREG_G1] = -1L;
 }
 
-extern const struct utrace_regset_view utrace_sparc64_native_view;
-static inline const struct utrace_regset_view *
-utrace_native_view(struct task_struct *tsk)
-{
-#ifdef CONFIG_COMPAT
-	extern const struct utrace_regset_view utrace_sparc32_view;
-	if (test_tsk_thread_flag(tsk, TIF_32BIT))
-		return &utrace_sparc32_view;
-#endif
-	return &utrace_sparc64_native_view;
-}
-
 #endif
--- linux-2.6.18/include/asm-x86_64/ptrace.h
+++ linux-2.6.18/include/asm-x86_64/ptrace.h
@@ -80,6 +80,9 @@ struct pt_regs {
 
 #define PTRACE_ARCH_PRCTL	  30	/* arch_prctl for child */
 
+#define PTRACE_SYSEMU		  31
+#define PTRACE_SYSEMU_SINGLESTEP  32
+
 #if defined(__KERNEL__) && !defined(__ASSEMBLY__) 
 #define user_mode(regs) (!!((regs)->cs & 3))
 #define user_mode_vm(regs) user_mode(regs)
--- linux-2.6.18/include/asm-x86_64/tracehook.h
+++ linux-2.6.18/include/asm-x86_64/tracehook.h
@@ -1,5 +1,13 @@
 /*
  * Tracing hooks, x86-64 CPU support
+ *
+ * Copyright (C) 2006, 2007 Red Hat, Inc.  All rights reserved.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU General Public License v.2.
+ *
+ * Red Hat Author: Roland McGrath.
  */
 
 #ifndef _ASM_TRACEHOOK_H
@@ -34,21 +42,19 @@ static inline void tracehook_disable_sys
 	clear_tsk_thread_flag(tsk, TIF_SYSCALL_TRACE);
 }
 
+#define tracehook_syscall_callno(regs)	(&(regs)->orig_rax)
+#define tracehook_syscall_retval(regs)	(&(regs)->rax)
 static inline void tracehook_abort_syscall(struct pt_regs *regs)
 {
 	regs->orig_rax = -1L;
 }
 
-extern const struct utrace_regset_view utrace_x86_64_native, utrace_ia32_view;
-static inline const struct utrace_regset_view *
-utrace_native_view(struct task_struct *tsk)
-{
+/*
+ * These are used directly by some of the regset code.
+ */
+extern const struct utrace_regset_view utrace_x86_64_native;
 #ifdef CONFIG_IA32_EMULATION
-	if (test_tsk_thread_flag(tsk, TIF_IA32))
-		return &utrace_ia32_view;
+extern const struct utrace_regset_view utrace_ia32_view;
 #endif
-	return &utrace_x86_64_native;
-}
-
 
 #endif
--- linux-2.6.18/include/linux/ptrace.h
+++ linux-2.6.18/include/linux/ptrace.h
@@ -49,21 +49,22 @@
 #include <asm/ptrace.h>
 
 #ifdef __KERNEL__
-#include <linux/compiler.h>
+#include <linux/compiler.h>		/* For unlikely.  */
+#include <linux/sched.h>		/* For struct task_struct.  */
 #include <linux/types.h>
-struct task_struct;
+#include <linux/errno.h>
 struct siginfo;
 struct rusage;
 
 
 extern int ptrace_may_attach(struct task_struct *task);
-extern int __ptrace_may_attach(struct task_struct *task);
 
 
 #ifdef CONFIG_PTRACE
 #include <asm/tracehook.h>
-struct utrace_attached_engine;
-struct utrace_regset_view;
+#ifndef __GENKSYMS__
+#include <linux/tracehook.h>
+#endif
 
 /*
  * These must be defined by arch code to handle machine-specific ptrace
@@ -80,30 +81,21 @@ struct utrace_regset_view;
  * and sets *retval to the value--which might have any bit pattern at all,
  * including one that looks like -ENOSYS or another error code.
  */
-extern fastcall int arch_ptrace(long *request, struct task_struct *child,
-				struct utrace_attached_engine *engine,
-				unsigned long addr, unsigned long data,
-				long *retval);
-#ifdef CONFIG_COMPAT
-#include <linux/compat.h>
-
-extern fastcall int arch_compat_ptrace(compat_long_t *request,
-				       struct task_struct *child,
-				       struct utrace_attached_engine *engine,
-				       compat_ulong_t a, compat_ulong_t d,
-				       compat_long_t *retval);
-#endif
+extern int arch_ptrace(long *request, struct task_struct *child,
+		       struct utrace_attached_engine *engine,
+		       unsigned long addr, unsigned long data,
+		       long *retval);
 
 /*
  * Convenience function doing access to a single utrace_regset for ptrace.
  * The offset and size are in bytes, giving the location in the regset data.
  */
-extern fastcall int ptrace_regset_access(struct task_struct *child,
-					 struct utrace_attached_engine *engine,
-					 const struct utrace_regset_view *view,
-					 int setno, unsigned long offset,
-					 unsigned int size, void __user *data,
-					 int write);
+extern int ptrace_regset_access(struct task_struct *child,
+				struct utrace_attached_engine *engine,
+				const struct utrace_regset_view *view,
+				int setno, unsigned long offset,
+				unsigned int size, void __user *data,
+				int write);
 
 /*
  * Convenience wrapper for doing access to a whole utrace_regset for ptrace.
@@ -121,11 +113,11 @@ static inline int ptrace_whole_regset(st
  * The regno value gives a slot number plus regset->bias.
  * The value accessed is regset->size bytes long.
  */
-extern fastcall int ptrace_onereg_access(struct task_struct *child,
-					 struct utrace_attached_engine *engine,
-					 const struct utrace_regset_view *view,
-					 int setno, unsigned long regno,
-					 void __user *data, int write);
+extern int ptrace_onereg_access(struct task_struct *child,
+				struct utrace_attached_engine *engine,
+				const struct utrace_regset_view *view,
+				int setno, unsigned long regno,
+				void __user *data, int write);
 
 
 /*
@@ -135,7 +127,9 @@ extern fastcall int ptrace_onereg_access
  * An element describes the range [.start, .end) of struct user offsets,
  * measured in bytes; it maps to the regset in the view's regsets array
  * at the index given by .regset, at .offset bytes into that regset's data.
- * If .regset is -1, then the [.start, .end) range reads as zero.
+ * If .regset is -1, then the [.start, .end) range reads as zero
+ * if .offset is zero, and is skipped on read (user's buffer unchanged)
+ * if .offset is -1.
  */
 struct ptrace_layout_segment {
 	unsigned int start, end, regset, offset;
@@ -145,12 +139,12 @@ struct ptrace_layout_segment {
  * Convenience function for doing access to a ptrace compatibility layout.
  * The offset and size are in bytes.
  */
-extern fastcall int ptrace_layout_access(
-	struct task_struct *child, struct utrace_attached_engine *engine,
-	const struct utrace_regset_view *view,
-	const struct ptrace_layout_segment layout[],
-	unsigned long offset, unsigned int size,
-	void __user *data, void *kdata, int write);
+extern int ptrace_layout_access(struct task_struct *child,
+				struct utrace_attached_engine *engine,
+				const struct utrace_regset_view *view,
+				const struct ptrace_layout_segment layout[],
+				unsigned long offset, unsigned int size,
+				void __user *data, void *kdata, int write);
 
 
 /* Convenience wrapper for the common PTRACE_PEEKUSR implementation.  */
@@ -175,7 +169,41 @@ static inline int ptrace_pokeusr(struct 
 				    NULL, &data, 1);
 }
 
+/*
+ * Called in copy_process.
+ */
+static inline void ptrace_init_task(struct task_struct *tsk)
+{
+	INIT_LIST_HEAD(&tsk->ptracees);
+}
+
+/*
+ * Called in do_exit, after setting PF_EXITING, no locks are held.
+ */
+void ptrace_exit(struct task_struct *tsk);
+
+/*
+ * Called in do_wait, with tasklist_lock held for reading.
+ * This reports any ptrace-child that is ready as do_wait would a normal child.
+ * If there are no ptrace children, returns -ECHILD.
+ * If there are some ptrace children but none reporting now, returns 0.
+ * In those cases the tasklist_lock is still held so next_thread(tsk) works.
+ * For any other return value, tasklist_lock is released before return.
+ */
+int ptrace_do_wait(struct task_struct *tsk,
+		   pid_t pid, int options, struct siginfo __user *infop,
+		   int __user *stat_addr, struct rusage __user *rusagep);
+
+
 #ifdef CONFIG_COMPAT
+#include <linux/compat.h>
+
+extern int arch_compat_ptrace(compat_long_t *request,
+			      struct task_struct *child,
+			      struct utrace_attached_engine *engine,
+			      compat_ulong_t a, compat_ulong_t d,
+			      compat_long_t *retval);
+
 /* Convenience wrapper for the common PTRACE_PEEKUSR implementation.  */
 static inline int ptrace_compat_peekusr(
 	struct task_struct *child, struct utrace_attached_engine *engine,
@@ -198,26 +226,10 @@ static inline int ptrace_compat_pokeusr(
 				    layout, addr, sizeof(compat_ulong_t),
 				    NULL, &data, 1);
 }
-#endif
-
-
-/*
- * Called in do_exit, after setting PF_EXITING, no locks are held.
- */
-void ptrace_exit(struct task_struct *tsk);
+#endif	/* CONFIG_COMPAT */
 
-/*
- * Called in do_wait, with tasklist_lock held for reading.
- * This reports any ptrace-child that is ready as do_wait would a normal child.
- * If there are no ptrace children, returns -ECHILD.
- * If there are some ptrace children but none reporting now, returns 0.
- * In those cases the tasklist_lock is still held so next_thread(tsk) works.
- * For any other return value, tasklist_lock is released before return.
- */
-int ptrace_do_wait(struct task_struct *tsk,
-		   pid_t pid, int options, struct siginfo __user *infop,
-		   int __user *stat_addr, struct rusage __user *rusagep);
-#else
+#else  /* no CONFIG_PTRACE */
+static inline void ptrace_init_task(struct task_struct *tsk) { }
 static inline void ptrace_exit(struct task_struct *tsk) { }
 static inline int ptrace_do_wait(struct task_struct *tsk,
 				 pid_t pid, int options,
@@ -227,7 +239,7 @@ static inline int ptrace_do_wait(struct 
 {
 	return -ECHILD;
 }
-#endif
+#endif	/* CONFIG_PTRACE */
 
 
 #ifndef force_successful_syscall_return
--- linux-2.6.18/include/linux/sched.h
+++ linux-2.6.18/include/linux/sched.h
@@ -1500,11 +1500,13 @@ static inline int lock_need_resched(spin
 	return 0;
 }
 
-/* Reevaluate whether the task has signals pending delivery.
-   This is required every time the blocked sigset_t changes.
-   callers must hold sighand->siglock.  */
-
-extern FASTCALL(void recalc_sigpending_tsk(struct task_struct *t));
+/*
+ * Reevaluate whether the task has signals pending delivery.
+ * Wake the task if so.
+ * This is required every time the blocked sigset_t changes.
+ * callers must hold sighand->siglock.
+ */
+extern void recalc_sigpending_and_wake(struct task_struct *t);
 extern void recalc_sigpending(void);
 
 extern void signal_wake_up(struct task_struct *t, int resume_stopped);
--- linux-2.6.18/include/linux/signal.h
+++ linux-2.6.18/include/linux/signal.h
@@ -241,6 +241,131 @@ extern int sigprocmask(int, sigset_t *, 
 struct pt_regs;
 extern int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka, struct pt_regs *regs, void *cookie);
 
+/*
+ * In POSIX a signal is sent either to a specific thread (Linux task)
+ * or to the process as a whole (Linux thread group).  How the signal
+ * is sent determines whether it's to one thread or the whole group,
+ * which determines which signal mask(s) are involved in blocking it
+ * from being delivered until later.  When the signal is delivered,
+ * either it's caught or ignored by a user handler or it has a default
+ * effect that applies to the whole thread group (POSIX process).
+ *
+ * The possible effects an unblocked signal set to SIG_DFL can have are:
+ *   ignore	- Nothing Happens
+ *   terminate	- kill the process, i.e. all threads in the group,
+ * 		  similar to exit_group.  The group leader (only) reports
+ *		  WIFSIGNALED status to its parent.
+ *   coredump	- write a core dump file describing all threads using
+ *		  the same mm and then kill all those threads
+ *   stop 	- stop all the threads in the group, i.e. TASK_STOPPED state
+ *
+ * SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.
+ * Other signals when not blocked and set to SIG_DFL behaves as follows.
+ * The job control signals also have other special effects.
+ *
+ *	+--------------------+------------------+
+ *	|  POSIX signal      |  default action  |
+ *	+--------------------+------------------+
+ *	|  SIGHUP            |  terminate	|
+ *	|  SIGINT            |	terminate	|
+ *	|  SIGQUIT           |	coredump 	|
+ *	|  SIGILL            |	coredump 	|
+ *	|  SIGTRAP           |	coredump 	|
+ *	|  SIGABRT/SIGIOT    |	coredump 	|
+ *	|  SIGBUS            |	coredump 	|
+ *	|  SIGFPE            |	coredump 	|
+ *	|  SIGKILL           |	terminate(+)	|
+ *	|  SIGUSR1           |	terminate	|
+ *	|  SIGSEGV           |	coredump 	|
+ *	|  SIGUSR2           |	terminate	|
+ *	|  SIGPIPE           |	terminate	|
+ *	|  SIGALRM           |	terminate	|
+ *	|  SIGTERM           |	terminate	|
+ *	|  SIGCHLD           |	ignore   	|
+ *	|  SIGCONT           |	ignore(*)	|
+ *	|  SIGSTOP           |	stop(*)(+)  	|
+ *	|  SIGTSTP           |	stop(*)  	|
+ *	|  SIGTTIN           |	stop(*)  	|
+ *	|  SIGTTOU           |	stop(*)  	|
+ *	|  SIGURG            |	ignore   	|
+ *	|  SIGXCPU           |	coredump 	|
+ *	|  SIGXFSZ           |	coredump 	|
+ *	|  SIGVTALRM         |	terminate	|
+ *	|  SIGPROF           |	terminate	|
+ *	|  SIGPOLL/SIGIO     |	terminate	|
+ *	|  SIGSYS/SIGUNUSED  |	coredump 	|
+ *	|  SIGSTKFLT         |	terminate	|
+ *	|  SIGWINCH          |	ignore   	|
+ *	|  SIGPWR            |	terminate	|
+ *	|  SIGRTMIN-SIGRTMAX |	terminate       |
+ *	+--------------------+------------------+
+ *	|  non-POSIX signal  |  default action  |
+ *	+--------------------+------------------+
+ *	|  SIGEMT            |  coredump	|
+ *	+--------------------+------------------+
+ *
+ * (+) For SIGKILL and SIGSTOP the action is "always", not just "default".
+ * (*) Special job control effects:
+ * When SIGCONT is sent, it resumes the process (all threads in the group)
+ * from TASK_STOPPED state and also clears any pending/queued stop signals
+ * (any of those marked with "stop(*)").  This happens regardless of blocking,
+ * catching, or ignoring SIGCONT.  When any stop signal is sent, it clears
+ * any pending/queued SIGCONT signals; this happens regardless of blocking,
+ * catching, or ignored the stop signal, though (except for SIGSTOP) the
+ * default action of stopping the process may happen later or never.
+ */
+
+#ifdef SIGEMT
+#define SIGEMT_MASK	rt_sigmask(SIGEMT)
+#else
+#define SIGEMT_MASK	0
+#endif
+
+#if SIGRTMIN > BITS_PER_LONG
+#define rt_sigmask(sig)	(1ULL << ((sig)-1))
+#else
+#define rt_sigmask(sig)	sigmask(sig)
+#endif
+#define siginmask(sig, mask) (rt_sigmask(sig) & (mask))
+
+#define SIG_KERNEL_ONLY_MASK (\
+	rt_sigmask(SIGKILL)   |  rt_sigmask(SIGSTOP))
+
+#define SIG_KERNEL_STOP_MASK (\
+	rt_sigmask(SIGSTOP)   |  rt_sigmask(SIGTSTP)   | \
+	rt_sigmask(SIGTTIN)   |  rt_sigmask(SIGTTOU)   )
+
+#define SIG_KERNEL_COREDUMP_MASK (\
+        rt_sigmask(SIGQUIT)   |  rt_sigmask(SIGILL)    | \
+	rt_sigmask(SIGTRAP)   |  rt_sigmask(SIGABRT)   | \
+        rt_sigmask(SIGFPE)    |  rt_sigmask(SIGSEGV)   | \
+	rt_sigmask(SIGBUS)    |  rt_sigmask(SIGSYS)    | \
+        rt_sigmask(SIGXCPU)   |  rt_sigmask(SIGXFSZ)   | \
+	SIGEMT_MASK				       )
+
+#define SIG_KERNEL_IGNORE_MASK (\
+        rt_sigmask(SIGCONT)   |  rt_sigmask(SIGCHLD)   | \
+	rt_sigmask(SIGWINCH)  |  rt_sigmask(SIGURG)    )
+
+#define sig_kernel_only(sig) \
+	(((sig) < SIGRTMIN) && siginmask(sig, SIG_KERNEL_ONLY_MASK))
+#define sig_kernel_coredump(sig) \
+	(((sig) < SIGRTMIN) && siginmask(sig, SIG_KERNEL_COREDUMP_MASK))
+#define sig_kernel_ignore(sig) \
+	(((sig) < SIGRTMIN) && siginmask(sig, SIG_KERNEL_IGNORE_MASK))
+#define sig_kernel_stop(sig) \
+	(((sig) < SIGRTMIN) && siginmask(sig, SIG_KERNEL_STOP_MASK))
+
+#define sig_needs_tasklist(sig)	((sig) == SIGCONT)
+
+#define sig_user_defined(t, signr) \
+	(((t)->sighand->action[(signr)-1].sa.sa_handler != SIG_DFL) &&	\
+	 ((t)->sighand->action[(signr)-1].sa.sa_handler != SIG_IGN))
+
+#define sig_fatal(t, signr) \
+	(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
+	 (t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
+
 #endif /* __KERNEL__ */
 
 #endif /* _LINUX_SIGNAL_H */
--- linux-2.6.18/include/linux/tracehook.h
+++ linux-2.6.18/include/linux/tracehook.h
@@ -1,18 +1,26 @@
 /*
  * Tracing hooks
  *
- * This file defines hook entry points called by core code where
- * user tracing/debugging support might need to do something.
- * These entry points are called tracehook_*.  Each hook declared below
- * has a detailed comment giving the context (locking et al) from
- * which it is called, and the meaning of its return value (if any).
- *
- * We also declare here tracehook_* functions providing access to low-level
- * interrogation and control of threads.  These functions must be called
- * on either the current thread or on a quiescent thread.  We say a
- * thread is "quiescent" if it is in TASK_STOPPED or TASK_TRACED state,
- * we are guaranteed it will not be woken up and return to user mode, and
- * we have called wait_task_inactive on it.
+ * Copyright (C) 2006, 2007 Red Hat, Inc.  All rights reserved.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU General Public License v.2.
+ *
+ * Red Hat Author: Roland McGrath.
+ *
+ * This file defines hook entry points called by core code where user
+ * tracing/debugging support might need to do something.  These entry
+ * points are called tracehook_*().  Each hook declared below has a
+ * detailed comment giving the context (locking et al) from which it is
+ * called, and the meaning of its return value (if any).
+ *
+ * We also declare here tracehook_*() functions providing access to
+ * low-level interrogation and control of threads.  These functions must
+ * be called on either the current thread or on a quiescent thread.  We
+ * say a thread is "quiescent" if it is in %TASK_STOPPED or %TASK_TRACED
+ * state, we are guaranteed it will not be woken up and return to user
+ * mode, and we have called wait_task_inactive() on it.
  */
 
 #ifndef _LINUX_TRACEHOOK_H
@@ -20,6 +28,7 @@
 
 #include <linux/sched.h>
 #include <linux/uaccess.h>
+#include <linux/utrace.h>
 struct linux_binprm;
 struct pt_regs;
 
@@ -44,9 +53,11 @@ struct pt_regs;
  * should be one that can be evaluated in modules, i.e. uses exported symbols.
  *
  * Block-step control (trap on control transfer), when available.
- * tracehook_disable_block_step will be called after tracehook_enable_single_step.
- * When enabled, the next jump, or other control transfer or syscall exit,
- * produces a SIGTRAP.  Enabling or disabling redundantly is harmless.
+ * If these are available, asm/tracehook.h does #define HAVE_ARCH_BLOCK_STEP.
+ * tracehook_disable_block_step() will be called after
+ * tracehook_enable_single_step().  When enabled, the next jump, or other
+ * control transfer or syscall exit, produces a %SIGTRAP.
+ * Enabling or disabling redundantly is harmless.
  *
  *	void tracehook_enable_block_step(struct task_struct *tsk);
  *	void tracehook_disable_block_step(struct task_struct *tsk);
@@ -54,51 +65,63 @@ struct pt_regs;
  *
  * If those calls are defined, #define ARCH_HAS_BLOCK_STEP to nonzero.
  * Do not #define it if these calls are never available in this kernel config.
- * If defined, the value of ARCH_HAS_BLOCK_STEP can be constant or variable.
+ * If defined, the value of %ARCH_HAS_BLOCK_STEP can be constant or variable.
  * It should evaluate to nonzero if the hardware is able to support
- * tracehook_enable_block_step.  If it's a variable expression, it
+ * tracehook_enable_block_step().  If it's a variable expression, it
  * should be one that can be evaluated in modules, i.e. uses exported symbols.
  *
  * Control system call tracing.  When enabled a syscall entry or exit
- * produces a call to tracehook_report_syscall, below.
+ * produces a call to tracehook_report_syscall(), below.
  *
  *	void tracehook_enable_syscall_trace(struct task_struct *tsk);
  *	void tracehook_disable_syscall_trace(struct task_struct *tsk);
  *
- * When stopped in tracehook_report_syscall for syscall entry,
+ * When stopped in tracehook_report_syscall() for syscall entry,
  * abort the syscall so no kernel function is called.
  * If the register state was not otherwise updated before,
  * this produces an -ENOSYS error return as for an invalid syscall number.
  *
  *	void tracehook_abort_syscall(struct pt_regs *regs);
  *
- * Return the regset view (see below) that is native for the given process.
- * For example, what it would access when it called ptrace.
- * Throughout the life of the process, this only changes at exec.
- *
- *	const struct utrace_regset_view *utrace_native_view(struct task_struct *);
- *
- ***/
+ * When stopped in tracehook_report_syscall() for syscall entry or exit,
+ * return the address of the word the in struct pt_regs that holds the
+ * syscall number, and the word that holds the return value.  These can be
+ * changed at entry to change the syscall that will be attempted, and
+ * at exit to change the results that will be seen by the thread.
+ *
+ *	long *tracehook_syscall_callno(struct pt_regs *regs);
+ *	long *tracehook_syscall_retval(struct pt_regs *regs);
+ */
 
 
-/*
+/**
+ * struct utrace_regset - accessible thread CPU state
+ * @n:		Number of slots (registers).
+ * @size:	Size in bytes of a slot (register).
+ * @align:	Required alignment, in bytes.
+ * @bias:	Bias from natural indexing.
+ * @get:	Function to fetch values.
+ * @set:	Function to store values.
+ * @active:	Function to report if regset is active.
+ * @writeback:	Function to write data back to user memory.
+ *
  * This data structure describes a machine resource we call a register set.
  * This is part of the state of an individual thread, not necessarily
  * actual CPU registers per se.  A register set consists of a number of
- * similar slots, given by ->n.  Each slot is ->size bytes, and aligned to
- * ->align bytes (which is at least ->size).
+ * similar slots, given by @n.  Each slot is @size bytes, and aligned to
+ * @align bytes (which is at least @size).
  *
  * As described above, these entry points can be called on the current
- * thread or on a quiescent thread.  The pos argument must be aligned
- * according to ->align; the count argument must be a multiple of ->size.
+ * thread or on a quiescent thread.  The @pos argument must be aligned
+ * according to @align; the @count argument must be a multiple of @size.
  * These functions are not responsible for checking for invalid arguments.
  *
- * When there is a natural value to use as an index, ->bias gives the
+ * When there is a natural value to use as an index, @bias gives the
  * difference between the natural index and the slot index for the
  * register set.  For example, x86 GDT segment descriptors form a regset;
  * the segment selector produces a natural index, but only a subset of
  * that index space is available as a regset (the TLS slots); subtracting
- * ->bias from a segment selector index value computes the regset slot.
+ * @bias from a segment selector index value computes the regset slot.
  */
 struct utrace_regset {
 	unsigned int n;		/* Number of slots (registers).  */
@@ -107,25 +130,25 @@ struct utrace_regset {
 	unsigned int bias;	/* Bias from natural indexing.  */
 
 	/*
-	 * Return -ENODEV if not available on the hardware found.
-	 * Return 0 if no interesting state in this thread.
-	 * Return >0 number of ->size units of interesting state.
+	 * Return -%ENODEV if not available on the hardware found.
+	 * Return %0 if no interesting state in this thread.
+	 * Return >%0 number of @size units of interesting state.
 	 * Any get call fetching state beyond that number will
 	 * see the default initialization state for this data,
 	 * so a caller that knows that the default state is need
 	 * not copy it all out.
-	 * This call is optional; the pointer is NULL if there
-	 * so no inexpensive check to yield a value < .n.
+	 * This call is optional; the pointer is %NULL if there
+	 * is no inexpensive check to yield a value < @n.
 	 */
 	int (*active)(struct task_struct *, const struct utrace_regset *);
 
 	/*
-	 * Fetch and store register values.  Return 0 on success; -EIO or
-	 * -ENODEV are usual failure returns.  The pos and count values are
-	 * in bytes, but must be properly aligned.  If kbuf is non-null,
-	 * that buffer is used and ubuf is ignored.  If kbuf is NULL, then
-	 * ubuf gives a userland pointer to access directly, and an -EFAULT
-	 * return value is possible.
+	 * Fetch and store register values.  Return %0 on success; -%EIO
+	 * or -%ENODEV are usual failure returns.  The @pos and @count
+	 * values are in bytes, but must be properly aligned.  If @kbuf
+	 * is non-null, that buffer is used and @ubuf is ignored.  If
+	 * @kbuf is %NULL, then ubuf gives a userland pointer to access
+	 * directly, and an -%EFAULT return value is possible.
 	 */
 	int (*get)(struct task_struct *, const struct utrace_regset *,
 		   unsigned int pos, unsigned int count,
@@ -135,56 +158,72 @@ struct utrace_regset {
 		   const void *kbuf, const void __user *ubuf);
 
 	/*
-	 * This call is optional; usually the pointer is NULL.
-	 * When provided, there is some user memory associated
-	 * with this regset's hardware, such as memory backing
-	 * cached register data on register window machines; the
-	 * regset's data controls what user memory is used
-	 * (e.g. via the stack pointer value).
+	 * This call is optional; usually the pointer is %NULL.  When
+	 * provided, there is some user memory associated with this
+	 * regset's hardware, such as memory backing cached register
+	 * data on register window machines; the regset's data controls
+	 * what user memory is used (e.g. via the stack pointer value).
 	 *
-	 * Write register data back to user memory.  If the
-	 * immediate flag is nonzero, it must be written to the
-	 * user memory so uaccess/access_process_vm can see it
-	 * when this call returns; if zero, then it must be
-	 * written back by the time the task completes a context
-	 * switch (as synchronized with wait_task_inactive).
-	 * Return 0 on success or if there was nothing to do,
-	 * -EFAULT for a memory problem (bad stack pointer or
-	 * whatever), or -EIO for a hardware problem.
+	 * Write register data back to user memory.  If the @immediate
+	 * flag is nonzero, it must be written to the user memory so
+	 * uaccess/access_process_vm() can see it when this call
+	 * returns; if zero, then it must be written back by the time
+	 * the task completes a context switch (as synchronized with
+	 * wait_task_inactive()).  Return %0 on success or if there was
+	 * nothing to do, -%EFAULT for a memory problem (bad stack
+	 * pointer or whatever), or -%EIO for a hardware problem.
 	 */
 	int (*writeback)(struct task_struct *, const struct utrace_regset *,
 			 int immediate);
 };
 
-/*
- * A regset view is a collection of regsets (struct utrace_regset, above).
- * This describes all the state of a thread that can be seen from a given
- * architecture/ABI environment.  More than one view might refer to the
- * same utrace_regset, or more than one regset might refer to the same
- * machine-specific state in the thread.  For example, a 32-bit thread's
- * state could be examined from the 32-bit view or from the 64-bit view.
- * Either method reaches the same thread register state, doing appropriate
- * widening or truncation.
+/**
+ * struct utrace_regset_view - available regsets
+ * @name:	Identifier, e.g. ELF_PLATFORM string.
+ * @regsets:	Array of @n regsets available in this view.
+ * @n:		Number of elements in @regsets.
+ * @e_machine:	ELF %EM_* value for which this is the native view, if any.
+ *
+ * A regset view is a collection of regsets (&struct utrace_regset,
+ * above).  This describes all the state of a thread that can be seen
+ * from a given architecture/ABI environment.  More than one view might
+ * refer to the same &struct utrace_regset, or more than one regset
+ * might refer to the same machine-specific state in the thread.  For
+ * example, a 32-bit thread's state could be examined from the 32-bit
+ * view or from the 64-bit view.  Either method reaches the same thread
+ * register state, doing appropriate widening or truncation.
  */
 struct utrace_regset_view {
-	const char *name;	/* Identifier, e.g. ELF_PLATFORM string.  */
-
+	const char *name;
 	const struct utrace_regset *regsets;
 	unsigned int n;
-
-	/*
-	 * EM_* value for which this is the native view, if any.
-	 */
 	u16 e_machine;
 };
 
+/*
+ * This is documented here rather than at the definition sites because its
+ * implementation is machine-dependent but its interface is universal.
+ */
+/**
+ * utrace_native_view - Return the process's native regset view.
+ * @tsk: a thread of the process in question
+ *
+ * Return the &struct utrace_regset_view that is native for the given process.
+ * For example, what it would access when it called ptrace().
+ * Throughout the life of the process, this only changes at exec.
+ */
+const struct utrace_regset_view *utrace_native_view(struct task_struct *tsk);
+
 
 /*
- * These two are helpers for writing regset get/set functions in arch code.
+ * These are helpers for writing regset get/set functions in arch code.
+ * Because @start_pos and @end_pos are always compile-time constants,
+ * these are inlined into very little code though they look large.
+ *
  * Use one or more calls sequentially for each chunk of regset data stored
- * contiguously in memory.  Call with constants for start_pos and end_pos,
+ * contiguously in memory.  Call with constants for @start_pos and @end_pos,
  * giving the range of byte positions in the regset that data corresponds
- * to; end_pos can be -1 if this chunk is at the end of the regset layout.
+ * to; @end_pos can be -1 if this chunk is at the end of the regset layout.
  * Each call updates the arguments to point past its chunk.
  */
 
@@ -290,20 +329,12 @@ utrace_regset_copyin_ignore(unsigned int
 	return 0;
 }
 
-/**/
-
 
-/***
- ***
- *** Following are entry points from core code, where the user debugging
- *** support can affect the normal behavior.  The locking situation is
- *** described for each call.
- ***
- ***/
-
-#ifdef CONFIG_UTRACE
-#include <linux/utrace.h>
-#endif
+/*
+ * Following are entry points from core code, where the user debugging
+ * support can affect the normal behavior.  The locking situation is
+ * described for each call.
+ */
 
 
 /*
@@ -312,10 +343,7 @@ utrace_regset_copyin_ignore(unsigned int
  */
 static inline void tracehook_init_task(struct task_struct *child)
 {
-#ifdef CONFIG_UTRACE
-	child->utrace_flags = 0;
-	child->utrace = NULL;
-#endif
+	utrace_init_task(child);
 }
 
 /*
@@ -324,11 +352,9 @@ static inline void tracehook_init_task(s
  */
 static inline void tracehook_release_task(struct task_struct *p)
 {
-#ifdef CONFIG_UTRACE
 	smp_mb();
-	if (p->utrace != NULL)
+	if (tsk_utrace_struct(p) != NULL)
 		utrace_release_task(p);
-#endif
 }
 
 /*
@@ -339,10 +365,20 @@ static inline void tracehook_release_tas
  */
 static inline int tracehook_check_released(struct task_struct *p)
 {
-#ifdef CONFIG_UTRACE
-	return unlikely(p->utrace != NULL);
-#endif
-	return 0;
+	int bad = 0;
+	BUG_ON(p->exit_state != EXIT_DEAD);
+	if (unlikely(tsk_utrace_struct(p) != NULL)) {
+		/*
+		 * In a race condition, utrace_attach will temporarily set
+		 * it, but then check p->exit_state and clear it.  It does
+		 * all this under task_lock, so we take the lock to check
+		 * that there is really a bug and not just that known race.
+		 */
+		task_lock(p);
+		bad = unlikely(tsk_utrace_struct(p) != NULL);
+		task_unlock(p);
+	}
+	return bad;
 }
 
 /*
@@ -353,11 +389,7 @@ static inline int tracehook_check_releas
 static inline int tracehook_notify_cldstop(struct task_struct *tsk,
 					   const siginfo_t *info)
 {
-#ifdef CONFIG_UTRACE
-	if (tsk->utrace_flags & UTRACE_ACTION_NOREAP)
-		return 1;
-#endif
-	return 0;
+	return (tsk_utrace_flags(tsk) & UTRACE_ACTION_NOREAP);
 }
 
 /*
@@ -371,14 +403,11 @@ static inline int tracehook_notify_cldst
 static inline int tracehook_notify_death(struct task_struct *tsk,
 					 int *noreap, void **death_cookie)
 {
-	*death_cookie = NULL;
-#ifdef CONFIG_UTRACE
-	*death_cookie = tsk->utrace;
-	if (tsk->utrace_flags & UTRACE_ACTION_NOREAP) {
+	*death_cookie = tsk_utrace_struct(tsk);
+	if (tsk_utrace_flags(tsk) & UTRACE_ACTION_NOREAP) {
 		*noreap = 1;
 		return 1;
 	}
-#endif
 	*noreap = 0;
 	return 0;
 }
@@ -391,11 +420,8 @@ static inline int tracehook_notify_death
 static inline int tracehook_consider_fatal_signal(struct task_struct *tsk,
 						  int sig)
 {
-#ifdef CONFIG_UTRACE
-	return (tsk->utrace_flags & (UTRACE_EVENT(SIGNAL_TERM)
-				     | UTRACE_EVENT(SIGNAL_CORE)));
-#endif
-	return 0;
+	return (tsk_utrace_flags(tsk) & (UTRACE_EVENT(SIGNAL_TERM)
+					 | UTRACE_EVENT(SIGNAL_CORE)));
 }
 
 /*
@@ -405,12 +431,10 @@ static inline int tracehook_consider_fat
  * Called with tsk->sighand->siglock held.
  */
 static inline int tracehook_consider_ignored_signal(struct task_struct *tsk,
-						    int sig, void *handler)
+						    int sig,
+						    void __user *handler)
 {
-#ifdef CONFIG_UTRACE
-	return (tsk->utrace_flags & UTRACE_EVENT(SIGNAL_IGN));
-#endif
-	return 0;
+	return (tsk_utrace_flags(tsk) & UTRACE_EVENT(SIGNAL_IGN));
 }
 
 
@@ -421,10 +445,7 @@ static inline int tracehook_consider_ign
  */
 static inline int tracehook_induce_sigpending(struct task_struct *tsk)
 {
-#ifdef CONFIG_UTRACE
-	return unlikely(tsk->utrace_flags & UTRACE_ACTION_QUIESCE);
-#endif
-	return 0;
+	return unlikely(tsk_utrace_flags(tsk) & UTRACE_ACTION_QUIESCE);
 }
 
 /*
@@ -439,10 +460,8 @@ static inline int tracehook_get_signal(s
 				       siginfo_t *info,
 				       struct k_sigaction *return_ka)
 {
-#ifdef CONFIG_UTRACE
-	if (unlikely(tsk->utrace_flags))
+	if (unlikely(tsk_utrace_flags(tsk)))
 		return utrace_get_signal(tsk, regs, info, return_ka);
-#endif
 	return 0;
 }
 
@@ -455,21 +474,8 @@ static inline int tracehook_get_signal(s
  */
 static inline int tracehook_finish_stop(int last_one)
 {
-#ifdef CONFIG_UTRACE
-	if (current->utrace_flags & UTRACE_EVENT(JCTL))
+	if (tsk_utrace_flags(current) & UTRACE_EVENT(JCTL))
 		return utrace_report_jctl(CLD_STOPPED);
-#endif
-
-	return 0;
-}
-
-/*
- * Called with tasklist_lock held for reading, for an event notification stop.
- * We are already in TASK_TRACED.  Return zero to go back to running,
- * or nonzero to actually stop until resumed.
- */
-static inline int tracehook_stop_now(void)
-{
 	return 0;
 }
 
@@ -481,10 +487,7 @@ static inline int tracehook_stop_now(voi
  */
 static inline int tracehook_inhibit_wait_stopped(struct task_struct *child)
 {
-#ifdef CONFIG_UTRACE
-	return (child->utrace_flags & UTRACE_ACTION_NOREAP);
-#endif
-	return 0;
+	return (tsk_utrace_flags(child) & UTRACE_ACTION_NOREAP);
 }
 
 /*
@@ -494,10 +497,7 @@ static inline int tracehook_inhibit_wait
  */
 static inline int tracehook_inhibit_wait_zombie(struct task_struct *child)
 {
-#ifdef CONFIG_UTRACE
-	return (child->utrace_flags & UTRACE_ACTION_NOREAP);
-#endif
-	return 0;
+	return (tsk_utrace_flags(child) & UTRACE_ACTION_NOREAP);
 }
 
 /*
@@ -507,10 +507,7 @@ static inline int tracehook_inhibit_wait
  */
 static inline int tracehook_inhibit_wait_continued(struct task_struct *child)
 {
-#ifdef CONFIG_UTRACE
-	return (child->utrace_flags & UTRACE_ACTION_NOREAP);
-#endif
-	return 0;
+	return (tsk_utrace_flags(child) & UTRACE_ACTION_NOREAP);
 }
 
 
@@ -520,10 +517,8 @@ static inline int tracehook_inhibit_wait
  */
 static inline int tracehook_unsafe_exec(struct task_struct *tsk)
 {
-#ifdef CONFIG_UTRACE
-	if (tsk->utrace_flags)
+	if (tsk_utrace_flags(tsk))
 		return utrace_unsafe_exec(tsk);
-#endif
 	return 0;
 }
 
@@ -539,10 +534,8 @@ static inline int tracehook_unsafe_exec(
  */
 static inline struct task_struct *tracehook_tracer_task(struct task_struct *p)
 {
-#ifdef CONFIG_UTRACE
-	if (p->utrace_flags)
+	if (tsk_utrace_flags(p))
 		return utrace_tracer_task(p);
-#endif
 	return NULL;
 }
 
@@ -554,24 +547,32 @@ static inline int tracehook_allow_access
 {
 	if (tsk == current)
 		return 1;
-#ifdef CONFIG_UTRACE
-	if (tsk->utrace_flags)
+	if (tsk_utrace_flags(tsk))
 		return utrace_allow_access_process_vm(tsk);
-#endif
 	return 0;
 }
 
+/*
+ * Return nonzero if the current task is expected to want breakpoint
+ * insertion in its memory at some point.  A zero return is no guarantee
+ * it won't be done, but this is a hint that it's known to be likely.
+ * May be called with tsk->mm->mmap_sem held for writing.
+ */
+static inline int tracehook_expect_breakpoints(struct task_struct *tsk)
+{
+	return (tsk_utrace_flags(tsk) & UTRACE_EVENT(SIGNAL_CORE));
+}
 
-/***
- ***
- *** Following decelarations are hook stubs where core code reports
- *** events.  These are called without locks, from the thread having the
- *** event.  In all tracehook_report_* calls, no locks are held and the thread
- *** is in a state close to returning to user mode with little baggage to
- *** unwind, except as noted below for tracehook_report_clone.  It is generally
- *** OK to block in these places if you want the user thread to be suspended.
- ***
- ***/
+
+/*
+ * Following decelarations are hook stubs where core code reports
+ * events.  These are called without locks, from the thread having the
+ * event.  In all tracehook_report_*() calls, no locks are held and the
+ * thread is in a state close to returning to user mode with little
+ * baggage to unwind, except as noted below for tracehook_report_clone.
+ * It is generally OK to block in these places if you want the user
+ * thread to be suspended.
+ */
 
 /*
  * Thread has just become a zombie (exit_state==TASK_ZOMBIE) or is about to
@@ -582,11 +583,20 @@ static inline int tracehook_allow_access
 static inline void tracehook_report_death(struct task_struct *tsk,
 					  int exit_state, void *death_cookie)
 {
-#ifdef CONFIG_UTRACE
 	smp_mb();
-	if (tsk->utrace_flags & (UTRACE_EVENT(DEATH) | UTRACE_ACTION_QUIESCE))
+	if (tsk_utrace_flags(tsk) & (UTRACE_EVENT(DEATH)
+				     | UTRACE_EVENT(QUIESCE)))
 		utrace_report_death(tsk, death_cookie);
-#endif
+}
+
+/*
+ * This is called when tracehook_inhibit_wait_zombie(p) returned true
+ * and a previously delayed group_leader is now eligible for reaping.
+ * It's called from release_task, with no locks held, and p is not current.
+ */
+static inline void tracehook_report_delayed_group_leader(struct task_struct *p)
+{
+	utrace_report_delayed_group_leader(p);
 }
 
 /*
@@ -594,12 +604,10 @@ static inline void tracehook_report_deat
  * The freshly initialized register state can be seen and changed here.
  */
 static inline void tracehook_report_exec(struct linux_binprm *bprm,
-				    struct pt_regs *regs)
+					 struct pt_regs *regs)
 {
-#ifdef CONFIG_UTRACE
-	if (current->utrace_flags & UTRACE_EVENT(EXEC))
+	if (tsk_utrace_flags(current) & UTRACE_EVENT(EXEC))
 		utrace_report_exec(bprm, regs);
-#endif
 }
 
 /*
@@ -608,10 +616,8 @@ static inline void tracehook_report_exec
  */
 static inline void tracehook_report_exit(long *exit_code)
 {
-#ifdef CONFIG_UTRACE
-	if (current->utrace_flags & UTRACE_EVENT(EXIT))
+	if (tsk_utrace_flags(current) & UTRACE_EVENT(EXIT))
 		utrace_report_exit(exit_code);
-#endif
 }
 
 /*
@@ -626,28 +632,23 @@ static inline void tracehook_report_exit
 static inline void tracehook_report_clone(unsigned long clone_flags,
 					  struct task_struct *child)
 {
-#ifdef CONFIG_UTRACE
-	if (current->utrace_flags & UTRACE_EVENT(CLONE))
+	if (tsk_utrace_flags(current) & UTRACE_EVENT(CLONE))
 		utrace_report_clone(clone_flags, child);
-#endif
 }
 
 /*
  * Called after the child has started running, shortly after
- * tracehook_report_clone.  This is just before the clone/fork syscall
- * returns, or blocks for vfork child completion if (clone_flags &
- * CLONE_VFORK).  The child pointer may be invalid if a self-reaping
- * child died and tracehook_report_clone took no action to prevent it
- * from self-reaping.
+ * tracehook_report_clone.  This is just before the clone/fork syscall returns,
+ * or blocks for vfork child completion if (clone_flags & CLONE_VFORK).
+ * The child pointer may be invalid if a self-reaping child died and
+ * tracehook_report_clone took no action to prevent it from self-reaping.
  */
 static inline void tracehook_report_clone_complete(unsigned long clone_flags,
 						   pid_t pid,
 						   struct task_struct *child)
 {
-#ifdef CONFIG_UTRACE
-	if (current->utrace_flags & UTRACE_ACTION_QUIESCE)
+	if (tsk_utrace_flags(current) & UTRACE_ACTION_QUIESCE)
 		utrace_quiescent(current, NULL);
-#endif
 }
 
 /*
@@ -659,10 +660,8 @@ static inline void tracehook_report_clon
 static inline void tracehook_report_vfork_done(struct task_struct *child,
 					       pid_t child_pid)
 {
-#ifdef CONFIG_UTRACE
-	if (current->utrace_flags & UTRACE_EVENT(VFORK_DONE))
+	if (tsk_utrace_flags(current) & UTRACE_EVENT(VFORK_DONE))
 		utrace_report_vfork_done(child_pid);
-#endif
 }
 
 /*
@@ -670,11 +669,9 @@ static inline void tracehook_report_vfor
  */
 static inline void tracehook_report_syscall(struct pt_regs *regs, int is_exit)
 {
-#ifdef CONFIG_UTRACE
-	if (current->utrace_flags & (is_exit ? UTRACE_EVENT(SYSCALL_EXIT)
-				     : UTRACE_EVENT(SYSCALL_ENTRY)))
+	if (tsk_utrace_flags(current) & (is_exit ? UTRACE_EVENT(SYSCALL_EXIT)
+					 : UTRACE_EVENT(SYSCALL_ENTRY)))
 		utrace_report_syscall(regs, is_exit);
-#endif
 }
 
 /*
@@ -694,13 +691,11 @@ static inline void tracehook_report_hand
 						  const sigset_t *oldset,
 						  struct pt_regs *regs)
 {
-#ifdef CONFIG_UTRACE
 	struct task_struct *tsk = current;
-	if ((tsk->utrace_flags & UTRACE_EVENT_SIGNAL_ALL)
-	    && (tsk->utrace_flags & (UTRACE_ACTION_SINGLESTEP
-				     | UTRACE_ACTION_BLOCKSTEP)))
+	if ((tsk_utrace_flags(tsk) & UTRACE_EVENT_SIGNAL_ALL)
+	    && (tsk_utrace_flags(tsk) & (UTRACE_ACTION_SINGLESTEP
+					 | UTRACE_ACTION_BLOCKSTEP)))
 		utrace_signal_handler_singlestep(tsk, regs);
-#endif
 }
 
 
--- linux-2.6.18/include/linux/utrace.h
+++ linux-2.6.18/include/linux/utrace.h
@@ -1,35 +1,43 @@
 /*
- * User Debugging Data & Event Rendezvous
+ * utrace infrastructure interface for debugging user processes
+ *
+ * Copyright (C) 2006, 2007 Red Hat, Inc.  All rights reserved.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU General Public License v.2.
+ *
+ * Red Hat Author: Roland McGrath.
  *
  * This interface allows for notification of interesting events in a thread.
  * It also mediates access to thread state such as registers.
  * Multiple unrelated users can be associated with a single thread.
  * We call each of these a tracing engine.
  *
- * A tracing engine starts by calling utrace_attach on the chosen thread,
- * passing in a set of hooks (struct utrace_engine_ops), and some associated
- * data.  This produces a struct utrace_attached_engine, which is the handle
- * used for all other operations.  An attached engine has its ops vector,
- * its data, and a flags word controlled by utrace_set_flags.
+ * A tracing engine starts by calling utrace_attach() on the chosen thread,
+ * passing in a set of hooks (&struct utrace_engine_ops), and some
+ * associated data.  This produces a &struct utrace_attached_engine, which
+ * is the handle used for all other operations.  An attached engine has its
+ * ops vector, its data, and a flags word controlled by utrace_set_flags().
  *
  * Each engine's flags word contains two kinds of flags: events of
  * interest, and action state flags.
  *
  * For each event flag that is set, that engine will get the
- * appropriate ops->report_* callback when the event occurs.  The
- * struct utrace_engine_ops need not provide callbacks for an event
+ * appropriate ops->report_*() callback when the event occurs.  The
+ * &struct utrace_engine_ops need not provide callbacks for an event
  * unless the engine sets one of the associated event flags.
  *
  * Action state flags change the normal behavior of the thread.
- * These bits are in UTRACE_ACTION_STATE_MASK; these can be OR'd into
- * flags set with utrace_set_flags.  Also, every callback that return
+ * These bits are in %UTRACE_ACTION_STATE_MASK; these can be OR'd into
+ * flags set with utrace_set_flags().  Also, every callback that return
  * an action value can reset these bits for the engine (see below).
  *
- * The bits UTRACE_ACTION_STATE_MASK of all attached engines are OR'd
+ * The bits %UTRACE_ACTION_STATE_MASK of all attached engines are OR'd
  * together, so each action is in force as long as any engine requests it.
- * As long as some engine sets the UTRACE_ACTION_QUIESCE flag, the thread
+ * As long as some engine sets the %UTRACE_ACTION_QUIESCE flag, the thread
  * will block and not resume running user code.  When the last engine
- * clears its UTRACE_ACTION_QUIESCE flag, the thread will resume running.
+ * clears its %UTRACE_ACTION_QUIESCE flag, the thread will resume running.
  */
 
 #ifndef _LINUX_UTRACE_H
@@ -38,19 +46,41 @@
 #include <linux/list.h>
 #include <linux/rcupdate.h>
 #include <linux/signal.h>
+#include <linux/sched.h>
 
 struct linux_binprm;
 struct pt_regs;
+struct utrace;
+struct utrace_signal;
 struct utrace_regset;
 struct utrace_regset_view;
 
+#ifdef __GENKSYMS__		/* RHEL-5 GA KABI compatibility */
+struct utrace
+{
+	union {
+		struct rcu_head dead;
+		struct {
+			struct task_struct *cloning;
+			struct utrace_signal *signal;
+		} live;
+		struct {
+			int report_death; /* report_death running */
+			int reap; /* release_task called */
+		} exit;
+	} u;
+
+	struct list_head engines;
+	spinlock_t lock;
+};
+#endif
 
 /*
- * Flags in task_struct.utrace_flags and utrace_attached_engine.flags.
- * Low four bits are UTRACE_ACTION_STATE_MASK bits (below).
+ * Flags in &struct task_struct.utrace_flags and
+ * &struct utrace_attached_engine.flags.
+ * Low four bits are %UTRACE_ACTION_STATE_MASK bits (below).
  * Higher bits are events of interest.
  */
-
 #define UTRACE_FIRST_EVENT	4
 #define UTRACE_EVENT_BITS	(BITS_PER_LONG - UTRACE_FIRST_EVENT)
 #define UTRACE_EVENT_MASK	(-1UL &~ UTRACE_ACTION_STATE_MASK)
@@ -95,22 +125,23 @@ enum utrace_events {
 /*
  * Action flags, in return value of callbacks.
  *
- * UTRACE_ACTION_RESUME (zero) is the return value to do nothing special.
- * For each particular callback, some bits in UTRACE_ACTION_OP_MASK can
+ * %UTRACE_ACTION_RESUME (zero) is the return value to do nothing special.
+ * For each particular callback, some bits in %UTRACE_ACTION_OP_MASK can
  * be set in the return value to change the thread's behavior (see below).
  *
- * If UTRACE_ACTION_NEWSTATE is set, then the UTRACE_ACTION_STATE_MASK
+ * If %UTRACE_ACTION_NEWSTATE is set, then the %UTRACE_ACTION_STATE_MASK
  * bits in the return value replace the engine's flags as in utrace_set_flags
  * (but the event flags remained unchanged).
  *
- * If UTRACE_ACTION_HIDE is set, then the callbacks to other engines
+ * If %UTRACE_ACTION_HIDE is set, then the callbacks to other engines
  * should be suppressed for this event.  This is appropriate only when
  * the event was artificially provoked by something this engine did,
  * such as setting a breakpoint.
  *
- * If UTRACE_ACTION_DETACH is set, this engine is detached as by utrace_detach.
- * The action bits in UTRACE_ACTION_OP_MASK work as normal, but the engine's
- * UTRACE_ACTION_STATE_MASK bits will no longer affect the thread.
+ * If %UTRACE_ACTION_DETACH is set, this engine is detached as by
+ * utrace_detach().  The action bits in %UTRACE_ACTION_OP_MASK work as
+ * normal, but the engine's %UTRACE_ACTION_STATE_MASK bits will no longer
+ * affect the thread.
  */
 #define UTRACE_ACTION_RESUME	0x0000 /* Continue normally after event.  */
 #define UTRACE_ACTION_HIDE	0x0010 /* Hide event from other tracing.  */
@@ -119,8 +150,8 @@ enum utrace_events {
 
 /*
  * These flags affect the state of the thread until they are changed via
- * utrace_set_flags or by the next callback to the same engine that uses
- * UTRACE_ACTION_NEWSTATE.
+ * utrace_set_flags() or by the next callback to the same engine that uses
+ * %UTRACE_ACTION_NEWSTATE.
  */
 #define UTRACE_ACTION_QUIESCE	0x0001 /* Stay quiescent after callbacks.  */
 #define UTRACE_ACTION_SINGLESTEP 0x0002 /* Resume for one instruction.  */
@@ -128,11 +159,13 @@ enum utrace_events {
 #define UTRACE_ACTION_NOREAP	0x0008 /* Inhibit parent SIGCHLD and wait.  */
 #define UTRACE_ACTION_STATE_MASK 0x000f /* Lasting state bits.  */
 
-/* These flags have meanings specific to the particular event report hook.  */
+/*
+ * These flags have meanings specific to the particular event report hook.
+ */
 #define UTRACE_ACTION_OP_MASK	0xff00
 
 /*
- * Action flags in return value and argument of report_signal callback.
+ * Action flags in return value and argument of report_signal() callback.
  */
 #define UTRACE_SIGNAL_DELIVER	0x0100 /* Deliver according to sigaction.  */
 #define UTRACE_SIGNAL_IGN	0x0200 /* Ignore the signal.  */
@@ -142,23 +175,21 @@ enum utrace_events {
 #define UTRACE_SIGNAL_TSTP	0x0600 /* Deliver as job control stop.  */
 #define UTRACE_SIGNAL_HOLD	0x1000 /* Flag, push signal back on queue.  */
 /*
- * This value is passed to a report_signal callback after a signal
- * handler is entered while UTRACE_ACTION_SINGLESTEP is in force.
+ * This value is passed to a report_signal() callback after a signal
+ * handler is entered while %UTRACE_ACTION_SINGLESTEP is in force.
  * For this callback, no signal will never actually be delivered regardless
  * of the return value, and the other callback parameters are null.
  */
 #define UTRACE_SIGNAL_HANDLER	0x0700
 
-/* Action flag in return value of report_jctl.  */
+/*
+ * Action flag in return value of report_jctl().
+ */
 #define UTRACE_JCTL_NOSIGCHLD	0x0100 /* Do not notify the parent.  */
 
 
 /*
- * Flags for utrace_attach.  If UTRACE_ATTACH_CREATE is not specified,
- * you only look up an existing engine already attached to the
- * thread.  If UTRACE_ATTACH_MATCH_* bits are set, only consider
- * matching engines.  If UTRACE_ATTACH_EXCLUSIVE is set, attempting to
- * attach a second (matching) engine fails with -EEXIST.
+ * Flags for utrace_attach().
  */
 #define UTRACE_ATTACH_CREATE		0x0010 /* Attach a new engine.  */
 #define UTRACE_ATTACH_EXCLUSIVE		0x0020 /* Refuse if existing match.  */
@@ -167,56 +198,28 @@ enum utrace_events {
 #define UTRACE_ATTACH_MATCH_MASK	0x000f
 
 
-/*
- * Per-thread structure task_struct.utrace points to.
- *
- * The task itself never has to worry about this going away after
- * some event is found set in task_struct.utrace_flags.
- * Once created, this pointer is changed only when the task is quiescent
- * (TASK_TRACED or TASK_STOPPED with the siglock held, or dead).
- *
- * For other parties, the pointer to this is protected by RCU and
- * task_lock.  Since call_rcu is never used while the thread is alive and
- * using this struct utrace, we can overlay the RCU data structure used
- * only for a dead struct with some local state used only for a live utrace
- * on an active thread.
- */
-struct utrace
-{
-	union {
-		struct rcu_head dead;
-		struct {
-			struct task_struct *cloning;
-			struct utrace_signal *signal;
-		} live;
-		struct {
-			int report_death; /* report_death running */
-			int reap; /* release_task called */
-		} exit;
-	} u;
-
-	struct list_head engines;
-	spinlock_t lock;
-};
-#define utrace_lock(utrace)	spin_lock(&(utrace)->lock)
-#define utrace_unlock(utrace)	spin_unlock(&(utrace)->lock)
-
-
-/*
- * Per-engine per-thread structure.
+#ifdef CONFIG_UTRACE
+/**
+ * struct utrace_attached_engine - Per-engine per-thread structure.
+ * @ops: &struct utrace_engine_ops pointer passed to utrace_attach()
+ * @data: engine-private void * passed to utrace_attach()
+ * @flags: current flags set by utrace_set_flags()
  *
  * The task itself never has to worry about engines detaching while
  * it's doing event callbacks.  These structures are freed only when
  * the task is quiescent.  For other parties, the list is protected
- * by RCU and utrace_lock.
+ * by RCU and utrace->lock.
  */
 struct utrace_attached_engine
 {
+/* private: */
 	struct list_head entry;	/* Entry on thread's utrace.engines list.  */
 	struct rcu_head rhead;
+	atomic_t check_dead;
 
+/* public: */
 	const struct utrace_engine_ops *ops;
-	unsigned long data;
+	void *data;
 
 	unsigned long flags;
 };
@@ -227,21 +230,20 @@ struct utrace_engine_ops
 	/*
 	 * Event reporting hooks.
 	 *
-	 * Return values contain UTRACE_ACTION_* flag bits.
-	 * The UTRACE_ACTION_OP_MASK bits are specific to each kind of event.
+	 * Return values contain %UTRACE_ACTION_* flag bits.
+	 * The %UTRACE_ACTION_OP_MASK bits are specific to each kind of event.
 	 *
-	 * All report_* hooks are called with no locks held, in a generally
+	 * All report_*() hooks are called with no locks held, in a generally
 	 * safe environment when we will be returning to user mode soon.
 	 * It is fine to block for memory allocation and the like, but all
 	 * hooks are *asynchronous* and must not block on external events.
-	 * If you want the thread to block, request UTRACE_ACTION_QUIESCE in
-	 * your hook; then later wake it up with utrace_set_flags.
-	 *
+	 * If you want the thread to block, request %UTRACE_ACTION_QUIESCE in
+	 * your hook; then later wake it up with utrace_set_flags().
 	 */
 
 	/*
 	 * Event reported for parent, before child might run.
-	 * The PF_STARTING flag prevents other engines from attaching
+	 * The %PF_STARTING flag prevents other engines from attaching
 	 * before this one has its chance.
 	 */
 	u32 (*report_clone)(struct utrace_attached_engine *engine,
@@ -250,30 +252,30 @@ struct utrace_engine_ops
 			    struct task_struct *child);
 
 	/*
-	 * Event reported for parent using CLONE_VFORK or vfork system call.
+	 * Event reported for parent using %CLONE_VFORK or vfork() system call.
 	 * The child has died or exec'd, so the vfork parent has unblocked
-	 * and is about to return child_pid.
+	 * and is about to return @child_pid.
 	 */
 	u32 (*report_vfork_done)(struct utrace_attached_engine *engine,
 				 struct task_struct *parent, pid_t child_pid);
 
 	/*
-	 * Event reported after UTRACE_ACTION_QUIESCE is set, when the target
+	 * Event reported after %UTRACE_ACTION_QUIESCE is set, when the target
 	 * thread is quiescent.  Either it's the current thread, or it's in
-	 * TASK_TRACED or TASK_STOPPED and will not resume running until the
-	 * UTRACE_ACTION_QUIESCE flag is no longer asserted by any engine.
+	 * %TASK_TRACED or %TASK_STOPPED and will not resume running until the
+	 * %UTRACE_ACTION_QUIESCE flag is no longer asserted by any engine.
 	 */
 	u32 (*report_quiesce)(struct utrace_attached_engine *engine,
 			      struct task_struct *tsk);
 
 	/*
 	 * Thread dequeuing a signal to be delivered.
-	 * The action and *return_ka values say what UTRACE_ACTION_RESUME
+	 * The @action and @return_ka values say what %UTRACE_ACTION_RESUME
 	 * will do (possibly already influenced by another tracing engine).
-	 * An UTRACE_SIGNAL_* return value overrides the signal disposition.
-	 * The *info data (including info->si_signo) can be changed at will.
-	 * Changing *return_ka affects the sigaction that be used.
-	 * The *orig_ka value is the one in force before other tracing
+	 * An %UTRACE_SIGNAL_* return value overrides the signal disposition.
+	 * The @info data (including @info->si_signo) can be changed at will.
+	 * Changing @return_ka affects the sigaction that will be used.
+	 * The @orig_ka value is the one in force before other tracing
 	 * engines intervened.
 	 */
 	u32 (*report_signal)(struct utrace_attached_engine *engine,
@@ -284,9 +286,9 @@ struct utrace_engine_ops
 			     struct k_sigaction *return_ka);
 
 	/*
-	 * Job control event completing, about to send SIGCHLD to parent
-	 * with CLD_STOPPED or CLD_CONTINUED as given in type.
-	 * UTRACE_JOBSTOP_NOSIGCHLD in the return value inhibits that.
+	 * Job control event completing, about to send %SIGCHLD to parent
+	 * with %CLD_STOPPED or %CLD_CONTINUED as given in type.
+	 * %UTRACE_JOBSTOP_NOSIGCHLD in the return value inhibits that.
 	 */
 	u32 (*report_jctl)(struct utrace_attached_engine *engine,
 			   struct task_struct *tsk,
@@ -319,9 +321,9 @@ struct utrace_engine_ops
 
 	/*
 	 * Thread is exiting and cannot be prevented from doing so,
-	 * but all its state is still live.  The *code value will be
+	 * but all its state is still live.  The @code value will be
 	 * the wait result seen by the parent, and can be changed by
-	 * this engine or others.  The orig_code value is the real
+	 * this engine or others.  The @orig_code value is the real
 	 * status, not changed by any tracing engine.
 	 */
 	u32 (*report_exit)(struct utrace_attached_engine *engine,
@@ -329,11 +331,23 @@ struct utrace_engine_ops
 			   long orig_code, long *code);
 
 	/*
-	 * Thread is really dead now.  If UTRACE_ACTION_NOREAP is in force,
+	 * Thread is really dead now.  If %UTRACE_ACTION_NOREAP is in force,
 	 * it remains an unreported zombie.  Otherwise, it might be reaped
 	 * by its parent, or self-reap immediately.  Though the actual
-	 * reaping may happen in parallel, a report_reap callback will
-	 * always be ordered after a report_death callback.
+	 * reaping may happen in parallel, a report_reap() callback will
+	 * always be ordered after a report_death() callback.
+	 *
+	 * If %UTRACE_ACTION_NOREAP is in force and this was a group_leader
+	 * dying with threads still in the group (delayed_group_leader()),
+	 * then there can be a second report_death() callback later when
+	 * the group_leader is no longer delayed.  This second callback can
+	 * be made from another thread's context, but it will always be
+	 * serialized after the first report_death() callback and before
+	 * the report_reap() callback.  It's possible that
+	 * delayed_group_leader() will already be true by the time it can
+	 * be checked inside the first report_death callback made at the
+	 * time of death, but that a second callback will be made almost
+	 * immediately thereafter.
 	 */
 	u32 (*report_death)(struct utrace_attached_engine *engine,
 			    struct task_struct *tsk);
@@ -355,7 +369,7 @@ struct utrace_engine_ops
 	 */
 
 	/*
-	 * Return nonzero iff the caller task should be allowed to access
+	 * Return nonzero iff the @caller task should be allowed to access
 	 * the memory of the target task via /proc/PID/mem and so forth,
 	 * by dint of this engine's attachment to the target.
 	 */
@@ -364,117 +378,52 @@ struct utrace_engine_ops
 				       struct task_struct *caller);
 
 	/*
-	 * Return LSM_UNSAFE_* bits that apply to the exec in progress
+	 * Return %LSM_UNSAFE_* bits that apply to the exec in progress
 	 * due to tracing done by this engine.  These bits indicate that
 	 * someone is able to examine the process and so a set-UID or similar
 	 * privilege escalation may not be safe to permit.
 	 *
-	 * Called with task_lock held.
+	 * Called with task_lock() held.
 	 */
 	int (*unsafe_exec)(struct utrace_attached_engine *engine,
 			   struct task_struct *target);
 
 	/*
-	 * Return the task_struct for the task using ptrace on this one, or
-	 * NULL.  Always called with rcu_read_lock held to keep the
+	 * Return the &struct task_struct for the task using ptrace on this
+	 * one, or %NULL.  Always called with rcu_read_lock() held to keep the
 	 * returned struct alive.
 	 *
 	 * At exec time, this may be called with task_lock(target) still
-	 * held from when unsafe_exec was just called.  In that case it
-	 * must give results consistent with those unsafe_exec results,
-	 * i.e. non-NULL if any LSM_UNSAFE_PTRACE_* bits were set.
+	 * held from when unsafe_exec() was just called.  In that case it
+	 * must give results consistent with those unsafe_exec() results,
+	 * i.e. non-%NULL if any %LSM_UNSAFE_PTRACE_* bits were set.
 	 *
 	 * The value is also used to display after "TracerPid:" in
 	 * /proc/PID/status, where it is called with only rcu_read_lock held.
 	 *
-	 * If this engine returns NULL, another engine may supply the result.
+	 * If this engine returns %NULL, another engine may supply the result.
 	 */
 	struct task_struct *(*tracer_task)(struct utrace_attached_engine *,
 					   struct task_struct *target);
 };
 
 
-/***
- *** These are the exported entry points for tracing engines to use.
- ***/
-
 /*
- * Attach a new tracing engine to a thread, or look up attached engines.
- * See UTRACE_ATTACH_* flags, above.  The caller must ensure that the
- * target thread does not get freed, i.e. hold a ref or be its parent.
+ * These are the exported entry points for tracing engines to use.
  */
 struct utrace_attached_engine *utrace_attach(struct task_struct *target,
 					     int flags,
 					     const struct utrace_engine_ops *,
-					     unsigned long data);
-
-/*
- * Detach a tracing engine from a thread.  After this, the engine
- * data structure is no longer accessible, and the thread might be reaped.
- * The thread will start running again if it was being kept quiescent
- * and no longer has any attached engines asserting UTRACE_ACTION_QUIESCE.
- *
- * If the target thread is not already quiescent, then a callback to this
- * engine might be in progress or about to start on another CPU.  If it's
- * quiescent when utrace_detach is called, then after successful return
- * it's guaranteed that no more callbacks to the ops vector will be done.
- * The only exception is SIGKILL (and exec by another thread in the group),
- * which breaks quiescence and can cause asynchronous DEATH and/or REAP
- * callbacks even when UTRACE_ACTION_QUIESCE is set.  In that event,
- * utrace_detach fails with -ESRCH or -EALREADY to indicate that the
- * report_reap or report_death callbacks have begun or will run imminently.
- */
+					     void *data);
 int utrace_detach(struct task_struct *target,
 		  struct utrace_attached_engine *engine);
-
-/*
- * Change the flags for a tracing engine.
- * This resets the event flags and the action state flags.
- * If UTRACE_ACTION_QUIESCE and UTRACE_EVENT(QUIESCE) are set,
- * this will cause a report_quiesce callback soon, maybe immediately.
- * If UTRACE_ACTION_QUIESCE was set before and is no longer set by
- * any engine, this will wake the thread up.
- *
- * This fails with -EALREADY and does nothing if you try to clear
- * UTRACE_EVENT(DEATH) when the report_death callback may already have
- * begun, if you try to clear UTRACE_EVENT(REAP) when the report_reap
- * callback may already have begun, if you try to newly set
- * UTRACE_ACTION_NOREAP when the target may already have sent its
- * parent SIGCHLD, or if you try to newly set UTRACE_EVENT(DEATH),
- * UTRACE_EVENT(QUIESCE), or UTRACE_ACTION_QUIESCE, when the target is
- * already dead or dying.  It can fail with -ESRCH when the target has
- * already been detached (including forcible detach on reaping).  If
- * the target was quiescent before the call, then after a successful
- * call, no event callbacks not requested in the new flags will be
- * made, and a report_quiesce callback will always be made if
- * requested.  These rules provide for coherent synchronization based
- * on quiescence, even when SIGKILL is breaking quiescence.
- */
 int utrace_set_flags(struct task_struct *target,
 		     struct utrace_attached_engine *engine,
 		     unsigned long flags);
-
-/*
- * Cause a specified signal delivery in the target thread, which must be
- * quiescent (or the current thread).  The action has UTRACE_SIGNAL_* bits
- * as returned from a report_signal callback.  If ka is non-null, it gives
- * the sigaction to follow for UTRACE_SIGNAL_DELIVER; otherwise, the
- * installed sigaction at the time of delivery is used.
- */
 int utrace_inject_signal(struct task_struct *target,
 			 struct utrace_attached_engine *engine,
 			 u32 action, siginfo_t *info,
 			 const struct k_sigaction *ka);
-
-/*
- * Prepare to access thread's machine state, see <linux/tracehook.h>.
- * The given thread must be quiescent (or the current thread).
- * When this returns, the struct utrace_regset calls may be used to
- * interrogate or change the thread's state.  Do not cache the returned
- * pointer when the thread can resume.  You must call utrace_regset to
- * ensure that context switching has completed and consistent state is
- * available.
- */
 const struct utrace_regset *utrace_regset(struct task_struct *target,
 					  struct utrace_attached_engine *,
 					  const struct utrace_regset_view *,
@@ -492,6 +441,7 @@ void utrace_report_clone(unsigned long c
 void utrace_report_vfork_done(pid_t child_pid);
 void utrace_report_exit(long *exit_code);
 void utrace_report_death(struct task_struct *, struct utrace *);
+void utrace_report_delayed_group_leader(struct task_struct *);
 int utrace_report_jctl(int type);
 void utrace_report_exec(struct linux_binprm *bprm, struct pt_regs *regs);
 void utrace_report_syscall(struct pt_regs *regs, int is_exit);
@@ -500,5 +450,114 @@ int utrace_allow_access_process_vm(struc
 int utrace_unsafe_exec(struct task_struct *);
 void utrace_signal_handler_singlestep(struct task_struct *, struct pt_regs *);
 
+/*
+ * <linux/tracehook.h> uses these accessors to avoid #ifdef CONFIG_UTRACE.
+ */
+static inline unsigned long tsk_utrace_flags(struct task_struct *tsk)
+{
+	return tsk->utrace_flags;
+}
+static inline struct utrace *tsk_utrace_struct(struct task_struct *tsk)
+{
+	return tsk->utrace;
+}
+static inline void utrace_init_task(struct task_struct *child)
+{
+	child->utrace_flags = 0;
+	child->utrace = NULL;
+}
+
+#else  /* !CONFIG_UTRACE */
+
+static unsigned long tsk_utrace_flags(struct task_struct *tsk)
+{
+	return 0;
+}
+static struct utrace *tsk_utrace_struct(struct task_struct *tsk)
+{
+	return NULL;
+}
+static inline void utrace_init_task(struct task_struct *child)
+{
+}
+
+/*
+ * The calls to these should all be in if (0) and optimized out entirely.
+ * We have stubs here only so tracehook.h doesn't need to #ifdef them
+ * to avoid external references in case of unoptimized compilation.
+ */
+static inline int utrace_quiescent(struct task_struct *tsk, void *ignored)
+{
+	BUG();
+	return 0;
+}
+static inline void utrace_release_task(struct task_struct *tsk)
+{
+	BUG();
+}
+static inline int utrace_get_signal(struct task_struct *tsk,
+				    struct pt_regs *regs,
+				    siginfo_t *info, struct k_sigaction *ka)
+{
+	BUG();
+	return 0;
+}
+static inline void utrace_report_clone(unsigned long clone_flags,
+				       struct task_struct *child)
+{
+	BUG();
+}
+static inline void utrace_report_vfork_done(pid_t child_pid)
+{
+	BUG();
+}
+static inline void utrace_report_exit(long *exit_code)
+{
+	BUG();
+}
+static inline void utrace_report_death(struct task_struct *tsk, void *ignored)
+{
+	BUG();
+}
+static inline void utrace_report_delayed_group_leader(struct task_struct *tsk)
+{
+	BUG();
+}
+static inline int utrace_report_jctl(int type)
+{
+	BUG();
+	return 0;
+}
+static inline void utrace_report_exec(struct linux_binprm *bprm,
+				      struct pt_regs *regs)
+{
+	BUG();
+}
+static inline void utrace_report_syscall(struct pt_regs *regs, int is_exit)
+{
+	BUG();
+}
+static inline struct task_struct *utrace_tracer_task(struct task_struct *tsk)
+{
+	BUG();
+	return NULL;
+}
+static inline int utrace_allow_access_process_vm(struct task_struct *tsk)
+{
+	BUG();
+	return 0;
+}
+static inline int utrace_unsafe_exec(struct task_struct *tsk)
+{
+	BUG();
+	return 0;
+}
+static inline void utrace_signal_handler_singlestep(struct task_struct *tsk,
+						    struct pt_regs *regs)
+{
+	BUG();
+}
+
+#endif  /* CONFIG_UTRACE */
 
 #endif	/* linux/utrace.h */
--- linux-2.6.18/init/Kconfig
+++ linux-2.6.18/init/Kconfig
@@ -530,9 +530,21 @@ endmenu
 
 menu "Process debugging support"
 
+config PTRACE
+	bool "Legacy ptrace system call interface"
+	default y
+	select UTRACE
+	depends on PROC_FS
+	help
+	  Enable the ptrace system call.
+	  This is traditionally used by debuggers like GDB,
+	  and is used by UML and some other applications.
+	  Unless you are very sure you won't run anything that needs it,
+	  say Y.
+
 config UTRACE
 	bool "Infrastructure for tracing and debugging user processes"
-	default y
+	default y if MODULES || PTRACE
 	help
 	  Enable the utrace process tracing interface.
 	  This is an internal kernel interface to track events in user
@@ -543,18 +555,6 @@ config UTRACE
 	  applications.  Unless you are making a specially stripped-down
 	  kernel and are very sure you don't need these facilitiies,
 	  say Y.
-
-config PTRACE
-	bool "Legacy ptrace system call interface"
-	default y
-	depends on UTRACE
-	help
-	  Enable the ptrace system call.
-	  This is traditionally used by debuggers like GDB,
-	  and is used by UML and some other applications.
-	  Unless you are very sure you won't run anything that needs it,
-	  say Y.
-
 endmenu
 
 menu "Block layer"
--- linux-2.6.18/kernel/exit.c
+++ linux-2.6.18/kernel/exit.c
@@ -20,8 +20,8 @@
 #include <linux/acct.h>
 #include <linux/file.h>
 #include <linux/binfmts.h>
-#include <linux/ptrace.h>
 #include <linux/tracehook.h>
+#include <linux/ptrace.h>
 #include <linux/profile.h>
 #include <linux/mount.h>
 #include <linux/proc_fs.h>
@@ -139,6 +139,7 @@ void release_task(struct task_struct * p
 {
 	struct task_struct *leader;
 	int zap_leader;
+	int inhibited_leader;
 repeat:
 	tracehook_release_task(p);
 	atomic_dec(&p->user->processes);
@@ -152,10 +153,14 @@ repeat:
 	 * group leader's parent process. (if it wants notification.)
 	 */
 	zap_leader = 0;
+	inhibited_leader = 0;
 	leader = p->group_leader;
 	if (leader != p && thread_group_empty(leader) && leader->exit_state == EXIT_ZOMBIE) {
 		BUG_ON(leader->exit_signal == -1);
-		do_notify_parent(leader, leader->exit_signal);
+		if (tracehook_inhibit_wait_zombie(leader))
+			inhibited_leader = 1;
+		else
+			do_notify_parent(leader, leader->exit_signal);
 		/*
 		 * If we were the last child thread and the leader has
 		 * exited already, and the leader's parent ignores SIGCHLD,
@@ -176,6 +181,13 @@ repeat:
 	p = leader;
 	if (unlikely(zap_leader))
 		goto repeat;
+
+	/*
+	 * If tracing usurps normal reaping of the leader, tracing needs
+	 * to be notified it would normally be reapable now.
+	 */
+	if (unlikely(inhibited_leader))
+		tracehook_report_delayed_group_leader(leader);
 }
 
 /*
@@ -601,7 +613,8 @@ reparent_thread(struct task_struct *p, s
 	/* If we'd notified the old parent about this child's death,
 	 * also notify the new parent.
 	 */
-	if (p->exit_state == EXIT_ZOMBIE && p->exit_signal != -1 &&
+	if (!tracehook_inhibit_wait_zombie(p) &&
+	    p->exit_state == EXIT_ZOMBIE && p->exit_signal != -1 &&
 	    thread_group_empty(p))
 		do_notify_parent(p, p->exit_signal);
 
@@ -674,11 +687,8 @@ static void exit_notify(struct task_stru
 		read_lock(&tasklist_lock);
 		spin_lock_irq(&tsk->sighand->siglock);
 		for (t = next_thread(tsk); t != tsk; t = next_thread(t))
-			if (!signal_pending(t) && !(t->flags & PF_EXITING)) {
-				recalc_sigpending_tsk(t);
-				if (signal_pending(t))
-					signal_wake_up(t, 0);
-			}
+			if (!signal_pending(t) && !(t->flags & PF_EXITING))
+				recalc_sigpending_and_wake(t);
 		spin_unlock_irq(&tsk->sighand->siglock);
 		read_unlock(&tasklist_lock);
 	}
--- linux-2.6.18/kernel/fork.c
+++ linux-2.6.18/kernel/fork.c
@@ -45,6 +45,9 @@
 #include <linux/cn_proc.h>
 #include <linux/delayacct.h>
 #include <linux/taskstats_kern.h>
+#ifndef __GENKSYMS__
+#include <linux/ptrace.h>
+#endif
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -1026,9 +1029,7 @@ static struct task_struct *copy_process(
 	INIT_LIST_HEAD(&p->sibling);
 	p->vfork_done = NULL;
 	spin_lock_init(&p->alloc_lock);
-#ifdef CONFIG_PTRACE
-	INIT_LIST_HEAD(&p->ptracees);
-#endif
+	ptrace_init_task(p);
 
 	clear_tsk_thread_flag(p, TIF_SIGPENDING);
 	init_sigpending(&p->pending);
@@ -1345,6 +1346,10 @@ long do_fork(unsigned long clone_flags,
 	 * might get invalid after that point, if the thread exits quickly.
 	 */
 	if (!IS_ERR(p)) {
+		/*
+		 * When called from kernel_thread, don't do user tracing stuff.
+		 */
+		int is_user = likely(user_mode(regs));
 		struct completion vfork;
 
 		if (clone_flags & CLONE_VFORK) {
@@ -1352,7 +1357,8 @@ long do_fork(unsigned long clone_flags,
 			init_completion(&vfork);
 		}
 
-		tracehook_report_clone(clone_flags, p);
+		if (likely(is_user))
+			tracehook_report_clone(clone_flags, p);
 
 		p->flags &= ~PF_STARTING;
 
@@ -1367,11 +1373,13 @@ long do_fork(unsigned long clone_flags,
 		else
 			wake_up_new_task(p, clone_flags);
 
-		tracehook_report_clone_complete(clone_flags, nr, p);
+		if (likely(is_user))
+			tracehook_report_clone_complete(clone_flags, nr, p);
 
 		if (clone_flags & CLONE_VFORK) {
 			wait_for_completion(&vfork);
-			tracehook_report_vfork_done(p, nr);
+			if (likely(is_user))
+				tracehook_report_vfork_done(p, nr);
 		}
 	} else {
 		free_pid(pid);
--- linux-2.6.18/kernel/Makefile
+++ linux-2.6.18/kernel/Makefile
@@ -4,7 +4,7 @@
 
 obj-y     = sched.o fork.o exec_domain.o panic.o printk.o profile.o \
 	    exit.o itimer.o time.o softirq.o resource.o \
-	    sysctl.o capability.o ptrace.o timer.o user.o \
+	    sysctl.o capability.o timer.o user.o \
 	    signal.o sys.o kmod.o workqueue.o pid.o \
 	    rcupdate.o extable.o params.o posix-timers.o \
 	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
@@ -51,6 +51,7 @@ obj-$(CONFIG_RELAY) += relay.o
 obj-$(CONFIG_TASK_DELAY_ACCT) += delayacct.o
 obj-$(CONFIG_TASKSTATS) += taskstats.o
 obj-$(CONFIG_UTRACE) += utrace.o
+obj-$(CONFIG_PTRACE) += ptrace.o
 
 ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
--- linux-2.6.18/kernel/power/process.c
+++ linux-2.6.18/kernel/power/process.c
@@ -74,7 +74,7 @@ static void cancel_freezing(struct task_
 		pr_debug("  clean up: %s\n", p->comm);
 		do_not_freeze(p);
 		spin_lock_irqsave(&p->sighand->siglock, flags);
-		recalc_sigpending_tsk(p);
+		recalc_sigpending_and_wake(p);
 		spin_unlock_irqrestore(&p->sighand->siglock, flags);
 	}
 }
--- linux-2.6.18/kernel/ptrace.c
+++ linux-2.6.18/kernel/ptrace.c
@@ -18,59 +18,15 @@
 #include <linux/ptrace.h>
 #include <linux/security.h>
 #include <linux/signal.h>
+#include <linux/syscalls.h>
+#include <linux/utrace.h>
+#include <linux/tracehook.h>
 #include <linux/audit.h>
 
+#include <asm/tracehook.h>
 #include <asm/pgtable.h>
 #include <asm/uaccess.h>
 
-#ifdef CONFIG_PTRACE
-#include <linux/utrace.h>
-#include <linux/tracehook.h>
-#include <asm/tracehook.h>
-#endif
-
-int getrusage(struct task_struct *, int, struct rusage __user *);
-
-//#define PTRACE_DEBUG
-
-int __ptrace_may_attach(struct task_struct *task)
-{
-	/* May we inspect the given task?
-	 * This check is used both for attaching with ptrace
-	 * and for allowing access to sensitive information in /proc.
-	 *
-	 * ptrace_attach denies several cases that /proc allows
-	 * because setting up the necessary parent/child relationship
-	 * or halting the specified task is impossible.
-	 */
-	int dumpable = 0;
-	/* Don't let security modules deny introspection */
-	if (task == current)
-		return 0;
-	if (((current->uid != task->euid) ||
-	     (current->uid != task->suid) ||
-	     (current->uid != task->uid) ||
-	     (current->gid != task->egid) ||
-	     (current->gid != task->sgid) ||
-	     (current->gid != task->gid)) && !capable(CAP_SYS_PTRACE))
-		return -EPERM;
-	smp_rmb();
-	if (task->mm)
-		dumpable = task->mm->dumpable;
-	if (!dumpable && !capable(CAP_SYS_PTRACE))
-		return -EPERM;
-
-	return security_ptrace(current, task);
-}
-
-int ptrace_may_attach(struct task_struct *task)
-{
-	int err;
-	task_lock(task);
-	err = __ptrace_may_attach(task);
-	task_unlock(task);
-	return !err;
-}
 
 /*
  * Access another process' address space.
@@ -127,17 +83,32 @@ int access_process_vm(struct task_struct
 }
 
 
-#ifndef CONFIG_PTRACE
-
-asmlinkage long sys_ptrace(long request, long pid, long addr, long data)
-{
-	return -ENOSYS;
-}
+#ifdef CONFIG_DEBUG_PREEMPT
+#define NO_LOCKS	WARN_ON(preempt_count() != 0)
+#define START_CHECK	do { int _dbg_preempt = preempt_count()
+#define	END_CHECK	BUG_ON(preempt_count() != _dbg_preempt); } while (0)
+#else
+#define NO_LOCKS	do { } while (0)
+#define START_CHECK	do { } while (0)
+#define	END_CHECK	do { } while (0)
+#endif
 
+#define PTRACE_DEBUG 1
+#ifdef PTRACE_DEBUG
+#define CHECK_INIT(p)	atomic_set(&(p)->check_dead, 1)
+#define CHECK_DEAD(p)	BUG_ON(!atomic_dec_and_test(&(p)->check_dead))
 #else
+#define CHECK_INIT(p)	do { } while (0)
+#define CHECK_DEAD(p)	do { } while (0)
+#endif
 
 struct ptrace_state
 {
+	struct rcu_head rcu;
+#ifdef PTRACE_DEBUG
+	atomic_t check_dead;
+#endif
+
 	/*
 	 * These elements are always available, even when the struct is
 	 * awaiting destruction at the next RCU callback point.
@@ -147,23 +118,18 @@ struct ptrace_state
 	struct task_struct *parent; /* Whom we report to.  */
 	struct list_head entry;	/* Entry on parent->ptracees list.  */
 
-	union {
-		struct rcu_head dead;
-		struct {
-			u8 options; /* PTRACE_SETOPTIONS bits.  */
-			unsigned int syscall:1;	/* Reporting for syscall.  */
+	u8 options;		/* PTRACE_SETOPTIONS bits.  */
+	unsigned int syscall:1;	/* Reporting for syscall.  */
 #ifdef PTRACE_SYSEMU
-			unsigned int sysemu:1; /* PTRACE_SYSEMU in progress. */
+	unsigned int sysemu:1;	/* PTRACE_SYSEMU in progress. */
 #endif
-			unsigned int have_eventmsg:1; /* u.eventmsg valid. */
-			unsigned int cap_sys_ptrace:1; /* Tracer capable.  */
+	unsigned int have_eventmsg:1; /* u.eventmsg valid. */
+	unsigned int cap_sys_ptrace:1; /* Tracer capable.  */
 
-			union
-			{
-				unsigned long eventmsg;
-				siginfo_t *siginfo;
-			} u;
-		} live;
+	union
+	{
+		unsigned long eventmsg;
+		siginfo_t *siginfo;
 	} u;
 };
 
@@ -179,32 +145,47 @@ ptrace_state_unlink(struct ptrace_state 
 
 static struct ptrace_state *
 ptrace_setup(struct task_struct *target, struct utrace_attached_engine *engine,
-	     struct task_struct *parent, u8 options, int cap_sys_ptrace,
-	     struct ptrace_state *state)
+	     struct task_struct *parent, u8 options, int cap_sys_ptrace)
 {
-	if (state == NULL) {
-		state = kzalloc(sizeof *state, GFP_USER);
-		if (unlikely(state == NULL))
-			return ERR_PTR(-ENOMEM);
-	}
+	struct ptrace_state *state;
 
-	state->engine = engine;
+	NO_LOCKS;
+
+	state = kzalloc(sizeof *state, GFP_USER);
+	if (unlikely(state == NULL))
+		return ERR_PTR(-ENOMEM);
+
+	INIT_RCU_HEAD(&state->rcu);
+	CHECK_INIT(state);
 	state->task = target;
+	state->engine = engine;
+	state->options = options;
+	state->cap_sys_ptrace = cap_sys_ptrace;
+
+	rcu_read_lock();
+
+	/*
+	 * In ptrace_traceme, it's only safe to use this inside rcu_read_lock.
+	 */
+	if (parent == NULL)
+		parent = current->parent;
+
 	state->parent = parent;
-	state->u.live.options = options;
-	state->u.live.cap_sys_ptrace = cap_sys_ptrace;
 
 	task_lock(parent);
 	if (unlikely(parent->flags & PF_EXITING)) {
 		task_unlock(parent);
 		kfree(state);
-		return ERR_PTR(-EALREADY);
+		state = ERR_PTR(-EALREADY);
+	}
+	else {
+		list_add_rcu(&state->entry, &state->parent->ptracees);
+		task_unlock(state->parent);
 	}
-	list_add_rcu(&state->entry, &state->parent->ptracees);
-	task_unlock(state->parent);
 
-	BUG_ON(engine->data != 0);
-	rcu_assign_pointer(engine->data, (unsigned long) state);
+	rcu_read_unlock();
+
+	NO_LOCKS;
 
 	return state;
 }
@@ -213,26 +194,29 @@ static void
 ptrace_state_free(struct rcu_head *rhead)
 {
 	struct ptrace_state *state = container_of(rhead,
-						  struct ptrace_state, u.dead);
+						  struct ptrace_state, rcu);
 	kfree(state);
 }
 
 static void
 ptrace_done(struct ptrace_state *state)
 {
-	INIT_RCU_HEAD(&state->u.dead);
-	call_rcu(&state->u.dead, ptrace_state_free);
+	CHECK_DEAD(state);
+	BUG_ON(state->rcu.func);
+	BUG_ON(state->rcu.next);
+	call_rcu(&state->rcu, ptrace_state_free);
 }
 
 /*
  * Update the tracing engine state to match the new ptrace state.
  */
 static int __must_check
-ptrace_update(struct task_struct *target,
-	      struct utrace_attached_engine *engine,
-	      unsigned long flags)
+ptrace_update(struct task_struct *target, struct ptrace_state *state,
+	      unsigned long flags, int from_stopped)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
+	int ret;
+
+	START_CHECK;
 
 	/*
 	 * These events are always reported.
@@ -248,9 +232,9 @@ ptrace_update(struct task_struct *target
 	/*
 	 * PTRACE_SETOPTIONS can request more events.
 	 */
-	if (state->u.live.options & PTRACE_O_TRACEEXIT)
+	if (state->options & PTRACE_O_TRACEEXIT)
 		flags |= UTRACE_EVENT(EXIT);
-	if (state->u.live.options & PTRACE_O_TRACEVFORKDONE)
+	if (state->options & PTRACE_O_TRACEVFORKDONE)
 		flags |= UTRACE_EVENT(VFORK_DONE);
 
 	/*
@@ -259,7 +243,7 @@ ptrace_update(struct task_struct *target
 	 */
 	flags |= UTRACE_ACTION_NOREAP | UTRACE_EVENT(REAP);
 
-	if (!(flags & UTRACE_ACTION_QUIESCE)) {
+	if (from_stopped && !(flags & UTRACE_ACTION_QUIESCE)) {
 		/*
 		 * We're letting the thread resume from ptrace stop.
 		 * If SIGKILL is waking it up, it can be racing with us here
@@ -269,8 +253,8 @@ ptrace_update(struct task_struct *target
 		if (!unlikely(target->flags & PF_SIGNALED))
 			target->exit_code = 0;
 
-		if (!state->u.live.have_eventmsg)
-			state->u.live.u.siginfo = NULL;
+		if (!state->have_eventmsg)
+			state->u.siginfo = NULL;
 
 		if (target->state == TASK_STOPPED) {
 			/*
@@ -289,20 +273,56 @@ ptrace_update(struct task_struct *target
 		}
 	}
 
-	return utrace_set_flags(target, engine, flags);
+	ret = utrace_set_flags(target, state->engine, flags);
+
+	END_CHECK;
+
+	return ret;
+}
+
+/*
+ * This does ptrace_update and also installs state in engine->data.
+ * Only after utrace_set_flags succeeds (in ptrace_update) inside
+ * rcu_read_lock() can we be sure state->engine is still valid.
+ * Otherwise a quick death could have come along and cleaned it up
+ * already.  Note that from ptrace_update we can get event callbacks
+ * that will see engine->data still NULL before we set it.  This is
+ * fine, as they will just act as if we had not been attached yet.
+ */
+static int __must_check
+ptrace_setup_finish(struct task_struct *target, struct ptrace_state *state)
+{
+	int ret;
+
+	NO_LOCKS;
+
+	rcu_read_lock();
+	ret = ptrace_update(target, state, 0, 0);
+	if (likely(ret == 0)) {
+		struct utrace_attached_engine *engine = state->engine;
+		BUG_ON(engine->data != NULL);
+		rcu_assign_pointer(engine->data, state);
+	}
+	rcu_read_unlock();
+
+	NO_LOCKS;
+
+	return ret;
 }
 
+
 static int ptrace_traceme(void)
 {
 	struct utrace_attached_engine *engine;
 	struct ptrace_state *state;
-	struct task_struct *parent;
 	int retval;
 
+	NO_LOCKS;
+
 	engine = utrace_attach(current, (UTRACE_ATTACH_CREATE
 					 | UTRACE_ATTACH_EXCLUSIVE
 					 | UTRACE_ATTACH_MATCH_OPS),
-			       &ptrace_utrace_ops, 0UL);
+			       &ptrace_utrace_ops, NULL);
 
 	if (IS_ERR(engine)) {
 		retval = PTR_ERR(engine);
@@ -310,45 +330,25 @@ static int ptrace_traceme(void)
 			retval = -EPERM;
 	}
 	else {
-		/*
-		 * We need to preallocate so that we can hold
-		 * rcu_read_lock from extracting ->parent through
-		 * ptrace_setup using it.
-		 */
-		state = kzalloc(sizeof *state, GFP_USER);
-		if (unlikely(state == NULL)) {
-			(void) utrace_detach(current, engine);
-			printk(KERN_ERR
-			       "ptrace out of memory, lost child %d of %d",
-			       current->pid, current->parent->pid);
-			return -ENOMEM;
-		}
-
-		rcu_read_lock();
-		parent = rcu_dereference(current->parent);
-
 		task_lock(current);
-		retval = security_ptrace(parent, current);
+		retval = security_ptrace(current->parent, current);
 		task_unlock(current);
 
 		if (retval) {
-			kfree(state);
 			(void) utrace_detach(current, engine);
 		}
 		else {
-			state = ptrace_setup(current, engine, parent, 0, 0,
-					     state);
+			state = ptrace_setup(current, engine, NULL, 0, 0);
 			if (IS_ERR(state))
 				retval = PTR_ERR(state);
 		}
-		rcu_read_unlock();
 
 		if (!retval) {
 			/*
 			 * This can't fail because we can't die while we
 			 * are here doing this.
 			 */
-			retval = ptrace_update(current, engine, 0);
+			retval = ptrace_setup_finish(current, state);
 			BUG_ON(retval);
 		}
 		else if (unlikely(retval == -EALREADY))
@@ -361,6 +361,8 @@ static int ptrace_traceme(void)
 			retval = 0;
 	}
 
+	NO_LOCKS;
+
 	return retval;
 }
 
@@ -372,6 +374,8 @@ static int ptrace_attach(struct task_str
 
 	audit_ptrace(task);
 
+	NO_LOCKS;
+
 	retval = -EPERM;
 	if (task->pid <= 1)
 		goto bad;
@@ -380,10 +384,13 @@ static int ptrace_attach(struct task_str
 	if (!task->mm)		/* kernel threads */
 		goto bad;
 
+	pr_debug("%d ptrace_attach %d state %lu exit_code %x\n",
+		 current->pid, task->pid, task->state, task->exit_code);
+
 	engine = utrace_attach(task, (UTRACE_ATTACH_CREATE
 				      | UTRACE_ATTACH_EXCLUSIVE
 				      | UTRACE_ATTACH_MATCH_OPS),
-			       &ptrace_utrace_ops, 0);
+			       &ptrace_utrace_ops, NULL);
 	if (IS_ERR(engine)) {
 		retval = PTR_ERR(engine);
 		if (retval == -EEXIST)
@@ -391,13 +398,23 @@ static int ptrace_attach(struct task_str
 		goto bad;
 	}
 
+	pr_debug("%d ptrace_attach %d after utrace_attach: %lu exit_code %x\n",
+		 current->pid, task->pid, task->state, task->exit_code);
+
+	NO_LOCKS;
 	if (ptrace_may_attach(task)) {
 		state = ptrace_setup(task, engine, current, 0,
-				     capable(CAP_SYS_PTRACE), NULL);
+				     capable(CAP_SYS_PTRACE));
 		if (IS_ERR(state))
 			retval = PTR_ERR(state);
 		else {
-			retval = ptrace_update(task, engine, 0);
+			retval = ptrace_setup_finish(task, state);
+
+			pr_debug("%d ptrace_attach %d after ptrace_update (%d)"
+				 " %lu exit_code %x\n",
+				 current->pid, task->pid, retval,
+				 task->state, task->exit_code);
+
 			if (retval) {
 				/*
 				 * It died before we enabled any callbacks.
@@ -410,11 +427,14 @@ static int ptrace_attach(struct task_str
 			}
 		}
 	}
+	NO_LOCKS;
 	if (retval)
 		(void) utrace_detach(task, engine);
 	else {
 		int stopped = 0;
 
+		NO_LOCKS;
+
 		/*
 		 * We must double-check that task has not just died and
 		 * been reaped (after ptrace_update succeeded).
@@ -432,29 +452,89 @@ static int ptrace_attach(struct task_str
 		read_unlock(&tasklist_lock);
 
 		if (stopped) {
+			const struct utrace_regset *regset;
+
+			/*
+			 * Set QUIESCE immediately, so we can allow
+			 * ptrace requests while he's in TASK_STOPPED.
+			 */
+			retval = ptrace_update(task, state, /* XXX child death+other thread waits race could have freed state already */
+					       UTRACE_ACTION_QUIESCE, 0);
+			if (retval)
+				/*
+				 * Anything is possible here.  It might not
+				 * really have been quiescent yet.  It
+				 * might have just woken up and died.
+				 */
+				BUG_ON(retval != -ESRCH && retval != -EALREADY);
+			retval = 0;
+
 			/*
 			 * Do now the regset 0 writeback that we do on every
 			 * stop, since it's never been done.  On register
 			 * window machines, this makes sure the user memory
 			 * backing the register data is up to date.
 			 */
-			const struct utrace_regset *regset;
 			regset = utrace_regset(task, engine,
 					       utrace_native_view(task), 0);
 			if (regset->writeback)
 				(*regset->writeback)(task, regset, 1);
 		}
+
+		pr_debug("%d ptrace_attach %d complete (%sstopped)"
+			 " state %lu code %x",
+			 current->pid, task->pid, stopped ? "" : "not ",
+			 task->state, task->exit_code);
 	}
 
 bad:
+	NO_LOCKS;
 	return retval;
 }
 
+/*
+ * The task might be dying or being reaped in parallel, in which case
+ * engine and state may no longer be valid.  utrace_detach checks for us.
+ */
 static int ptrace_detach(struct task_struct *task,
-			 struct utrace_attached_engine *engine)
+			 struct utrace_attached_engine *engine,
+			 struct ptrace_state *state)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
-	int error = utrace_detach(task, engine);
+
+	int error;
+
+	NO_LOCKS;
+
+#ifdef HAVE_ARCH_PTRACE_DETACH
+	/*
+	 * Some funky compatibility code in arch_ptrace may have
+	 * needed to install special state it should clean up now.
+	 */
+	arch_ptrace_detach(task);
+#endif
+
+	/*
+	 * Traditional ptrace behavior does wake_up_process no matter what
+	 * in ptrace_detach.  But utrace_detach will not do a wakeup if
+	 * it's in a proper job control stop.  We need it to wake up from
+	 * TASK_STOPPED and either resume or process more signals.  A
+	 * pending stop signal will just leave it stopped again, but will
+	 * consume the signal, and reset task->exit_code for the next wait
+	 * call to see.  This is important to userland if ptrace_do_wait
+	 * "stole" the previous unwaited-for-ness (clearing exit_code), but
+	 * there is a pending SIGSTOP, e.g. sent by a PTRACE_ATTACH done
+	 * while already in job control stop.
+	 */
+	read_lock(&tasklist_lock);
+	if (likely(task->signal != NULL)) {
+		spin_lock_irq(&task->sighand->siglock);
+		task->signal->flags &= ~SIGNAL_STOP_STOPPED;
+		spin_unlock_irq(&task->sighand->siglock);
+	}
+	read_unlock(&tasklist_lock);
+
+	error = utrace_detach(task, engine);
+	NO_LOCKS;
 	if (!error) {
 		/*
 		 * We can only get here from the ptracer itself or via
@@ -488,6 +568,9 @@ void
 ptrace_exit(struct task_struct *tsk)
 {
 	struct list_head *pos, *n;
+	int restart;
+
+	NO_LOCKS;
 
 	/*
 	 * Taking the task_lock after PF_EXITING is set ensures that a
@@ -501,40 +584,58 @@ ptrace_exit(struct task_struct *tsk)
 	}
 	task_unlock(tsk);
 
-restart:
-	rcu_read_lock();
+	restart = 0;
+	do {
+		struct ptrace_state *state;
+		int error;
 
-	list_for_each_safe_rcu(pos, n, &tsk->ptracees) {
-		struct ptrace_state *state = list_entry(pos,
-							struct ptrace_state,
-							entry);
-		int error = utrace_detach(state->task, state->engine);
-		BUG_ON(state->parent != tsk);
-		if (likely(error == 0)) {
-			ptrace_state_unlink(state);
-			ptrace_done(state);
-		}
-		else if (unlikely(error == -EALREADY)) {
-			/*
-			 * It's still doing report_death callbacks.
-			 * Just wait for it to settle down.
-			 * Since wait_task_inactive might yield,
-			 * we must go out of rcu_read_lock and restart.
-			 */
-			struct task_struct *p = state->task;
-			get_task_struct(p);
-			rcu_read_unlock();
-			wait_task_inactive(p);
-			put_task_struct(p);
-			goto restart;
+		START_CHECK;
+
+		rcu_read_lock();
+
+		list_for_each_safe_rcu(pos, n, &tsk->ptracees) {
+			state = list_entry(pos, struct ptrace_state, entry);
+			error = utrace_detach(state->task, state->engine);
+			BUG_ON(state->parent != tsk);
+			if (likely(error == 0)) {
+				ptrace_state_unlink(state);
+				ptrace_done(state);
+			}
+			else if (unlikely(error == -EALREADY)) {
+				/*
+				 * It's still doing report_death callbacks.
+				 * Just wait for it to settle down.
+				 * Since wait_task_inactive might yield,
+				 * we must go out of rcu_read_lock and restart.
+				 */
+				struct task_struct *p = state->task;
+				get_task_struct(p);
+				rcu_read_unlock();
+				wait_task_inactive(p);
+				put_task_struct(p);
+				restart = 1;
+				break;
+			}
+			else {
+				BUG_ON(error != -ESRCH);
+				restart = -1;
+			}
 		}
-		else
-			BUG_ON(error != -ESRCH);
-	}
 
-	rcu_read_unlock();
+		rcu_read_unlock();
+
+		END_CHECK;
+
+		cond_resched();
+	} while (restart > 0);
 
-	BUG_ON(!list_empty(&tsk->ptracees));
+	if (likely(restart == 0))
+		/*
+		 * If we had an -ESRCH error from utrace_detach, we might
+		 * still be racing with the thread in ptrace_state_unlink,
+		 * but things are OK.
+		 */
+		BUG_ON(!list_empty(&tsk->ptracees));
 }
 
 static int
@@ -542,7 +643,7 @@ ptrace_induce_signal(struct task_struct 
 		     struct utrace_attached_engine *engine,
 		     long signr)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
+	struct ptrace_state *state = engine->data;
 
 	if (signr == 0)
 		return 0;
@@ -550,15 +651,15 @@ ptrace_induce_signal(struct task_struct 
 	if (!valid_signal(signr))
 		return -EIO;
 
-	if (state->u.live.syscall) {
+	if (state->syscall) {
 		/*
 		 * This is the traditional ptrace behavior when given
 		 * a signal to resume from a syscall tracing stop.
 		 */
 		send_sig(signr, target, 1);
 	}
-	else if (!state->u.live.have_eventmsg && state->u.live.u.siginfo) {
-		siginfo_t *info = state->u.live.u.siginfo;
+	else if (!state->have_eventmsg && state->u.siginfo) {
+		siginfo_t *info = state->u.siginfo;
 
 		/* Update the siginfo structure if the signal has
 		   changed.  If the debugger wanted something
@@ -579,7 +680,7 @@ ptrace_induce_signal(struct task_struct 
 	return 0;
 }
 
-fastcall int
+int
 ptrace_regset_access(struct task_struct *target,
 		     struct utrace_attached_engine *engine,
 		     const struct utrace_regset_view *view,
@@ -614,7 +715,7 @@ ptrace_regset_access(struct task_struct 
 	return ret;
 }
 
-fastcall int
+int
 ptrace_onereg_access(struct task_struct *target,
 		     struct utrace_attached_engine *engine,
 		     const struct utrace_regset_view *view,
@@ -652,7 +753,7 @@ ptrace_onereg_access(struct task_struct 
 	return ret;
 }
 
-fastcall int
+int
 ptrace_layout_access(struct task_struct *target,
 		     struct utrace_attached_engine *engine,
 		     const struct utrace_regset_view *view,
@@ -685,7 +786,7 @@ ptrace_layout_access(struct task_struct 
 			 * This is a no-op/zero-fill portion of struct user.
 			 */
 			ret = 0;
-			if (!write) {
+			if (!write && seg->offset == 0) {
 				if (kdata)
 					memset(kdata, 0, n);
 				else if (clear_user(udata, n))
@@ -743,6 +844,8 @@ ptrace_start(long pid, long request,
 	struct ptrace_state *state;
 	int ret;
 
+	NO_LOCKS;
+
 	if (request == PTRACE_TRACEME)
 		return ptrace_traceme();
 
@@ -752,9 +855,7 @@ ptrace_start(long pid, long request,
 	if (child)
 		get_task_struct(child);
 	read_unlock(&tasklist_lock);
-#ifdef PTRACE_DEBUG
-	printk("ptrace pid %ld => %p\n", pid, child);
-#endif
+	pr_debug("ptrace pid %ld => %p\n", pid, child);
 	if (!child)
 		goto out;
 
@@ -769,27 +870,52 @@ ptrace_start(long pid, long request,
 
 	rcu_read_lock();
 	engine = utrace_attach(child, UTRACE_ATTACH_MATCH_OPS,
-			       &ptrace_utrace_ops, 0);
+			       &ptrace_utrace_ops, NULL);
 	ret = -ESRCH;
 	if (IS_ERR(engine) || engine == NULL)
 		goto out_tsk_rcu;
-	state = rcu_dereference((struct ptrace_state *) engine->data);
+	state = rcu_dereference(engine->data);
 	if (state == NULL || state->parent != current)
 		goto out_tsk_rcu;
-	rcu_read_unlock();
-
 	/*
 	 * Traditional ptrace behavior demands that the target already be
 	 * quiescent, but not dead.
 	 */
 	if (request != PTRACE_KILL
 	    && !(engine->flags & UTRACE_ACTION_QUIESCE)) {
-#ifdef PTRACE_DEBUG
-		printk("%d not stopped (%lx)\n", child->pid, child->state);
-#endif
-		goto out_tsk;
+		/*
+		 * If it's in job control stop, turn it into proper quiescence.
+		 */
+		struct sighand_struct *sighand;
+		unsigned long flags;
+		sighand = lock_task_sighand(child, &flags);
+		if (likely(sighand != NULL)) {
+			if (child->state == TASK_STOPPED)
+				ret = 0;
+			unlock_task_sighand(child, &flags);
+		}
+		if (ret == 0) {
+			ret = ptrace_update(child, state,
+					    UTRACE_ACTION_QUIESCE, 0);
+			if (unlikely(ret == -EALREADY))
+				ret = -ESRCH;
+			if (unlikely(ret))
+				BUG_ON(ret != -ESRCH);
+		}
+
+		if (ret) {
+			pr_debug("%d not stopped (%lu)\n",
+				 child->pid, child->state);
+			goto out_tsk_rcu;
+		}
+
+		ret = -ESRCH;  /* Return value for exit_state bail-out.  */
 	}
 
+	rcu_read_unlock();
+
+	NO_LOCKS;
+
 	/*
 	 * We do this for all requests to match traditional ptrace behavior.
 	 * If the machine state synchronization done at context switch time
@@ -815,11 +941,43 @@ ptrace_start(long pid, long request,
 out_tsk_rcu:
 	rcu_read_unlock();
 out_tsk:
+	NO_LOCKS;
 	put_task_struct(child);
 out:
 	return ret;
 }
 
+static inline int is_sysemu(long req)
+{
+#ifdef PTRACE_SYSEMU
+	if (req == PTRACE_SYSEMU || req == PTRACE_SYSEMU_SINGLESTEP)
+		return 1;
+#endif
+	return 0;
+}
+
+static inline int is_singlestep(long req)
+{
+#ifdef PTRACE_SYSEMU_SINGLESTEP
+	if (req == PTRACE_SYSEMU_SINGLESTEP)
+		return 1;
+#endif
+#ifdef PTRACE_SINGLESTEP
+	if (req == PTRACE_SINGLESTEP)
+		return 1;
+#endif
+	return 0;
+}
+
+static inline int is_blockstep(long req)
+{
+#ifdef PTRACE_SINGLEBLOCK
+	if (req == PTRACE_SINGLEBLOCK)
+		return 1;
+#endif
+	return 0;
+}
+
 static int
 ptrace_common(long request, struct task_struct *child,
 	      struct utrace_attached_engine *engine,
@@ -829,6 +987,8 @@ ptrace_common(long request, struct task_
 	unsigned long flags;
 	int ret = -EIO;
 
+	NO_LOCKS;
+
 	switch (request) {
 	case PTRACE_DETACH:
 		/*
@@ -836,7 +996,7 @@ ptrace_common(long request, struct task_
 		 */
 		ret = ptrace_induce_signal(child, engine, data);
 		if (!ret) {
-			ret = ptrace_detach(child, engine);
+			ret = ptrace_detach(child, engine, state);
 			if (ret == -EALREADY) /* Already a zombie.  */
 				ret = -ESRCH;
 			if (ret)
@@ -860,51 +1020,34 @@ ptrace_common(long request, struct task_
 # ifdef ARCH_HAS_BLOCK_STEP
 		if (! ARCH_HAS_BLOCK_STEP)
 # endif
-			if (request == PTRACE_SINGLEBLOCK)
+			if (is_blockstep(request))
 				break;
 #endif
 	case PTRACE_SINGLESTEP:
 #ifdef ARCH_HAS_SINGLE_STEP
 		if (! ARCH_HAS_SINGLE_STEP)
 #endif
-			if (request == PTRACE_SINGLESTEP
-#ifdef PTRACE_SYSEMU_SINGLESTEP
-			    || request == PTRACE_SYSEMU_SINGLESTEP
-#endif
-				)
+			if (is_singlestep(request))
 				break;
 
 		ret = ptrace_induce_signal(child, engine, data);
 		if (ret)
 			break;
 
-
 		/*
 		 * Reset the action flags without QUIESCE, so it resumes.
 		 */
 		flags = 0;
 #ifdef PTRACE_SYSEMU
-		state->u.live.sysemu = (request == PTRACE_SYSEMU_SINGLESTEP
-					|| request == PTRACE_SYSEMU);
-#endif
-		if (request == PTRACE_SINGLESTEP
-#ifdef PTRACE_SYSEMU
-		    || request == PTRACE_SYSEMU_SINGLESTEP
+		state->sysemu = is_sysemu(request);
 #endif
-			)
+		if (request == PTRACE_SYSCALL || is_sysemu(request))
+			flags |= UTRACE_EVENT_SYSCALL;
+		if (is_singlestep(request))
 			flags |= UTRACE_ACTION_SINGLESTEP;
-#ifdef PTRACE_SINGLEBLOCK
-		else if (request == PTRACE_SINGLEBLOCK)
+		else if (is_blockstep(request))
 			flags |= UTRACE_ACTION_BLOCKSTEP;
-#endif
-		if (request == PTRACE_SYSCALL)
-			flags |= UTRACE_EVENT_SYSCALL;
-#ifdef PTRACE_SYSEMU
-		else if (request == PTRACE_SYSEMU
-			 || request == PTRACE_SYSEMU_SINGLESTEP)
-			flags |= UTRACE_EVENT(SYSCALL_ENTRY);
-#endif
-		ret = ptrace_update(child, engine, flags);
+		ret = ptrace_update(child, state, flags, 1);
 		if (ret)
 			BUG_ON(ret != -ESRCH);
 		ret = 0;
@@ -917,29 +1060,29 @@ ptrace_common(long request, struct task_
 		ret = -EINVAL;
 		if (data & ~PTRACE_O_MASK)
 			break;
-		state->u.live.options = data;
-		ret = ptrace_update(child, engine, UTRACE_ACTION_QUIESCE);
+		state->options = data;
+		ret = ptrace_update(child, state, UTRACE_ACTION_QUIESCE, 1);
 		if (ret)
 			BUG_ON(ret != -ESRCH);
 		ret = 0;
 		break;
 	}
 
+	NO_LOCKS;
+
 	return ret;
 }
 
 
 asmlinkage long sys_ptrace(long request, long pid, long addr, long data)
 {
-	struct task_struct *child;
-	struct utrace_attached_engine *engine;
-	struct ptrace_state *state;
+	struct task_struct *child = NULL;
+	struct utrace_attached_engine *engine = NULL;
+	struct ptrace_state *state = NULL;
 	long ret, val;
 
-#ifdef PTRACE_DEBUG
-	printk("%d sys_ptrace(%ld, %ld, %lx, %lx)\n",
-	       current->pid, request, pid, addr, data);
-#endif
+	pr_debug("%d sys_ptrace(%ld, %ld, %lx, %lx)\n",
+		 current->pid, request, pid, addr, data);
 
 	ret = ptrace_start(pid, request, &child, &engine, &state);
 	if (ret != -EIO)
@@ -982,21 +1125,21 @@ asmlinkage long sys_ptrace(long request,
 		break;
 
 	case PTRACE_GETEVENTMSG:
-		ret = put_user(state->u.live.have_eventmsg
-			       ? state->u.live.u.eventmsg : 0L,
+		ret = put_user(state->have_eventmsg
+			       ? state->u.eventmsg : 0L,
 			       (unsigned long __user *) data);
 		break;
 	case PTRACE_GETSIGINFO:
 		ret = -EINVAL;
-		if (!state->u.live.have_eventmsg && state->u.live.u.siginfo)
+		if (!state->have_eventmsg && state->u.siginfo)
 			ret = copy_siginfo_to_user((siginfo_t __user *) data,
-						   state->u.live.u.siginfo);
+						   state->u.siginfo);
 		break;
 	case PTRACE_SETSIGINFO:
 		ret = -EINVAL;
-		if (!state->u.live.have_eventmsg && state->u.live.u.siginfo) {
+		if (!state->have_eventmsg && state->u.siginfo) {
 			ret = 0;
-			if (copy_from_user(state->u.live.u.siginfo,
+			if (copy_from_user(state->u.siginfo,
 					   (siginfo_t __user *) data,
 					   sizeof(siginfo_t)))
 				ret = -EFAULT;
@@ -1005,11 +1148,10 @@ asmlinkage long sys_ptrace(long request,
 	}
 
 out_tsk:
+	NO_LOCKS;
 	put_task_struct(child);
 out:
-#ifdef PTRACE_DEBUG
-	printk("%d ptrace -> %lx\n", current->pid, ret);
-#endif
+	pr_debug("%d ptrace -> %lx\n", current->pid, ret);
 	return ret;
 }
 
@@ -1026,10 +1168,8 @@ asmlinkage long compat_sys_ptrace(compat
 	struct ptrace_state *state;
 	compat_long_t ret, val;
 
-#ifdef PTRACE_DEBUG
-	printk("%d compat_sys_ptrace(%d, %d, %x, %x)\n",
-	       current->pid, request, pid, addr, cdata);
-#endif
+	pr_debug("%d compat_sys_ptrace(%d, %d, %x, %x)\n",
+		 current->pid, request, pid, addr, cdata);
 	ret = ptrace_start(pid, request, &child, &engine, &state);
 	if (ret != -EIO)
 		goto out;
@@ -1071,22 +1211,22 @@ asmlinkage long compat_sys_ptrace(compat
 		break;
 
 	case PTRACE_GETEVENTMSG:
-		ret = put_user(state->u.live.have_eventmsg
-			       ? state->u.live.u.eventmsg : 0L,
+		ret = put_user(state->have_eventmsg
+			       ? state->u.eventmsg : 0L,
 			       (compat_long_t __user *) data);
 		break;
 	case PTRACE_GETSIGINFO:
 		ret = -EINVAL;
-		if (!state->u.live.have_eventmsg && state->u.live.u.siginfo)
+		if (!state->have_eventmsg && state->u.siginfo)
 			ret = copy_siginfo_to_user32(
 				(struct compat_siginfo __user *) data,
-				state->u.live.u.siginfo);
+				state->u.siginfo);
 		break;
 	case PTRACE_SETSIGINFO:
 		ret = -EINVAL;
-		if (!state->u.live.have_eventmsg && state->u.live.u.siginfo
+		if (!state->have_eventmsg && state->u.siginfo
 		    && copy_siginfo_from_user32(
-			    state->u.live.u.siginfo,
+			    state->u.siginfo,
 			    (struct compat_siginfo __user *) data))
 			ret = -EFAULT;
 		break;
@@ -1095,9 +1235,7 @@ asmlinkage long compat_sys_ptrace(compat
 out_tsk:
 	put_task_struct(child);
 out:
-#ifdef PTRACE_DEBUG
-	printk("%d ptrace -> %lx\n", current->pid, ret);
-#endif
+	pr_debug("%d ptrace -> %lx\n", current->pid, (long)ret);
 	return ret;
 }
 #endif
@@ -1111,35 +1249,38 @@ detach_zombie(struct task_struct *tsk,
 	      struct task_struct *p, struct ptrace_state *state)
 {
 	int detach_error;
+	struct utrace_attached_engine *engine;
+
 restart:
+	NO_LOCKS;
 	detach_error = 0;
 	rcu_read_lock();
-	if (tsk != current) {
+	if (tsk == current)
+		engine = state->engine;
+	else {
 		/*
 		 * We've excluded other ptrace_do_wait calls.  But the
 		 * ptracer itself might have done ptrace_detach while we
 		 * did not have rcu_read_lock.  So double-check that state
 		 * is still valid.
 		 */
-		struct utrace_attached_engine *engine;
-		engine = utrace_attach(
-			p, (UTRACE_ATTACH_MATCH_OPS
-			    | UTRACE_ATTACH_MATCH_DATA),
-			&ptrace_utrace_ops,
-			(unsigned long) state);
+		engine = utrace_attach(p, (UTRACE_ATTACH_MATCH_OPS
+					   | UTRACE_ATTACH_MATCH_DATA),
+				       &ptrace_utrace_ops, state);
 		if (IS_ERR(engine) || state->parent != tsk)
 			detach_error = -ESRCH;
 		else
 			BUG_ON(state->engine != engine);
 	}
+	rcu_read_unlock();
+	NO_LOCKS;
 	if (likely(!detach_error))
-		detach_error = ptrace_detach(p, state->engine);
+		detach_error = ptrace_detach(p, engine, state);
 	if (unlikely(detach_error == -EALREADY)) {
 		/*
 		 * It's still doing report_death callbacks.
 		 * Just wait for it to settle down.
 		 */
-		rcu_read_unlock();
 		wait_task_inactive(p); /* Might block.  */
 		goto restart;
 	}
@@ -1152,7 +1293,7 @@ restart:
 	 */
 	if (detach_error)
 		BUG_ON(detach_error != -ESRCH);
-	rcu_read_unlock();
+	NO_LOCKS;
 }
 
 /*
@@ -1165,6 +1306,7 @@ int
 ptrace_do_wait(struct task_struct *tsk,
 	       pid_t pid, int options, struct siginfo __user *infop,
 	       int __user *stat_addr, struct rusage __user *rusagep)
+	__releases(tasklist_lock)
 {
 	struct ptrace_state *state;
 	struct task_struct *p;
@@ -1204,8 +1346,15 @@ ptrace_do_wait(struct task_struct *tsk,
 		case EXIT_ZOMBIE:
 			if (!likely(options & WEXITED))
 				continue;
-			if (delay_group_leader(p))
+			if (delay_group_leader(p)) {
+				struct task_struct *next = next_thread(p);
+				pr_debug("%d ptrace_do_wait leaving %d "
+					 "zombie code %x "
+					 "delay_group_leader (%d/%lu)\n",
+					 current->pid, p->pid, p->exit_code,
+					 next->pid, next->state);
 				continue;
+			}
 			exit_code = p->exit_code;
 			goto found;
 		case EXIT_DEAD:
@@ -1217,6 +1366,14 @@ ptrace_do_wait(struct task_struct *tsk,
 			 * guaranteed a wakeup on wait_chldexit after
 			 * any new deaths.
 			 */
+			if (p->flags & PF_EXITING)
+				/*
+				 * It's in do_exit and might have set
+				 * p->exit_code already, but it's not quite
+				 * dead yet.  It will get to report_death
+				 * and wakes us up when it finishes.
+				 */
+				continue;
 			break;
 		}
 
@@ -1231,20 +1388,26 @@ ptrace_do_wait(struct task_struct *tsk,
 			goto found;
 
 		// XXX should handle WCONTINUED
+
+		pr_debug("%d ptrace_do_wait leaving %d state %lu code %x\n",
+			 current->pid, p->pid, p->state, p->exit_code);
 	}
 	rcu_read_unlock();
+	if (err == 0)
+		pr_debug("%d ptrace_do_wait blocking\n", current->pid);
+
 	return err;
 
 found:
 	BUG_ON(state->parent != tsk);
 	rcu_read_unlock();
 
-#ifdef PTRACE_DEBUG
-	printk("%d ptrace_do_wait (%d) found %d code %x (%lu)\n", current->pid, tsk->pid, p->pid, exit_code, p->exit_state);
-#endif
+	pr_debug("%d ptrace_do_wait (%d) found %d code %x (%u/%d)\n",
+		 current->pid, tsk->pid, p->pid, exit_code,
+		 p->exit_state, p->exit_signal);
 
 	if (p->exit_state) {
-		if (unlikely(p->parent == tsk))
+		if (unlikely(p->parent == tsk && p->exit_signal != -1))
 			/*
 			 * This is our natural child we were ptracing.
 			 * When it dies it detaches (see ptrace_report_death).
@@ -1253,6 +1416,14 @@ found:
 			 * the normal wait_task_zombie path instead.
 			 */
 			return 0;
+
+		/*
+		 * If there was a group exit in progress, all threads
+		 * report that status.  Most have SIGKILL in their exit_code.
+		 */
+		if (p->signal->flags & SIGNAL_GROUP_EXIT)
+			exit_code = p->signal->group_exit_code;
+
 		if ((exit_code & 0x7f) == 0) {
 			why = CLD_EXITED;
 			status = exit_code >> 8;
@@ -1275,6 +1446,8 @@ found:
 	get_task_struct(p);
 	read_unlock(&tasklist_lock);
 
+	NO_LOCKS;
+
 	if (rusagep)
 		err = getrusage(p, RUSAGE_BOTH, rusagep);
 	if (infop) {
@@ -1310,6 +1483,41 @@ found:
 	return err;
 }
 
+
+/*
+ * All the report callbacks (except death and reap) are subject to a race
+ * with ptrace_exit doing a quick detach and ptrace_done.  It can do this
+ * even when the target is not quiescent, so a callback may already be in
+ * progress when it does ptrace_done.  Callbacks use this function to fetch
+ * the struct ptrace_state while ensuring it doesn't disappear until
+ * put_ptrace_state is called.  This just uses RCU, since state and
+ * anything we try to do to state->parent is safe under rcu_read_lock.
+ */
+static struct ptrace_state *
+get_ptrace_state(struct utrace_attached_engine *engine,
+		 struct task_struct *tsk)
+	__acquires(RCU)
+{
+	struct ptrace_state *state;
+
+	rcu_read_lock();
+	state = rcu_dereference(engine->data);
+	if (likely(state != NULL))
+		return state;
+
+	rcu_read_unlock();
+	return NULL;
+}
+
+static inline void
+put_ptrace_state(struct ptrace_state *state)
+	__releases(RCU)
+{
+	BUG_ON(state == NULL);
+	rcu_read_unlock();
+}
+
+
 static void
 do_notify(struct task_struct *tsk, struct task_struct *parent, int why)
 {
@@ -1346,6 +1554,10 @@ do_notify(struct task_struct *tsk, struc
 		}
 	}
 
+	read_lock(&tasklist_lock);
+	if (unlikely(parent->signal == NULL))
+		goto out;
+
 	sighand = parent->sighand;
 	spin_lock_irqsave(&sighand->siglock, flags);
 	if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN &&
@@ -1356,26 +1568,30 @@ do_notify(struct task_struct *tsk, struc
 	 */
 	wake_up_interruptible_sync(&parent->signal->wait_chldexit);
 	spin_unlock_irqrestore(&sighand->siglock, flags);
+
+out:
+	read_unlock(&tasklist_lock);
 }
 
 static u32
-ptrace_report(struct utrace_attached_engine *engine, struct task_struct *tsk,
+ptrace_report(struct utrace_attached_engine *engine,
+	      struct task_struct *tsk,
+	      struct ptrace_state *state,
 	      int code)
+	__releases(RCU)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
 	const struct utrace_regset *regset;
 
-#ifdef PTRACE_DEBUG
-	printk("%d ptrace_report %d engine %p state %p code %x parent %d (%p)\n",
-	       current->pid, tsk->pid, engine, state, code,
-	       state->parent->pid, state->parent);
-	if (!state->u.live.have_eventmsg && state->u.live.u.siginfo) {
-		const siginfo_t *si = state->u.live.u.siginfo;
-		printk("  si %d code %x errno %d addr %p\n",
-		       si->si_signo, si->si_code, si->si_errno,
-		       si->si_addr);
+	pr_debug("%d ptrace_report %d engine %p"
+		 " state %p code %x parent %d (%p)\n",
+		 current->pid, tsk->pid, engine, state, code,
+		 state->parent->pid, state->parent);
+	if (!state->have_eventmsg && state->u.siginfo) {
+		const siginfo_t *si = state->u.siginfo;
+		pr_debug("  si %d code %x errno %d addr %p\n",
+			 si->si_signo, si->si_code, si->si_errno,
+			 si->si_addr);
 	}
-#endif
 
 	/*
 	 * Set our QUIESCE flag right now, before notifying the tracer.
@@ -1386,6 +1602,17 @@ ptrace_report(struct utrace_attached_eng
 	 */
 	utrace_set_flags(tsk, engine, engine->flags | UTRACE_ACTION_QUIESCE);
 
+	BUG_ON(code == 0);
+	tsk->exit_code = code;
+	do_notify(tsk, state->parent, CLD_TRAPPED);
+
+	pr_debug("%d ptrace_report quiescing exit_code %x\n",
+		 current->pid, current->exit_code);
+
+	put_ptrace_state(state);
+
+	NO_LOCKS;
+
 	/*
 	 * If regset 0 has a writeback call, do it now.  On register window
 	 * machines, this makes sure the user memory backing the register
@@ -1396,33 +1623,29 @@ ptrace_report(struct utrace_attached_eng
 	if (regset->writeback)
 		(*regset->writeback)(tsk, regset, 0);
 
-	BUG_ON(code == 0);
-	tsk->exit_code = code;
-	do_notify(tsk, state->parent, CLD_TRAPPED);
-
-#ifdef PTRACE_DEBUG
-	printk("%d ptrace_report quiescing exit_code %x\n",
-	       current->pid, current->exit_code);
-#endif
-
 	return UTRACE_ACTION_RESUME;
 }
 
 static inline u32
-ptrace_event(struct utrace_attached_engine *engine, struct task_struct *tsk,
+ptrace_event(struct utrace_attached_engine *engine,
+	     struct task_struct *tsk,
+	     struct ptrace_state *state,
 	     int event)
+	__releases(RCU)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
-	state->u.live.syscall = 0;
-	return ptrace_report(engine, tsk, (event << 8) | SIGTRAP);
+	state->syscall = 0;
+	return ptrace_report(engine, tsk, state, (event << 8) | SIGTRAP);
 }
 
-
+/*
+ * Unlike other report callbacks, this can't be called while ptrace_exit
+ * is doing ptrace_done in parallel, so we don't need get_ptrace_state.
+ */
 static u32
 ptrace_report_death(struct utrace_attached_engine *engine,
 		    struct task_struct *tsk)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
+	struct ptrace_state *state = engine->data;
 
 	if (tsk->exit_code == 0 && unlikely(tsk->flags & PF_SIGNALED))
 		/*
@@ -1432,22 +1655,44 @@ ptrace_report_death(struct utrace_attach
 		 */
 		tsk->exit_code = SIGKILL;
 
-	if (tsk->parent == state->parent) {
+	if (unlikely(state == NULL)) {
+		/*
+		 * We can be called before ptrace_setup_finish is done,
+		 * if we're dying before attaching really finished.
+		 */
+		printk("XXX ptrace_report_death leak\n");
+		return UTRACE_ACTION_RESUME;
+	}
+
+	if (tsk->parent == state->parent && tsk->exit_signal != -1) {
 		/*
-		 * This is a natural child, so we detach and let the normal
+		 * This is a natural child (excluding clone siblings of a
+		 * child group_leader), so we detach and let the normal
 		 * reporting happen once our NOREAP action is gone.  But
 		 * first, generate a SIGCHLD for those cases where normal
 		 * behavior won't.  A ptrace'd child always generates SIGCHLD.
 		 */
-		if (tsk->exit_signal == -1 || !thread_group_empty(tsk))
+		pr_debug("ptrace %d death natural parent %d exit_code %x\n",
+			 tsk->pid, state->parent->pid, tsk->exit_code);
+		if (!thread_group_empty(tsk))
 			do_notify(tsk, state->parent, CLD_EXITED);
 		ptrace_state_unlink(state);
-		rcu_assign_pointer(engine->data, 0UL);
+		rcu_assign_pointer(engine->data, NULL);
 		ptrace_done(state);
 		return UTRACE_ACTION_DETACH;
 	}
 
+	/*
+	 * This might be a second report_death callback for a group leader
+	 * that was delayed when its original report_death callback was made.
+	 * Repeating do_notify is exactly what we need for that case too.
+	 * After the wakeup, ptrace_do_wait will see delay_group_leader false.
+	 */
+
+	pr_debug("ptrace %d death notify %d exit_code %x: ",
+		 tsk->pid, state->parent->pid, tsk->exit_code);
 	do_notify(tsk, state->parent, CLD_EXITED);
+	pr_debug("%d notified %d\n", tsk->pid, state->parent->pid);
 	return UTRACE_ACTION_RESUME;
 }
 
@@ -1459,36 +1704,116 @@ static void
 ptrace_report_reap(struct utrace_attached_engine *engine,
 		   struct task_struct *tsk)
 {
-	struct ptrace_state *state;
-	rcu_read_lock();
-	state = rcu_dereference((struct ptrace_state *) engine->data);
-	if (state != NULL) {
-		ptrace_state_unlink(state);
-		rcu_assign_pointer(engine->data, 0UL);
-		ptrace_done(state);
+	struct ptrace_state *state = engine->data;
+
+	if (unlikely(state == NULL)) { /* Not fully attached.  */
+		printk("XXX ptrace_report_reap leak\n");
+		return;
 	}
-	rcu_read_unlock();
+
+	NO_LOCKS;
+
+	ptrace_state_unlink(state);
+	rcu_assign_pointer(engine->data, NULL);
+	ptrace_done(state);
+
+	NO_LOCKS;
 }
 
+/*
+ * Start tracing the child.  This has to do put_ptrace_state before it can
+ * do allocation that might block.
+ */
+static void
+ptrace_clone_setup(struct utrace_attached_engine *engine,
+		   struct task_struct *parent,
+		   struct ptrace_state *state,
+		   struct task_struct *child)
+	__releases(RCU)
+{
+	struct task_struct *tracer;
+	struct utrace_attached_engine *child_engine;
+	struct ptrace_state *child_state;
+	int ret;
+	u8 options;
+	int cap_sys_ptrace;
+
+	tracer = state->parent;
+	options = state->options;
+	cap_sys_ptrace = state->cap_sys_ptrace;
+	get_task_struct(tracer);
+	put_ptrace_state(state);
+
+	NO_LOCKS;
+
+	child_engine = utrace_attach(child, (UTRACE_ATTACH_CREATE
+					     | UTRACE_ATTACH_EXCLUSIVE
+					     | UTRACE_ATTACH_MATCH_OPS),
+				     &ptrace_utrace_ops, NULL);
+	if (unlikely(IS_ERR(child_engine))) {
+		BUG_ON(PTR_ERR(child_engine) != -ENOMEM);
+		put_task_struct(tracer);
+		goto nomem;
+	}
+
+	child_state = ptrace_setup(child, child_engine,
+				   tracer, options, cap_sys_ptrace);
+
+	put_task_struct(tracer);
+
+	if (unlikely(IS_ERR(child_state))) {
+		(void) utrace_detach(child, child_engine);
+
+		if (PTR_ERR(child_state) == -ENOMEM)
+			goto nomem;
+
+		/*
+		 * Our tracer has started exiting.  It's
+		 * too late to set it up tracing the child.
+		 */
+		BUG_ON(PTR_ERR(child_state) != -EALREADY);
+	}
+	else {
+		sigaddset(&child->pending.signal, SIGSTOP);
+		set_tsk_thread_flag(child, TIF_SIGPENDING);
+		ret = ptrace_setup_finish(child, child_state);
+
+		/*
+		 * The child hasn't run yet, it can't have died already.
+		 */
+		BUG_ON(ret);
+	}
+
+	NO_LOCKS;
+
+	return;
+
+nomem:
+	NO_LOCKS;
+	printk(KERN_ERR "ptrace out of memory, lost child %d of %d",
+	       child->pid, parent->pid);
+}
 
 static u32
 ptrace_report_clone(struct utrace_attached_engine *engine,
 		    struct task_struct *parent,
 		    unsigned long clone_flags, struct task_struct *child)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
-	struct utrace_attached_engine *child_engine;
-	int event = PTRACE_EVENT_FORK;
-	int option = PTRACE_O_TRACEFORK;
+	int event, option;
+	struct ptrace_state *state;
 
-#ifdef PTRACE_DEBUG
-	printk("%d (%p) engine %p ptrace_report_clone child %d (%p) fl %lx\n",
-	       parent->pid, parent, engine, child->pid, child, clone_flags);
-#endif
+	NO_LOCKS;
 
-	if (clone_flags & CLONE_UNTRACED)
-		goto out;
+	state = get_ptrace_state(engine, parent);
+	if (unlikely(state == NULL))
+		return UTRACE_ACTION_RESUME;
+
+	pr_debug("%d (%p) engine %p"
+		 " ptrace_report_clone child %d (%p) fl %lx\n",
+		 parent->pid, parent, engine, child->pid, child, clone_flags);
 
+	event = PTRACE_EVENT_FORK;
+	option = PTRACE_O_TRACEFORK;
 	if (clone_flags & CLONE_VFORK) {
 		event = PTRACE_EVENT_VFORK;
 		option = PTRACE_O_TRACEVFORK;
@@ -1498,53 +1823,38 @@ ptrace_report_clone(struct utrace_attach
 		option = PTRACE_O_TRACECLONE;
 	}
 
-	if (!(clone_flags & CLONE_PTRACE) && !(state->u.live.options & option))
-		goto out;
-
-	child_engine = utrace_attach(child, (UTRACE_ATTACH_CREATE
-					     | UTRACE_ATTACH_EXCLUSIVE
-					     | UTRACE_ATTACH_MATCH_OPS),
-				     &ptrace_utrace_ops, 0UL);
-	if (unlikely(IS_ERR(child_engine))) {
-		BUG_ON(PTR_ERR(child_engine) != -ENOMEM);
-		printk(KERN_ERR
-		       "ptrace out of memory, lost child %d of %d",
-		       child->pid, parent->pid);
-	}
-	else {
-		struct ptrace_state *child_state;
-		child_state = ptrace_setup(child, child_engine,
-					   state->parent,
-					   state->u.live.options,
-					   state->u.live.cap_sys_ptrace,
-					   NULL);
-		if (unlikely(IS_ERR(child_state))) {
-			BUG_ON(PTR_ERR(child_state) != -ENOMEM);
-			(void) utrace_detach(child, child_engine);
-			printk(KERN_ERR
-			       "ptrace out of memory, lost child %d of %d",
-			       child->pid, parent->pid);
-		}
-		else {
-			int ret;
-			sigaddset(&child->pending.signal, SIGSTOP);
-			set_tsk_thread_flag(child, TIF_SIGPENDING);
-			ret = ptrace_update(child, child_engine, 0);
-			/*
-			 * The child hasn't run yet,
-			 * it can't have died already.
-			 */
-			BUG_ON(ret);
-		}
+	if (state->options & option) {
+		state->have_eventmsg = 1;
+		state->u.eventmsg = child->pid;
 	}
+	else
+		event = 0;
 
-	if (state->u.live.options & option) {
-		state->u.live.have_eventmsg = 1;
-		state->u.live.u.eventmsg = child->pid;
-		return ptrace_event(engine, parent, event);
+	if (!(clone_flags & CLONE_UNTRACED)
+	    && (event || (clone_flags & CLONE_PTRACE))) {
+		/*
+		 * Have our tracer start following the child too.
+		 */
+		ptrace_clone_setup(engine, parent, state, child);
+
+		NO_LOCKS;
+
+		/*
+		 * That did put_ptrace_state, so we have to check
+		 * again in case our tracer just started exiting.
+		 */
+		state = get_ptrace_state(engine, parent);
+		if (unlikely(state == NULL))
+			return UTRACE_ACTION_RESUME;
 	}
 
-out:
+	if (event)
+		return ptrace_event(engine, parent, state, event);
+
+	put_ptrace_state(state);
+
+	NO_LOCKS;
+
 	return UTRACE_ACTION_RESUME;
 }
 
@@ -1553,10 +1863,13 @@ static u32
 ptrace_report_vfork_done(struct utrace_attached_engine *engine,
 			 struct task_struct *parent, pid_t child_pid)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
-	state->u.live.have_eventmsg = 1;
-	state->u.live.u.eventmsg = child_pid;
-	return ptrace_event(engine, parent, PTRACE_EVENT_VFORK_DONE);
+	struct ptrace_state *state = get_ptrace_state(engine, parent);
+	if (unlikely(state == NULL))
+		return UTRACE_ACTION_RESUME;
+
+	state->have_eventmsg = 1;
+	state->u.eventmsg = child_pid;
+	return ptrace_event(engine, parent, state, PTRACE_EVENT_VFORK_DONE);
 }
 
 
@@ -1567,24 +1880,31 @@ ptrace_report_signal(struct utrace_attac
 		     const struct k_sigaction *orig_ka,
 		     struct k_sigaction *return_ka)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
 	int signo = info == NULL ? SIGTRAP : info->si_signo;
-	state->u.live.syscall = 0;
-	state->u.live.have_eventmsg = 0;
-	state->u.live.u.siginfo = info;
-	return ptrace_report(engine, tsk, signo) | UTRACE_SIGNAL_IGN;
+	struct ptrace_state *state = get_ptrace_state(engine, tsk);
+	if (unlikely(state == NULL))
+		return UTRACE_ACTION_RESUME;
+
+	state->syscall = 0;
+	state->have_eventmsg = 0;
+	state->u.siginfo = info;
+	return ptrace_report(engine, tsk, state, signo) | UTRACE_SIGNAL_IGN;
 }
 
 static u32
 ptrace_report_jctl(struct utrace_attached_engine *engine,
 		   struct task_struct *tsk, int type)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
-#ifdef PTRACE_DEBUG
-	printk("ptrace %d jctl notify %d type %x exit_code %x\n",
-	       tsk->pid, state->parent->pid, type, tsk->exit_code);
-#endif
+	struct ptrace_state *state = get_ptrace_state(engine, tsk);
+	if (unlikely(state == NULL))
+		return UTRACE_ACTION_RESUME;
+
+	pr_debug("ptrace %d jctl notify %d type %x exit_code %x\n",
+		 tsk->pid, state->parent->pid, type, tsk->exit_code);
+
 	do_notify(tsk, state->parent, type);
+	put_ptrace_state(state);
+
 	return UTRACE_JCTL_NOSIGCHLD;
 }
 
@@ -1594,11 +1914,13 @@ ptrace_report_exec(struct utrace_attache
 		   const struct linux_binprm *bprm,
 		   struct pt_regs *regs)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
-	if (state->u.live.options & PTRACE_O_TRACEEXEC)
-		return ptrace_event(engine, tsk, PTRACE_EVENT_EXEC);
-	state->u.live.syscall = 0;
-	return ptrace_report(engine, tsk, SIGTRAP);
+	struct ptrace_state *state = get_ptrace_state(engine, tsk);
+	if (unlikely(state == NULL))
+		return UTRACE_ACTION_RESUME;
+
+	return ptrace_event(engine, tsk, state,
+			    (state->options & PTRACE_O_TRACEEXEC)
+			    ? PTRACE_EVENT_EXEC : 0);
 }
 
 static u32
@@ -1606,14 +1928,43 @@ ptrace_report_syscall(struct utrace_atta
 		      struct task_struct *tsk, struct pt_regs *regs,
 		      int entry)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
+	struct ptrace_state *state = get_ptrace_state(engine, tsk);
+	if (unlikely(state == NULL))
+		return UTRACE_ACTION_RESUME;
+
 #ifdef PTRACE_SYSEMU
-	if (entry && state->u.live.sysemu)
-		tracehook_abort_syscall(regs);
+	if (state->sysemu) {
+		/*
+		 * A syscall under PTRACE_SYSEMU gets just one stop and
+		 * report.  But at that stop, the syscall number is
+		 * expected to reside in the pseudo-register.  We need to
+		 * reset it to prevent the actual syscall from happening.
+		 *
+		 * At the entry tracing stop, the return value register has
+		 * been primed to -ENOSYS, and the syscall pseudo-register
+		 * has the syscall number.  We squirrel away the syscall
+		 * number in the return value register long enough to skip
+		 * the actual syscall and get to the exit tracing stop.
+		 * There, we swap the registers back and do ptrace_report.
+		 */
+
+		long *scno = tracehook_syscall_callno(regs);
+		long *retval = tracehook_syscall_retval(regs);
+		if (entry) {
+			*retval = *scno;
+			*scno = -1;
+			return UTRACE_ACTION_RESUME;
+		}
+		else {
+			*scno = *retval;
+			*retval = -ENOSYS;
+		}
+	}
 #endif
-	state->u.live.syscall = 1;
-	return ptrace_report(engine, tsk,
-			     ((state->u.live.options & PTRACE_O_TRACESYSGOOD)
+
+	state->syscall = 1;
+	return ptrace_report(engine, tsk, state,
+			     ((state->options & PTRACE_O_TRACESYSGOOD)
 			      ? 0x80 : 0) | SIGTRAP);
 }
 
@@ -1626,7 +1977,7 @@ ptrace_report_syscall_entry(struct utrac
 
 static u32
 ptrace_report_syscall_exit(struct utrace_attached_engine *engine,
-			    struct task_struct *tsk, struct pt_regs *regs)
+			   struct task_struct *tsk, struct pt_regs *regs)
 {
 	return ptrace_report_syscall(engine, tsk, regs, 0);
 }
@@ -1635,20 +1986,33 @@ static u32
 ptrace_report_exit(struct utrace_attached_engine *engine,
 		   struct task_struct *tsk, long orig_code, long *code)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
-	state->u.live.have_eventmsg = 1;
-	state->u.live.u.eventmsg = *code;
-	return ptrace_event(engine, tsk, PTRACE_EVENT_EXIT);
+	struct ptrace_state *state = get_ptrace_state(engine, tsk);
+	if (unlikely(state == NULL))
+		return UTRACE_ACTION_RESUME;
+
+	state->have_eventmsg = 1;
+	state->u.eventmsg = *code;
+	return ptrace_event(engine, tsk, state, PTRACE_EVENT_EXIT);
 }
 
 static int
 ptrace_unsafe_exec(struct utrace_attached_engine *engine,
 		   struct task_struct *tsk)
 {
-	struct ptrace_state *state = (struct ptrace_state *) engine->data;
 	int unsafe = LSM_UNSAFE_PTRACE;
-	if (state->u.live.cap_sys_ptrace)
-		unsafe = LSM_UNSAFE_PTRACE_CAP;
+	struct ptrace_state *state;
+
+	START_CHECK;
+
+	state = get_ptrace_state(engine, tsk);
+	if (likely(state != NULL)) {
+		if (state->cap_sys_ptrace)
+			unsafe = LSM_UNSAFE_PTRACE_CAP;
+		put_ptrace_state(state);
+	}
+
+	END_CHECK;
+
 	return unsafe;
 }
 
@@ -1656,16 +2020,20 @@ static struct task_struct *
 ptrace_tracer_task(struct utrace_attached_engine *engine,
 		   struct task_struct *target)
 {
+	struct task_struct *parent = NULL;
 	struct ptrace_state *state;
 
-	/*
-	 * This call is not necessarily made by the target task,
-	 * so ptrace might be getting detached while we run here.
-	 * The state pointer will be NULL if that happens.
-	 */
-	state = rcu_dereference((struct ptrace_state *) engine->data);
+	START_CHECK;
+
+	state = get_ptrace_state(engine, target);
+	if (likely(state != NULL)) {
+		parent = state->parent;
+		put_ptrace_state(state);
+	}
 
-	return state == NULL ? NULL : state->parent;
+	END_CHECK;
+
+	return parent;
 }
 
 static int
@@ -1674,22 +2042,24 @@ ptrace_allow_access_process_vm(struct ut
 			       struct task_struct *caller)
 {
 	struct ptrace_state *state;
-	int ours;
+	int ours = 0;
 
-	/*
-	 * This call is not necessarily made by the target task,
-	 * so ptrace might be getting detached while we run here.
-	 * The state pointer will be NULL if that happens.
-	 */
-	rcu_read_lock();
-	state = rcu_dereference((struct ptrace_state *) engine->data);
-	ours = (state != NULL
-		&& ((engine->flags & UTRACE_ACTION_QUIESCE)
-		    || (target->state == TASK_STOPPED))
-		&& state->parent == caller);
-	rcu_read_unlock();
+	START_CHECK;
 
-	return ours && security_ptrace(caller, target) == 0;
+	state = get_ptrace_state(engine, target);
+	if (likely(state != NULL)) {
+		ours = (((engine->flags & UTRACE_ACTION_QUIESCE)
+			 || target->state == TASK_STOPPED)
+			&& state->parent == caller);
+		put_ptrace_state(state);
+	}
+
+	if (ours)
+		ours = security_ptrace(caller, target) == 0;
+
+	END_CHECK;
+
+	return ours;
 }
 
 
@@ -1709,5 +2079,3 @@ static const struct utrace_engine_ops pt
 	.tracer_task = ptrace_tracer_task,
 	.allow_access_process_vm = ptrace_allow_access_process_vm,
 };
-
-#endif
--- linux-2.6.18/kernel/signal.c
+++ linux-2.6.18/kernel/signal.c
@@ -35,125 +35,6 @@
 
 static kmem_cache_t *sigqueue_cachep;
 
-/*
- * In POSIX a signal is sent either to a specific thread (Linux task)
- * or to the process as a whole (Linux thread group).  How the signal
- * is sent determines whether it's to one thread or the whole group,
- * which determines which signal mask(s) are involved in blocking it
- * from being delivered until later.  When the signal is delivered,
- * either it's caught or ignored by a user handler or it has a default
- * effect that applies to the whole thread group (POSIX process).
- *
- * The possible effects an unblocked signal set to SIG_DFL can have are:
- *   ignore	- Nothing Happens
- *   terminate	- kill the process, i.e. all threads in the group,
- * 		  similar to exit_group.  The group leader (only) reports
- *		  WIFSIGNALED status to its parent.
- *   coredump	- write a core dump file describing all threads using
- *		  the same mm and then kill all those threads
- *   stop 	- stop all the threads in the group, i.e. TASK_STOPPED state
- *
- * SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.
- * Other signals when not blocked and set to SIG_DFL behaves as follows.
- * The job control signals also have other special effects.
- *
- *	+--------------------+------------------+
- *	|  POSIX signal      |  default action  |
- *	+--------------------+------------------+
- *	|  SIGHUP            |  terminate	|
- *	|  SIGINT            |	terminate	|
- *	|  SIGQUIT           |	coredump 	|
- *	|  SIGILL            |	coredump 	|
- *	|  SIGTRAP           |	coredump 	|
- *	|  SIGABRT/SIGIOT    |	coredump 	|
- *	|  SIGBUS            |	coredump 	|
- *	|  SIGFPE            |	coredump 	|
- *	|  SIGKILL           |	terminate(+)	|
- *	|  SIGUSR1           |	terminate	|
- *	|  SIGSEGV           |	coredump 	|
- *	|  SIGUSR2           |	terminate	|
- *	|  SIGPIPE           |	terminate	|
- *	|  SIGALRM           |	terminate	|
- *	|  SIGTERM           |	terminate	|
- *	|  SIGCHLD           |	ignore   	|
- *	|  SIGCONT           |	ignore(*)	|
- *	|  SIGSTOP           |	stop(*)(+)  	|
- *	|  SIGTSTP           |	stop(*)  	|
- *	|  SIGTTIN           |	stop(*)  	|
- *	|  SIGTTOU           |	stop(*)  	|
- *	|  SIGURG            |	ignore   	|
- *	|  SIGXCPU           |	coredump 	|
- *	|  SIGXFSZ           |	coredump 	|
- *	|  SIGVTALRM         |	terminate	|
- *	|  SIGPROF           |	terminate	|
- *	|  SIGPOLL/SIGIO     |	terminate	|
- *	|  SIGSYS/SIGUNUSED  |	coredump 	|
- *	|  SIGSTKFLT         |	terminate	|
- *	|  SIGWINCH          |	ignore   	|
- *	|  SIGPWR            |	terminate	|
- *	|  SIGRTMIN-SIGRTMAX |	terminate       |
- *	+--------------------+------------------+
- *	|  non-POSIX signal  |  default action  |
- *	+--------------------+------------------+
- *	|  SIGEMT            |  coredump	|
- *	+--------------------+------------------+
- *
- * (+) For SIGKILL and SIGSTOP the action is "always", not just "default".
- * (*) Special job control effects:
- * When SIGCONT is sent, it resumes the process (all threads in the group)
- * from TASK_STOPPED state and also clears any pending/queued stop signals
- * (any of those marked with "stop(*)").  This happens regardless of blocking,
- * catching, or ignoring SIGCONT.  When any stop signal is sent, it clears
- * any pending/queued SIGCONT signals; this happens regardless of blocking,
- * catching, or ignored the stop signal, though (except for SIGSTOP) the
- * default action of stopping the process may happen later or never.
- */
-
-#ifdef SIGEMT
-#define M_SIGEMT	M(SIGEMT)
-#else
-#define M_SIGEMT	0
-#endif
-
-#if SIGRTMIN > BITS_PER_LONG
-#define M(sig) (1ULL << ((sig)-1))
-#else
-#define M(sig) (1UL << ((sig)-1))
-#endif
-#define T(sig, mask) (M(sig) & (mask))
-
-#define SIG_KERNEL_ONLY_MASK (\
-	M(SIGKILL)   |  M(SIGSTOP)                                   )
-
-#define SIG_KERNEL_STOP_MASK (\
-	M(SIGSTOP)   |  M(SIGTSTP)   |  M(SIGTTIN)   |  M(SIGTTOU)   )
-
-#define SIG_KERNEL_COREDUMP_MASK (\
-        M(SIGQUIT)   |  M(SIGILL)    |  M(SIGTRAP)   |  M(SIGABRT)   | \
-        M(SIGFPE)    |  M(SIGSEGV)   |  M(SIGBUS)    |  M(SIGSYS)    | \
-        M(SIGXCPU)   |  M(SIGXFSZ)   |  M_SIGEMT                     )
-
-#define SIG_KERNEL_IGNORE_MASK (\
-        M(SIGCONT)   |  M(SIGCHLD)   |  M(SIGWINCH)  |  M(SIGURG)    )
-
-#define sig_kernel_only(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_ONLY_MASK))
-#define sig_kernel_coredump(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_COREDUMP_MASK))
-#define sig_kernel_ignore(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_IGNORE_MASK))
-#define sig_kernel_stop(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_STOP_MASK))
-
-#define sig_needs_tasklist(sig)	((sig) == SIGCONT)
-
-#define sig_user_defined(t, signr) \
-	(((t)->sighand->action[(signr)-1].sa.sa_handler != SIG_DFL) &&	\
-	 ((t)->sighand->action[(signr)-1].sa.sa_handler != SIG_IGN))
-
-#define sig_fatal(t, signr) \
-	(!T(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
-	 (t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
 
 static int sig_ignored(struct task_struct *t, int sig)
 {
@@ -209,16 +90,28 @@ static inline int has_pending_signals(si
 
 #define PENDING(p,b) has_pending_signals(&(p)->signal, (b))
 
-fastcall void recalc_sigpending_tsk(struct task_struct *t)
+static int recalc_sigpending_tsk(struct task_struct *t)
 {
 	if (t->signal->group_stop_count > 0 ||
 	    (freezing(t)) ||
 	    PENDING(&t->pending, &t->blocked) ||
 	    PENDING(&t->signal->shared_pending, &t->blocked) ||
-	    tracehook_induce_sigpending(t))
+	    tracehook_induce_sigpending(t)) {
 		set_tsk_thread_flag(t, TIF_SIGPENDING);
-	else
-		clear_tsk_thread_flag(t, TIF_SIGPENDING);
+		return 1;
+	}
+	clear_tsk_thread_flag(t, TIF_SIGPENDING);
+	return 0;
+}
+
+/*
+ * After recalculating TIF_SIGPENDING, we need to make sure the task wakes up.
+ * This is superfluous when called on current, the wakeup is a harmless no-op.
+ */
+void recalc_sigpending_and_wake(struct task_struct *t)
+{
+	if (recalc_sigpending_tsk(t))
+		signal_wake_up(t, 0);
 }
 
 void recalc_sigpending(void)
@@ -844,7 +737,7 @@ force_sig_info(int sig, struct siginfo *
 		action->sa.sa_handler = SIG_DFL;
 		if (blocked) {
 			sigdelset(&t->blocked, sig);
-			recalc_sigpending_tsk(t);
+			recalc_sigpending_and_wake(t);
 		}
 	}
 	ret = specific_send_sig_info(sig, info, t);
@@ -2231,7 +2124,7 @@ int do_sigaction(int sig, struct k_sigac
 			rm_from_queue_full(&mask, &t->signal->shared_pending);
 			do {
 				rm_from_queue_full(&mask, &t->pending);
-				recalc_sigpending_tsk(t);
+				recalc_sigpending_and_wake(t);
 				t = next_thread(t);
 			} while (t != current);
 		}
--- linux-2.6.18/kernel/sys_ni.c
+++ linux-2.6.18/kernel/sys_ni.c
@@ -112,6 +112,10 @@ cond_syscall(sys_vm86);
 cond_syscall(compat_sys_ipc);
 cond_syscall(compat_sys_sysctl);
 
+/* CONFIG_PTRACE syscalls */
+cond_syscall(sys_ptrace);
+cond_syscall(compat_sys_ptrace);
+
 /* arch-specific weak syscall entries */
 cond_syscall(sys_pciconfig_read);
 cond_syscall(sys_pciconfig_write);
--- linux-2.6.18/kernel/timer.c
+++ linux-2.6.18/kernel/timer.c
@@ -1454,7 +1454,7 @@ asmlinkage long sys_getpid(void)
 /*
  * Accessing ->parent is not SMP-safe, it could
  * change from under us. However, we can use a stale
- * value of ->real_parent under rcu_read_lock(), see
+ * value of ->parent under rcu_read_lock(), see
  * release_task()->call_rcu(delayed_put_task_struct).
  */
 asmlinkage long sys_getppid(void)
--- linux-2.6.18/kernel/utrace.c
+++ linux-2.6.18/kernel/utrace.c
@@ -1,3 +1,15 @@
+/*
+ * utrace infrastructure interface for debugging user processes
+ *
+ * Copyright (C) 2006, 2007 Red Hat, Inc.  All rights reserved.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU General Public License v.2.
+ *
+ * Red Hat Author: Roland McGrath.
+ */
+
 #include <linux/utrace.h>
 #include <linux/tracehook.h>
 #include <linux/err.h>
@@ -8,6 +20,49 @@
 #include <asm/tracehook.h>
 
 
+#define UTRACE_DEBUG 1
+#ifdef UTRACE_DEBUG
+#define CHECK_INIT(p)	atomic_set(&(p)->check_dead, 1)
+#define CHECK_DEAD(p)	BUG_ON(!atomic_dec_and_test(&(p)->check_dead))
+#else
+#define CHECK_INIT(p)	do { } while (0)
+#define CHECK_DEAD(p)	do { } while (0)
+#endif
+
+/*
+ * Per-thread structure task_struct.utrace points to.
+ *
+ * The task itself never has to worry about this going away after
+ * some event is found set in task_struct.utrace_flags.
+ * Once created, this pointer is changed only when the task is quiescent
+ * (TASK_TRACED or TASK_STOPPED with the siglock held, or dead).
+ *
+ * For other parties, the pointer to this is protected by RCU and
+ * task_lock.  Since call_rcu is never used while the thread is alive and
+ * using this struct utrace, we can overlay the RCU data structure used
+ * only for a dead struct with some local state used only for a live utrace
+ * on an active thread.
+ */
+struct utrace
+{
+	union {
+		struct rcu_head dead;
+		struct {
+			struct task_struct *cloning;
+			struct utrace_signal *signal;
+		} live;
+		struct {
+			unsigned long flags;
+		} exit;
+	} u;
+
+	struct list_head engines;
+	spinlock_t lock;
+#ifdef UTRACE_DEBUG
+	atomic_t check_dead;
+#endif
+};
+
 static struct kmem_cache *utrace_cachep;
 static struct kmem_cache *utrace_engine_cachep;
 
@@ -36,8 +91,9 @@ subsys_initcall(utrace_init);
 static struct utrace *
 utrace_first_engine(struct task_struct *target,
 		    struct utrace_attached_engine *engine)
+	__acquires(utrace->lock)
 {
-	struct utrace *utrace, *ret;
+	struct utrace *utrace;
 
 	/*
 	 * If this is a newborn thread and we are not the creator,
@@ -45,12 +101,13 @@ utrace_first_engine(struct task_struct *
 	 * to attach.  The PF_STARTING flag is cleared after its
 	 * report_clone hook has had a chance to run.
 	 */
-	if ((target->flags & PF_STARTING)
-	    && (current->utrace == NULL
-		|| current->utrace->u.live.cloning != target)) {
-		yield();
-		return (signal_pending(current)
-			? ERR_PTR(-ERESTARTNOINTR) : NULL);
+	if (target->flags & PF_STARTING) {
+		utrace = current->utrace;
+		if (utrace == NULL || utrace->u.live.cloning != target) {
+			yield();
+			return (signal_pending(current)
+				? ERR_PTR(-ERESTARTNOINTR) : NULL);
+		}
 	}
 
 	utrace = kmem_cache_alloc(utrace_cachep, GFP_KERNEL);
@@ -62,12 +119,13 @@ utrace_first_engine(struct task_struct *
 	INIT_LIST_HEAD(&utrace->engines);
 	list_add(&engine->entry, &utrace->engines);
 	spin_lock_init(&utrace->lock);
+	CHECK_INIT(utrace);
 
-	ret = utrace;
-	utrace_lock(utrace);
+	spin_lock(&utrace->lock);
 	task_lock(target);
 	if (likely(target->utrace == NULL)) {
 		rcu_assign_pointer(target->utrace, utrace);
+
 		/*
 		 * The task_lock protects us against another thread doing
 		 * the same thing.  We might still be racing against
@@ -80,28 +138,27 @@ utrace_first_engine(struct task_struct *
 		 * see our target->utrace pointer.
 		 */
 		smp_mb();
-		if (target->exit_state == EXIT_DEAD) {
-			/*
-			 * The target has already been through release_task.
-			 */
-			target->utrace = NULL;
-			goto cannot_attach;
+		if (likely(target->exit_state != EXIT_DEAD)) {
+			task_unlock(target);
+			return utrace;
 		}
-		task_unlock(target);
-	}
-	else {
+
 		/*
-		 * Another engine attached first, so there is a struct already.
-		 * A null return says to restart looking for the existing one.
+		 * The target has already been through release_task.
+		 * Our caller will restart and notice it's too late now.
 		 */
-	cannot_attach:
-		ret = NULL;
-		task_unlock(target);
-		utrace_unlock(utrace);
-		kmem_cache_free(utrace_cachep, utrace);
+		target->utrace = NULL;
 	}
 
-	return ret;
+	/*
+	 * Another engine attached first, so there is a struct already.
+	 * A null return says to restart looking for the existing one.
+	 */
+	task_unlock(target);
+	spin_unlock(&utrace->lock);
+	kmem_cache_free(utrace_cachep, utrace);
+
+	return NULL;
 }
 
 static void
@@ -116,8 +173,10 @@ utrace_free(struct rcu_head *rhead)
  */
 static void
 rcu_utrace_free(struct utrace *utrace)
+	__releases(utrace->lock)
 {
-	utrace_unlock(utrace);
+	CHECK_DEAD(utrace);
+	spin_unlock(&utrace->lock);
 	INIT_RCU_HEAD(&utrace->u.dead);
 	call_rcu(&utrace->u.dead, utrace_free);
 }
@@ -130,14 +189,24 @@ utrace_engine_free(struct rcu_head *rhea
 	kmem_cache_free(utrace_engine_cachep, engine);
 }
 
+static inline void
+rcu_engine_free(struct utrace_attached_engine *engine)
+{
+	CHECK_DEAD(engine);
+	call_rcu(&engine->rhead, utrace_engine_free);
+}
+
+
 /*
  * Remove the utrace pointer from the task, unless there is a pending
- * forced signal (or it's quiescent in utrace_get_signal).
+ * forced signal (or it's quiescent in utrace_get_signal).  We know it's
+ * quiescent now, and so are guaranteed it will have to take utrace->lock
+ * before it can set ->exit_state if it's not set now.
  */
 static inline void
 utrace_clear_tsk(struct task_struct *tsk, struct utrace *utrace)
 {
-	if (utrace->u.live.signal == NULL) {
+	if (tsk->exit_state || utrace->u.live.signal == NULL) {
 		task_lock(tsk);
 		if (likely(tsk->utrace != NULL)) {
 			rcu_assign_pointer(tsk->utrace, NULL);
@@ -159,10 +228,12 @@ remove_engine(struct utrace_attached_eng
 	list_del_rcu(&engine->entry);
 	if (list_empty(&utrace->engines))
 		utrace_clear_tsk(tsk, utrace);
-	call_rcu(&engine->rhead, utrace_engine_free);
+	rcu_engine_free(engine);
 }
 
 
+#define DEATH_EVENTS (UTRACE_EVENT(DEATH) | UTRACE_EVENT(QUIESCE))
+
 /*
  * Called with utrace locked, after remove_engine may have run.
  * Passed the flags from all remaining engines, i.e. zero if none
@@ -173,6 +244,7 @@ remove_engine(struct utrace_attached_eng
 static void
 check_dead_utrace(struct task_struct *tsk, struct utrace *utrace,
 		  unsigned long flags)
+	__releases(utrace->lock)
 {
 	long exit_state = 0;
 
@@ -235,14 +307,27 @@ check_dead_utrace(struct task_struct *ts
 			 * which will call release_task itself.
 			 */
 			read_unlock(&tasklist_lock);
+	}
 
+	/*
+	 * When it's in TASK_STOPPED state, do not set UTRACE_EVENT(JCTL).
+	 * That bit indicates utrace_report_jctl has not run yet, but it
+	 * may have.  Set UTRACE_ACTION_QUIESCE instead to be sure that
+	 * once it resumes it will recompute its flags in utrace_quiescent.
+	 */
+	if (((flags &~ tsk->utrace_flags) & UTRACE_EVENT(JCTL))
+	    && tsk->state == TASK_STOPPED) {
+		flags &= ~UTRACE_EVENT(JCTL);
+		flags |= UTRACE_ACTION_QUIESCE;
 	}
 
 	tsk->utrace_flags = flags;
 	if (flags)
-		utrace_unlock(utrace);
-	else
+		spin_unlock(&utrace->lock);
+	else {
+		BUG_ON(tsk->utrace == utrace);
 		rcu_utrace_free(utrace);
+	}
 
 	/*
 	 * Now we're finished updating the utrace state.
@@ -270,8 +355,6 @@ check_dead_utrace(struct task_struct *ts
 		release_task(tsk);
 }
 
-
-
 /*
  * Get the target thread to quiesce.  Return nonzero if it's already quiescent.
  * Return zero if it will report a QUIESCE event soon.
@@ -281,37 +364,83 @@ check_dead_utrace(struct task_struct *ts
 static int
 quiesce(struct task_struct *target, int interrupt)
 {
-	int quiescent;
+	int ret;
 
 	target->utrace_flags |= UTRACE_ACTION_QUIESCE;
 	read_barrier_depends();
 
-	quiescent = (target->exit_state
-		     || target->state & (TASK_TRACED | TASK_STOPPED));
+	if (target->exit_state)
+		goto dead;
+
+	/*
+	 * First a quick check without the siglock.  If it's in TASK_TRACED
+	 * or TASK_STOPPED already, we know it is going to go through
+	 * utrace_get_signal before it resumes.
+	 */
+	ret = 1;
+	switch (target->state) {
+	case TASK_TRACED:
+		break;
+
+	case TASK_STOPPED:
+		/*
+		 * If it will call utrace_report_jctl but has not gotten
+		 * through it yet, then don't consider it quiescent yet.
+		 * utrace_report_jctl will take target->utrace->lock and
+		 * clear UTRACE_EVENT(JCTL) once it finishes.  After that,
+		 * it is considered quiescent; when it wakes up, it will go
+		 * through utrace_get_signal before doing anything else.
+		 */
+		if (!(target->utrace_flags & UTRACE_EVENT(JCTL)))
+			break;
 
-	if (!quiescent) {
+	default:
+		/*
+		 * Now get the siglock and check again.
+		 */
 		spin_lock_irq(&target->sighand->siglock);
-		quiescent = (unlikely(target->exit_state)
-			     || unlikely(target->state
-					 & (TASK_TRACED | TASK_STOPPED)));
-		if (!quiescent) {
+		if (unlikely(target->exit_state)) {
+			spin_unlock_irq(&target->sighand->siglock);
+			goto dead;
+		}
+		switch (target->state) {
+		case TASK_TRACED:
+			break;
+
+		case TASK_STOPPED:
+			ret = !(target->utrace_flags & UTRACE_EVENT(JCTL));
+			break;
+
+		default:
+			/*
+			 * It is not stopped, so tell it to stop soon.
+			 */
+			ret = 0;
 			if (interrupt)
 				signal_wake_up(target, 0);
 			else {
 				set_tsk_thread_flag(target, TIF_SIGPENDING);
 				kick_process(target);
 			}
+			break;
 		}
 		spin_unlock_irq(&target->sighand->siglock);
 	}
 
-	return quiescent;
+	return ret;
+
+dead:
+	/*
+	 * On the exit path, it's only truly quiescent if it has
+	 * already been through utrace_report_death, or never will.
+	 */
+	return !(target->utrace_flags & DEATH_EVENTS);
 }
 
 
 static struct utrace_attached_engine *
 matching_engine(struct utrace *utrace, int flags,
-		const struct utrace_engine_ops *ops, unsigned long data)
+		const struct utrace_engine_ops *ops, void *data)
 {
 	struct utrace_attached_engine *engine;
 	list_for_each_entry_rcu(engine, &utrace->engines, entry) {
@@ -326,13 +455,25 @@ matching_engine(struct utrace *utrace, i
 	return ERR_PTR(-ENOENT);
 }
 
-/*
-  option to stop it?
-  option to match existing on ops, ops+data, return it; nocreate:lookup only
+
+/**
+ * utrace_attach - Attach new engine to a thread, or look up attached engines.
+ * @target: thread to attach to
+ * @flags: %UTRACE_ATTACH_* flags
+ * @ops: callback table for new engine
+ * @data: engine private data pointer
+ *
+ * The caller must ensure that the @target thread does not get freed,
+ * i.e. hold a ref or be its parent.
+ *
+ * If %UTRACE_ATTACH_CREATE is not specified, you only look up an existing
+ * engine already attached to the thread.  If %UTRACE_ATTACH_MATCH_* bits
+ * are set, only consider matching engines.  If %UTRACE_ATTACH_EXCLUSIVE is
+ * set, attempting to attach a second (matching) engine fails with -%EEXIST.
  */
 struct utrace_attached_engine *
 utrace_attach(struct task_struct *target, int flags,
-	     const struct utrace_engine_ops *ops, unsigned long data)
+	     const struct utrace_engine_ops *ops, void *data)
 {
 	struct utrace *utrace;
 	struct utrace_attached_engine *engine;
@@ -349,6 +490,7 @@ restart:
 		rcu_read_unlock();
 		return ERR_PTR(-ESRCH);
 	}
+
 	if (utrace == NULL) {
 		rcu_read_unlock();
 
@@ -359,67 +501,72 @@ restart:
 		if (unlikely(engine == NULL))
 			return ERR_PTR(-ENOMEM);
 		engine->flags = 0;
+		CHECK_INIT(engine);
 
-	first:
-		utrace = utrace_first_engine(target, engine);
-		if (IS_ERR(utrace) || unlikely(utrace == NULL)) {
-			kmem_cache_free(utrace_engine_cachep, engine);
-			if (unlikely(utrace == NULL)) /* Race condition.  */
-				goto restart;
-			return ERR_PTR(PTR_ERR(utrace));
-		}
+		goto first;
 	}
-	else {
-		if (!(flags & UTRACE_ATTACH_CREATE)) {
-			engine = matching_engine(utrace, flags, ops, data);
-			rcu_read_unlock();
-			return engine;
-		}
-		rcu_read_unlock();
 
-		engine = kmem_cache_alloc(utrace_engine_cachep, GFP_KERNEL);
-		if (unlikely(engine == NULL))
-			return ERR_PTR(-ENOMEM);
-		engine->flags = 0;
+	if (!(flags & UTRACE_ATTACH_CREATE)) {
+		engine = matching_engine(utrace, flags, ops, data);
+		rcu_read_unlock();
+		return engine;
+	}
+	rcu_read_unlock();
 
-		rcu_read_lock();
-		utrace = rcu_dereference(target->utrace);
-		if (unlikely(utrace == NULL)) { /* Race with detach.  */
-			rcu_read_unlock();
-			goto first;
-		}
-		utrace_lock(utrace);
+	engine = kmem_cache_alloc(utrace_engine_cachep, GFP_KERNEL);
+	if (unlikely(engine == NULL))
+		return ERR_PTR(-ENOMEM);
+	engine->flags = 0;
+	CHECK_INIT(engine);
 
-		if (flags & UTRACE_ATTACH_EXCLUSIVE) {
-			struct utrace_attached_engine *old;
-			old = matching_engine(utrace, flags, ops, data);
-			if (!IS_ERR(old)) {
-				utrace_unlock(utrace);
-				rcu_read_unlock();
-				kmem_cache_free(utrace_engine_cachep, engine);
-				return ERR_PTR(-EEXIST);
-			}
-		}
+	rcu_read_lock();
+	utrace = rcu_dereference(target->utrace);
+	if (unlikely(utrace == NULL)) { /* Race with detach.  */
+		rcu_read_unlock();
+		goto first;
+	}
+	spin_lock(&utrace->lock);
 
-		if (unlikely(rcu_dereference(target->utrace) != utrace)) {
-			/*
-			 * We lost a race with other CPUs doing a sequence
-			 * of detach and attach before we got in.
-			 */
-			utrace_unlock(utrace);
+	if (flags & UTRACE_ATTACH_EXCLUSIVE) {
+		struct utrace_attached_engine *old;
+		old = matching_engine(utrace, flags, ops, data);
+		if (!IS_ERR(old)) {
+			spin_unlock(&utrace->lock);
 			rcu_read_unlock();
 			kmem_cache_free(utrace_engine_cachep, engine);
-			goto restart;
+			return ERR_PTR(-EEXIST);
 		}
+	}
+
+	if (unlikely(rcu_dereference(target->utrace) != utrace)) {
+		/*
+		 * We lost a race with other CPUs doing a sequence
+		 * of detach and attach before we got in.
+		 */
+		spin_unlock(&utrace->lock);
 		rcu_read_unlock();
+		kmem_cache_free(utrace_engine_cachep, engine);
+		goto restart;
+	}
+	rcu_read_unlock();
 
-		list_add_tail_rcu(&engine->entry, &utrace->engines);
+	list_add_tail_rcu(&engine->entry, &utrace->engines);
+	goto finish;
+
+first:
+	utrace = utrace_first_engine(target, engine);
+	if (IS_ERR(utrace) || unlikely(utrace == NULL)) {
+		kmem_cache_free(utrace_engine_cachep, engine);
+		if (unlikely(utrace == NULL)) /* Race condition.  */
+			goto restart;
+		return ERR_PTR(PTR_ERR(utrace));
 	}
 
+finish:
 	engine->ops = ops;
 	engine->data = data;
 
-	utrace_unlock(utrace);
+	spin_unlock(&utrace->lock);
 
 	return engine;
 }
@@ -467,6 +614,16 @@ rescan_flags(struct utrace *utrace)
 #define DEAD_FLAGS_MASK	(UTRACE_EVENT(REAP) | UTRACE_ACTION_NOREAP)
 
 /*
+ * Flags bits in utrace->u.exit.flags word.  These are private
+ * communication among utrace_report_death, utrace_release_task,
+ * utrace_detach, and utrace_set_flags.
+ */
+#define	EXIT_FLAG_DEATH			1 /* utrace_report_death running */
+#define	EXIT_FLAG_DELAYED_GROUP_LEADER	2 /* utrace_delayed_group_leader ran */
+#define	EXIT_FLAG_REAP			4 /* release_task ran */
+
+
+/*
  * We may have been the one keeping the target thread quiescent.
  * Check if it should wake up now.
  * Called with utrace locked, and unlocks it on return.
@@ -476,6 +633,7 @@ rescan_flags(struct utrace *utrace)
 static void
 wake_quiescent(unsigned long old_flags,
 	       struct utrace *utrace, struct task_struct *target)
+	__releases(utrace->lock)
 {
 	unsigned long flags;
 
@@ -485,7 +643,7 @@ wake_quiescent(unsigned long old_flags,
 	 */
 	flags = rescan_flags(utrace);
 	if (target->exit_state) {
-		BUG_ON(utrace->u.exit.report_death);
+		BUG_ON(utrace->u.exit.flags & EXIT_FLAG_DEATH);
 		flags &= DEAD_FLAGS_MASK;
 	}
 	check_dead_utrace(target, utrace, flags);
@@ -517,7 +675,7 @@ wake_quiescent(unsigned long old_flags,
 			/*
 			 * Wake the task up.
 			 */
-			recalc_sigpending_tsk(target);
+			recalc_sigpending_and_wake(target);
 			wake_up_state(target, TASK_STOPPED | TASK_TRACED);
 			spin_unlock_irq(&target->sighand->siglock);
 		}
@@ -547,23 +705,25 @@ wake_quiescent(unsigned long old_flags,
 static struct utrace *
 get_utrace_lock_attached(struct task_struct *target,
 			 struct utrace_attached_engine *engine)
+	__acquires(utrace->lock)
 {
 	struct utrace *utrace;
 
 	rcu_read_lock();
 	utrace = rcu_dereference(target->utrace);
 	smp_rmb();
-	if (unlikely(target->exit_state == EXIT_DEAD)) {
+	if (unlikely(utrace == NULL)
+	    || unlikely(target->exit_state == EXIT_DEAD))
 		/*
-		 * Called after utrace_release_task might have started.
-		 * A call to this engine's report_reap callback might
-		 * already be in progress or engine might even have been
-		 * freed already.
+		 * If all engines detached already, utrace is clear.
+		 * Otherwise, we're called after utrace_release_task might
+		 * have started.  A call to this engine's report_reap
+		 * callback might already be in progress or engine might
+		 * even have been freed already.
 		 */
 		utrace = ERR_PTR(-ESRCH);
-	}
 	else {
-		utrace_lock(utrace);
+		spin_lock(&utrace->lock);
 		if (unlikely(rcu_dereference(target->utrace) != utrace)
 		    || unlikely(rcu_dereference(engine->ops)
 				== &dead_engine_ops)) {
@@ -571,7 +731,7 @@ get_utrace_lock_attached(struct task_str
 			 * By the time we got the utrace lock,
 			 * it had been reaped or detached already.
 			 */
-			utrace_unlock(utrace);
+			spin_unlock(&utrace->lock);
 			utrace = ERR_PTR(-ESRCH);
 		}
 	}
@@ -580,6 +740,26 @@ get_utrace_lock_attached(struct task_str
 	return utrace;
 }
 
+/**
+ * utrace_detach - Detach a tracing engine from a thread.
+ * @target: thread to detach from
+ * @engine: engine attached to @target
+ *
+ * After this, the engine data structure is no longer accessible, and the
+ * thread might be reaped.  The thread will start running again if it was
+ * being kept quiescent and no longer has any attached engines asserting
+ * %UTRACE_ACTION_QUIESCE.
+ *
+ * If the target thread is not already quiescent, then a callback to this
+ * engine might be in progress or about to start on another CPU.  If it's
+ * quiescent when utrace_detach() is called, then after successful return
+ * it's guaranteed that no more callbacks to the ops vector will be done.
+ * The only exception is %SIGKILL (and exec by another thread in the group),
+ * which breaks quiescence and can cause asynchronous %DEATH and/or %REAP
+ * callbacks even when %UTRACE_ACTION_QUIESCE is set.  In that event,
+ * utrace_detach() fails with -%ESRCH or -%EALREADY to indicate that the
+ * report_reap() or report_death() callbacks have begun or will run imminently.
+ */
 int
 utrace_detach(struct task_struct *target,
 	      struct utrace_attached_engine *engine)
@@ -591,16 +771,25 @@ utrace_detach(struct task_struct *target
 	if (unlikely(IS_ERR(utrace)))
 		return PTR_ERR(utrace);
 
+	/*
+	 * On the exit path, DEATH and QUIESCE event bits are set only
+	 * before utrace_report_death has taken the lock.  At that point,
+	 * the death report will come soon, so disallow detach until it's
+	 * done.  This prevents us from racing with it detaching itself.
+	 */
 	if (target->exit_state
-	    && unlikely(utrace->u.exit.reap || utrace->u.exit.report_death)) {
+	    && (unlikely(target->utrace_flags & DEATH_EVENTS)
+		|| unlikely(utrace->u.exit.flags & (EXIT_FLAG_DEATH
+						    | EXIT_FLAG_REAP)))) {
 		/*
 		 * We have already started the death report, or
 		 * even entered release_task.  We can't prevent
 		 * the report_death and report_reap callbacks,
 		 * so tell the caller they will happen.
 		 */
-		int ret = utrace->u.exit.reap ? -ESRCH : -EALREADY;
-		utrace_unlock(utrace);
+		int ret = ((utrace->u.exit.flags & EXIT_FLAG_REAP)
+			   ? -ESRCH : -EALREADY);
+		spin_unlock(&utrace->lock);
 		return ret;
 	}
 
@@ -613,7 +802,7 @@ utrace_detach(struct task_struct *target
 		wake_quiescent(flags, utrace, target);
 	}
 	else
-		utrace_unlock(utrace);
+		spin_unlock(&utrace->lock);
 
 
 	return 0;
@@ -627,6 +816,7 @@ EXPORT_SYMBOL_GPL(utrace_detach);
  */
 static void
 utrace_reap(struct task_struct *target, struct utrace *utrace)
+	__releases(utrace->lock)
 {
 	struct utrace_attached_engine *engine, *next;
 	const struct utrace_engine_ops *ops;
@@ -641,14 +831,14 @@ restart:
 		if (engine->flags & UTRACE_EVENT(REAP)) {
 			ops = rcu_dereference(engine->ops);
 			if (ops != &dead_engine_ops) {
-				utrace_unlock(utrace);
+				spin_unlock(&utrace->lock);
 				(*ops->report_reap)(engine, target);
-				call_rcu(&engine->rhead, utrace_engine_free);
-				utrace_lock(utrace);
+				rcu_engine_free(engine);
+				spin_lock(&utrace->lock);
 				goto restart;
 			}
 		}
-		call_rcu(&engine->rhead, utrace_engine_free);
+		rcu_engine_free(engine);
 	}
 
 	rcu_utrace_free(utrace);
@@ -663,18 +853,27 @@ utrace_release_task(struct task_struct *
 	struct utrace *utrace;
 
 	task_lock(target);
-	utrace = target->utrace;
+	utrace = rcu_dereference(target->utrace);
 	rcu_assign_pointer(target->utrace, NULL);
 	task_unlock(target);
 
 	if (unlikely(utrace == NULL))
 		return;
 
-	utrace_lock(utrace);
-	utrace->u.exit.reap = 1;
+	spin_lock(&utrace->lock);
+	/*
+	 * If the list is empty, utrace is already on its way to be freed.
+	 * We raced with detach and we won the task_lock race but lost the
+	 * utrace->lock race.  All we have to do is let RCU run.
+	 */
+	if (!unlikely(list_empty(&utrace->engines))) {
+		utrace->u.exit.flags |= EXIT_FLAG_REAP;
+
+		if (!(target->utrace_flags & DEATH_EVENTS)) {
+			utrace_reap(target, utrace); /* Unlocks and frees.  */
+			return;
+		}
 
-	if (target->utrace_flags & (UTRACE_EVENT(DEATH)
-				    | UTRACE_EVENT(QUIESCE)))
 		/*
 		 * The target will do some final callbacks but hasn't
 		 * finished them yet.  We know because it clears these
@@ -683,12 +882,37 @@ utrace_release_task(struct task_struct *
 		 * delay the REAP report and the teardown until after the
 		 * target finishes its death reports.
 		 */
-		utrace_unlock(utrace);
-	else
-		utrace_reap(target, utrace); /* Unlocks and frees.  */
+	}
+	spin_unlock(&utrace->lock);
 }
 
-
+/**
+ * utrace_set_flags - Change the flags for a tracing engine.
+ * @target: thread to affect
+ * @engine: attached engine to affect
+ * @flags: new flags value
+ *
+ * This resets the event flags and the action state flags.
+ * If %UTRACE_ACTION_QUIESCE and %UTRACE_EVENT(%QUIESCE) are set,
+ * this will cause a report_quiesce() callback soon, maybe immediately.
+ * If %UTRACE_ACTION_QUIESCE was set before and is no longer set by
+ * any engine, this will wake the thread up.
+ *
+ * This fails with -%EALREADY and does nothing if you try to clear
+ * %UTRACE_EVENT(%DEATH) when the report_death() callback may already have
+ * begun, if you try to clear %UTRACE_EVENT(%REAP) when the report_reap()
+ * callback may already have begun, if you try to newly set
+ * %UTRACE_ACTION_NOREAP when the target may already have sent its
+ * parent %SIGCHLD, or if you try to newly set %UTRACE_EVENT(%DEATH),
+ * %UTRACE_EVENT(%QUIESCE), or %UTRACE_ACTION_QUIESCE, when the target is
+ * already dead or dying.  It can fail with -%ESRCH when the target has
+ * already been detached (including forcible detach on reaping).  If
+ * the target was quiescent before the call, then after a successful
+ * call, no event callbacks not requested in the new flags will be
+ * made, and a report_quiesce() callback will always be made if
+ * requested.  These rules provide for coherent synchronization based
+ * on quiescence, even when %SIGKILL is breaking quiescence.
+ */
 int
 utrace_set_flags(struct task_struct *target,
 		 struct utrace_attached_engine *engine,
@@ -720,14 +944,12 @@ restart:			/* See below. */
 	if (target->exit_state
 	    && (((flags &~ old_flags) & (UTRACE_ACTION_QUIESCE
 					 | UTRACE_ACTION_NOREAP
-					 | UTRACE_EVENT(DEATH)
-					 | UTRACE_EVENT(QUIESCE)))
-		|| (utrace->u.exit.report_death
-		    && ((old_flags &~ flags) & (UTRACE_EVENT(DEATH) |
-						UTRACE_EVENT(QUIESCE))))
-		|| (utrace->u.exit.reap
+					 | DEATH_EVENTS))
+		|| ((utrace->u.exit.flags & EXIT_FLAG_DEATH)
+		    && ((old_flags &~ flags) & DEATH_EVENTS))
+		|| ((utrace->u.exit.flags & EXIT_FLAG_REAP)
 		    && ((old_flags &~ flags) & UTRACE_EVENT(REAP))))) {
-		utrace_unlock(utrace);
+		spin_unlock(&utrace->lock);
 		return ret;
 	}
 
@@ -742,12 +964,11 @@ restart:			/* See below. */
 	 * that it won't.
 	 */
 	if ((flags &~ old_utrace_flags) & (UTRACE_ACTION_NOREAP
-					   | UTRACE_EVENT(DEATH)
-					   | UTRACE_EVENT(QUIESCE))) {
+					   | DEATH_EVENTS)) {
 		read_lock(&tasklist_lock);
 		if (unlikely(target->exit_state)) {
 			read_unlock(&tasklist_lock);
-			utrace_unlock(utrace);
+			spin_unlock(&utrace->lock);
 			return ret;
 		}
 		target->utrace_flags |= flags;
@@ -763,14 +984,14 @@ restart:			/* See below. */
 		if (flags & UTRACE_ACTION_QUIESCE) {
 			report = (quiesce(target, 1)
 				  && (flags & UTRACE_EVENT(QUIESCE)));
-			utrace_unlock(utrace);
+			spin_unlock(&utrace->lock);
 		}
 		else
-			wake_quiescent(old_flags, utrace, target);
+			goto wake;
 	}
 	else if (((old_flags &~ flags) & UTRACE_ACTION_NOREAP)
 		 && target->exit_state)
-			wake_quiescent(old_flags, utrace, target);
+		goto wake;
 	else {
 		/*
 		 * If we're asking for single-stepping or syscall tracing,
@@ -784,7 +1005,7 @@ restart:			/* See below. */
 			& (UTRACE_ACTION_SINGLESTEP | UTRACE_ACTION_BLOCKSTEP
 			   | UTRACE_EVENT_SYSCALL)))
 			quiesce(target, 0);
-		utrace_unlock(utrace);
+		spin_unlock(&utrace->lock);
 	}
 
 	if (report) {	/* Already quiescent, won't report itself.  */
@@ -797,10 +1018,10 @@ restart:			/* See below. */
 			 * again.  Since we released the lock, they
 			 * could have changed asynchronously just now.
 			 * We must refetch the current flags to change
-			 * the UTRACE_ACTION_STATE_MASK bits.  If the
+			 * the %UTRACE_ACTION_STATE_MASK bits.  If the
 			 * target thread started dying, then there is
 			 * nothing we can do--but that failure is due
-			 * to the report_quiesce callback after the
+			 * to the report_quiesce() callback after the
 			 * original utrace_set_flags has already
 			 * succeeded, so we don't want to return
 			 * failure here (hence leave ret = 0).
@@ -816,6 +1037,21 @@ restart:			/* See below. */
 	}
 
 	return ret;
+
+wake:
+	/*
+	 * It's quiescent now and needs to wake up.
+	 *
+	 * On the exit path, it's only truly quiescent if it has
+	 * already been through utrace_report_death, or never will.
+	 */
+	if (unlikely(target->exit_state)
+	    && unlikely(target->utrace_flags & DEATH_EVENTS))
+		spin_unlock(&utrace->lock);
+	else
+		wake_quiescent(old_flags, utrace, target);
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(utrace_set_flags);
 
@@ -841,7 +1077,7 @@ update_action(struct task_struct *tsk, s
 		if (! ARCH_HAS_BLOCK_STEP)
 #endif
 			WARN_ON(ret & UTRACE_ACTION_BLOCKSTEP);
-		utrace_lock(utrace);
+		spin_lock(&utrace->lock);
 		/*
 		 * If we're changing something other than just QUIESCE,
 		 * make sure we pass through utrace_quiescent before
@@ -855,7 +1091,7 @@ update_action(struct task_struct *tsk, s
 		engine->flags &= ~UTRACE_ACTION_STATE_MASK;
 		engine->flags |= ret & UTRACE_ACTION_STATE_MASK;
 		tsk->utrace_flags |= engine->flags;
-		utrace_unlock(utrace);
+		spin_unlock(&utrace->lock);
 	}
 	else
 		ret |= engine->flags & UTRACE_ACTION_STATE_MASK;
@@ -875,6 +1111,7 @@ update_action(struct task_struct *tsk, s
 static u32
 remove_detached(struct task_struct *tsk, struct utrace *utrace,
 		u32 action, unsigned long mask)
+	__releases(utrace->lock)
 {
 	struct utrace_attached_engine *engine, *next;
 	unsigned long flags = 0;
@@ -902,9 +1139,11 @@ check_detach(struct task_struct *tsk, u3
 		 * This must be current to be sure it's not possibly
 		 * getting into utrace_report_death.
 		 */
+		struct utrace *utrace;
 		BUG_ON(tsk != current);
-		utrace_lock(tsk->utrace);
-		action = remove_detached(tsk, tsk->utrace, action, ~0UL);
+		utrace = tsk->utrace;
+		spin_lock(&utrace->lock);
+		action = remove_detached(tsk, utrace, action, ~0UL);
 	}
 	return action;
 }
@@ -1008,6 +1247,18 @@ utrace_report_jctl(int what)
 		spin_unlock_irq(&tsk->sighand->siglock);
 	}
 
+	/*
+	 * We clear the UTRACE_EVENT(JCTL) bit to indicate that we are now
+	 * in a truly quiescent TASK_STOPPED state.  After this, we can be
+	 * detached by another thread.  Setting UTRACE_ACTION_QUIESCE
+	 * ensures that we will go through utrace_quiescent and recompute
+	 * flags after we resume.
+	 */
+	spin_lock(&utrace->lock);
+	tsk->utrace_flags &= ~UTRACE_EVENT(JCTL);
+	tsk->utrace_flags |= UTRACE_ACTION_QUIESCE;
+	spin_unlock(&utrace->lock);
+
 	return action & UTRACE_JCTL_NOSIGCHLD;
 }
 
@@ -1119,7 +1370,7 @@ restart:
 		 */
 		unsigned long flags;
 		utrace = rcu_dereference(tsk->utrace);
-		utrace_lock(utrace);
+		spin_lock(&utrace->lock);
 		flags = rescan_flags(utrace);
 		if (flags == 0)
 			utrace_clear_tsk(tsk, utrace);
@@ -1176,6 +1427,68 @@ utrace_report_exit(long *exit_code)
 }
 
 /*
+ * Called with utrace locked, unlocks it on return.  Unconditionally
+ * recompute the flags after report_death is finished.  This may notice
+ * that there are no engines left and free the utrace struct.
+ */
+static void
+finish_report_death(struct task_struct *tsk, struct utrace *utrace)
+	__releases(utrace->lock)
+{
+	/*
+	 * After we unlock (possibly inside utrace_reap for callbacks) with
+	 * this flag clear, competing utrace_detach/utrace_set_flags calls
+	 * know that we've finished our callbacks and any detach bookkeeping.
+	 */
+	utrace->u.exit.flags &= EXIT_FLAG_REAP;
+
+	if (utrace->u.exit.flags & EXIT_FLAG_REAP)
+		/*
+		 * utrace_release_task was already called in parallel.
+		 * We must complete its work now.
+		 */
+		utrace_reap(tsk, utrace);
+	else
+		/*
+		 * Clear out any detached engines and in the process
+		 * recompute the flags.  Mask off event bits we can't
+		 * see any more.  This tells utrace_release_task we
+		 * have already finished, if it comes along later.
+		 * Note this all happens on the already-locked utrace,
+		 * which might already be removed from the task.
+		 */
+		remove_detached(tsk, utrace, 0, DEAD_FLAGS_MASK);
+}
+
+/*
+ * Called with utrace locked, unlocks it on return.
+ * EXIT_FLAG_DELAYED_GROUP_LEADER is set.
+ * Do second report_death callbacks for engines using NOREAP.
+ */
+static void
+report_delayed_group_leader(struct task_struct *tsk, struct utrace *utrace)
+	__releases(utrace->lock)
+{
+	struct list_head *pos, *next;
+	struct utrace_attached_engine *engine;
+	u32 action;
+
+	utrace->u.exit.flags |= EXIT_FLAG_DEATH;
+	spin_unlock(&utrace->lock);
+
+	/* XXX must change for sharing */
+	list_for_each_safe_rcu(pos, next, &utrace->engines) {
+		engine = list_entry(pos, struct utrace_attached_engine, entry);
+#define NOREAP_DEATH (UTRACE_EVENT(DEATH) | UTRACE_ACTION_NOREAP)
+		if ((engine->flags & NOREAP_DEATH) == NOREAP_DEATH)
+			REPORT(report_death);
+	}
+
+	spin_lock(&utrace->lock);
+	finish_report_death(tsk, utrace);
+}
+
+/*
  * Called iff UTRACE_EVENT(DEATH) or UTRACE_ACTION_QUIESCE flag is set.
  *
  * It is always possible that we are racing with utrace_release_task here,
@@ -1204,10 +1517,11 @@ utrace_report_death(struct task_struct *
 	 * flag and know that we are not yet fully quiescent for purposes
 	 * of detach bookkeeping.
 	 */
-	utrace_lock(utrace);
-	BUG_ON(utrace->u.exit.report_death);
-	utrace->u.exit.report_death = 1;
-	utrace_unlock(utrace);
+	spin_lock(&utrace->lock);
+	BUG_ON(utrace->u.exit.flags & EXIT_FLAG_DEATH);
+	utrace->u.exit.flags &= EXIT_FLAG_REAP;
+	utrace->u.exit.flags |= EXIT_FLAG_DEATH;
+	spin_unlock(&utrace->lock);
 
 	/* XXX must change for sharing */
 	list_for_each_safe_rcu(pos, next, &utrace->engines) {
@@ -1218,36 +1532,49 @@ utrace_report_death(struct task_struct *
 			REPORT(report_quiesce);
 	}
 
-	/*
-	 * Unconditionally lock and recompute the flags.
-	 * This may notice that there are no engines left and
-	 * free the utrace struct.
-	 */
-	utrace_lock(utrace);
+	spin_lock(&utrace->lock);
+	if (unlikely(utrace->u.exit.flags & EXIT_FLAG_DELAYED_GROUP_LEADER))
+		/*
+		 * Another thread's release_task came along and
+		 * removed the delayed_group_leader condition,
+		 * but after we might have started callbacks.
+		 * Do the second report_death callback right now.
+		 */
+		report_delayed_group_leader(tsk, utrace);
+	else
+		finish_report_death(tsk, utrace);
+}
+
+/*
+ * We're called from release_task when delayed_group_leader(tsk) was
+ * previously true and is no longer true, and NOREAP was set.
+ * This means no parent notifications have happened for this zombie.
+ */
+void
+utrace_report_delayed_group_leader(struct task_struct *tsk)
+{
+	struct utrace *utrace;
+
+	rcu_read_lock();
+	utrace = rcu_dereference(tsk->utrace);
+	if (unlikely(utrace == NULL)) {
+		rcu_read_unlock();
+		return;
+	}
+	spin_lock(&utrace->lock);
+	rcu_read_unlock();
+
+	utrace->u.exit.flags |= EXIT_FLAG_DELAYED_GROUP_LEADER;
 
 	/*
-	 * After we unlock (possibly inside utrace_reap for callbacks) with
-	 * this flag clear, competing utrace_detach/utrace_set_flags calls
-	 * know that we've finished our callbacks and any detach bookkeeping.
+	 * If utrace_report_death is still running, or release_task has
+	 * started already, there is nothing more to do now.
 	 */
-	utrace->u.exit.report_death = 0;
-
-	if (utrace->u.exit.reap)
-		/*
-		 * utrace_release_task was already called in parallel.
-		 * We must complete its work now.
-		 */
-		utrace_reap(tsk, utrace);
+	if ((utrace->u.exit.flags & (EXIT_FLAG_DEATH | EXIT_FLAG_REAP))
+	    || !likely(tsk->utrace_flags & UTRACE_ACTION_NOREAP))
+		spin_unlock(&utrace->lock);
 	else
-		/*
-		 * Clear out any detached engines and in the process
-		 * recompute the flags.  Mask off event bits we can't
-		 * see any more.  This tells utrace_release_task we
-		 * have already finished, if it comes along later.
-		 * Note this all happens on the already-locked utrace,
-		 * which might already be removed from the task.
-		 */
-		remove_detached(tsk, utrace, 0, DEAD_FLAGS_MASK);
+		report_delayed_group_leader(tsk, utrace);
 }
 
 /*
@@ -1311,6 +1638,7 @@ utrace_report_syscall(struct pt_regs *re
 	struct list_head *pos, *next;
 	struct utrace_attached_engine *engine;
 	unsigned long action, ev;
+	int killed;
 
 /*
   XXX pass syscall # to engine hook directly, let it return inhibit-action
@@ -1334,12 +1662,32 @@ utrace_report_syscall(struct pt_regs *re
 			break;
 	}
 	action = check_detach(tsk, action);
-	if (unlikely(check_quiescent(tsk, action)) && !is_exit)
+	killed = check_quiescent(tsk, action);
+
+	if (!is_exit) {
+		if (unlikely(killed))
+			/*
+			 * We are continuing despite QUIESCE because of a
+			 * SIGKILL.  Don't let the system call actually
+			 * proceed.
+			 */
+			tracehook_abort_syscall(regs);
+
 		/*
-		 * We are continuing despite QUIESCE because of a SIGKILL.
-		 * Don't let the system call actually proceed.
+		 * Clear TIF_SIGPENDING if it no longer needs to be set.
+		 * It may have been set as part of quiescence, and won't
+		 * ever have been cleared by another thread.  For other
+		 * reports, we can just leave it set and will go through
+		 * utrace_get_signal to reset things.  But here we are
+		 * about to enter a syscall, which might bail out with an
+		 * -ERESTART* error if it's set now.
 		 */
-		tracehook_abort_syscall(regs);
+		if (signal_pending(tsk)) {
+			spin_lock_irq(&tsk->sighand->siglock);
+			recalc_sigpending();
+			spin_unlock_irq(&tsk->sighand->siglock);
+		}
+	}
 }
 
 
@@ -1355,44 +1703,6 @@ struct utrace_signal
 };
 
 
-// XXX copied from signal.c
-#ifdef SIGEMT
-#define M_SIGEMT	M(SIGEMT)
-#else
-#define M_SIGEMT	0
-#endif
-
-#if SIGRTMIN > BITS_PER_LONG
-#define M(sig) (1ULL << ((sig)-1))
-#else
-#define M(sig) (1UL << ((sig)-1))
-#endif
-#define T(sig, mask) (M(sig) & (mask))
-
-#define SIG_KERNEL_ONLY_MASK (\
-	M(SIGKILL)   |  M(SIGSTOP)                                   )
-
-#define SIG_KERNEL_STOP_MASK (\
-	M(SIGSTOP)   |  M(SIGTSTP)   |  M(SIGTTIN)   |  M(SIGTTOU)   )
-
-#define SIG_KERNEL_COREDUMP_MASK (\
-        M(SIGQUIT)   |  M(SIGILL)    |  M(SIGTRAP)   |  M(SIGABRT)   | \
-        M(SIGFPE)    |  M(SIGSEGV)   |  M(SIGBUS)    |  M(SIGSYS)    | \
-        M(SIGXCPU)   |  M(SIGXFSZ)   |  M_SIGEMT                     )
-
-#define SIG_KERNEL_IGNORE_MASK (\
-        M(SIGCONT)   |  M(SIGCHLD)   |  M(SIGWINCH)  |  M(SIGURG)    )
-
-#define sig_kernel_only(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_ONLY_MASK))
-#define sig_kernel_coredump(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_COREDUMP_MASK))
-#define sig_kernel_ignore(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_IGNORE_MASK))
-#define sig_kernel_stop(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_STOP_MASK))
-
-
 /*
  * Call each interested tracing engine's report_signal callback.
  */
@@ -1442,13 +1752,53 @@ utrace_signal_handler_singlestep(struct 
 int
 utrace_get_signal(struct task_struct *tsk, struct pt_regs *regs,
 		  siginfo_t *info, struct k_sigaction *return_ka)
+	__releases(tsk->sighand->siglock)
+	__acquires(tsk->sighand->siglock)
 {
-	struct utrace *utrace = tsk->utrace;
+	struct utrace *utrace;
 	struct utrace_signal signal = { info, return_ka, 0 };
 	struct k_sigaction *ka;
 	unsigned long action, event;
 
 	/*
+	 * We could have been considered quiescent while we were in
+	 * TASK_STOPPED, and detached asynchronously.  If we woke up
+	 * and checked tsk->utrace_flags before that was finished,
+	 * we might be here with utrace already removed or in the
+	 * middle of being removed.
+	 */
+	rcu_read_lock();
+	utrace = rcu_dereference(tsk->utrace);
+	if (unlikely(utrace == NULL)) {
+		rcu_read_unlock();
+		return 0;
+	}
+	if (!(tsk->utrace_flags & UTRACE_EVENT(JCTL))) {
+		/*
+		 * It's possible we might have just been in TASK_STOPPED
+		 * and subject to the aforementioned race.
+		 *
+		 * RCU makes it safe to get the utrace->lock even if it's
+		 * being freed.  Once we have that lock, either an external
+		 * detach has finished and this struct has been freed, or
+		 * else we know we are excluding any other detach attempt.
+		 * Since we are no longer in TASK_STOPPED now, all we
+		 * needed the lock for was to order any quiesce() call after us.
+		 */
+		spin_unlock_irq(&tsk->sighand->siglock);
+		spin_lock(&utrace->lock);
+		if (unlikely(tsk->utrace != utrace)) {
+			spin_unlock(&utrace->lock);
+			rcu_read_unlock();
+			cond_resched();
+			return -1;
+		}
+		spin_unlock(&utrace->lock);
+		spin_lock_irq(&tsk->sighand->siglock);
+	}
+	rcu_read_unlock();
+
+	/*
 	 * If a signal was injected previously, it could not use our
 	 * stack space directly.  It had to allocate a data structure,
 	 * which we can now copy out of and free.
@@ -1614,7 +1964,7 @@ utrace_get_signal(struct task_struct *ts
 		else
 			spin_lock_irq(&tsk->sighand->siglock);
 
-		recalc_sigpending_tsk(tsk);
+		recalc_sigpending();
 	}
 
 	/*
@@ -1670,12 +2020,19 @@ utrace_get_signal(struct task_struct *ts
 }
 
 
-/*
- * Cause a specified signal delivery in the target thread,
- * which must be quiescent.  The action has UTRACE_SIGNAL_* bits
- * as returned from a report_signal callback.  If ka is non-null,
- * it gives the sigaction to follow for UTRACE_SIGNAL_DELIVER;
- * otherwise, the installed sigaction at the time of delivery is used.
+/**
+ * utrace_inject_signal - Cause a specified signal delivery.
+ * @target: thread to process the signal
+ * @engine: engine attached to @target
+ * @action: signal disposition
+ * @info: signal number and details
+ * @ka: sigaction() settings to follow when @action is %UTRACE_SIGNAL_DELIVER
+ *
+ * The @target thread must be quiescent (or the current thread).
+ * The @action has %UTRACE_SIGNAL_* bits as returned from a report_signal()
+ * callback.  If @ka is non-null, it gives the sigaction to follow for
+ * %UTRACE_SIGNAL_DELIVER; otherwise, the installed sigaction at the time
+ * of delivery is used.
  */
 int
 utrace_inject_signal(struct task_struct *target,
@@ -1764,13 +2121,27 @@ utrace_inject_signal(struct task_struct 
 		}
 	}
 
-	utrace_unlock(utrace);
+	spin_unlock(&utrace->lock);
 
 	return ret;
 }
 EXPORT_SYMBOL_GPL(utrace_inject_signal);
 
-
+/**
+ * utrace_regset - Prepare to access a thread's machine state.
+ * @target: thread to examine
+ * @engine: engine attached to @target
+ * @view: &struct utrace_regset_view providing machine state description
+ * @which: index into regsets provided by @view
+ *
+ * Prepare to access thread's machine state,
+ * see &struct utrace_regset in <linux/tracehook.h>.
+ * The given thread must be quiescent (or the current thread).  When this
+ * returns, the &struct utrace_regset calls may be used to interrogate or
+ * change the thread's state.  Do not cache the returned pointer when the
+ * thread can resume.  You must call utrace_regset() to ensure that
+ * context switching has completed and consistent state is available.
+ */
 const struct utrace_regset *
 utrace_regset(struct task_struct *target,
 	      struct utrace_attached_engine *engine,
@@ -1786,18 +2157,28 @@ utrace_regset(struct task_struct *target
 }
 EXPORT_SYMBOL_GPL(utrace_regset);
 
-
 /*
- * Return the task_struct for the task using ptrace on this one, or NULL.
- * Must be called with rcu_read_lock held to keep the returned struct alive.
+ * This is declared in linux/tracehook.h and defined in machine-dependent
+ * code.  We put the export here to ensure no machine forgets it.
+ */
+EXPORT_SYMBOL_GPL(utrace_native_view);
+
+
+/**
+ * utrace_tracer_task - Find the task using ptrace on this one.
+ * @target: task in question
+ *
+ * Return the &struct task_struct for the task using ptrace on this one,
+ * or %NULL.  Must be called with rcu_read_lock() held to keep the returned
+ * struct alive.
  *
- * At exec time, this may be called with task_lock(p) still held from when
- * tracehook_unsafe_exec was just called.  In that case it must give
- * results consistent with those unsafe_exec results, i.e. non-NULL if
- * any LSM_UNSAFE_PTRACE_* bits were set.
+ * At exec time, this may be called with task_lock() still held from when
+ * tracehook_unsafe_exec() was just called.  In that case it must give
+ * results consistent with those unsafe_exec() results, i.e. non-%NULL if
+ * any %LSM_UNSAFE_PTRACE_* bits were set.
  *
  * The value is also used to display after "TracerPid:" in /proc/PID/status,
- * where it is called with only rcu_read_lock held.
+ * where it is called with only rcu_read_lock() held.
  */
 struct task_struct *
 utrace_tracer_task(struct task_struct *target)
@@ -1857,7 +2238,7 @@ utrace_allow_access_process_vm(struct ta
 
 /*
  * Called on the current task to return LSM_UNSAFE_* bits implied by tracing.
- * Called with task_lock held.
+ * Called with task_lock() held.
  */
 int
 utrace_unsafe_exec(struct task_struct *tsk)
--- linux-2.6.18/mm/nommu.c
+++ linux-2.6.18/mm/nommu.c
@@ -20,7 +20,7 @@
 #include <linux/pagemap.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
-#include <linux/ptrace.h>
+#include <linux/tracehook.h>
 #include <linux/blkdev.h>
 #include <linux/backing-dev.h>
 #include <linux/mount.h>
@@ -570,7 +570,7 @@ static unsigned long determine_vm_flags(
 	 * it's being traced - otherwise breakpoints set in it may interfere
 	 * with another untraced process
 	 */
-	if ((flags & MAP_PRIVATE) && (current->ptrace & PT_PTRACED))
+	if ((flags & MAP_PRIVATE) && tracehook_expect_breakpoints(current))
 		vm_flags &= ~VM_MAYSHARE;
 
 	return vm_flags;