From: Roland McGrath <roland@redhat.com> Date: Thu, 27 Mar 2008 14:30:38 -0700 Subject: [utrace] race crash fixes Message-id: 20080327213038.C0FF226FA1C@magilla.localdomain O-Subject: [RHEL5.2 PATCH] utrace race crash fixes [RHBZ#245429, RHBZ#245735, RHBZ#312961, RHBZ#428693] Bugzilla: 428693 245429 245735 312961 This relates to or resolves RHBZ#245429, RHBZ#245735, RHBZ#312961, RHBZ#428693. It fixes some races with attach/detach. I've had this in Fedora utrace for some time, but failed to post it for RHEL5. commit cd9f54da4fc448ce0363320b4389e9985438c3b3 Author: Roland McGrath <roland@redhat.com> Date: Thu Jan 24 22:47:33 2008 -0800 Fix race with utrace->u.live.signal check vs reset. Signed-off-by: Roland McGrath <roland@redhat.com> diff --git a/kernel/ptrace.c b/kernel/ptrace.c index aed2c5c..2c363b8 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -213,6 +213,14 @@ ptrace_done(struct ptrace_state *state) CHECK_DEAD(state); BUG_ON(state->rcu.func); BUG_ON(state->rcu.next); + /* + * We clear @task here, while we are sure that the task_struct is + * still live, because our caller has to permit its release. + * By RCU rules, this means that inside rcu_read_lock(), + * rcu_dereference(state->task) will always produce either + * a pointer that is being kept alive by RCU, or NULL. + */ + rcu_assign_pointer(state->task, NULL); call_rcu(&state->rcu, ptrace_state_free); } @@ -559,6 +567,7 @@ ptrace_exit(struct task_struct *tsk) do { struct ptrace_state *state; + struct task_struct *p; int error; START_CHECK; @@ -568,7 +577,17 @@ ptrace_exit(struct task_struct *tsk) restart = 0; list_for_each_safe_rcu(pos, n, &tsk->ptracees) { state = list_entry(pos, struct ptrace_state, entry); - error = utrace_detach(state->task, state->engine); + /* + * Here rcu_read_lock() keeps live any task_struct + * that state->task still points to. If state->task + * was cleared already, then state itself is on the + * way to be freed by RCU and we are just seeing a + * stale list element here. + */ + p = rcu_dereference(state->task); + if (unlikely(p == NULL)) + continue; + error = utrace_detach(p, state->engine); BUG_ON(state->parent != tsk); if (likely(error == 0)) { ptrace_state_unlink(state); @@ -581,7 +600,6 @@ ptrace_exit(struct task_struct *tsk) * Since wait_task_inactive might yield, * we must go out of rcu_read_lock and restart. */ - struct task_struct *p = state->task; get_task_struct(p); rcu_read_unlock(); wait_task_inactive(p); @@ -1291,7 +1309,9 @@ ptrace_do_wait(struct task_struct *tsk, rcu_read_lock(); list_for_each_entry_rcu(state, &tsk->ptracees, entry) { - p = state->task; + p = rcu_dereference(state->task); + if (unlikely(p == NULL)) + continue; if (pid > 0) { if (p->pid != pid) diff --git a/kernel/utrace.c b/kernel/utrace.c index adece3a..5083288 100644 --- a/kernel/utrace.c +++ b/kernel/utrace.c @@ -1,7 +1,7 @@ /* * utrace infrastructure interface for debugging user processes * - * Copyright (C) 2006, 2007 Red Hat, Inc. All rights reserved. + * Copyright (C) 2006, 2007, 2008 Red Hat, Inc. All rights reserved. * * This copyrighted material is made available to anyone wishing to use, * modify, copy, or redistribute it subject to the terms and conditions @@ -1385,6 +1385,7 @@ restart: */ if (action & UTRACE_ACTION_QUIESCE) { int killed; + int stop; if (signal != NULL) { BUG_ON(utrace->u.live.signal != NULL); @@ -1401,7 +1402,8 @@ restart: * Never stop when there is a SIGKILL bringing us down. */ killed = sigkill_pending(tsk); - if (!killed && (tsk->utrace_flags & UTRACE_ACTION_QUIESCE)) { + stop = !killed && (tsk->utrace_flags & UTRACE_ACTION_QUIESCE); + if (likely(stop)) { set_current_state(TASK_TRACED); /* * If there is a group stop in progress, @@ -1423,10 +1425,32 @@ restart: */ BUG_ON(tsk->utrace != utrace); BUG_ON(utrace->u.live.signal != signal); + + if (likely(stop)) { + /* + * Synchronize with any utrace_detach + * that might be in progress. We were + * just now quiescent in TASK_TRACED, + * and it expected us to stay there. + * But SIGKILL might have broken that. + * Taking this lock here serializes its + * work so that if it had the lock and + * thought we were still in TASK_TRACED, + * we block until it has finished + * looking at utrace. A utrace_detach + * that gets the lock after we release + * it here will not think we are + * quiescent at all, since we are in + * TASK_RUNNING state now. + */ + spin_lock(&utrace->lock); + spin_unlock(&utrace->lock); + } + utrace->u.live.signal = NULL; } - if (killed) /* Game over, man! */ + if (unlikely(killed)) /* Game over, man! */ return 1; /*