Sophie

Sophie

distrib > CentOS > 6 > i386 > by-pkgid > cf93d8a8acdcc6fe2225039da0502495 > files > 3928

kernel-doc-2.6.32-131.17.1.el6.centos.plus.noarch.rpm

<?xml version="1.0" encoding="ANSI_X3.4-1968" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=ANSI_X3.4-1968" /><title>Tear-down Races</title><meta name="generator" content="DocBook XSL Stylesheets V1.75.2" /><link rel="home" href="index.html" title="The utrace User Debugging Infrastructure" /><link rel="up" href="ch01.html" title="Chapter&#160;1.&#160;utrace concepts" /><link rel="prev" href="ch01s03.html" title="Stopping Safely" /><link rel="next" href="ch02.html" title="Chapter&#160;2.&#160;utrace core API" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Tear-down Races</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ch01s03.html">Prev</a>&#160;</td><th width="60%" align="center">Chapter&#160;1.&#160;utrace concepts</th><td width="20%" align="right">&#160;<a accesskey="n" href="ch02.html">Next</a></td></tr></table><hr /></div><div class="sect1" title="Tear-down Races"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="teardown"></a>Tear-down Races</h2></div></div></div><div class="toc"><dl><dt><span class="sect2"><a href="ch01s04.html#SIGKILL">Primacy of <code class="constant">SIGKILL</code></a></span></dt><dt><span class="sect2"><a href="ch01s04.html#reap">Final callbacks</a></span></dt><dt><span class="sect2"><a href="ch01s04.html#refcount">Engine and task pointers</a></span></dt><dt><span class="sect2"><a href="ch01s04.html#reap-after-death">
      Serialization of <code class="constant">DEATH</code> and <code class="constant">REAP</code>
    </a></span></dt><dt><span class="sect2"><a href="ch01s04.html#interlock">Interlock with final callbacks</a></span></dt><dt><span class="sect2"><a href="ch01s04.html#barrier">Using <code class="function">utrace_barrier</code></a></span></dt></dl></div><div class="sect2" title="Primacy of SIGKILL"><div class="titlepage"><div><div><h3 class="title"><a id="SIGKILL"></a>Primacy of <code class="constant">SIGKILL</code></h3></div></div></div><p>
    Ordinarily synchronization issues for tracing engines are kept fairly
    straightforward by using <code class="constant">UTRACE_STOP</code>.  You ask a
    thread to stop, and then once it makes the
    <code class="function">report_quiesce</code> callback it cannot do anything else
    that would result in another callback, until you let it with a
    <code class="function">utrace_control</code> call.  This simple arrangement
    avoids complex and error-prone code in each one of a tracing engine's
    event callbacks to keep them serialized with the engine's other
    operations done on that thread from another thread of control.
    However, giving tracing engines complete power to keep a traced thread
    stuck in place runs afoul of a more important kind of simplicity that
    the kernel overall guarantees: nothing can prevent or delay
    <code class="constant">SIGKILL</code> from making a thread die and release its
    resources.  To preserve this important property of
    <code class="constant">SIGKILL</code>, it as a special case can break
    <code class="constant">UTRACE_STOP</code> like nothing else normally can.  This
    includes both explicit <code class="constant">SIGKILL</code> signals and the
    implicit <code class="constant">SIGKILL</code> sent to each other thread in the
    same thread group by a thread doing an exec, or processing a fatal
    signal, or making an <code class="function">exit_group</code> system call.  A
    tracing engine can prevent a thread from beginning the exit or exec or
    dying by signal (other than <code class="constant">SIGKILL</code>) if it is
    attached to that thread, but once the operation begins, no tracing
    engine can prevent or delay all other threads in the same thread group
    dying.
  </p></div><div class="sect2" title="Final callbacks"><div class="titlepage"><div><div><h3 class="title"><a id="reap"></a>Final callbacks</h3></div></div></div><p>
    The <code class="function">report_reap</code> callback is always the final event
    in the life cycle of a traced thread.  Tracing engines can use this as
    the trigger to clean up their own data structures.  The
    <code class="function">report_death</code> callback is always the penultimate
    event a tracing engine might see; it's seen unless the thread was
    already in the midst of dying when the engine attached.  Many tracing
    engines will have no interest in when a parent reaps a dead process,
    and nothing they want to do with a zombie thread once it dies; for
    them, the <code class="function">report_death</code> callback is the natural
    place to clean up data structures and detach.  To facilitate writing
    such engines robustly, given the asynchrony of
    <code class="constant">SIGKILL</code>, and without error-prone manual
    implementation of synchronization schemes, the
    <span class="application">utrace</span> infrastructure provides some special
    guarantees about the <code class="function">report_death</code> and
    <code class="function">report_reap</code> callbacks.  It still takes some care
    to be sure your tracing engine is robust to tear-down races, but these
    rules make it reasonably straightforward and concise to handle a lot of
    corner cases correctly.
  </p></div><div class="sect2" title="Engine and task pointers"><div class="titlepage"><div><div><h3 class="title"><a id="refcount"></a>Engine and task pointers</h3></div></div></div><p>
    The first sort of guarantee concerns the core data structures
    themselves.  <span class="structname">struct utrace_engine</span> is
    a reference-counted data structure.  While you hold a reference, an
    engine pointer will always stay valid so that you can safely pass it to
    any <span class="application">utrace</span> call.  Each call to
    <code class="function">utrace_attach_task</code> or
    <code class="function">utrace_attach_pid</code> returns an engine pointer with a
    reference belonging to the caller.  You own that reference until you
    drop it using <code class="function">utrace_engine_put</code>.  There is an
    implicit reference on the engine while it is attached.  So if you drop
    your only reference, and then use
    <code class="function">utrace_attach_task</code> without
    <code class="constant">UTRACE_ATTACH_CREATE</code> to look up that same engine,
    you will get the same pointer with a new reference to replace the one
    you dropped, just like calling <code class="function">utrace_engine_get</code>.
    When an engine has been detached, either explicitly with
    <code class="constant">UTRACE_DETACH</code> or implicitly after
    <code class="function">report_reap</code>, then any references you hold are all
    that keep the old engine pointer alive.
  </p><p>
    There is nothing a kernel module can do to keep a <span class="structname">struct
    task_struct</span> alive outside of
    <code class="function">rcu_read_lock</code>.  When the task dies and is reaped
    by its parent (or itself), that structure can be freed so that any
    dangling pointers you have stored become invalid.
    <span class="application">utrace</span> will not prevent this, but it can
    help you detect it safely.  By definition, a task that has been reaped
    has had all its engines detached.  All
    <span class="application">utrace</span> calls can be safely called on a
    detached engine if the caller holds a reference on that engine pointer,
    even if the task pointer passed in the call is invalid.  All calls
    return <code class="constant">-ESRCH</code> for a detached engine, which tells
    you that the task pointer you passed could be invalid now.  Since
    <code class="function">utrace_control</code> and
    <code class="function">utrace_set_events</code> do not block, you can call those
    inside a <code class="function">rcu_read_lock</code> section and be sure after
    they don't return <code class="constant">-ESRCH</code> that the task pointer is
    still valid until <code class="function">rcu_read_unlock</code>.  The
    infrastructure never holds task references of its own.  Though neither
    <code class="function">rcu_read_lock</code> nor any other lock is held while
    making a callback, it's always guaranteed that the <span class="structname">struct
    task_struct</span> and the <span class="structname">struct
    utrace_engine</span> passed as arguments remain valid
    until the callback function returns.
  </p><p>
    The common means for safely holding task pointers that is available to
    kernel modules is to use <span class="structname">struct pid</span>, which
    permits <code class="function">put_pid</code> from kernel modules.  When using
    that, the calls <code class="function">utrace_attach_pid</code>,
    <code class="function">utrace_control_pid</code>,
    <code class="function">utrace_set_events_pid</code>, and
    <code class="function">utrace_barrier_pid</code> are available.
  </p></div><div class="sect2" title="Serialization of DEATH and REAP"><div class="titlepage"><div><div><h3 class="title"><a id="reap-after-death"></a>
      Serialization of <code class="constant">DEATH</code> and <code class="constant">REAP</code>
    </h3></div></div></div><p>
      The second guarantee is the serialization of
      <code class="constant">DEATH</code> and <code class="constant">REAP</code> event
      callbacks for a given thread.  The actual reaping by the parent
      (<code class="function">release_task</code> call) can occur simultaneously
      while the thread is still doing the final steps of dying, including
      the <code class="function">report_death</code> callback.  If a tracing engine
      has requested both <code class="constant">DEATH</code> and
      <code class="constant">REAP</code> event reports, it's guaranteed that the
      <code class="function">report_reap</code> callback will not be made until
      after the <code class="function">report_death</code> callback has returned.
      If the <code class="function">report_death</code> callback itself detaches
      from the thread, then the <code class="function">report_reap</code> callback
      will never be made.  Thus it is safe for a
      <code class="function">report_death</code> callback to clean up data
      structures and detach.
    </p></div><div class="sect2" title="Interlock with final callbacks"><div class="titlepage"><div><div><h3 class="title"><a id="interlock"></a>Interlock with final callbacks</h3></div></div></div><p>
    The final sort of guarantee is that a tracing engine will know for sure
    whether or not the <code class="function">report_death</code> and/or
    <code class="function">report_reap</code> callbacks will be made for a certain
    thread.  These tear-down races are disambiguated by the error return
    values of <code class="function">utrace_set_events</code> and
    <code class="function">utrace_control</code>.  Normally
    <code class="function">utrace_control</code> called with
    <code class="constant">UTRACE_DETACH</code> returns zero, and this means that no
    more callbacks will be made.  If the thread is in the midst of dying,
    it returns <code class="constant">-EALREADY</code> to indicate that the
    <code class="constant">report_death</code> callback may already be in progress;
    when you get this error, you know that any cleanup your
    <code class="function">report_death</code> callback does is about to happen or
    has just happened--note that if the <code class="function">report_death</code>
    callback does not detach, the engine remains attached until the thread
    gets reaped.  If the thread is in the midst of being reaped,
    <code class="function">utrace_control</code> returns <code class="constant">-ESRCH</code>
    to indicate that the <code class="function">report_reap</code> callback may
    already be in progress; this means the engine is implicitly detached
    when the callback completes.  This makes it possible for a tracing
    engine that has decided asynchronously to detach from a thread to
    safely clean up its data structures, knowing that no
    <code class="function">report_death</code> or <code class="function">report_reap</code>
    callback will try to do the same.  <code class="constant">utrace_detach</code>
    returns <code class="constant">-ESRCH</code> when the <span class="structname">struct
    utrace_engine</span> has already been detached, but is
    still a valid pointer because of its reference count.  A tracing engine
    can use this to safely synchronize its own independent multiple threads
    of control with each other and with its event callbacks that detach.
  </p><p>
    In the same vein, <code class="function">utrace_set_events</code> normally
    returns zero; if the target thread was stopped before the call, then
    after a successful call, no event callbacks not requested in the new
    flags will be made.  It fails with <code class="constant">-EALREADY</code> if
    you try to clear <code class="constant">UTRACE_EVENT(DEATH)</code> when the
    <code class="function">report_death</code> callback may already have begun, if
    you try to clear <code class="constant">UTRACE_EVENT(REAP)</code> when the
    <code class="function">report_reap</code> callback may already have begun, or if
    you try to newly set <code class="constant">UTRACE_EVENT(DEATH)</code> or
    <code class="constant">UTRACE_EVENT(QUIESCE)</code> when the target is already
    dead or dying.  Like <code class="function">utrace_control</code>, it returns
    <code class="constant">-ESRCH</code> when the thread has already been detached
    (including forcible detach on reaping).  This lets the tracing engine
    know for sure which event callbacks it will or won't see after
    <code class="function">utrace_set_events</code> has returned.  By checking for
    errors, it can know whether to clean up its data structures immediately
    or to let its callbacks do the work.
  </p></div><div class="sect2" title="Using utrace_barrier"><div class="titlepage"><div><div><h3 class="title"><a id="barrier"></a>Using <code class="function">utrace_barrier</code></h3></div></div></div><p>
    When a thread is safely stopped, calling
    <code class="function">utrace_control</code> with <code class="constant">UTRACE_DETACH</code>
    or calling <code class="function">utrace_set_events</code> to disable some events
    ensures synchronously that your engine won't get any more of the callbacks
    that have been disabled (none at all when detaching).  But these can also
    be used while the thread is not stopped, when it might be simultaneously
    making a callback to your engine.  For this situation, these calls return
    <code class="constant">-EINPROGRESS</code> when it's possible a callback is in
    progress.  If you are not prepared to have your old callbacks still run,
    then you can synchronize to be sure all the old callbacks are finished,
    using <code class="function">utrace_barrier</code>.  This is necessary if the
    kernel module containing your callback code is going to be unloaded.
  </p><p>
    After using <code class="constant">UTRACE_DETACH</code> once, further calls to
    <code class="function">utrace_control</code> with the same engine pointer will
    return <code class="constant">-ESRCH</code>.  In contrast, after getting
    <code class="constant">-EINPROGRESS</code> from
    <code class="function">utrace_set_events</code>, you can call
    <code class="function">utrace_set_events</code> again later and if it returns zero
    then know the old callbacks have finished.
  </p><p>
    Unlike all other calls, <code class="function">utrace_barrier</code> (and
    <code class="function">utrace_barrier_pid</code>) will accept any engine pointer you
    hold a reference on, even if <code class="constant">UTRACE_DETACH</code> has already
    been used.  After any <code class="function">utrace_control</code> or
    <code class="function">utrace_set_events</code> call (these do not block), you can
    call <code class="function">utrace_barrier</code> to block until callbacks have
    finished.  This returns <code class="constant">-ESRCH</code> only if the engine is
    completely detached (finished all callbacks).  Otherwise it waits
    until the thread is definitely not in the midst of a callback to this
    engine and then returns zero, but can return
    <code class="constant">-ERESTARTSYS</code> if its wait is interrupted.
  </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ch01s03.html">Prev</a>&#160;</td><td width="20%" align="center"><a accesskey="u" href="ch01.html">Up</a></td><td width="40%" align="right">&#160;<a accesskey="n" href="ch02.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Stopping Safely&#160;</td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top">&#160;Chapter&#160;2.&#160;utrace core API</td></tr></table></div></body></html>