<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Transitioning from IPython.kernel to IPython.parallel — IPython 2.3.0 documentation</title> <link rel="stylesheet" href="../_static/default.css" type="text/css" /> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '../', VERSION: '2.3.0', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="../_static/jquery.js"></script> <script type="text/javascript" src="../_static/underscore.js"></script> <script type="text/javascript" src="../_static/doctools.js"></script> <link rel="top" title="IPython 2.3.0 documentation" href="../index.html" /> <link rel="up" title="Using IPython for parallel computing" href="index.html" /> <link rel="next" title="Configuration and customization" href="../config/index.html" /> <link rel="prev" title="Details of Parallel Computing with IPython" href="parallel_details.html" /> </head> <body> <div style="background-color: white; text-align: left; padding: 10px 10px 15px 15px"> <a href="http://ipython.org/"><img src="../_static/logo.png" border="0" alt="IPython Documentation"/></a> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="../genindex.html" title="General Index" accesskey="I">index</a></li> <li class="right" > <a href="../py-modindex.html" title="Python Module Index" >modules</a> |</li> <li class="right" > <a href="../config/index.html" title="Configuration and customization" accesskey="N">next</a> |</li> <li class="right" > <a href="parallel_details.html" title="Details of Parallel Computing with IPython" accesskey="P">previous</a> |</li> <li><a href="http://ipython.org">home</a>| </li> <li><a href="../search.html">search</a>| </li> <li><a href="../index.html">documentation </a> »</li> <li><a href="index.html" accesskey="U">Using IPython for parallel computing</a> »</li> </ul> </div> <div class="sphinxsidebar"> <div class="sphinxsidebarwrapper"> <h3><a href="../index.html">Table Of Contents</a></h3> <ul> <li><a class="reference internal" href="#">Transitioning from IPython.kernel to IPython.parallel</a><ul> <li><a class="reference internal" href="#processes">Processes</a></li> <li><a class="reference internal" href="#creating-a-client">Creating a Client</a></li> <li><a class="reference internal" href="#apply">Apply</a></li> <li><a class="reference internal" href="#multiengine-to-directview">MultiEngine to DirectView</a></li> <li><a class="reference internal" href="#task-to-loadbalancedview">Task to LoadBalancedView</a><ul> <li><a class="reference internal" href="#pendingresults-to-asyncresults">PendingResults to AsyncResults</a></li> </ul> </li> </ul> </li> </ul> <h4>Previous topic</h4> <p class="topless"><a href="parallel_details.html" title="previous chapter">Details of Parallel Computing with IPython</a></p> <h4>Next topic</h4> <p class="topless"><a href="../config/index.html" title="next chapter">Configuration and customization</a></p> <h3>This Page</h3> <ul class="this-page-menu"> <li><a href="../_sources/parallel/parallel_transition.txt" rel="nofollow">Show Source</a></li> </ul> <div id="searchbox" style="display: none"> <h3>Quick search</h3> <form class="search" action="../search.html" method="get"> <input type="text" name="q" /> <input type="submit" value="Go" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> <p class="searchtip" style="font-size: 90%"> Enter search terms or a module, class or function name. </p> </div> <script type="text/javascript">$('#searchbox').show(0);</script> </div> </div> <div class="document"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="transitioning-from-ipython-kernel-to-ipython-parallel"> <span id="parallel-transition"></span><h1>Transitioning from IPython.kernel to IPython.parallel<a class="headerlink" href="#transitioning-from-ipython-kernel-to-ipython-parallel" title="Permalink to this headline">¶</a></h1> <p>We have rewritten our parallel computing tools to use <a class="reference external" href="http://zeromq.org">0MQ</a> and <a class="reference external" href="https://github.com/facebook/tornado">Tornado</a>. The redesign has resulted in dramatically improved performance, as well as (we think), an improved interface for executing code remotely. This doc is to help users of IPython.kernel transition their codes to the new code.</p> <div class="section" id="processes"> <h2>Processes<a class="headerlink" href="#processes" title="Permalink to this headline">¶</a></h2> <p>The process model for the new parallel code is very similar to that of IPython.kernel. There is still a Controller, Engines, and Clients. However, the the Controller is now split into multiple processes, and can even be split across multiple machines. There does remain a single ipcontroller script for starting all of the controller processes.</p> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last">TODO: fill this out after config system is updated</p> </div> <div class="admonition seealso"> <p class="first admonition-title">See also</p> <p class="last">Detailed <a class="reference internal" href="parallel_process.html#parallel-process"><em>Parallel Process</em></a> doc for configuring and launching IPython processes.</p> </div> </div> <div class="section" id="creating-a-client"> <h2>Creating a Client<a class="headerlink" href="#creating-a-client" title="Permalink to this headline">¶</a></h2> <p>Creating a client with default settings has not changed much, though the extended options have. One significant change is that there are no longer multiple Client classes to represent the various execution models. There is just one low-level Client object for connecting to the cluster, and View objects are created from that Client that provide the different interfaces for execution.</p> <p>To create a new client, and set up the default direct and load-balanced objects:</p> <div class="highlight-ipython"><div class="highlight"><pre><span class="go"># old</span> <span class="gp">In [1]: </span><span class="kn">from</span> <span class="nn">IPython.kernel</span> <span class="kn">import</span> <span class="n">client</span> <span class="k">as</span> <span class="n">kclient</span> <span class="gp">In [2]: </span><span class="n">mec</span> <span class="o">=</span> <span class="n">kclient</span><span class="o">.</span><span class="n">MultiEngineClient</span><span class="p">()</span> <span class="gp">In [3]: </span><span class="n">tc</span> <span class="o">=</span> <span class="n">kclient</span><span class="o">.</span><span class="n">TaskClient</span><span class="p">()</span> <span class="go"># new</span> <span class="gp">In [1]: </span><span class="kn">from</span> <span class="nn">IPython.parallel</span> <span class="kn">import</span> <span class="n">Client</span> <span class="gp">In [2]: </span><span class="n">rc</span> <span class="o">=</span> <span class="n">Client</span><span class="p">()</span> <span class="gp">In [3]: </span><span class="n">dview</span> <span class="o">=</span> <span class="n">rc</span><span class="p">[:]</span> <span class="gp">In [4]: </span><span class="n">lbview</span> <span class="o">=</span> <span class="n">rc</span><span class="o">.</span><span class="n">load_balanced_view</span><span class="p">()</span> </pre></div> </div> </div> <div class="section" id="apply"> <h2>Apply<a class="headerlink" href="#apply" title="Permalink to this headline">¶</a></h2> <p>The main change to the API is the addition of the <tt class="xref py py-meth docutils literal"><span class="pre">apply()</span></tt> to the View objects. This is a method that takes <cite>view.apply(f,*args,**kwargs)</cite>, and calls <cite>f(*args, **kwargs)</cite> remotely on one or more engines, returning the result. This means that the natural unit of remote execution is no longer a string of Python code, but rather a Python function.</p> <ul class="simple"> <li>non-copying sends (track)</li> <li>remote References</li> </ul> <p>The flags for execution have also changed. Previously, there was only <cite>block</cite> denoting whether to wait for results. This remains, but due to the addition of fully non-copying sends of arrays and buffers, there is also a <cite>track</cite> flag, which instructs PyZMQ to produce a <tt class="xref py py-class docutils literal"><span class="pre">MessageTracker</span></tt> that will let you know when it is safe again to edit arrays in-place.</p> <p>The result of a non-blocking call to <cite>apply</cite> is now an <a class="reference internal" href="asyncresult.html"><em>AsyncResult object</em></a>.</p> </div> <div class="section" id="multiengine-to-directview"> <h2>MultiEngine to DirectView<a class="headerlink" href="#multiengine-to-directview" title="Permalink to this headline">¶</a></h2> <p>The multiplexing interface previously provided by the MultiEngineClient is now provided by the DirectView. Once you have a Client connected, you can create a DirectView with index-access to the client (<tt class="docutils literal"><span class="pre">view</span> <span class="pre">=</span> <span class="pre">client[1:5]</span></tt>). The core methods for communicating with engines remain: <cite>execute</cite>, <cite>run</cite>, <cite>push</cite>, <cite>pull</cite>, <cite>scatter</cite>, <cite>gather</cite>. These methods all behave in much the same way as they did on a MultiEngineClient.</p> <div class="highlight-ipython"><div class="highlight"><pre><span class="go"># old</span> <span class="gp">In [2]: </span><span class="n">mec</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s">'a=5'</span><span class="p">,</span> <span class="n">targets</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">])</span> <span class="go"># new</span> <span class="gp">In [2]: </span><span class="n">view</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s">'a=5'</span><span class="p">,</span> <span class="n">targets</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">])</span> <span class="go"># or</span> <span class="gp">In [2]: </span><span class="n">rc</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">]</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s">'a=5'</span><span class="p">)</span> </pre></div> </div> <p>This extends to any method that communicates with the engines.</p> <p>Requests of the Hub (queue status, etc.) are no-longer asynchronous, and do not take a <cite>block</cite> argument.</p> <ul> <li><p class="first"><tt class="xref py py-meth docutils literal"><span class="pre">get_ids()</span></tt> is now the property <tt class="xref py py-attr docutils literal"><span class="pre">ids</span></tt>, which is passively updated by the Hub (no need for network requests for an up-to-date list).</p> </li> <li><p class="first"><tt class="xref py py-meth docutils literal"><span class="pre">barrier()</span></tt> has been renamed to <tt class="xref py py-meth docutils literal"><span class="pre">wait()</span></tt>, and now takes an optional timeout. <tt class="xref py py-meth docutils literal"><span class="pre">flush()</span></tt> is removed, as it is redundant with <tt class="xref py py-meth docutils literal"><span class="pre">wait()</span></tt></p> </li> <li><p class="first"><tt class="xref py py-meth docutils literal"><span class="pre">zip_pull()</span></tt> has been removed</p> </li> <li><p class="first"><tt class="xref py py-meth docutils literal"><span class="pre">keys()</span></tt> has been removed, but is easily implemented as:</p> <div class="highlight-python"><div class="highlight"><pre><span class="n">dview</span><span class="o">.</span><span class="n">apply</span><span class="p">(</span><span class="k">lambda</span> <span class="p">:</span> <span class="nb">globals</span><span class="p">()</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span> </pre></div> </div> </li> <li><p class="first"><tt class="xref py py-meth docutils literal"><span class="pre">push_function()</span></tt> and <tt class="xref py py-meth docutils literal"><span class="pre">push_serialized()</span></tt> are removed, as <tt class="xref py py-meth docutils literal"><span class="pre">push()</span></tt> handles functions without issue.</p> </li> </ul> <div class="admonition seealso"> <p class="first admonition-title">See also</p> <p class="last"><a class="reference internal" href="parallel_multiengine.html#parallel-multiengine"><em>Our Direct Interface doc</em></a> for a simple tutorial with the DirectView.</p> </div> <p>The other major difference is the use of <tt class="xref py py-meth docutils literal"><span class="pre">apply()</span></tt>. When remote work is simply functions, the natural return value is the actual Python objects. It is no longer the recommended pattern to use stdout as your results, due to stream decoupling and the asynchronous nature of how the stdout streams are handled in the new system.</p> </div> <div class="section" id="task-to-loadbalancedview"> <h2>Task to LoadBalancedView<a class="headerlink" href="#task-to-loadbalancedview" title="Permalink to this headline">¶</a></h2> <p>Load-Balancing has changed more than Multiplexing. This is because there is no longer a notion of a StringTask or a MapTask, there are simply Python functions to call. Tasks are now simpler, because they are no longer composites of push/execute/pull/clear calls, they are a single function that takes arguments, and returns objects.</p> <p>The load-balanced interface is provided by the <tt class="xref py py-class docutils literal"><span class="pre">LoadBalancedView</span></tt> class, created by the client:</p> <div class="highlight-ipython"><div class="highlight"><pre><span class="gp">In [10]: </span><span class="n">lbview</span> <span class="o">=</span> <span class="n">rc</span><span class="o">.</span><span class="n">load_balanced_view</span><span class="p">()</span> <span class="go"># load-balancing can also be restricted to a subset of engines:</span> <span class="gp">In [10]: </span><span class="n">lbview</span> <span class="o">=</span> <span class="n">rc</span><span class="o">.</span><span class="n">load_balanced_view</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">])</span> </pre></div> </div> <p>A simple task would consist of sending some data, calling a function on that data, plus some data that was resident on the engine already, and then pulling back some results. This can all be done with a single function.</p> <p>Let’s say you want to compute the dot product of two matrices, one of which resides on the engine, and another resides on the client. You might construct a task that looks like this:</p> <div class="highlight-ipython"><div class="highlight"><pre><span class="gp">In [10]: </span><span class="n">st</span> <span class="o">=</span> <span class="n">kclient</span><span class="o">.</span><span class="n">StringTask</span><span class="p">(</span><span class="s">"""</span> <span class="go"> import numpy</span> <span class="go"> C=numpy.dot(A,B)</span> <span class="go"> """,</span> <span class="go"> push=dict(B=B),</span> <span class="go"> pull='C'</span> <span class="go"> )</span> <span class="gp">In [11]: </span><span class="n">tid</span> <span class="o">=</span> <span class="n">tc</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">st</span><span class="p">)</span> <span class="gp">In [12]: </span><span class="n">tr</span> <span class="o">=</span> <span class="n">tc</span><span class="o">.</span><span class="n">get_task_result</span><span class="p">(</span><span class="n">tid</span><span class="p">)</span> <span class="gp">In [13]: </span><span class="n">C</span> <span class="o">=</span> <span class="n">tc</span><span class="p">[</span><span class="s">'C'</span><span class="p">]</span> </pre></div> </div> <p>In the new code, this is simpler:</p> <div class="highlight-ipython"><div class="highlight"><pre><span class="gp">In [10]: </span><span class="kn">import</span> <span class="nn">numpy</span> <span class="gp">In [11]: </span><span class="kn">from</span> <span class="nn">IPython.parallel</span> <span class="kn">import</span> <span class="n">Reference</span> <span class="gp">In [12]: </span><span class="n">ar</span> <span class="o">=</span> <span class="n">lbview</span><span class="o">.</span><span class="n">apply</span><span class="p">(</span><span class="n">numpy</span><span class="o">.</span><span class="n">dot</span><span class="p">,</span> <span class="n">Reference</span><span class="p">(</span><span class="s">'A'</span><span class="p">),</span> <span class="n">B</span><span class="p">)</span> <span class="gp">In [13]: </span><span class="n">C</span> <span class="o">=</span> <span class="n">ar</span><span class="o">.</span><span class="n">get</span><span class="p">()</span> </pre></div> </div> <p>Note the use of <tt class="docutils literal"><span class="pre">Reference</span></tt> This is a convenient representation of an object that exists in the engine’s namespace, so you can pass remote objects as arguments to your task functions.</p> <p>Also note that in the kernel model, after the task is run, ‘A’, ‘B’, and ‘C’ are all defined on the engine. In order to deal with this, there is also a <cite>clear_after</cite> flag for Tasks to prevent pollution of the namespace, and bloating of engine memory. This is not necessary with the new code, because only those objects explicitly pushed (or set via <cite>globals()</cite>) will be resident on the engine beyond the duration of the task.</p> <div class="admonition seealso"> <p class="first admonition-title">See also</p> <p class="last">Dependencies also work very differently than in IPython.kernel. See our <a class="reference internal" href="parallel_task.html#parallel-dependencies"><em>doc on Dependencies</em></a> for details.</p> </div> <div class="admonition seealso"> <p class="first admonition-title">See also</p> <p class="last"><a class="reference internal" href="parallel_task.html#parallel-task"><em>Our Task Interface doc</em></a> for a simple tutorial with the LoadBalancedView.</p> </div> <div class="section" id="pendingresults-to-asyncresults"> <h3>PendingResults to AsyncResults<a class="headerlink" href="#pendingresults-to-asyncresults" title="Permalink to this headline">¶</a></h3> <p>With the departure from Twisted, we no longer have the <tt class="xref py py-class docutils literal"><span class="pre">Deferred</span></tt> class for representing unfinished results. For this, we have an AsyncResult object, based on the object of the same name in the built-in <a class="reference external" href="http://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool" title="(in Python v2.7)"><tt class="xref py py-mod docutils literal"><span class="pre">multiprocessing.pool</span></tt></a> module. Our version provides a superset of that interface.</p> <p>However, unlike in IPython.kernel, we do not have PendingDeferred, PendingResult, or TaskResult objects. Simply this one object, the AsyncResult. Every asynchronous (<cite>block=False</cite>) call returns one.</p> <p>The basic methods of an AsyncResult are:</p> <div class="highlight-python"><div class="highlight"><pre><span class="n">AsyncResult</span><span class="o">.</span><span class="n">wait</span><span class="p">([</span><span class="n">timeout</span><span class="p">]):</span> <span class="c"># wait for the result to arrive</span> <span class="n">AsyncResult</span><span class="o">.</span><span class="n">get</span><span class="p">([</span><span class="n">timeout</span><span class="p">]):</span> <span class="c"># wait for the result to arrive, and then return it</span> <span class="n">AsyncResult</span><span class="o">.</span><span class="n">metadata</span><span class="p">:</span> <span class="c"># dict of extra information about execution.</span> </pre></div> </div> <p>There are still some things that behave the same as IPython.kernel:</p> <div class="highlight-ipython"><div class="highlight"><pre><span class="go"># old</span> <span class="gp">In [5]: </span><span class="n">pr</span> <span class="o">=</span> <span class="n">mec</span><span class="o">.</span><span class="n">pull</span><span class="p">(</span><span class="s">'a'</span><span class="p">,</span> <span class="n">targets</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">],</span> <span class="n">block</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="gp">In [6]: </span><span class="n">pr</span><span class="o">.</span><span class="n">r</span> <span class="gh">Out[6]: </span><span class="go">[5, 5]</span> <span class="go"># new</span> <span class="gp">In [5]: </span><span class="n">ar</span> <span class="o">=</span> <span class="n">dview</span><span class="o">.</span><span class="n">pull</span><span class="p">(</span><span class="s">'a'</span><span class="p">,</span> <span class="n">targets</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">],</span> <span class="n">block</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="gp">In [6]: </span><span class="n">ar</span><span class="o">.</span><span class="n">r</span> <span class="gh">Out[6]: </span><span class="go">[5, 5]</span> </pre></div> </div> <p>The <tt class="docutils literal"><span class="pre">.r</span></tt> or <tt class="docutils literal"><span class="pre">.result</span></tt> property simply calls <tt class="xref py py-meth docutils literal"><span class="pre">get()</span></tt>, waiting for and returning the result.</p> <div class="admonition seealso"> <p class="first admonition-title">See also</p> <p class="last"><a class="reference internal" href="asyncresult.html"><em>AsyncResult details</em></a></p> </div> </div> </div> </div> </div> </div> </div> <div class="clearer"></div> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="../genindex.html" title="General Index" >index</a></li> <li class="right" > <a href="../py-modindex.html" title="Python Module Index" >modules</a> |</li> <li class="right" > <a href="../config/index.html" title="Configuration and customization" >next</a> |</li> <li class="right" > <a href="parallel_details.html" title="Details of Parallel Computing with IPython" >previous</a> |</li> <li><a href="http://ipython.org">home</a>| </li> <li><a href="../search.html">search</a>| </li> <li><a href="../index.html">documentation </a> »</li> <li><a href="index.html" >Using IPython for parallel computing</a> »</li> </ul> </div> <div class="footer"> © Copyright The IPython Development Team. Last updated on Sep 02, 2015. Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.2.3. </div> </body> </html>