Sophie

Sophie

distrib > Fedora > 18 > x86_64 > media > updates > by-pkgid > 171636fb720078ab07822dd4a76f1938 > files > 2507

mlton-20130715-4.fc18.x86_64.rpm

<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="generator" content="AsciiDoc 8.6.8">
<title>Regions</title>
<link rel="stylesheet" href="./asciidoc.css" type="text/css">
<link rel="stylesheet" href="./pygments.css" type="text/css">


<script type="text/javascript" src="./asciidoc.js"></script>
<script type="text/javascript">
/*<![CDATA[*/
asciidoc.install();
/*]]>*/
</script>
<link rel="stylesheet" href="./mlton.css" type="text/css"/>
</head>
<body class="article">
<div id="banner">
<div id="banner-home">
<a href="./Home">MLton 20130715</a>
</div>
</div>
<div id="header">
<h1>Regions</h1>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div class="paragraph"><p>In region-based memory management, the heap is divided into a
collection of regions into which objects are allocated.  At compile
time, either in the source program or through automatic inference,
allocation points are annotated with the region in which the
allocation will occur.  Typically, although not always, the regions
are allocated and deallocated according to a stack discipline.</p></div>
<div class="paragraph"><p>MLton does not use region-based memory management; it uses traditional
<a href="GarbageCollection">GarbageCollection</a>.  We have considered integrating regions with
MLton, but in our opinion it is far from clear that regions would
provide MLton with improved performance, while they would certainly
add a lot of complexity to the compiler and complicate reasoning about
and achieving <a href="SpaceSafety">SpaceSafety</a>.  Region-based memory management and
garbage collection have different strengths and weaknesses; it&#8217;s
pretty easy to come up with programs that do significantly better
under regions than under GC, and vice versa.  We believe that it is
the case that common SML idioms tend to work better under GC than
under regions.</p></div>
<div class="paragraph"><p>One common argument for regions is that the region operations can all
be done in (approximately) constant time; therefore, you eliminate GC
pause times, leading to a real-time GC.  However, because of space
safety concerns (see below), we believe that region-based memory
management for SML must also include a traditional garbage collector.
Hence, to achieve real-time memory management for MLton/SML, we
believe that it would be both easier and more efficient to implement a
traditional real-time garbage collector than it would be to implement
a region system.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_regions_the_ml_kit_and_space_safety">Regions, the ML Kit, and space safety</h2>
<div class="sectionbody">
<div class="paragraph"><p>The <a href="MLKit">ML Kit</a> pioneered the use of regions for compiling
Standard ML.  The ML Kit maintains a stack of regions at run time.  At
compile time, it uses region inference to decide when data can be
allocated in a stack-like manner, assigning it to an appropriate
region.  The ML Kit has put a lot of effort into improving the
supporting analyses and representations of regions, which are all
necessary to improve the performance.</p></div>
<div class="paragraph"><p>Unfortunately, under a pure stack-based region system, space leaks are
inevitable in theory, and costly in practice.  Data for which region
inference can not determine the lifetime is moved into the "global
region" whose lifetime is the entire program.  There are two ways in
which region inference will place an object to the global region.</p></div>
<div class="ulist"><ul>
<li>
<p>
When the inference is too conservative, that is, when the data is
used in a stack-like manner but the region inference can&#8217;t figure it
out.
</p>
</li>
<li>
<p>
When data is not used in a stack-like manner.  In this case,
correctness requires region inference to place the object
</p>
</li>
</ul></div>
<div class="paragraph"><p>This global region is a source of space leaks.  No matter what region
system you use, there are some programs such that the global region
must exist, and its size will grow to an unbounded multiple of the
live data size.  For these programs one must have a GC to achieve
space safety.</p></div>
<div class="paragraph"><p>To solve this problem, the ML Kit has undergone work to combine
garbage collection with region-based memory management.
<a href="References#HallenbergEtAl02">HallenbergEtAl02</a> and <a href="References#Elsman03">Elsman03</a> describe the addition
of a garbage collector to the ML Kit&#8217;s region-based system.  These
papers provide convincing evidence for space leaks in the global
region.  They show a number of benchmarks where the memory usage of
the program running with just regions is a large multiple (2, 10, 50,
even 150) of the program running with regions plus GC.</p></div>
<div class="paragraph"><p>These papers also give some numbers to show the ML Kit with just
regions does better than either a system with just GC or a combined
system.  Unfortunately, a pure region system isn&#8217;t practical because
of the lack of space safety.  And the other performance numbers are
not so convincing, because they compare to an old version of SML/NJ
and not at all with MLton.  It would be interesting to see a
comparison with a more serious collector.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_regions_garbage_collection_and_cyclone">Regions, Garbage Collection, and Cyclone</h2>
<div class="sectionbody">
<div class="paragraph"><p>One possibility is to take Cyclone&#8217;s approach, and provide both
region-based memory management and garbage collection, but at the
programmer&#8217;s option (<a href="References#GrossmanEtAl02">GrossmanEtAl02</a>, <a href="References#HicksEtAl03">HicksEtAl03</a>).</p></div>
<div class="paragraph"><p>One might ask whether we might do the same thing&#8201;&#8212;&#8201;i.e., provide a
<span class="monospaced">MLton.Regions</span> structure with explicit region based memory
management operations, so that the programmer could use them when
appropriate.  <a href="MatthewFluet">MatthewFluet</a> has thought about this question</p></div>
<div class="ulist"><ul>
<li>
<p>
<a href="http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html">http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html</a>
</p>
</li>
</ul></div>
<div class="paragraph"><p>Unfortunately, his conclusion is that the SML type system is too weak
to support this option, although there might be a "poor-man&#8217;s" version
with dynamic checks.</p></div>
</div>
</div>
</div>
<div id="footnotes"><hr></div>
<div id="footer">
<div id="footer-text">
</div>
<div id="footer-badges">
</div>
</div>
</body>
</html>