Sophie

Sophie

distrib > Mageia > 5 > x86_64 > by-pkgid > 0f59c43d821902385f0623255621244d > files > 47

aspell-manual-0.60.6.1-8.mga5.x86_64.rpm

<html lang="en">
<head>
<title>Aspell Suggestion Strategy - GNU Aspell 0.60.6.1</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="description" content="Aspell 0.60.6.1 spell checker user's manual.">
<meta name="generator" content="makeinfo 4.8">
<link title="Top" rel="start" href="index.html#Top">
<link rel="up" href="Implementation-Notes.html#Implementation-Notes" title="Implementation Notes">
<link rel="next" href="Notes-on-8_002dbit-Characters.html#Notes-on-8_002dbit-Characters" title="Notes on 8-bit Characters">
<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
<!--
This is the user's manual for Aspell

GNU Aspell is a spell checker designed to eventually replace Ispell.
It can either be used as a library or as an independent spell checker.

Copyright (C) 2000--2011 Kevin Atkinson.

     Permission is granted to copy, distribute and/or modify this
     document under the terms of the GNU Free Documentation License,
     Version 1.1 or any later version published by the Free Software
     Foundation; with no Invariant Sections, no Front-Cover Texts and
     no Back-Cover Texts.  A copy of the license is included in the
     section entitled "GNU Free Documentation License".
   -->
<meta http-equiv="Content-Style-Type" content="text/css">
<style type="text/css"><!--
  pre.display { font-family:inherit }
  pre.format  { font-family:inherit }
  pre.smalldisplay { font-family:inherit; font-size:smaller }
  pre.smallformat  { font-family:inherit; font-size:smaller }
  pre.smallexample { font-size:smaller }
  pre.smalllisp    { font-size:smaller }
  span.sc    { font-variant:small-caps }
  span.roman { font-family:serif; font-weight:normal; } 
  span.sansserif { font-family:sans-serif; font-weight:normal; } 
--></style>
</head>
<body>
<div class="node">
<p>
<a name="Aspell-Suggestion-Strategy"></a>
Next:&nbsp;<a rel="next" accesskey="n" href="Notes-on-8_002dbit-Characters.html#Notes-on-8_002dbit-Characters">Notes on 8-bit Characters</a>,
Up:&nbsp;<a rel="up" accesskey="u" href="Implementation-Notes.html#Implementation-Notes">Implementation Notes</a>
<hr>
</div>

<h3 class="appendixsec">A.1 Aspell Suggestion Strategy</h3>

<p>The magic behind my spell checker comes from merging Lawrence Philips
excellent metaphone algorithm and Ispell's near miss strategy which is
inserting a space or hyphen, interchanging two adjacent letters,
changing one letter, deleting a letter, or adding a letter.

   <p>The process goes something like this.

     <ol type=1 start=1>
<li>Convert the misspelled word to its soundslike equivalent (its
metaphone for English words).

     <li>Find all words that have a soundslike within one or two edit distances
from the original word's soundslike.  The edit distance is the total
number of deletions, insertions, exchanges, or adjacent swaps needed
to make one string equivalent to the other.  When set to only look for
soundslikes within one edit distance it tries all possible soundslike
combinations and checks if each one is in the dictionary.  When set to
find all soundslike within two edit distances it scans through the
entire dictionary and quickly scores each soundslike.  The scoring is
quick because it will give up if the two soundslikes are more than two
edit distances apart.

     <li>Find misspelled words that have a correctly spelled replacement by the
same criteria of step number 2 and 3.  That is the misspelled word in
the word pair (such as &ldquo;teh -&gt; the&rdquo;) would appear in the suggestions
list as if it was a correct spelling.

     <li>Score the result list and return the words with the lowest score.  The
score is roughly the weighed average of the weighed edit distance of
the word to the misspelled word and the soundslike equivalent of the
two words.  The weighted edit distance is like the edit distance
except that the various edits have weights attached to them.

     <li>Replace the misspelled words that have correctly spelled replacements
with their replacements and remove any duplicates that might arise
because of this.
        </ol>

   <p>Please note that the soundslike equivalent is a rough approximation of
how the words sounds.  It is not the phoneme of the word by any means. 
For more details about exactly how each step is performed please see
the file <samp><span class="file">suggest.cc</span></samp>.  For more information on the metaphone
algorithm please see the data file <samp><span class="file">english_phonet.dat</span></samp>.

   </body></html>