Sophie: eso-midas-doc-13SEPpl1.2-3.mga5 i586

eso-midas-doc-13SEPpl1.2-3.mga5.i586.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<!--Converted with LaTeX2HTML 98.1p1 release (March 2nd, 1998)
originally by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds
* revised and updated by:  Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
  Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<TITLE>Discriminant Analysis</TITLE>
<META NAME="description" CONTENT="Discriminant Analysis">
<META NAME="keywords" CONTENT="vol2">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<LINK REL="STYLESHEET" HREF="vol2.css">
<LINK REL="next" HREF="node215.html">
<LINK REL="previous" HREF="node213.html">
<LINK REL="up" HREF="node210.html">
<LINK REL="next" HREF="node215.html">
</HEAD>
<BODY >
<!--Navigation Panel-->
<A NAME="tex2html4072"
 HREF="node215.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
 SRC="icons.gif/next_motif.gif"></A> 
<A NAME="tex2html4069"
 HREF="node210.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
 SRC="icons.gif/up_motif.gif"></A> 
<A NAME="tex2html4063"
 HREF="node213.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
 SRC="icons.gif/previous_motif.gif"></A> 
<A NAME="tex2html4071"
 HREF="node1.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
 SRC="icons.gif/contents_motif.gif"></A>  
<BR>
<B> Next:</B> <A NAME="tex2html4073"
 HREF="node215.html">Correspondence Analysis</A>
<B> Up:</B> <A NAME="tex2html4070"
 HREF="node210.html">Multivariate Analysis Methods</A>
<B> Previous:</B> <A NAME="tex2html4064"
 HREF="node213.html">Cluster Analysis</A>
<BR>
<BR>
<!--End of Navigation Panel-->

<H1><A NAME="SECTION001540000000000000000">
Discriminant Analysis</A>
</H1>
Discriminant Analysis may be used for two objectives: either
<A NAME="10047">&#160;</A>
we want to <I>assess</I> the adequacy of classification, given
the group memberships of the objects under study; or we wish
to <I>assign</I> objects to one of a number of (known)
groups of objects. Discriminant Analysis may thus have a
descriptive or a predictive objective.

<P>
In both cases, some group assignments must be known before carrying
out the Discriminant Analysis.  Such group assignments, or labelling,
may be arrived at in any way.  Hence Discriminant Analysis can be
employed as a useful complement to Cluster Analysis (in order to judge
the results of the latter) or Principal Components Analysis.
Alternatively, in star-galaxy separation, for instance, using
digitised images, the analyst may define group (stars, galaxies)
membership visually for a conveniently small <I>training set</I> or
<I>design set</I>.
<A NAME="10052">&#160;</A>
<A NAME="10053">&#160;</A>

<P>
Methods implemented in this area are Multiple Discriminant Analysis,
Fisher's Linear Discriminant Analysis, and K-Nearest Neighbours
Discriminant Analysis.

<P>
<DL COMPACT><DT><I>Multiple Discriminant Analysis</I>
<DD>(MDA) is also termed Discriminant
<A NAME="10055">&#160;</A>
<A NAME="10056">&#160;</A>
<A NAME="10057">&#160;</A>
Factor Analysis and Canonical Discriminant Analysis.  It adopts a
similar perspective to PCA: the rows of the data matrix to be examined
constitute points in a multidimensional space, as also do the group
mean vectors.  Discriminating axes are determined in this space, in
such a way that optimal separation of the predefined groups is
attained.  As with PCA, the problem becomes mathematically the
eigenreduction of a real, symmetric matrix.  The eigenvalues represent
the discriminating power of the associated eigenvectors.  The <I>n</I><SUB><I>Y</I></SUB>groups lie in a space of dimension at most <I>n</I><SUB><I>Y</I></SUB> - 1.  This will be
the number of discriminant axes or factors obtainable in the most
common practical case when 
<!-- MATH: $n > m > n_Y$ -->
<I>n</I> &gt; <I>m</I> &gt; <I>n</I><SUB><I>Y</I></SUB> (where <I>n</I> is the number of
rows, and <I>m</I> the number of columns of the input data matrix).

<P>
<DT><I>Linear Discriminant Analysis</I>
<DD>is the 2-group case of MDA.
<A NAME="10058">&#160;</A>
It optimally separates two groups, using the 
<I>Mahalanobis metric</I> or <I>generalized distance</I>.
<A NAME="10061">&#160;</A>
<A NAME="10062">&#160;</A>
It also gives the same linear separating decision surface as 
Bayesian maximum likelihood discrimination in the case of equal
class covariance matrices.

<P>
<DT><I>K-NNs Discriminant Analysis</I>
<DD>:
Non-parametric (distribution-free) methods dispense with the need for
assumptions regarding the probability density function.  They have become
very popular especially in the image processing area.
The K-NNs method assigns an object of unknown affiliation to the group
to which the majority of its K nearest neighbours belongs. </DL>
<P>
There is no best discrimination method.   A few remarks concerning the
advantages and disadvantages of the methods studied are as follows.

<P>
<UL>
<P>
<LI>Analytical simplicity or
      computational reasons may lead to initial consideration of 
      linear discriminant analysis or the NN-rule.

<P>
<LI>Linear discrimination is the most widely used in practice.  Often
      the 2-group method is used repeatedly for the analysis of pairs
      of multigroup data (yielding 
<!-- MATH: ${  {k(k-1)} \over 2 }$ -->
<IMG
 WIDTH="64" HEIGHT="56" ALIGN="MIDDLE" BORDER="0"
 SRC="img424.gif"
 ALT="$ { {k(k-1)} \over 2 }$">
decision
      surfaces for <I>k</I> groups).

<P>
<LI>To estimate the parameters required in quadratic discrimination
      more computation and data is required than in the case of linear
      discrimination.  If there is not a great difference in the group
      covariance matrices, then the latter will perform as well as
      quadratic discrimination.

<P>
<LI>The <I>k</I>-NN rule is simply defined and implemented, especially
      if there is insufficient data to adequately define sample
      means and covariance matrices.

<P>
<LI>MDA is most appropriately used for <I>feature selection</I>.  As in
<A NAME="10067">&#160;</A>
      the case of PCA, we may want to focus on the variables used in 
      order to investigate the differences between groups; to create
      synthetic variables which improve the grouping ability of the data;
      to arrive at a similar objective by discarding irrelevant variables;
      or to determine the most parsimonious variables for graphical
      representational purposes.
</UL>
<P>
<HR>
<!--Navigation Panel-->
<A NAME="tex2html4072"
 HREF="node215.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
 SRC="icons.gif/next_motif.gif"></A> 
<A NAME="tex2html4069"
 HREF="node210.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
 SRC="icons.gif/up_motif.gif"></A> 
<A NAME="tex2html4063"
 HREF="node213.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
 SRC="icons.gif/previous_motif.gif"></A> 
<A NAME="tex2html4071"
 HREF="node1.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
 SRC="icons.gif/contents_motif.gif"></A>  
<BR>
<B> Next:</B> <A NAME="tex2html4073"
 HREF="node215.html">Correspondence Analysis</A>
<B> Up:</B> <A NAME="tex2html4070"
 HREF="node210.html">Multivariate Analysis Methods</A>
<B> Previous:</B> <A NAME="tex2html4064"
 HREF="node213.html">Cluster Analysis</A>
<!--End of Navigation Panel-->
<ADDRESS>
<I>Petra Nass</I>
<BR><I>1999-06-15</I>
</ADDRESS>
</BODY>
</HTML>