Sophie

Sophie

distrib > Mageia > 6 > x86_64 > media > core-release > by-pkgid > f764aff74a2895166e9783869182e855 > files > 5

perl-XML-Driver-HTML-0.60.0-7.mga6.noarch.rpm

NAME
       XML::Driver::HTML - SAX Driver for non wellformed HTML.

SYNOPSIS
         use XML::Driver::HTML;

         $driver = new XML::Driver::HTML(
               'Handler' => $some_sax_filter_or_handler,
               'Source' => $some_PerlSAX_like_hash
               );

         $driver->parse();

       or

         use XML::Driver::HTML;

         $driver = new XML::Driver::HTML();

         $driver->parse(
               'Handler' => $some_sax_filter_or_handler,
               'Source' => $some_PerlSAX_like_hash
               );

         $driver->parse(
               'Handler' => $some_other_sax_filter_or_handler,
               'Source' => $some_other_source
               );


DESCRIPTION
       XML::Driver::HTML is a SAX Driver for HTML. There is no
       need for the HTML input to be weel formed, as
       XML::Driver::HTML is generating its SAX events by walking
       a HTML::TreeBuilder object. The simplest kind of use, is a
       filter from HTML to XHTML using XML::Handler::YAWriter as
       a SAX Handler.

           my $ya = new XML::Handler::YAWriter(
               'Output' => new IO::File ( ">-" ),
               'Pretty' => {
                   'NoWhiteSpace'=>1,
                   'NoComments'=>1,
                   'AddHiddenNewline'=>1,
                   'AddHiddenAttrTab'=>1,
                   }
               );

           my $html = new XML::Driver::HTML(
               'Handler' => $ya,
               'Source' => { 'ByteStream' => new IO::File ( "<-" ) }
               );

           $html->parse();


       METHODS




       new Creates a new XML::Driver::HTML object. Default
           options for parsing, described below, are passed as
           key-value pairs or as a single hash.  Options may be
           changed directly in the object.

       parse
           Parses a document.  Options, described below, are
           passed as key-value pairs or as a single hash.
           Options passed to parse() override the default options
           in the parser object for the duration of the parse.

       OPTIONS

       The following options are supported by XML::Driver::HTML :

       Handler
           Default SAX Handler to receive events

       Source
           Hash containing the input source for parsing.  The
           `Source' hash may contain the following parameters:

           ByteStream
               The raw byte stream (file handle) containing the
               document.

           String
               A string containing the document.

           SystemId
               The system identifier (URL) of the document.

           Encoding
               A string describing the character encoding.

           If more than one of `ByteStream', `String', or
           `SystemId', then preference is given first to
           `ByteStream', then `String', then `SystemId'.

NOTES
       XML::Driver::HTML requires Perl 5.6 to convert from
       ISO-8859-1 to UTF-8.

BUGS
       not yet implemented:

           Interpretation of SystemId as being an URI
           XHTML document type

       other bugs:

           HTML::Parser and HTML::TreeBuilder bugs concerning DOCTYPE and CSS.
           Perl handling of UFT8 is compatible between different versions. So
           you need exactly Perl 5.6.0, not lower not higher.


AUTHOR
         Michael Koehne, Kraehe@Copyleft.De
         (c) 2001 GNU General Public License


SEE ALSO
       the XML::Parser::PerlSAX manpage and the HTML::TreeBuilder
       manpage