Rechercher une page de manuel

XML::RSSLite.3pm

Langue: en

Autres versions - même langue

2008-01-21 (debian - 07/07/09)

Version: 2003-02-24 (ubuntu - 08/07/09)

Section: 3 (Bibliothèques de fonctions)

Sommaire

NAME
SYNOPSIS
DESCRIPTION
SEE ALSO
AUTHOR
LICENSE
POD ERRORS

NAME

XML::RSSLite - lightweight, "relaxed" RSS (and XML-ish) parser

SYNOPSIS

   use XML::RSSLite;
 
   . . .
 
   parseRSS(\%result, \$content);
 
   print "=== Channel ===\n",
         "Title: $result{'title'}\n",
         "Desc:  $result{'description'}\n",
         "Link:  $result{'link'}\n\n";
 
   foreach $item (@{$result{'item'}}) {
   print "  --- Item ---\n",
         "  Title: $item->{'title'}\n",
         "  Desc:  $item->{'description'}\n",
         "  Link:  $item->{'link'}\n\n";
   }

DESCRIPTION

This module attempts to extract the maximum amount of content from available documents, and is less concerned with XML compliance than alternatives. Rather than rely on XML::Parser, it uses heuristics and good old-fashioned Perl regular expressions. It stores the data in a simple hash structure, and ``aliases'' certain tags so that when done, you can count on having the minimal data necessary for re-constructing a valid RSS file. This means you get the basic title, description, and link for a channel and its items.

This module extracts more usable links by parsing ``scriptingNews'' and ``weblog'' formats in addition to RDF & RSS. It also ``sanitizes'' the output for best results. The munging includes:

Remove html tags to leave plain text
Remove characters other than 0-9~!@#$%^&*()-+=a-zA-Z[];',.:"<>?\s
Use <url> tags when <link> is empty
Use misplaced urls in <title> when <link> is empty
Exract links from <a href=...> if required
Limit links to ftp and http
Join relative urls to the site base

EXPORT

parseRSS($outHashRef, $inScalarRef): $inScalarRef is a reference to a scalar containing the document to be parsed, the contents will effectively be destroyed. $outHashRef is a reference to the hash within which to store the parsed content.

EXPORTABLE

parseXML(\%parsedTree, \$parseThis, 'topTag', $comments);

parsedTree - required

Reference to hash to store the parsed document within.

parseThis - required

Reference to scalar containing the document to parse.

topTag - optional

Tag to consider the root node, leaving this undefined is not recommended.

comments - optional

false will remove contents from parseThis
true will not remove comments from parseThis
array reference is true, comments are stored here

CAVEATS

This is not a conforming parser. It does not handle the following

•

   <foo bar=">">

•

   <foo><bar> <bar></bar> <bar></bar> </bar></foo>

•

   <![CDATA[ ]]>

•

PI

It's non-validating, without a DTD the following cannot be properly addressed

entities
namespaces: This might be arriving in the next release.

AUTHOR

Jerrad Pierce <jpierce@cpan.org>.

Scott Thomason <scott@thomasons.org>

LICENSE

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 407:: You forgot a '=back' before '=head2'
Around line 443:: =back without =over

Linux Certif

Toute la documentation sur la certification Linux LPI

Rechercher une page de manuel

XML::RSSLite.3pm

Sommaire

NAME

SYNOPSIS

DESCRIPTION

SEE ALSO

AUTHOR

LICENSE

POD ERRORS

Découvrir

Apprendre

Linux Certif

Toute la documentation sur la certification Linux LPI

Rechercher une page de manuel

XML::RSSLite.3pm

Sommaire

NAME

SYNOPSIS

DESCRIPTION

SEE ALSO

AUTHOR

LICENSE

POD ERRORS

Découvrir

Apprendre

Partager