Rechercher une page de manuel
XML::RSSLite.3pm
Langue: en
Version: 2003-02-24 (ubuntu - 08/07/09)
Section: 3 (Bibliothèques de fonctions)
NAME
XML::RSSLite - lightweight, "relaxed" RSS (and XML-ish) parserSYNOPSIS
use XML::RSSLite; . . . parseRSS(\%result, \$content); print "=== Channel ===\n", "Title: $result{'title'}\n", "Desc: $result{'description'}\n", "Link: $result{'link'}\n\n"; foreach $item (@{$result{'item'}}) { print " --- Item ---\n", " Title: $item->{'title'}\n", " Desc: $item->{'description'}\n", " Link: $item->{'link'}\n\n"; }
DESCRIPTION
This module attempts to extract the maximum amount of content from available documents, and is less concerned with XML compliance than alternatives. Rather than rely on XML::Parser, it uses heuristics and good old-fashioned Perl regular expressions. It stores the data in a simple hash structure, and ``aliases'' certain tags so that when done, you can count on having the minimal data necessary for re-constructing a valid RSS file. This means you get the basic title, description, and link for a channel and its items.This module extracts more usable links by parsing ``scriptingNews'' and ``weblog'' formats in addition to RDF & RSS. It also ``sanitizes'' the output for best results. The munging includes:
- Remove html tags to leave plain text
- Remove characters other than 0-9~!@#$%^&*()-+=a-zA-Z[];',.:"<>?\s
- Use <url> tags when <link> is empty
- Use misplaced urls in <title> when <link> is empty
- Exract links from <a href=...> if required
- Limit links to ftp and http
- Join relative urls to the site base
EXPORT
- parseRSS($outHashRef, $inScalarRef)
- $inScalarRef is a reference to a scalar containing the document to be parsed, the contents will effectively be destroyed. $outHashRef is a reference to the hash within which to store the parsed content.
EXPORTABLE
- parseXML(\%parsedTree, \$parseThis, 'topTag', $comments);
-
-
- parsedTree - required
- Reference to hash to store the parsed document within.
- parseThis - required
- Reference to scalar containing the document to parse.
- topTag - optional
- Tag to consider the root node, leaving this undefined is not recommended.
- comments - optional
-
-
- false will remove contents from parseThis
- true will not remove comments from parseThis
- array reference is true, comments are stored here
-
-
CAVEATS
This is not a conforming parser. It does not handle the following
- •
-
<foo bar=">">
- •
-
<foo><bar> <bar></bar> <bar></bar> </bar></foo>
- •
-
<![CDATA[ ]]>
- •
-
PI
It's non-validating, without a DTD the following cannot be properly addressed
- entities
- namespaces
- This might be arriving in the next release.
SEE ALSO
perl(1), "XML::RSS", "XML::SAX::PurePerl", "XML::Parser::Lite", <XML::Parser>AUTHOR
Jerrad Pierce <jpierce@cpan.org>.Scott Thomason <scott@thomasons.org>
LICENSE
Portions Copyright (c) 2002 Jerrad Pierce, (c) 2000 Scott Thomason. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.POD ERRORS
Hey! The above document had some coding errors, which are explained below:- Around line 407:
- You forgot a '=back' before '=head2'
- Around line 443:
- =back without =over
Contenus ©2006-2024 Benjamin Poulain
Design ©2006-2024 Maxime Vantorre