Rechercher une page de manuel
Parse::MediaWikiDump::Links.3pm
Langue: en
Version: 2009-11-21 (ubuntu - 24/10/10)
Section: 3 (Bibliothèques de fonctions)
NAME
Parse::MediaWikiDump::Links - Object capable of processing link dump filesABOUT
This object is used to access content of the SQL based category dump files by providing an iterative interface for extracting the indidivual article links to the same. Objects returned are an instance of Parse::MediaWikiDump::link.SYNOPSIS
$pmwd = Parse::MediaWikiDump->new; $links = $pmwd->links('pagelinks.sql'); $links = $pmwd->links(\*FILEHANDLE); #print the links between articles while(defined($link = $links->next)) { print 'from ', $link->from, ' to ', $link->namespace, ':', $link->to, "\n"; }
METHODS
- Parse::MediaWikiDump::Links->new
- Create a new instance of a page links dump file parser
- $links->next
- Return the next available Parse::MediaWikiDump::link object or undef if there is no more data left
EXAMPLE
List all links between articles in a friendly way
#!/usr/bin/perl use strict; use warnings; use Parse::MediaWikiDump; my $pmwd = Parse::MediaWikiDump->new; my $links = $pmwd->links(shift) or die "must specify a pagelinks dump file"; my $dump = $pmwd->pages(shift) or die "must specify an article dump file"; my %id_to_namespace; my %id_to_pagename; binmode(STDOUT, ':utf8'); #build a map between namespace ids to namespace names foreach (@{$dump->namespaces}) { my $id = $_->[0]; my $name = $_->[1]; $id_to_namespace{$id} = $name; } #build a map between article ids and article titles while(my $page = $dump->next) { my $id = $page->id; my $title = $page->title; $id_to_pagename{$id} = $title; } $dump = undef; #cleanup since we don't need it anymore while(my $link = $links->next) { my $namespace = $link->namespace; my $from = $link->from; my $to = $link->to; my $namespace_name = $id_to_namespace{$namespace}; my $fully_qualified; my $from_name = $id_to_pagename{$from}; if ($namespace_name eq '') { #default namespace $fully_qualified = $to; } else { $fully_qualified = "$namespace_name:$to"; } print "Article \"$from_name\" links to \"$fully_qualified\"\n"; }
Contenus ©2006-2024 Benjamin Poulain
Design ©2006-2024 Maxime Vantorre