Rechercher une page de manuel
Bio::ASN1::EntrezGene::Indexer.3pm
Langue: en
Version: 2005-05-04 (debian - 07/07/09)
Section: 3 (Bibliothèques de fonctions)
Sommaire
NAME
Bio::ASN1::EntrezGene::Indexer - Indexes NCBI Entrez Gene files.SYNOPSIS
use Bio::ASN1::EntrezGene::Indexer; # creating & using the index is just a few lines my $inx = Bio::ASN1::EntrezGene::Indexer->new( -filename => 'entrezgene.idx', -write_flag => 'WRITE'); # needed for make_index call, but if opening # existing index file, don't set write flag! $inx->make_index('Homo_sapiens', 'Mus_musculus', 'Rattus_norvegicus'); my $seq = $inx->fetch(10); # Bio::Seq obj for Entrez Gene #10 # alternatively, if one prefers just a data structure instead of objects $seq = $inx->fetch_hash(10); # a hash produced by Bio::ASN1::EntrezGene # that contains all data in the Entrez Gene record # note that in case you wonder, you can get the files 'Homo_sapiens' # from NCBI Entrez Gene ftp download, DATA/ASN/Mammalia directory
PREREQUISITE
Bio::ASN1::EntrezGene, Bioperl version that contains Stefan Kirov's entrezgene.pm and all dependencies therein.INSTALLATION
Same as Bio::ASN1::EntrezGeneDESCRIPTION
Bio::ASN1::EntrezGene::Indexer is a Perl Indexer for NCBI Entrez Gene genome databases. It processes an ASN.1-formatted Entrez Gene record and stores the file position for each record in a way compliant with Bioperl standard (in fact its a subclass of Bioperl's index objects).Note that this module does not parse record, because it needs to run fast and grab only the gene ids. For parsing record, use Bio::ASN1::EntrezGene, or better yet, use Bio::SeqIO, format 'entrezgene'.
It takes this module (version 1.07) 21 seconds to index the human genome Entrez Gene file (Apr. 5/2005 download) on one 2.4 GHz Intel Xeon processor.
SEE ALSO
For details on various parsers I generated for Entrez Gene, example scripts that uses/benchmarks the modules, please see <http://sourceforge.net/projects/egparser/>. Those other parsers etc. are included in V1.05 download.AUTHOR
Dr. Mingyi Liu <mingyi.liu@gpc-biotech.com>COPYRIGHT
The Bio::ASN1::EntrezGene module and its related modules and scripts are copyright (c) 2005 Mingyi Liu, GPC Biotech AG and Altana Research Institute. All rights reserved. I created these modules when working on a collaboration project between these two companies. Therefore a special thanks for the two companies to allow the release of the code into public domain.You may use and distribute them under the terms of the Perl itself or GPL (<http://www.gnu.org/copyleft/gpl.html>).
CITATION
Liu, M and Grigoriev, A (2005) ``Fast Parsers for Entrez Gene'' Bioinformatics. In pressOPERATION SYSTEMS SUPPORTED
Any OS that Perl & Bioperl run on.METHODS
fetch
Parameters: $geneid - id for the Entrez Gene record to be retrieved Example: my $hash = $indexer->fetch(10); # get Entrez Gene #10 Function: fetch the data for the given Entrez Gene id. Returns: A Bio::Seq object produced by Bio::SeqIO::entrezgene Notes: One needs to have Bio::SeqIO::entrezgene installed before calling this function!
fetch_hash
Parameters: $geneid - id for the Entrez Gene record to be retrieved Example: my $hash = $indexer->fetch_hash(10); # get Entrez Gene #10 Function: fetch a hash produced by Bio::ASN1::EntrezGene for given Entrez Gene id. Returns: A data structure containing all data items from the Entrez Gene record. Notes: Alternative to fetch()
_file_handle
Title : _file_handle Usage : $fh = $index->_file_handle( INT ) Function: Returns an open filehandle for the file index INT. On opening a new filehandle it caches it in the @{$index->_filehandle} array. If the requested filehandle is already open, it simply returns it from the array. Example : $fist_file_indexed = $index->_file_handle( 0 ); Returns : ref to a filehandle Args : INT Notes : This function is copied from Bio::Index::Abstract. Once that module changes file handle code like I do below to fit perl 5.005_03, this sub would be removed from this module
Contenus ©2006-2024 Benjamin Poulain
Design ©2006-2024 Maxime Vantorre