Bio::DB::SeqHound.3pm

Langue: en

Version: 2010-05-19 (ubuntu - 24/10/10)

Section: 3 (Bibliothèques de fonctions)

NAME

Bio::DB::SeqHound - Database object interface to SeqHound

SYNOPSIS

     use Bio::DB::SeqHound;
     $sh = Bio::DB::SeqHound->new();
 
     $seq = $sh->get_Seq_by_acc("CAA28783"); # Accession Number
 
     # or ...
 
     $seq = $sh->get_Seq_by_gi(4557225); # GI Number
 
 

VERSION

1.1

DESCRIPTION

SeqHound is a database of biological sequences and structures. This script allows the retrieval of sequence objects (Bio::Seq) from the SeqHound database at the Blueprint Initiative.

Bioperl extension permitting use of the SeqHound Database System developed by researchers at

  The Blueprint Initiative
  Samuel Lunenfeld Research Institute
  Mount Sinai Hospital
  Toronto, Canada
 
 

FEEDBACK/BUGS

known bugs: fail to get sequences for some RefSeq record with CONTIG, example GI = 34871762

<seqhound@blueprint.org>

MAILING LISTS

User feedback is an integral part of the evolution of this Bioperl module. Send your comments and suggestions preferably to seqhound.usergroup mailing lists. Your participation is much appreciated.

<seqhound.usergroup@lists.blueprint.org>

WEBSITE

For more information on SeqHound http://dogboxonline.unleashedinformatics.com/

DISCLAIMER

This software is provided 'as is' without warranty of any kind.

AUTHOR

Rong Yao, Hao Lieu, Ian Donaldson

<seqhound@blueprint.org>

APPENDIX

The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

new

  Title   : new
  Usage   : $sh = Bio::DB::SeqHound->new(@options);
  Function: Creates a new seqhound handle
  Returns : New seqhound handle
  Args    :
 
 

Routines Bio::DB::WebDBSeqI from Bio::DB::RandomAccessI

get_Seq_by_id

  Title   : get_Seq_by_id
  Usage   : $seq = $db->get_Seq_by_id('ROA1_HUMAN'); 
  Function: Gets a Bio::Seq object by its name
  Returns : a Bio::Seq object
  Args    : the id (as a string) of a sequence
  Throws  : "id does not exist" exception
  Example : Each of these calls retrieves the same sequence record
            $seq = $db->get_Seq_by_id(56);        #retrieval by GI
            $seq = $db->get_Seq_by_id("X02597");  #retrieval by NCBI accession
            $seq = $db->get_Seq_by_id("BTACHRE"); #retrieval by sequence "name"
            a sequence "name" is a secondary identifier (usually assigned by the
            submitting database external to the NCBI) that may not be visible in
            the GenBank flat file version of the record but is always present in
            the ASN.1 format.
  Note    : Since in GenBank.pm, this function accepts a gi, an accession number
            or a sequence name, SeqHound also satisfies these inputs.
            If the input uid is a number, it is treated as a gi, if the uid is a
            string, it is treated as an accession number first. If the search still
            fails, it is treated as a sequence name.
            Since SeqHound stores biological data from different source sequence
            databases like: GenBank, GenPept, SwissProt, EMBL, RefSeq,
            you can pass ids from the above databases to this function. 
            The Bio::Seq object returned by this function is identical to the
            Bio::Seq generated by the GenBank.pm and GenPept.pm.
            The Bio::Seq object returned by this function sometimes has minor
            difference in the SeqFeature from the Bio::Seq object generated 
            in RefSeq.pm. 
            The Bio::Seq objects created from this function will have the NCBI
            versions of the SwissProt and EMBL sequence data information.
 
 

get_Seq_by_acc

   Title   : get_Seq_by_acc
   Usage   : $seq = $db->get_Seq_by_acc('M34830');
   Function: Gets a Seq object by accession numbers
   Returns : a Bio::Seq object
   Args    : the accession number as a string
   Throws  : "id does not exist" exception
   Note    : Since in GenBank.pm, this function accepts an accession number
             or a sequence name, SeqHound also satisfies these inputs.
             If the input uid is a string, it is treated as an accession number first.
             If the search fails, it is treated as a sequence name.
             Since SeqHound stores biological data from different source sequence
             databases like: GenBank, GenPept, SwissProt, EMBL, RefSeq,
             you can pass ids from the above databases to this function. 
             The Bio::Seq object returned by this function is identical to the
             Bio::Seq generated by the GenBank.pm and GenPept.pm.
             The Bio::Seq object returned by this function sometimes has minor
             difference in the SeqFeature from the Bio::Seq object generated 
             in RefSeq.pm. 
             The Bio::Seq objects created from this function will have the NCBI
             versions of the SwissProt and EMBL sequence data information.
 
 

get_Seq_by_gi

  Title   : get_Seq_by_gi
  Usage   : $seq = $sh->get_Seq_by_gi('405830');
  Function: Gets a Bio::Seq object by gi number
  Returns : A Bio::Seq object
  Args    : gi number (as a string)
  Throws  : "gi does not exist" exception
  Note    : call the same code get_Seq_by_id
 
 

get_Seq_by_version

  Title   : get_Seq_by_version
  Usage   : $seq = $db->get_Seq_by_version('X77802');
  Function: Gets a Bio::Seq object by sequence version
  Returns : A Bio::Seq object
  Args    : accession.version (as a string)
  Throws  : "acc.version does not exist" exception
  Note    : SeqHound only keeps the most up-to-date version of a sequence. So
            for the above example, use 
            $seq = $db->get_Seq_by_acc('X77802'); 
            instead of X77802.1
 
 

get_Stream_by_query

   Title   : get_Stream_by_query
   Usage   : $seq = $db->get_Stream_by_query($query);
   Function: Retrieves Seq objects from Entrez 'en masse', rather than one
             at a time.  For large numbers of sequences, this is far superior
             than get_Stream_by_[id/acc]().
   Example : $query_string = 'Candida maltosa 26S ribosomal RNA gene'; 
             $query = Bio::DB::Query::GenBank->new(-db=>'nucleotide',
                                         -query=>$query_string);
             $stream = $sh->get_Stream_by_query($query);
             or
             $query = Bio::DB::Query::GenBank->new (-db=> 'nucleotide',
                                         -ids=>['X02597', 'X63732', 11002, 4557284]);
             $stream = $sh->get_Stream_by_query($query);
   Returns : a Bio::SeqIO stream object
   Args    : $query :   A Bio::DB::Query::GenBank object. It is suggested that
             you create a Bio::DB::Query::GenBank object and get the entry
             count before you fetch a potentially large stream.
 
 

get_Stream_by_id

   Title   : get_Stream_by_id
   Usage   : $stream = $db->get_Stream_by_id(['J05128', 'S43442', 34996479]);
   Function: Gets a series of Seq objects by unique identifiers
   Returns : a Bio::SeqIO stream object
   Args    : $ref : a reference to an array of unique identifiers for
                    the desired sequence entries, according to genbank.pm
                    this function accepts gi, accession number
                    and sequence name
   Note    : Since in GenBank.pm, this function accepts a gi, an accession number
             or a sequence name, SeqHound also satisfies these inputs.
             If the input uid is a number, it is treated as a gi, if the uid is a
             string, it is treated as an accession number first. If the search still
             fails, it is treated as a sequence name.
             Since SeqHound stores biological data from different source sequence
             databases like: GenBank, GenPept, SwissProt, EMBL, RefSeq,
             you can pass ids from the above databases to this function. 
             The Bio::Seq object returned by this function is identical to the
             Bio::Seq generated by the GenBank.pm and GenPept.pm.
             The Bio::Seq object returned by this function sometimes has minor
             difference in the SeqFeature from the Bio::Seq object generated 
             in RefSeq.pm. 
             The Bio::Seq objects created from this function will have the NCBI
             versions of the SwissProt and EMBL sequence data information.
 
 

get_Stream_by_acc

   Title   : get_Stream_by_acc
   Usage   : $seq = $db->get_Stream_by_acc(['M98777', 'M34830']);
   Function: Gets a series of Seq objects by accession numbers
   Returns : a Bio::SeqIO stream object
   Args    : $ref : a reference to an array of accession numbers for
                    the desired sequence entries
   Note    : For SeqHound, this just calls the same code for get_Stream_by_id()
 
 

get_Stream_by_gi

   Title   : get_Stream_by_gi
   Usage   : $seq = $db->get_Seq_by_gi([161966, 255064]);
   Function: Gets a series of Seq objects by gi numbers
   Returns : a Bio::SeqIO stream object
   Args    : $ref : a reference to an array of gi numbers for
                    the desired sequence entries
   Note    : For SeqHound, this just calls the same code for get_Stream_by_id()
 
 

get_request

  Title   : get_request
  Usage   : my $lcontent = $self->get_request;
  Function: get the output from SeqHound API http call
  Returns : the result of the remote call from SeqHound
  Args    : %qualifiers = a hash of qualifiers 
            (SeqHound function name, id, query etc)
  Example : $lcontent = $self->get_request(-funcname=>'SeqHoundGetGenBankff',
                                         -query=>'gi',
                                         -uid=>555);
  Note    : this function overrides the implementation in Bio::DB::WebDBSeqI.
 
 

postprocess_data

  Title   : postprocess_data
  Usage   : $self->postprocess_data (-funcname => $funcname,
                                     -lcontent => $lcontent,
                                     -outtype  => $outtype);
  Function: process return String from http seqrem call 
            output type can be a string or a Bio::SeqIO object.
  Returns : void
  Args    : $funcname is the API function name of SeqHound 
            $lcontent is a string output from SeqHound server http call
            $outtype is a string or a Bio::SeqIO object 
  Example : $seqio = $self->postprocess_data ( -lcontent => $lcontent,
                                         -funcname => 'SeqHoundGetGenBankffList',
                                         -outtype => 'Bio::SeqIO');
            or
            $gi = $self->postprocess_data( -lcontent => $lcontent,
                                         -funcname => 'SeqHoundFindAcc',
                                         -outtype => 'string');
  Note    : this method overrides the method works for genbank/genpept,
            this is for SeqHound
 
 

_get_gi_from_name

  Title   : _get_gi_from_name
  Usage   : $self->_get_gi_from_name('J05128');
  Function: get the gene identifier from a sequence name
            in SeqHound database
  Return  : gene identifier or undef
  Args    : a string represented sequence name
 
 

_get_gi_from_acc

  Title   : _get_gi_from_acc
  Usage   : $self->_get_gi_from_acc('M34830')
  Function: get the gene identifier from an accession number
           in SeqHound database
  Return  : gene identifier or undef
  Args    : a string represented accession number
 
 

_get_Seq_from_gbff

  Title   : _get_Seq_from_gbff
  Usage   : $self->_get_Seq_from_gbff($str)
  Function: get the Bio::SeqIO stream object from gi or a list of gi
            in SeqHound database
  Return  : Bio::SeqIO or undef
  Args    : a string represented gene identifier or
            a list of gene identifiers
  Example : $seq = $self->_get_Seq_from_gbff(141740);
            or
            $seq = $self->_get_Seq_from_gbff([141740, 255064, 45185482]);
 
 

_init_SeqHound

  Title   : _init_SeqHound
  Usage   : $self->_init_SeqHound();
  Function: call SeqHoundInit at blueprint server 
  Return  : $result (TRUE or FALSE)
  Args    :
 
 

_MaxSizeArray

  Title   : _MaxSizeArray
  Usage   : $self->_MaxSizeArray(\@arr)
  Function: get an array with the limit size
  Return  : an array with the limit size
  Args    : a reference to an array