TFBS::Matrix::ICM.3pm

Langue: en

Autres versions - même langue

Version: 2008-01-24 (ubuntu - 24/10/10)

Section: 3 (Bibliothèques de fonctions)

NAME

TFBS::Matrix::ICM - class for information content matrices of nucleotide patterns

SYNOPSIS

creating a TFBS::Matrix::ICM object manually:
     my $matrixref = [ [ 0.00, 0.30, 0.00, 0.00, 0.24, 0.00 ],
                       [ 0.00, 0.00, 0.00, 1.45, 0.42, 0.00 ],
                       [ 0.00, 0.89, 2.00, 0.00, 0.00, 0.00 ],
                       [ 0.00, 0.00, 0.00, 0.13, 0.06, 2.00 ]
                     ];  
     my $icm = TFBS::Matrix::ICM->new(-matrix => $matrixref,
                                      -name   => "MyProfile",
                                      -ID     => "M0001"
                                     );
  
     # or
  
     my $matrixstring = <<ENDMATRIX
     2.00   0.30   0.00   0.00   0.24   0.00
     0.00   0.00   0.00   1.45   0.42   0.00
     0.00   0.89   2.00   0.00   0.00   0.00
     0.00   0.00   0.00   0.13   0.06   2.00
     ENDMATRIX
     ;
     my $icm = TFBS::Matrix::ICM->new(-matrixstring => $matrixstring,
                                      -name         => "MyProfile",
                                      -ID           => "M0001"
                                     );
 
 
retrieving a TFBS::Matix::ICM object from a database:

(See documentation of individual TFBS::DB::* modules to learn how to connect to different types of pattern databases and retrieve TFBS::Matrix::* objects from them.)

     my $db_obj = TFBS::DB::JASPAR2->new
                     (-connect => ["dbi:mysql:JASPAR2:myhost",
                                   "myusername", "mypassword"]);
     my $pfm = $db_obj->get_Matrix_by_ID("M0001", "ICM");
     # or
     my $pfm = $db_obj->get_Matrix_by_name("MyProfile", "ICM");
 
 
retrieving list of individual TFBS::Matrix::ICM objects from a TFBS::MatrixSet object

(see decumentation of TFBS::MatrixSet to learn how to create objects for storage and manipulation of multiple matrices)

     my @icm_list = $matrixset->all_patterns(-sort_by=>"name");
 
 

* drawing a sequence logo

     $icm->draw_logo(-file=>"logo.png", 
                     -full_scale =>2.25,
                     -xsize=>500,
                     -ysize =>250, 
                     -graph_title=>"C/EBPalpha binding site logo", 
                     -x_title=>"position", 
                     -y_title=>"bits");
 
 

DESCRIPTION

TFBS::Matrix::ICM is a class whose instances are objects representing position weight matrices (PFMs). An ICM is normally calculated from a raw position frequency matrix (see TFBS::Matrix::PFM for the explanation of position frequency matrices). For example, given the following position frequency matrix,
     A:[ 12     3     0     0     4     0  ]
     C:[  0     0     0    11     7     0  ]
     G:[  0     9    12     0     0     0  ]
     T:[  0     0     0     1     1    12  ]
 
 

the standard computational procedure is applied to convert it into the following information content matrix:

     A:[2.00  0.30  0.00  0.00  0.24  0.00]
     C:[0.00  0.00  0.00  1.45  0.42  0.00]
     G:[0.00  0.89  2.00  0.00  0.00  0.00]
     T:[0.00  0.00  0.00  0.13  0.06  2.00]
 
 

which contains the ``weights'' associated with the occurence of each nucleotide at the given position in a pattern.

A TFBS::Matrix::PWM object is equipped with methods to search nucleotide sequences and pairwise alignments of nucleotide sequences with the pattern they represent, and return a set of sites in nucleotide sequence (a TFBS::SiteSet object for single sequence search, and a TFBS::SitePairSet for the alignment search).

FEEDBACK

Please send bug reports and other comments to the author.

AUTHOR - Boris Lenhard

Boris Lenhard <Boris.Lenhard@cgb.ki.se>

APPENDIX

The rest of the documentation details each of the object methods. Internal methods are preceded with an underscore.

new

  Title   : new
  Usage   : my $icm = TFBS::Matrix::ICM->new(%args)
  Function: constructor for the TFBS::Matrix::ICM object
  Returns : a new TFBS::Matrix::ICM object
  Args    : # you must specify either one of the following three:
  
            -matrix,      # reference to an array of arrays of integers
               #or
            -matrixstring,# a string containing four lines
                          # of tab- or space-delimited integers
               #or
            -matrixfile,  # the name of a file containing four lines
                          # of tab- or space-delimited integers
            #######
  
            -name,        # string, OPTIONAL
            -ID,          # string, OPTIONAL
            -class,       # string, OPTIONAL
            -tags         # an array reference, OPTIONAL
 
 

to_PWM

  Title   : to_PWM
  Usage   : my $pwm = $icm->to_PWM()
  Function: converts an  information content matrix (a TFBS::Matrix::ICM object)
            to position weight matrix. At present it assumes uniform
            background distribution of nucleotide frequencies.
  Returns : a new TFBS::Matrix::PWM object
  Args    : none; in the future releases, it should be able to accept
            a user defined background probability of the four
            nucleotides
 
 

draw_logo

  Title   : draw_logo
  Usage   : my $gdImageObj = $icm->draw_logo(%args)
  Function: Draws a "sequence logo", a graphical representation
            of a possibly degenerate fixed-width nucleotide
            sequence pattern, from the information content matrix
  Returns : a GD::Image object;
            if you only need the image file you can ignore it
  Args    : -file,       # the name of the output PNG image file
                         # OPTIONAL: default none
            -xsize       # width of the image in pixels
                         # OPTIONAL: default 600
            -ysize       # height of the image in pixels
                         # OPTIONAL: default 5/8 of -x_size
            -startpos    # start position in the logo for x axis
                         # OPTIONAL: default is 1
            -margin      # size of image margins in pixels
                         # OPTIONAL: default 15% of -y_size
            -full_scale  # the maximum value on the y-axis, in bits
                         # OPTIONAL: default 2.25
            -graph_title,# the graph title
                         # OPTIONAL: default none
            -x_title,    # x-axis title; OPTIONAL: default none
            -y_title     # y-axis title; OPTIONAL: default none
            -error_bars  # reference to an array of S.D. values for each column; OPTIONAL
            -ps          # if true, produces a postscript string instead of a GD::Image object
             -pdf          # if true AND the -file argumant is used, produces an output pdf file
 
 

_draw_ps_logo

  Title   : _draw_ps_logo 
  Usage   : my $postscript_string = $icm->_draw_ps_logo(%args)
            Internal method, should be accessed using draw_logo()
  Function: Draws a "sequence logo", a graphical representation
            of a possibly degenerate fixed-width nucleotide
            sequence pattern, from the information content matrix
  Returns : a postscript string;
            if you only need the image file you can ignore it
  Args    : -file,       # the name of the output PNG image file
                         # OPTIONAL: default none
            -xsize       # width of the image in pixels
                         # OPTIONAL: default 600
            -ysize       # height of the image in pixels
                         # OPTIONAL: default 5/8 of -x_size
            -full_scale  # the maximum value on the y-axis, in bits
                         # OPTIONAL: default 2.25
            -graph_title,# the graph title
                         # OPTIONAL: default none
            -x_title,    # x-axis title; OPTIONAL: default none
            -y_title     # y-axis title; OPTIONAL: default none
 
 

_draw_svg_logo

name

ID

class

matrix

length

revcom

rawprint

prettyprint

The above methods are common to all matrix objects. Please consult TFBS::Matrix to find out how to use them.