# gp_mkmtx

Langue: en

Version: 111685 (mandriva - 01/05/08)

Section: 1 (Commandes utilisateur)

# Sommaire

## NAME

gp_mkmtx - calculate frequencies of nucleotides

## SYNOPSIS

gp_mkmtx [-a] [-g value] [-l] [-q] [-v] [-d] [-h] [inputfile] [outputfile]

## OPTIONS

-a
print only the absolute numbers of occurencies
-g value
divide each frequency by the expected frequency at GC contents equal to value %.
-l
do not apply logarythmic scaling (as a default, gp_mkmtx calculates the logarythm of the frequencies.
-v
Prints the version information.
-d
Prints lots of debugging information.
-h
Shows usage information.
inputfile
file to proces; if not given, will use standard input
outputfile
file to write the data to; if not given, will use standard output

## DESCRIPTION

gp_mkmtx is supposed to be a tool for an easy creation of matrices for the gp_matrix program. It takes a set of sequences, calculates the frequency of a nucleotide at each position starting from the first nucleotide and ending with the last nucleotide of the shortest sequence. For each position, four values are printed in a row, respectively for A, C, G and T/U. Each value is the logarithm of the calculated frequency (logarythmisation can be suppresed with the -l option). If the -g option is used, prior to the logarithmic scaling the values are diveded by the expected frequency at the given GC contents (that is, for example, at GC=50%, 0.25 for each nucleotide).

## EXAMPLES

gp_mkmtx -g 50 somesequence.fasta somesequence.mtx

will produce a matrix file somesequence.mtx which, after some editing, will be directly suitable for the gp_matrix program.

## DIAGNOSTICS

All Genpak programs complain in situations you would also complain, like when they cannot find a sequence you gave them or the sequence is not valid.

The Genpak programs do not write over existing files. I have found this feature very useful :-)

## BUGS

I'm sure there are plenty left, so please mail me if you find them. I tried to clean up every bug I could find.

## AUTHOR

January Weiner III <january@bioinformatics.org>

