Rechercher une page de manuel
gsnap
Langue: en
Version: 337652 (ubuntu - 24/10/10)
Section: 1 (Commandes utilisateur)
Sommaire
NAME
gsnap - Genomic Short-read Nucleotide Alignment ProgramSYNOPSIS
gsnap -dDB [OPTION]... [QUERY]...DESCRIPTION
Align the sequences QUERY to the reference DB. With no QUERY, read standard input.OPTIONS
Input options
- -D, --dir=directory
- Genome directory
- -d, --db=STRING
- Genome database
- -q, --part=INT/INT
- Process only the i-th out of every n sequences e.g., 0/100 or 99/100
- -c, --circular-input
- Circular-end data (paired reads are on same strand)
Computation options
Note: GSNAP has an ultrafast algorithm for calculating mismatches up to and including ((readlength+2)/12 - 2) ("ultrafast mismatches"). The program will run fastest if max-mismatches (plus suboptimal-levels) is within that value. Also, indels, especially end indels, take longer to compute, although the algorithm is still designed to be fast.
- -B, --batch=INT
- Batch mode (0 = no pre-loading, 1 = pre-load only indices; 2 (default) = pre-load both indices and genome)
- -m, --max-mismatches=FLOAT
- Maximum number of mismatches allowed (if not specified, then defaults to the ultrafast level of ((readlength+2)/12 - 2)) If specified between 0.0 and 1.0, then treated as a fraction of each read length. Otherwise, treated as an integral number of mismatches (including indel and splicing penalties)
- -i, --indel-penalty=INT
- Penalty for an indel (default 1000, essentially turning it off). Counts against mismatches allowed. To find indels, make indel-penalty less than or equal to max-mismatches For 2-base reads, need to set indel-penalty somewhat high
- -I, --indel-endlength=INT
- Minimum length at end required for indel alignments (default 3)
- -y, --max-middle-insertions=INT
- Maximum number of middle insertions allowed (default 9)
- -z, --max-middle-deletions=INT
- Maximum number of middle deletions allowed (default 30)
- -Y, --max-end-insertions=INT
- Maximum number of end insertions allowed (default 3)
- -Y, --max-end-deletions=INT
- Maximum number of end deletions allowed (default 6)
- -M, --suboptimal-score=INT
- Report suboptimal hits beyond best hit (default 0) All hits with best score plus suboptimal-score are reported
- -R, --masking=INT
- Masking of frequent/repetitive oligomers to avoid spending time on non-unique or repetitive reads
0 = no masking (will try to find non-unique or repetitive matches)
1 = mask frequent oligomers
2 = mask frequent and repetitive oligomers (fastest) (default)
3 = greedy frequent: mask frequent oligomers first, then try no masking if alignments not found
4 = greedy repetitive: mask frequent and repetitive oligomers first, then try no masking if alignments not found - -T, --trim=INT
- Trim mismatches at ends (0 = no (default), 1 = yes)
- -2, --dibase
- Input is 2-base encoded (e.g., SOLiD), with database built previously using dibaseindex)
- -C, --cmet
- Use database for methylcytosine experiments, built previously using cmetindex)
- -V, --usesnps=STRING
- Use database containing known SNPs (in <STRING>.iit, built previously using snpindex) for tolerance to SNPs
- -g, --geneprob=STRING
- Use IIT file containing geneprob (in <STRING>.iit, of cumulative format >(count) (genomicpos) to resolve ties
- -t, --nthreads=INT
- Number of worker threads
Splicing options for RNA-Seq
- -s, --splicesites=STRING
- Look for splicing involving known splice sites (in <STRING>.iit), at short or long distances
- -N, --novelsplicing=INT
- Look for novel splicing, not in known splice sites (if -s provided) within shortsplicedist (-w flag) or with novelspliceprob (-x flag)
- -w, --localsplicedist=INT
- Definition of local novel splicing event (default 200000)
- -e, --local-splice-penalty=INT
- Penalty for a local splice (default 2). Counts against mismatches allowed
- -E, --distant-splice-penalty=INT
- Penalty for a distant splice (default 3). Counts against mismatches allowed
- -k, --local-splice-endlength=INT
- Minimum length at end required for local spliced alignments (default 15, min is 14)
- -K, --distant-splice-endlength=INT
- Minimum length at end required for distant spliced alignments (default 16, min is 14)
- -J, --distant-splice-identity=FLOAT
- Minimum identity at end required for distant spliced alignments (default 0.95)
Options for paired-end reads
- -P, --pairmax=INT
- Max total genomic length for paired reads (default 1000). Should increase for RNA-Seq reads.
- -p, --pairlength=INT
- Expected paired-end length (default 200)
Output options
- -n, --npaths=INT
- Maximum number of paths to print (default 100).
- -Q, --quiet-if-excessive
- If more than maximum number of paths are found, then nothing is printed.
- -O, --ordered
- Print output in same order as input (relevant only if there is more than one worker thread)
- -S, --print-snps=INT
- Print detailed information about SNPs in reads (works only if -V also selected) (0=no (default), 1=positions and labels)
- -F, --failsonly
- Print only failed alignments, those with no results
- -f, --nofails
- Exclude printing of failed alignments
- -A, --format=STRING
- Another format type, other than default. Currently implemented: sam
Help options
- -v, --version
- Show version
- -?, --help
- Show this help message
ENVIRONMENT
- GMAPDB
- genome directory (eqivalent to -D)
FILES
- ~/.gmaprc
- configuration file
AUTHOR
Thomas D. Wu and Colin K. WatanabeREPORTING BUGS
Report bugs to Thomas Wu <twu@gene.com>.COPYRIGHT
Copyright 2005 Genentech, Inc. All rights reserved.SEE ALSO
gmap_setup(1), gmap(1)http://research-pub.gene.com/gmap/
Contenus ©2006-2024 Benjamin Poulain
Design ©2006-2024 Maxime Vantorre