Rechercher une page de manuel
sphinx_fe
Langue: en
Version: 367005 (MeeGo - 06/11/10)
Section: 1 (Commandes utilisateur)
NAME
sphinx_fe - Convert audio files to acoustic feature filesSYNOPSIS
sphinx_fe [ options ]...DESCRIPTION
This program converts audio files (in either Microsoft WAV, NIST Sphere, or raw format) to acoustic feature files for input to batch-mode speech recognition. The resulting files are also useful for various other things. A list of options follows:
- -alpha
- Preemphasis parameter
- -argfile
- file (e.g. feat.params from an acoustic model) to read parameters from. This will override anything set in other command line arguments.
- -blocksize
- Number of samples to read at a time.
- -build_outdirs
- Create missing subdirectories in output directory
- -c
- file for batch processing
- -cep2spec
- Input is cepstral files, output is log spectral files
- -di
- directory, input file names are relative to this, if defined
- -dither
- Add 1/2-bit noise
- -do
- directory, output files are relative to this
- -doublebw
- Use double bandwidth filters (same center freq)
- -ei
- extension to be applied to all input files
- -eo
- extension to be applied to all output files
- -example
- Shows example of how to use the tool
- -frate
- Frame rate
- -help
- Shows the usage of the tool
- -i
- audio input file
- -input_endian
- Endianness of input data, big or little, ignored if NIST or MS Wav
- -lifter
- Length of sin-curve for liftering, or 0 for no liftering.
- -logspec
- Write out logspectral files instead of cepstra
- -lowerf
- Lower edge of filters
- -mach_endian
- Endianness of machine, big or little
- -mswav
- Defines input format as Microsoft Wav (RIFF)
- -ncep
- Number of cep coefficients
- -nchans
- Number of channels of data (interlaced samples assumed)
- -nfft
- Size of FFT
- -nfilt
- Number of filter banks
- -nist
- Defines input format as NIST sphere
- -npart
- Number of parts to run in (supersedes -nskip and -runlen if non-zero)
- -nskip
- If a control file was specified, the number of utterances to skip at the head of the file
- -o
- cepstral output file
- -ofmt
- Format of output files - one of sphinx, htk, text.
- -part
- Index of the part to run (supersedes -nskip and -runlen if non-zero)
- -raw
- Defines input format as raw binary data
- -remove_dc
- Remove DC offset from each frame
- -round_filters
- Round mel filter frequencies to DFT points
- -runlen
- If a control file was specified, the number of utterances to process, or -1 for all
- -samprate
- Sampling rate
- -seed
- Seed for random number generator; if less than zero, pick our own
- -smoothspec
- Write out cepstral-smoothed logspectral files
- -spec2cep
- Input is log spectral files, output is cepstral files
- -transform
- Which type of transform to use to calculate cepstra (legacy, dct, or htk)
- -unit_area
- Normalize mel filters to unit area
- -upperf
- Upper edge of filters
- -verbose
- Show input filenames
- -warp_params
- defining the warping function
- -warp_type
- Warping function type (or shape)
- -whichchan
- Channel to process
- -wlen
- Hamming window length
Currently the only kind of features supported are MFCCs (mel-frequency cepstral coefficients). There are numerous options which control the properties of the output features. It is VERY important that you document the specific set of flags used to create any given set of feature files, since this information is NOT recorded in the files themselves, and any mismatch between the parameters used to extract features for recognition and those used to extract features for training will cause recognition to fail.
AUTHOR
Written by numerous people at CMU from 1994 onwards. This manual page by David Huggins-Daines <dhuggins@cs.cmu.edu>COPYRIGHT
Copyright © 1994-2007 Carnegie Mellon University. See the file COPYING included with this package for more information.Contenus ©2006-2024 Benjamin Poulain
Design ©2006-2024 Maxime Vantorre