Rechercher une page de manuel

slmseg

Langue: en

Version: 2010-08-16 (ubuntu - 24/10/10)

Section: 1 (Commandes utilisateur)

Sommaire

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
NOTES
AUTHOR
SEE ALSO

NAME

slmseg - maximum matching segment Chinese text.

SYNOPSIS

slmseg -d dict_file [option]... [corpus_file]...

DESCRIPTION

slmseg is a tool for segmenting Chinese text into words using maximum matching algorithm. slmseg segments corpus_file, or standard input if no filename is specified, and write the segmented result to standard output.

OPTIONS

-d dict_file: Use dict_file as lexicon. A default lexicon can be found at /usr/share/sunpinyin-slm/dict.utf8.
-f,--format (text|bin): Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word text are output, while in binary mode, binary short integer of the word-ids are written to stdout.
-s, --stok STOK_ID: Sentence token id. Default 10. It will be written to output in binary mode after every sentence.
-i, --show-id: Show Id info. Under text output format mode, attach id after known words. If under binary mode, print id(s) in text.
-m, --model language-model-file Speficy the language model file. This file is always generated by slmthread.

NOTES

Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space are inserted between unknown-words.

AUTHOR

Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.

Linux Certif

Toute la documentation sur la certification Linux LPI

Rechercher une page de manuel

slmseg

Sommaire

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

NOTES

AUTHOR

SEE ALSO

Découvrir

Apprendre

Linux Certif

Toute la documentation sur la certification Linux LPI

Rechercher une page de manuel

slmseg

Sommaire

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

NOTES

AUTHOR

SEE ALSO

Découvrir

Apprendre

Partager