Plucene::Analysis::LetterTokenizer.3pm

Langue: en

Version: 2008-03-01 (debian - 07/07/09)

Section: 3 (Bibliothèques de fonctions)

NAME

Plucene::Analysis::LetterTokenizer - Letter tokenizer

SYNOPSIS

         # isa Plucene::Analysis::CharTokenizer
 
 

DESCRIPTION

This is the letter tokenizer class, which divides text at non-letters.

Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces