Rechercher une page de manuel
hocr2djvused
Langue: en
Version: 05/24/2010 (ubuntu - 24/10/10)
Section: 1 (Commandes utilisateur)
Sommaire
NAME
hocr2djvused - hOCR to djvused script converterSYNOPSIS
- hocr2djvused [option...]
DESCRIPTION
- hocr2djvused reads a m[blue]hOCRm[][1] file (as produced by m[blue]OCRopusm[][2] or m[blue]Cuneiformm[][3]) from the standard input and converts it to a djvused script.
OPTIONS
Text segmentation options
-t lines, --details lines
- Record location of every line. Don't record locations of particular words or characters.
-t words, --details=words
- Record location of every line and every word. Don't record locations of particular characters.
This is the default.
-t chars, --details=chars
- Record location of every line, every word and every character.
--word-segmentation=simple
- Consider each non-empty sequence of non-whitespace characters a single word.
This is the default, despite being linguistically incorrect.
--word-segmentation=uax29
- Use the m[blue]Unicode Text Segmentationm[][4] algorithm to break lines into words.
This options break assumptions of some DjVu tools that words are separated by spaces, and therefore is it not recommended.
Other options
--rotation=n
- Assume that DjVu pages are rotated by n degrees.
--page-size=widthxheight
- Specifies that page size is width pixels × height pixels.
This option is required for hOCR generated by Cuneiform and superfluous otherwise.
--version
- Output version information and exit.
-h, --help
- Display help and exit.
SEE ALSO
AUTHOR
Jakub Wilk <jwilk@jwilk.net>
- Author.
COPYRIGHT
Copyright © 2008, 2009, 2010 Jakub Wilk
NOTES
- 1.
- hOCR
- http://docs.google.com/View?docid=dfxcv4vc_67g844kf
- 2.
- OCRopus
- http://ocropus.googlecode.com/
- 3.
- Cuneiform
- http://launchpad.net/cuneiform-linux
- 4.
- Unicode Text Segmentation
- http://unicode.org/reports/tr29/
Contenus ©2006-2024 Benjamin Poulain
Design ©2006-2024 Maxime Vantorre