cstocs

Langue: en

Version: 2007-12-19 (mandriva - 01/05/08)

Section: 1 (Commandes utilisateur)

NAME

cstocs -- charset encoding convertor for the Czech and Slovak languages.

FORMAT

         cstocs [options] src_encoding dst_encoding [files ...]
 
 

SYNOPSIS

         cstocs il2 ascii < file | less
         cstocs -i utf8 il2 file1 file2 file3
         cstocs --help
 
 

DESCRIPTION

Cstocs is a simple conversion utility to change charset encoding of a text. It reads either specified files or (if none specified) the standard input, assumes that the input is encoded in "src_encoding" and ties to reencode it into "dst_encoding". The result is written to the standard output.

Run "cstocs" without parameters to get short help and list of available encodings.

Characters that are not defined in "src_encoding" are passed to the output unchanged.

If source text contains character, that is defined in "src_encoding" but not in "dst_encoding", it can be handled several ways. For example, character ``e with caron'' (symbol ecaron), and ``d with caron'' (symbol dcaron) are included in the iso-8859-2 encoding, but not in the iso-8859-1. If you will do reencoding of 8859-2 text to 8859-1, you may want to do one of the following actions:

1.
Keep it the same, option "--nofillstring".
2.
Do not produce any output instead of ``ecaron'' symbol, option "--null".
3.
Substitute some string (possibly a space) instead of both ecaron and dcaron, options "--fillstring".
4.
Substitute a letter ``d'' instead of dcaron, and ``e'' instead of ecaron. It is even possible to substitute string instead of symbol, so you can replace the ``AE'' Latin character with string ``AE'' (letter ``A'', and letter ``E''). Or you can replace a ``plusminus sign'' with a string ``+/-''. These substitutions are described in the accent file.

OPTIONS

-i, -i.ext, --inplace.ext
Files specified will be converted in-place, using Perl "-i" facility. Optionaly, an extension for backup copies may be specified after dot. This parameter has to be the first one, if specified.
--dir directory
Encoding files are taken from directory instead of the default, which is Cz/Cstocs/enc in the Perl lib tree. The location of encoding files can also be changed using the CSTOCSDIR environment variable, but the --dir option has the highest priority.
--fillstring string
If source text contains character, that is defined in the "src_encoding" but not in the "dst_encoding" nor in the accent file (or accent file is not used), it is replaced by "string". The default is single space.
--nofillstring
Disable changes of characters that would otherwise have fillstring applied. This is different from "--null" because that cancels that character out.
--null
Completely equivalent to --fillstring "".
--nochange or --noaccent
Do not use the accent file at all.
--onebyone
Use only those rules from the accent file, which rewrite one character to one character. If this option is specified, character ``ecaron'' will be rewritten to ``e'', but ``AE'' character will not be rewritten to ``AE'' string.
--onebymore
Use all rules from accent file. This is the default option.

SEE ALSO

Cz::Cstocs(3).

AUTHOR

Jan ``Yenya'' Kasprzak has done the original Un*x implementation.

Jan Pazdziora, adelton@fi.muni.cz, created the Perl module version.