feeds2lrf

Langue: en

Autres versions - même langue

Version: April 2009 (ubuntu - 07/07/09)

Section: 1 (Commandes utilisateur)

NAME

feeds2lrf - part of calibre

SYNOPSIS

feeds2lrf [options] ARG

DESCRIPTION

feeds2lrf parses an online source of articles, like an RSS or ATOM feed and fetches the article contents organized in a nice hierarchy.

ARG can be one of:

file name - feeds2lrf will try to load a recipe from the file

builtin recipe title - feeds2lrf will load the builtin recipe and use it to fetch the feed. For e.g. Newsweek or "The BBC" or "The New York Times"

recipe as a string - feeds2lrf will load the recipe directly from the string arg.

Available builtin recipes are: 'The Age', 'Al Jazeera in English', 'Ambito.com', 'The American Spectator', u'Ars Technica', u'Associated Press', 'The Atlantic', u'The Australian', 'B92', u'The BBC', "Barron's", u'Blic', 'Borba Online', 'Business Week', 'CNN', 'Chicago Breaking News', 'Chicago Tribune', 'Christian Science Monitor', u'Cincinnati Enquirer', 'Clarin', u'Common Dreams', 'Courrier International', 'Critica de la Argentina', u'Cyberpresse', 'DNA India', u'Daily Telegraph', u'Danas', u'De Morgen', u'De Standaard', 'Diario Granma', 'Die Zeit Nachrichten', u'Discover Magazine', 'E-Novine', u'EL PAIS', 'ESPN', 'The Economist', 'El Cronista', 'El Mercurio online', 'El Mundo', 'El Universal', 'ElArgentino.com', u'Engadget', 'Exiled Online', 'FAZ NET', 'FTD', u'Financial Times', u'Forbes', 'Freakonomics Blog', u'Fudzilla', u'Glasgow Herald', 'Globe and Mail', 'Google Reader', u'The Guardian', u"Harper's Magazine", u"Harper's Magazine - articles from printed edition", u'The Hindu', 'Honolulu Advertiser', 'Honolulu Star-Bulletin', u'Indianapolis Star', 'Infobae.com', u'The International Herald Tribune', u'The Irish Times', u'The Japan Times', 'Jerusalem Post', 'Joel on Software', 'Jornal Brasileiro Online', u'Jutarnji', 'Juventud Rebelde', 'Juventud Rebelde in english', 'La Cuarta', 'La Mujer de mi Vida', 'La Nacion', 'La Nacion Chile', 'La Prensa', 'La Segunda', 'La Tercera', 'LeMonde.fr', u'Liberation', u'Linux Magazine', u'Linuxdevices', u'London Review of Books', u'The Los Angeles Times', 'The Market Ticker', u'The Moscow Times', 'NASA', 'NIN online', 'NSPM in English', u'The Nation', u'New Scientist - Online News', u'New York Review of Books', 'The New York Times', 'The New York Times (subscription)', u'The New Yorker', 'Newstimes', 'Newsweek', u'Nova srpska politicka misao', 'O Estado de S. Paulo', 'O Globo', 'The Onion', 'Outlook India', 'Pagina/12', 'Pescanik', u'Physicstoday', u'Physicsworld', 'Pobjeda Online', u'Politika Online', 'Portfolio', 'Press Online', 'Reuters', u'San Francisco Chronicle', u'Science AAAS', u'ScienceDaily', u'ScienceNews', u'Scientific American', u'The Scotsman', u'Shacknews', 'Soldiers', 'Spiegel Online', u'Spiegel Online International', u'The St. Petersburg Times', u'Sueddeutsche', u'Supersite for Windows', 'Sydney Morning Herald', u'Telegraph.co.uk', 'Telepolis', 'Teleread Blog', u'Time', u'The Times Online', "Tom's Hardware German", "Tom's Hardware US", 'USA Today', 'United Press International', u'Utne reader', u'Vecernje Novosti', 'Vijesti', 'Vreme', 'The Wall Street Journal', 'Washington Post', 'Wired.com', 'heise', u'la Repubblica', u'securitywatch', 'xkcd', 'zdnet'

Whenever you pass arguments to feeds2lrf that have spaces in them, enclose the arguments in quotation marks.

OPTIONS

--version
show program's version number and exit
-h, --help
show this help message and exit
FEEDS2DISK OPTIONS:
Options to control the behavior of feeds2disk
--feeds=FEEDS
Specify a list of feeds to download. For example: "['http://feeds.newsweek.com/newsweek/TopNews', 'http://feeds.newsweek.com/headlines/politics']" If you specify this option, any argument to %prog is ignored and a default recipe is used to download the feeds.
--verbose
Be more verbose while processing.
--title=TITLE
The title for this recipe. Used as the title for any ebooks created from the downloaded feeds.
--username=USERNAME
Username for sites that require a login to access content.
--password=PASSWORD
Password for sites that require a login to access content.
--recursions=RECURSIONS
Number of levels of links to follow on webpages that are linked to from feeds. Defaul 0
--no-progress-bar
Dont show the progress bar
--debug
Very verbose output, useful for debugging.
--test
Useful for recipe development. Forces max_articles_per_feed to 2 and downloads at most 2 feeds.
-t TIMEOUT, --timeout=TIMEOUT
Timeout in seconds to wait for a response from the server. Default: 10.0 s
--delay=DELAY
Minimum interval in seconds between consecutive fetches. Default is 0 s
--encoding=ENCODING
The character encoding for the websites you are trying to download. The default is to try and guess the encoding.
--match-regexp=MATCH_REGEXPS
Only links that match this regular expression will be followed. This option can be specified multiple times, in which case as long as a link matches any one regexp, it will be followed. By default all links are followed.
--filter-regexp=FILTER_REGEXPS
Any link that matches this regular expression will be ignored. This option can be specified multiple times, in which case as long as any regexp matches a link, it will be ignored.By default, no links are ignored. If both --filter-regexp and --match-regexp are specified, then --filter-regexp is applied first.
--dont-download-stylesheets
Do not download CSS stylesheets.
HTML2LRF OPTIONS:
-o OUTPUT, --output=OUTPUT
Output file name. Default is derived from input filename
--ignore-tables
Render HTML tables as blocks of text instead of actual tables. This is neccessary if the HTML contains very large or complex tables.
--minimize-memory-usage
Minimize memory usage at the cost of longer processing times. Use this option if you are on a memory constrained machine.
-a AUTHOR, --author=AUTHOR
Set the author(s). Multiple authors should be set as a comma separated list. Default: Unknown
--comment=FREETEXT
Set the comment.
--category=CATEGORY
Set the category
--title-sort=TITLE_SORT
Sort key for the title
--author-sort=AUTHOR_SORT
Sort key for the author
--publisher=PUBLISHER
Publisher
--cover=COVER
Path to file containing image to be used as cover
--use-metadata-cover
If there is a cover graphic detected in the source file, use that instead of the specified cover.
--base-font-size=BASE_FONT_SIZE
Specify the base font size in pts. All fonts are rescaled accordingly. This option obsoletes the --font-delta option and takes precedence over it. To use --font-delta, set this to 0. Default: 10.0pt
--enable-autorotation
Enable autorotation of images that are wider than the screen width.
--wordspace=WORDSPACE
Set the space between words in pts. Default is 2.5
--blank-after-para
Separate paragraphs by blank lines.
--header
Add a header to all the pages with title and author.
--headerformat=HEADERFORMAT
Set the format of the header. %a is replaced by the author and %t by the title. Default is %t by %a
--header-separation=HEADER_SEPARATION
Add extra spacing below the header. Default is 0 px.
--override-css=_OVERRIDE_CSS
Override the CSS. Can be either a path to a CSS stylesheet or a string. If it is a string it is interpreted as CSS.
--use-spine
Use the <spine> element from the OPF file to determine the order in which the HTML files are appended to the LRF. The .opf file must be in the same directory as the base HTML file.
--minimum-indent=MINIMUM_INDENT
Minimum paragraph indent (the indent of the first line of a paragraph) in pts. Default: 0
--font-delta=FONT_DELTA
Increase the font size by 2 * FONT_DELTA pts and the line spacing by FONT_DELTA pts. FONT_DELTA can be a fraction.If FONT_DELTA is negative, the font size is decreased.
--ignore-colors
Render all content as black on white instead of the colors specified by the HTML or CSS.
-p PROFILE, --profile=PROFILE
Profile of the target device for which this LRF is being generated. The profile determines things like the resolution and screen size of the target device. Default: prs500 Supported profiles: prs500
--left-margin=LEFT_MARGIN
Left margin of page. Default is 20 px.
--right-margin=RIGHT_MARGIN
Right margin of page. Default is 20 px.
--top-margin=TOP_MARGIN
Top margin of page. Default is 10 px.
--bottom-margin=BOTTOM_MARGIN
Bottom margin of page. Default is 0 px.
--render-tables-as-images
Render tables in the HTML as images (useful if the document has large or complex tables)
--text-size-multiplier-for-rendered-tables=TEXT_SIZE_MULTIPLIER_FOR_RENDERED_TABLES
Multiply the size of text in rendered tables by this factor. Default is 1.0
--link-levels=LINK_LEVELS
The maximum number of levels to recursively process links. A value of 0 means thats links are not followed. A negative value means that <a> tags are ignored.
--link-exclude=LINK_EXCLUDE
A regular expression. <a> tags whose href matches will be ignored. Defaults to @
--no-links-in-toc
Don't add links to the table of contents.
--disable-chapter-detection
Prevent the automatic detection chapters.
--chapter-regex=CHAPTER_REGEX
The regular expression used to detect chapter titles. It is searched for in heading tags (h1-h6). Defaults to chapter|book|appendix
--chapter-attr=CHAPTER_ATTR
Detect a chapter beginning at an element having the specified attribute. The format for this option is tagname regexp,attribute name,attribute value regexp. For example to match all heading tags that have the attribute class="chapter" you would use "h\d,class,chapter". You can set the attribute to "none" to match only on tag names. So for example, to match all h2 tags, you would use "h2,none,". Default is $,,$
--page-break-before-tag=PAGE_BREAK
If html2lrf does not find any page breaks in the html file and cannot detect chapter headings, it will automatically insert page-breaks before the tags whose names match this regular expression. Defaults to h[12]. You can disable it by setting the regexp to "$". The purpose of this option is to try to ensure that there are no really long pages as this degrades the page turn performance of the LRF. Thus this option is ignored if the current page has only a few elements.
--force-page-break-before-tag=FORCE_PAGE_BREAK
Force a page break before tags whose names match this regular expression.
--force-page-break-before-attr=FORCE_PAGE_BREAK_ATTR
Force a page break before an element having the specified attribute. The format for this option is tagname regexp,attribute name,attribute value regexp. For example to match all heading tags that have the attribute class="chapter" you would use "h\d,class,chapter". Default is $,,$
--add-chapters-to-toc
Add detected chapters to the table of contents.
--baen
Preprocess Baen HTML files to improve generated LRF.
--pdftohtml
You must add this option if processing files generated by pdftohtml, otherwise conversion will fail.
--book-designer
Use this option on html0 files from Book Designer.
--serif-family=SERIF_FAMILY
The serif family of fonts to embed
--sans-family=SANS_FAMILY
The sans-serif family of fonts to embed
--mono-family=MONO_FAMILY
The monospace family of fonts to embed
--lrs
Convert to LRS

Created by Kovid Goyal <kovid@kovidgoyal.net>

SEE ALSO

http://calibre.kovidgoyal.net