blnkcheck

Langue: en

Version: Jan 2000 (mandriva - 01/05/08)

Section: 1 (Commandes utilisateur)

NAME

blnkcheck - search html pages for broken links

SYNOPSIS

blnkcheck [-AafhOs] [-n ignore-list ] [-w list ] html-files

DESCRIPTION

blnkcheck searches html files for broken links. I searches only the relative links and does not need a web-server. As it does not need web-access it is very fast. The output of blnkcheck is of the same format as gcc error messages and can therefore be interpreted by many common editors (e.g emacs or vim). After editing a some html pages you can just type:
blnkcheck page1.html ../somewhere/page2.html
and blnkcheck will check that the links in these pages are correct.

blnkcheck checks the relative filesystem links. These are links of the form:
href="index.html"
href="../somepage.html#anchor1"
etc...

but not
href="/notchecked.html"
href="http://server.somewhere/something.html"
href="javascript:history.back();"

All tags containg relative links with href=..., src=..., and background=... are checked.

OPTIONS

-A
Do not open any other files than the files given on the command line. Normally blnkcheck will validate all references to named anchors (something like href="index.html#anchor"). With this option set only the named anchors in the files that are anyhow read are checked and for the others only the existence of the file (index.html in the above case) is verified.
-a
Display a list of the absolute links of the form "protocol://". These links are not checked by blnkcheck. The output produced with this option can be post-processed by httpcheck. E.g:
blnkcheck -a *.html | httpcheck -e
Note: -a does not list href="mailto:xx".
-f
Print only absolute filesystem links. Note: blnkcheck does not validate these links, but you can post process them e.g with some shell script. In this shell script you can also take care of "Alias" definitions for web pages on your server. (Such a shell script is currently not provided together with blnkcheck. Let me know if you wrote a good generic script that could be included.)
-h
Prints a little help/usage information.
-n list
Ignore links that match the given sub-sting. from the list. The list is a comma seperated list of stings. This function is case in-senitive.
-O
Don't print a warning when a file is not readable by others (not world readable).
-s
Print some statistic about the checked links at the end of the search. This option can not be used together with -a or -f.
-w list
Warn about absolute links (ftp://, http://, https:// , /home/ etc.) which matche any of the sub-strings in the list. The list is a comma seperated list of sub-strings. This is useful to detect pages that use absolute links within your own server. A strategy that should be avoided as the pages can not be mirrored or ported to other sites. This function is case in-senitive.

EXAMPLE

Check links in html files in the web server root directory (/home/httpd/html) and in all directories one level down:
(cd /home/httpd/html; blnkcheck *.html */*.html)

Check links in all html files on the server:
(cd /home/httpd/html; blnkcheck `find . -name '*.htm*' -print` | sort)

You can use the vim editor Quickfix mode or the emacs/xemacs M-x compile to parse the output of blnkcheck. This gives you the possibility to open the concerned web page and jump directly to the line where the broken link is. To do this you can write a Makefile that looks e.g as follows:

all:
       blnkcheck `find . -name '*.htm*' -print` | sort

BUGS

no known bugs

AUTHOR

Guido Socher (guido.socher@linuxfocus.org)

SEE ALSO

blnkcheck is designed as a fast checker for web masters that have shell and file system level access to their web-pages. It can also be used if you are able to keep a mirror of the web site on your local disk.

Other programs like e.g curl (http://curl.haxx.nu/) can be used if you want to check your web-pages only remotely via a web server. curl comes also with a dead link checker called checklinks.pl.

hrefgrep(1), srcgrep(1), webfgrep(1), httpcheck(1), lshtmlref(1)