QA: Use HTML Proofer To Check Internal Links

Uses Ruby html-proofer to check the links.  This commit also fixes the
various problems it found, as well as dealing with some of its
non-problem complaints (it doesn't like anchor (a) tags without either
an href, name, or id).

Running HTML proofer takes about 12 minutes on my system (with up to two
threads), during which it prints no text. Travis CI times out after 10
minutes of nothing being written to stdout, so this commit also adds a
background process the Makefile to print a line every minute while make
runs.
This commit is contained in:
David A. Harding 2015-03-14 14:39:33 -04:00
parent d44692c75b
commit d954708ef1
No known key found for this signature in database
GPG key ID: 4B29C30FF29EC4B7
42 changed files with 229 additions and 145 deletions

52
_contrib/bco-htmlproof Executable file
View file

@ -0,0 +1,52 @@
#!/usr/bin/env ruby
require 'html/proofer'
## Will throw an exception (exiting false) if any internal links don't
## work. The Makefile will terminate on the failure
HTML::Proofer.new(
## To test, uncomment the array below and comment out ./_site and :disable_external
#[ "/foo/bar#baz", "/foo/bar", "#", "#wallet", "/foo.css", "/bar.png", "/zh_TW/bitcoin-for-businesses" ],
"./_site",
{
## Disable external link checking by default to avoid spurious
## Travis CI failures. TODO: take an argument to optionally
## enable external link checking as part of the Makefile
## manual checks
:disable_external => true,
## Links to ignore
:href_ignore => [
'#', ## hrefs pointing to the current page (htmlproofer fails them)
/^\/bin/, ## /bin dir is not part of repository; holds Bitcoin Core binaries
/^\/stats/ ## /stats dir is not part of repository; generated by separate stats script
],
## Mangle links. If we enable external link checking, this will
## require updating
:href_swap => {
## (Hack) Append '$' to the ends of filenames we don't want to append .html to
/(css|png|rss|pdf|jpg|asc)$/ => '\1$',
## Append .html to end of most URLs so proofer can find the local files
/^(
[^#]+ ## Don't match URL containing a hash, we'll deal with them separately
[^\/$] ## Don't match URLs ending in a slash or $
)$/x => '\1.html',
## Insert .html between page and anchor, but only if there's a
## page name
/^(.+)(#.+)/ => '\1.html\2',
## (Un-hack) Remove previously-appended '$' from URLs
/\$$/ => '',
},
## It'd be nice if we had a _local_config.yaml file or something
## for settings specific to particular systems, but for now I
## think 2 is a good setting for Travis CI ("1.5 processors")
## and me (usually 2 processors)
:parallel => { :in_processes => 2 }
}
).run