Create an offline-only HTML file, mirroring external resources.
Iri al dosiero
Jaidyn Ann 359696e5e2 Update README.md, mention new --source argument 2024-05-31 22:37:03 -05:00
src Add --source & :HTML-URL to resolve relative URLs 2024-05-31 22:29:47 -05:00
t Add --source & :HTML-URL to resolve relative URLs 2024-05-31 22:29:47 -05:00
COPYING.txt Add README.md and COPYING.txt 2024-05-31 22:04:02 -05:00
Makefile Init basic tests 2024-05-27 23:59:55 -05:00
README.md Update README.md, mention new --source argument 2024-05-31 22:37:03 -05:00
mirror-img.asd Replace our RELATIVE-PATHNAME with PATHNAME-UTIL’s 2024-05-31 14:44:33 -05:00

mirror-img

mirror-img is a command-line tool that creates a “local” version of an HTML file, mirroring its remote images, stylesheets, and other resources.

Usage

usage: mirror-img [-h] [-d DIR] [-b BASE] [-s URL] HTML_FILE
       mirror-img [-h] [-d DIR] [-b BASE] [-s URL]
Available options:
  -h, --help          print this help text.
  -b, --base ARG      path to mirror directory used in URLs
  -s, --source ARG    URL used to resolve & mirror relative URLs
  -d, --downloads ARG directory for all mirrored files

Examples

Remote

In order to mirror a webpage, you can simply download it and pipe it into mirror-img:

$ curl https://www.gnu.org/philosophy/philosophy.html | mirror-img > philosophy.html

And now philosophy.html is a fully-local HTML file with no external resources!

… at least, it would be. Notice how some resources, like the CSS, dont load. This is because they are defined as relative links (e.g., “../style.css” rather than “https://invalid.tld/style.css”). In order for these to be mirrored as well, mirror-img needs to somehow know the source URL.

You can use the --source argument to provide the source URL, so relatively-linked resources can be mirrored, too:

$ SOURCE_URL="https://www.gnu.org/philosophy/philosophy.html"
$ curl "$SOURCE_URL" | mirror-img --source "$SOURCE_URL" > philosophy.html

Now were done! All mirrored content will be found in the mirror/ directory, and all links have been adjusted accordingly.

Local

If youd like to change the download directory, you can use the --downloads argument. To change the directory used in the output-HTMLs URLs, you can use --base.

For example, if youd like to mirror files into /tmp/mirrors/ but have URLs start with mirrors/ rather than /tmp/mirrors:

$ mirror-img --base "mirrors/" --downloads /tmp/mirrors/ index.html > new-index.html

… now new-index.html contains that local version of index.html!

Installation

Making a binary requires an implementation of Common Lisp installed: Steel Bank Common Lisp is our implementation-of-choice. Its available on most operating systems under the package name sbcl.

You also need the library-manager Quicklisp, which can be installed quite easily, including via our Makefile.

To install Quicklisp, build a binary, and install it, simply:

$ make quicklisp
$ make build
$ sudo cp mirror-img /usr/local/bin/mirror-img

Bam, you've made and installed a binary! Cool!

Tests

mirror-imgs tests can be run from a REPL using ASDF:TEST-SYSTEM, or from the Makefile target “test”.

* (asdf:test-system :eksd)
* (asdf:test-system :eksd/unix)
$ make test

Misc