Skip to content

Instantly share code, notes, and snippets.

@crittermike
Last active March 26, 2024 22:49
Show Gist options
  • Save crittermike/fe02c59fed1aeebd0a9697cf7e9f5c0c to your computer and use it in GitHub Desktop.
Save crittermike/fe02c59fed1aeebd0a9697cf7e9f5c0c to your computer and use it in GitHub Desktop.
Download an entire website with wget, along with assets.
# One liner
wget --recursive --page-requisites --adjust-extension --span-hosts --convert-links --restrict-file-names=windows --domains yoursite.com --no-parent yoursite.com
# Explained
wget \
--recursive \ # Download the whole site.
--page-requisites \ # Get all assets/elements (CSS/JS/images).
--adjust-extension \ # Save files with .html on the end.
--span-hosts \ # Include necessary assets from offsite as well.
--convert-links \ # Update links to still work in the static version.
--restrict-file-names=windows \ # Modify filenames to work in Windows as well.
--domains yoursite.com \ # Do not follow links outside this domain.
--no-parent \ # Don't follow links outside the directory you pass in.
yoursite.com/whatever/path # The URL to download
@BradKML
Copy link

BradKML commented Apr 2, 2023

Recently discovered random-wait as an option from here, should be included to make things less sus https://gist.github.com/stvhwrd/985dedbe1d3329e68d70

@BradKML
Copy link

BradKML commented Apr 10, 2023

Just realized that --no-cobbler and --mirror conflicted, and such should use -l inf --recursive instead? https://stackoverflow.com/questions/13092229/cant-resume-wget-mirror-with-no-clobber-c-f-b-unhelpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment