Skip to content

Instantly share code, notes, and snippets.

@nfriedly
Last active April 27, 2024 11:37
Show Gist options
  • Star 15 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save nfriedly/1d0f81fd68addd594d4974923205c384 to your computer and use it in GitHub Desktop.
Save nfriedly/1d0f81fd68addd594d4974923205c384 to your computer and use it in GitHub Desktop.
Chirp Audiobook Download Script

Chirp AudioBook Download Script

⚠️ Not currently working. Chirp changed something that broke this script.


This script eases the process of downloading the audio files from Chirp Audiobooks. It uses the browsers console to generate a list of URLs, and then provides a list of curl or wget commands to download them.

Tested with Firefox + Terminal on MacOS, and Firefox + PowerShell on Windows 10.

As an aside, I want to give a shout out to Libro.fm for providing a simple download button for each purchase. Then you don't need a script like this!

Instructions

  1. Find the book in your Chirp Library.
    • If you've already listened to it, you may need to move it back from your Archive.
  2. Click the book to open Chirp's web player.
  3. Open the browser's Web Developer Tools.
  4. Copy-paste the script.js contents into the console and press [enter].
  5. Initiate the script:
    • If the book is already at the start, click Play (▶).
    • If the book is on any other track, open the Chapters menu (top left) and select the first Track.
  6. Wait while the script advances through each track; it's saving the URLs in the background.
    • It may say "There was an error loading your audiobook, please reload the page." under the Play button, ignore this.
    • It may also show a number of URLs in red in the console, along with a warning after each one. Ignore these also.
  7. When it reaches the final track, the script will show a list of commands on the screen in a white box.
    • Click once to highlight the complete list.
    • Copy-paste it to a command line (Terminal, Power Shell, etc.) and press [enter] to execute it.
      • Some command lines will begin executing immediately, however you still need to press [enter] to execute the final command.
  8. Once the commands finish, you should have a new folder with a cover image and each of the tracks as .m4a files.
    • On macOS, type open . and press [enter] to view the files.
    • On Windows, type explorer . and press [enter] to view the files.
  9. Check the file size of each track:
    • If any are 0 bytes, the download URL may have expired.
      • In that case, go through the process again, but in step 7, first paste the commands into a text editor and delete everything except for the ones to download the 0-byte files.

Enjoy!

const $ = document.querySelector.bind(document);
function filename(name) {
return name.replaceAll('&', 'and').replaceAll(':', ' -').replaceAll(/[^a-z0-9 ._-]+/ig, '');
}
const title = filename($('h1.book-title').textContent);
const credits = [].slice.call(document.querySelectorAll('.credit'))
.map(n => filename(n.textContent))
.join(' - ');
const dirname = `${title} - ${credits}`;
const commands = [
`mkdir "${dirname}"`,
`cd "${dirname}"`,
// note: unlike the audio files, this one doesn't need to follow redirects, so we can use the same curl command everywhere.
`curl -o "cover.jpg" "${$('.cover-image').src }"`
];
const tracks = [];
let count = 0;
function addUrl(url) {
count += 1;
const chapter = filename($('div.chapter').textContent);
tracks.push({
count,
chapter,
url
})
}
function showCommands() {
const padSize = tracks.length.toString().length;
// MacOS comes with curl but not wget. Windows powershell has fake versions of both.
// The "real" curl needs the -L (--location) flag set to know to follow redirects.
// The fake windows version turns it on by default and *refuses to work if you set the flag manually*.
// So, we generate correct curl commands for mac and correct wget commands for Windows/Linux/etc.
const isMac = navigator.userAgent.includes('Macintosh');
const cmd = isMac ? 'curl -L -o' : 'wget -O';
tracks.forEach(({count, chapter, url}) => {
let trackNum = count.toString().padStart(padSize, "0");
commands.push(`${cmd} "${title} - ${trackNum} - ${chapter}.m4a" "${url}"`);
})
const div = document.createElement('div');
div.innerHTML = '<div style="position: absolute; top: 100px; left: 100px; z-index: 100000; background: white; padding: 10px;"><p>Copy these commands to PowerShell/Terminal/etc:</p><textarea id="dl-commands" style="min-height:20em; min-width:30em"></textarea></div>';
document.body.appendChild(div);
const textarea = document.querySelector('#dl-commands');
textarea.value = commands.join('\n');
textarea.onfocus = function(){this.select()};
}
function next() {
const btn = $('button.next-chapter')
if (btn.disabled) {
showCommands()
} else {
btn.click();
}
}
const audio = $('audio');
Object.defineProperty(audio, "src", {
get() {
return '';
},
set(url) {
setTimeout(() => {
addUrl(url);
next();
}, 500);
},
});
@nfriedly
Copy link
Author

@Sbackus65 it might be something about those books that the script doesn't account for; do you mind seeing me an email - nathan@[my GitHub username].com ?

@Sbackus65
Copy link

@nfriedly email sent. I included links to a couple of books and screenshots of the results. Thanks again!

@SteveG-23
Copy link

SteveG-23 commented Dec 15, 2023

I am using Windows 11. I have little expertise in Javascript or the "Web developer tools" console.
I tried this repeatedly yesterday in Firefox and Chrome and had a maddening series of problems:

  1. The script would run only through generating the MKDIR command and the WGET commands for the first couple of chapters, then would stop.
  2. I used editing tools to extrapolate the entire series of commands, ran them in Terminal and in various versions of Powershell. In one version I would get "WGET is not a recognized command." In others I would get an error that the & (ampersand) character (which is contained in the generated URLs) was not a permitted character and had to be enclosed in double quotes. (Fixing that didn't help.)
  3. Each time I tried to run the JS I would get a error "Cennot redefine property: src" or other elements, although I had cleared the console, and closed and restarted the browser, in between. This happened in both Chrome and Firefox.

Today I tried again, using Chrome, and it worked perfectly for one audiobook. But when I try again with another - after clearing the console and, for good measure, closing Chrome and starting from scratch - I get nothing but this:
<audio preload="none" class="embedded-player" src="[the URL from the previous audiobook]"></audio>``
What am I missing?

@nfriedly
Copy link
Author

Hey @SteveG-23, sorry it's giving you so much trouble.

The script would run only through generating the MKDIR command and the WGET commands for the first couple of chapters, then would stop.

wget timing out is something I've experienced intermittently. I'm not really sure what the cause is, but it usually worked for me on the second try.

I used editing tools to extrapolate the entire series of commands, ran them in Terminal and in various versions of Powershell. In one version I would get "WGET is not a recognized command." In others I would get an error that the & (ampersand) character (which is contained in the generated URLs) was not a permitted character and had to be enclosed in double quotes. (Fixing that didn't help.)

Normally powershell has a wget alias that runs some microsoft-made compatible equivalent. The script should generate commands with quotes around all of the urls; maybe one got added or removed somewhere that in turn broke the rest?

error "Cennot redefine property: src"

Ut-oh. That's basically how the script intercepts the URLs for the audio files. Without that, the script won't be able to accomplish much.

I wonder if chirp noticed this script and is trying to block it :(

@nfriedly
Copy link
Author

I just made a couple of tweaks to the script that should help:

  1. The list of commands is now displayed in a text box in the web page. Clicking it will select the entire list, making it easier to copy.
    • I've noticed that browsers sometimes collapse longer lists of commands, and can also stick other junk in the console, both of which can cause an incorrect list of commands to get copied. This should avoid that.
  2. I tweaked the filenames to ensure the track numbers are always padded with 0's to make them sortable by filename.
    • (Previously this only worked for books with <100 tracks.)

I tested on the Chronicles of Narnia collection, which is over 100 tracks, and it seems to work well. (Tested on Firefox on macOS, with uBlock origin blocking whatever tracking scripts Chirp happens to use.)

@Bostwickenator
Copy link

Bostwickenator commented Dec 15, 2023

Hey I've banged together a python script that uses FFmpeg to repack all the files returned by your implementation into a single m4a with all the chapter metadata and cover included. I can share a gist if that's useful.

Shared as a fork of this repo. It's pretty rough and almost certainly will need some path management tweaks for unix systems but it runs on windows if ffmpeg and ffprobe are available. You should be able to drag and drop the output folders from your script @nfriedly onto this (or otherwise pass them as a command line parameter)

@Bostwickenator
Copy link

Can you add a cd .. at the end of the generated commands so we can run one set after another without getting into a tree of books within books.

ps. Thank you so much for this script

@CommanderJoy
Copy link

CommanderJoy commented Feb 7, 2024

Just getting back to this process now. On Step 8: Once the commands finish, you should have a new folder with a cover image and each of the tracks as .m4a files.
On macOS, type open . and press [enter] to view the files. WHERE are you typing open .? While still in Terminal on a new line?
At the moment I can't find the file. I know where audiobooks are kept in Books (Library/Containers/ etc. but I am not sure where the folder created by the code is located so I can import it into Books.
Update: Well I found where the folder winds up—in the top home folder. However it is totally empty-no files.

@nfriedly
Copy link
Author

nfriedly commented Feb 7, 2024

@CommanderJoy

WHERE are you typing open .? While still in Terminal on a new line?

Yes, on a new line in Terminal

However it is totally empty-no files.

That sounds like something must have gone wrong. If you still have the Terminal open, can you copy-paste it into an email to me?(nathan@[my github username].com)

@CommanderJoy
Copy link

That sounds like something must have gone wrong. If you still have the Terminal open, can you copy-paste it into an email to me?(nathan@[my github username].com)

Thanks for the help, I did the process again, which actually was challenging because after clearing history, emptying cache, etc., when I posted the code into the console it turned blue and wouldn't process. Took several times quitting Safari, etc. Finally got it to process, input the code it spit out into Terminal, copied and pasted it all in an email to you. Hope I got your email address correct!

@nfriedly
Copy link
Author

nfriedly commented Feb 8, 2024

@CommanderJoy got your email, and I see the problem (posting here in case anyone else hits the same issue):

-bash: wget: command not found

wget is the command that downloads the files and gives them proper file names; it's apparently not installed on your computer.

To install wget on your mac, first install Homebrew:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Then use the brew command to install wget:

brew install wget

(There are other ways to install wget, but I this is the one I use and would recommend.)

Edit: I just added an abbreviated version of this to step 7 in the instructions.

@CommanderJoy
Copy link

CommanderJoy commented Feb 8, 2024

wget is the command that downloads the files and gives them proper file names; it's apparently not installed on your computer.

I am VERY cautious about what I install on my computer, perhaps I am old school, and I know you guys that are more tech-involved don't think twice. What is wget, how does it access the OS/system and does it/can it be used in ways I might not like? Do I need to give it permissions? Apologies if this seems silly/ignorant, but I don't like to assume. Please understand I am not a developer, and have only opened Mac Terminal TWICE before to do something incredibly simple, so installing an entire package just to install wget for this purpose is causing me to pause. Is there another way?

@nfriedly
Copy link
Author

nfriedly commented Feb 8, 2024

wget is a command-line download tool. It downloads files and saves them to wherever you tell it. It's open source; it comes from the GNU project, which includes many of the core tools that are usually bundled with Linux. But, because it's open source, you can also install it on macos.

It's homepage is at https://www.gnu.org/software/wget/ and you can also read a bit more about it on wikipedia: https://en.wikipedia.org/wiki/Wget

It does not need any special permissions. It doesn't really access your OS. If someone else had access to your computer, I suppose they could use wget to download something, but if they had access to run it, then they presumably would also have the access to install it if it wasn't there.

@CommanderJoy
Copy link

I downloaded wget2-latest.tar.lz.sig, however Macs can't open that file. What do I need to open and install it?

@nfriedly
Copy link
Author

nfriedly commented Feb 8, 2024

I downloaded wget2-latest.tar.lz.sig, however Macs can't open that file. What do I need to open and install it?

I believe the GNU website only serves up source code, not a compiled executable that you can just install and use. You could go that route, but it's the hard way, and I'm not even sure I could guide you through all of the steps. (Also, I think the .sig file is just a signature that can be used to verify the download wasn't corrupted.)

To install wget, you should follow the steps I outlined earlier. Homebrew is a package manager that that installs other software for you, and it provides a compiled version of wget that works on macos.

@nfriedly
Copy link
Author

nfriedly commented Feb 8, 2024

I wonder if I could switch wget for curl? curl is a bit more complex, but it looks like it's pre-instaled on macos. Let me test a little bit...

@nfriedly
Copy link
Author

nfriedly commented Feb 8, 2024

Ok, I think that works! I've switched the script from wget to curl, which appears to be built-in on MacOS, and has a wrapper that makes it work in Windows PowerShell.

@CommanderJoy Please refresh the page and then try again from the start. The generated commands should all be curl instead of wget now, and that should work without installing anything additional on your mac.

@CommanderJoy
Copy link

CommanderJoy commented Feb 8, 2024

Hooray!! That worked!! Thank you so much for spending the time on this so that those of us with Macs who might not have all the tech-savvy can accomplish this. You are very generous with your help, it is MUCH appreciated! I really like getting to the bottom of things like this to understand why something might not work, and hopefully, finally resolve it. On behalf of us Mac folks, thank you!

Update, well it looks like there is still an issue. The files show up in the folder, but there is no audio in the file. Each file is 249Kbytes. emailed Terminal info so you could see and maybe figure it out.

@nfriedly
Copy link
Author

nfriedly commented Feb 9, 2024

Apologies @CommanderJoy, in my swapping back and forth between windows and mac, I ended up testing slightly different versions of the script. The "real" curl needs a -L (or --location) flag to know to follow redirects, which is what each of those 249kb files was. But, windows doesn't have the real curl tool, it has some microsoft-developed lookalike that doesn't work quite the same. It follows redirects by default, and throws an error if you add the -L or --location flag.

So, I ended up changing the script to emit a curl command with -L for Mac users, and a wget command for everyone else.

I've now tested the updated script on both windows and mac (the same version of it on both this time!), and was able to successfully download a book on each.

@CommanderJoy
Copy link

CommanderJoy commented Feb 9, 2024 via email

@CommanderJoy
Copy link

CommanderJoy commented Feb 11, 2024 via email

@Bostwickenator
Copy link

Bostwickenator commented Feb 11, 2024 via email

@nfriedly
Copy link
Author

nfriedly commented Feb 12, 2024

Huh, apparently Chirp is a little inconsistent about their audio formats. I just checked 15 books from my library and 14 were indeed m4a's (AAC codec), but the last one was actually .mp3 (mislabeled as .m4a by this script.) Unfortunately, I'm not sure if there is a straightforward way to handle that, as we're setting the filename before the file is downloaded, but it has to be downloaded before we could do anything to determine the file type.

A renaming script for the ones that turn out to be mp3's would be a good idea, but probably not something I'm going to have done today.

@Bostwickenator I just took a look at your combining script - that's pretty cool! (I don't want to add the cd .. to the main script, because I think leaving the user in the book directory is a bit easier for new users, but of course you're welcome to add it to your fork :)

I think what ultimately needs to happen is that someone needs to turn this into a browser extension that downloads all of the files to the user's Downloads folder, figures out the correct filename, and potentially even combines them to a single file with a wasm'd version of ffmpeg. I might do it eventually, but not today...

@Bostwickenator
Copy link

Bostwickenator commented Feb 12, 2024 via email

@nfriedly
Copy link
Author

Yeah, the browser extension should probably normalize the metadata also...

Heads up the FFmpeg step isn't super fast, only 70-100x playback speed if I
remember correctly.

I think there's a way to tell ffmpeg to copy over the audio to the new file without re-encoding it, which should be faster. Something like https://superuser.com/a/1156327/351654

@Bostwickenator
Copy link

Bostwickenator commented Feb 12, 2024 via email

@CommanderJoy
Copy link

Just as an update, on a Mac and because I use Audiobook Builder which seems to be finicky about filenames, in the code I substituted mp3 and that works perfectly.

@Sbackus65
Copy link

In the last day or so I have gotten the following comment after each wget command: "HTTP request sent, awaiting response... 401 Unauthorized

Username/Password Authentication Failed." I have tried with Chrome and FIrefox, and with WIndows 10 PowerShell and wiith Ubuntu Konsole...

@nfriedly
Copy link
Author

@Sbackus65 Yep, it seems broken for me too now. Chirp must have changed something.

@CommanderJoy
Copy link

Well the code is no longer working with Chirp, and I wonder if this is no longer possible. While I could insert the code into the javascript console and I could put the code into Terminal, all files are 37 kb and do not play. A sad day for sure. Is there any way to come up with a fix for whatever they did?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment