v0.1.0
The first public release of goodread: the full command surface, the goodread library, open-endpoint routing, and the crawl pipeline.
The first public release. goodread is a single pure-Go binary that turns public
Goodreads pages into structured records: look up a book, an author, a series, a
list, a genre, a user, or a quote, search the catalog, read a shelf, and crawl
in bulk. It talks to www.goodreads.com over plain HTTPS with no API key, so
there is nothing to sign up for and nothing to pay for.
What you get
- Search the catalog.
goodread searchqueries the open autocomplete endpoint for books and authors, with--booksfor rich book records and--htmlfor the full search page. - Look up records.
book,author,series,list,genre,user, andquoteeach take an id or a URL and return a structured record, JSON-LD first with an HTML-selector fallback. - Read shelves.
goodread shelfreads a reader's bookshelf from the public RSS feed by default, with--htmland--max-pagesto walk the paginated shelf when you need more. - Find related work.
similarandreviewsread what a book page links to, andidclassifies a URL into (entity, id) without fetching. - Crawl in bulk.
seeddiscovers URLs from the sitemap,crawldrains the queue into a local SQLite store, anddbinspects and exports what you collected.cachemanages the on-disk page cache.
Open-endpoint routing
Goodreads sits behind an AWS WAF that intermittently challenges some HTML pages.
goodread routes around it where it can: search uses the autocomplete JSON
endpoint and shelf uses the public RSS feed, both un-challenged. The commands
that read /book/show/ (book, similar, reviews) can meet a challenge; when
they do, goodread exits cleanly with code 5 and the hint suggests --cookies to
lend a signed-in session. See troubleshooting.
The crawl pipeline
For more than a page at a time, the pipeline is seed to discover, crawl to
fetch and parse, and db to export. Everything lands in one SQLite file under
the data dir, with a content-addressed gzip page cache beside it so re-runs do
not re-fetch unchanged pages. goodread is polite by default: a two second delay
between requests and two workers.
The goodread library
The parsing and fetching live in their own package so you can read Goodreads pages from your own program without the CLI:
import "github.com/tamnd/goodread-cli/pkg/goodread"
c := goodread.New()
book, err := c.Book(ctx, "2767052")
if err != nil {
log.Fatal(err)
}
fmt.Println(book.Title, book.AvgRating)
Independent and public-data only
goodread is an independent, open-source tool. It is not affiliated with, endorsed by, or sponsored by Goodreads or Amazon. It reads only public pages, at a polite default rate.
Install
go install github.com/tamnd/goodread-cli/cmd/goodread@latest
Prebuilt archives for Linux, macOS, Windows, and FreeBSD, plus Linux packages (deb, rpm, apk), SBOMs, and cosign-signed checksums, are on the release page. There is also a Homebrew cask and a Scoop entry:
brew install --cask tamnd/tap/goodread
The multi-arch container image is on GHCR:
docker run --rm ghcr.io/tamnd/goodread:0.1.0 search "the hunger games"
The binary is pure Go (CGO_ENABLED=0) with no runtime dependencies.