Add Hashgraph Online docs scraper (standards-sdk + registry-broker) by kantorcodes · Pull Request #2660 · freeCodeCamp/devdocs

Closes #2642

This adds a new hol scraper for Hashgraph Online documentation under https://hol.org/docs/, scoped to:

  • libraries/standards-sdk/*
  • registry-broker/*

What was added

  • lib/docs/scrapers/hol.rb
  • lib/docs/filters/hol/clean_html.rb
  • lib/docs/filters/hol/entries.rb
  • public/icons/docs/hol/16.png
  • public/icons/docs/hol/16@2x.png
  • public/icons/docs/hol/SOURCE

The scraper uses Docusaurus-specific HTML cleanup, section-level entry extraction (h2/h3 ids), and limits crawl scope to avoid non-reference content.

Validation

Validated in Docker with Ruby 3.4.8:

  • bundle exec thor docs:list | grep -i hol
  • bundle exec thor docs:page hol libraries/standards-sdk/hcs-10/api/
  • bundle exec thor docs:page hol registry-broker/api/client/
  • bundle exec thor docs:generate hol --force

If you’re adding a new scraper, please ensure that you have:

  • Tested the scraper on a local copy of DevDocs
  • Ensured that the docs are styled similarly to other docs on DevDocs
  • Added these files to the public/icons/docs/hol/ directory:
    • 16.png: a 16×16 pixel icon for the doc
    • 16@2x.png: a 32×32 pixel icon for the doc
    • SOURCE: A text file containing the URL to the page the image can be found on or the URL of the original image itself