x-ray
The next web scraper. See through the <html> noise
Contribute
Become a financial contributor.
Financial Contributions
Top financial contributors
Organizations
$140 USD since Apr 2019
$128 USD since Oct 2019
$108 USD since Feb 2019
$98 USD since Dec 2020
$62 USD since Aug 2018
$12 USD since Dec 2020
$10 USD since May 2019
$4 USD since May 2019
$2 USD since Aug 2018
$2 USD since Jun 2020
Individuals
$20 USD since Jan 2017
$1 USD since Jul 2024
x-ray is all of us
Our contributors 13
Thank you for supporting x-ray.
Matthew Mueller
Nethome.wiki
Backers
$140 USD
ScrapingBee
Backers
$128 USD
Crawlbase
Backers
$108 USD
DateiWiki
Backers
$98 USD
John Packel
Backers
$20 USD
Itqna
Backers
$12 USD
Bluehost vs S...
Backers
$10 USD
Unhype
Backers
$2 USD
Scrapingdog
Backers
$2 USD
Budget
Transparent and open finances.
Credit from ScrapingBee to x-ray •
Credit from Nethome.wiki to x-ray •
$432.45 USD
$432.45 USD
--.-- USD
$73.00 USD
About
Features
Flexible schema: Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing.
Composable: The API is entirely composable, giving you great flexibility in how you scrape each page.
Pagination support: Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't lose what you've already scraped.
Crawler support: Start on one page and move to the next easily. The flow is predictable, following a breadth-first crawl through each of the pages.
Responsible: X-ray has support for concurrency, throttles, delays, timeouts and limits to help you scrape any page responsibly.
Pluggable drivers: Swap in different scrapers depending on your needs. Currently supports HTTP and PhantomJS driver drivers. In the future, I'd like to see a Tor driver for requesting pages through the Tor network.
Our team
Matthew Mueller