Open Collective
Open Collective
Loading

x-ray

COLLECTIVE
Open source

The next web scraper. See through the <html> noise

Contribute


Become a financial contributor.

Financial Contributions

Recurring contribution
Backers

Support us with a monthly donation and help us continue our activities.

Starts at
$2 USD / month

Latest activity by


+ 7
Recurring contribution
Sponsors

Become a sponsor and get your logo on our website and on our README on Github with a link to your site.

Starts at
$100 USD / month

Latest activity by


Be the first one to contribute!
Custom contribution
Donation
Make a custom one-time or recurring contribution.

Latest activity by


+ 7

Top financial contributors

Organizations

1
Nethome.wiki

$120 USD since Apr 2019

2
Crawlbase

$108 USD since Feb 2019

3
ScrapingBee

$108 USD since Oct 2019

4
DateiWiki

$78 USD since Dec 2020

5
Scraper API

$62 USD since Aug 2018

6
Itqna

$12 USD since Dec 2020

7
Bluehost vs Squarespace

$10 USD since May 2019

8
Scraloud

$4 USD since May 2019

9
Unhype

$2 USD since Aug 2018

10
Scrapingdog

$2 USD since Jun 2020

Individuals

1
John Packel

$20 USD since Jan 2017

x-ray is all of us

Our contributors 12

Thank you for supporting x-ray.

Nethome.wiki

Backers

$120 USD

Crawlbase

Backers

$108 USD

ScrapingBee

Backers

$108 USD

DateiWiki

Backers

$78 USD

Scraper API

Backers

$62 USD

Awesome project!

John Packel

Backers

$20 USD

Itqna

Backers

$12 USD

Bluehost vs S...

Backers

$10 USD

Scraloud

Backers

$4 USD

Give value they deserve.

Unhype

Backers

$2 USD

Scrapingdog

Backers

$2 USD

Budget


Transparent and open finances.

View all transactions
+$2.00USD
Completed
Contribution #110664
+$2.00USD
Completed
Contribution #54355
+$2.00USD
Completed
Contribution #44064
$
Today’s balance

$389.59 USD

Total raised

$389.59 USD

Total disbursed

--.-- USD

Estimated annual budget

$80.00 USD

About


Features

Flexible schema: Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing.

Composable: The API is entirely composable, giving you great flexibility in how you scrape each page.

Pagination support: Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't lose what you've already scraped.

Crawler support: Start on one page and move to the next easily. The flow is predictable, following a breadth-first crawl through each of the pages.

Responsible: X-ray has support for concurrency, throttles, delays, timeouts and limits to help you scrape any page responsibly.

Pluggable drivers: Swap in different scrapers depending on your needs. Currently supports HTTP and PhantomJS driver drivers. In the future, I'd like to see a Tor driver for requesting pages through the Tor network.

Our team