Real-Time Crawler
A real-time crawler is a tool for data collection and is meant specifically for use with search engines and e-commerce websites. In other words, you can say a real-time crawler is an advanced form of web scraper that is meant for the extraction of heavy data.
How Does It Work?
A request is sent to a real-time crawler
A real-time crawler gets the necessary information
The requested web data is sent back to the client
Data Delivery
Using real-time data delivery method, the required data is gotten on the same connection
By this, the HTTPS connection you use in submitting your request is the same through which you will get your data. So you get real-time data extraction
Callback Data Delivery Method
Using the callback data delivery method takes away the need to keep an open connection or to check your task status. It's more convenient as a real-time crawler sends you a notification when the data you need is ready
Note that to use this data delivery method, you will need to set up a callback server. After doing that you can then create a job request and send to a real-time crawler, which will then return the job info and begin collecting the required data
Once the requested data is ready, the real-time crawler notifies you by sending a POST request to your machine with a URL to download the data in JSON or HTML format
If the IP needs to be an e-commerce platform or social media, consider selecting roxlabs dedicated computer room IP. Fast IP, easy to set, unlimited traffic.
More on:roxlabs