Latest

Tuesday, July 11, 2017

laravel 5.4 using goutte scrape large pages

Asked by: Hwong Alex


I need scrape data from outdated web system. the page like below: enter image description here

I can fetch the data well on single page test, but when I use it in loop encountered nginx 504 timeout however. Pseudocode like:

scrapePage($pageNum) {//fetch data from each page}
for($currentPage = 1; $currentPage < $totalPages; $currentPage++){scrapePage($currentPage)}

I think the best way is let it like ‘asynchronous’,when scraped one page success, output something and goto next page. How to do that?



Source

No comments:

Post a Comment

Adbox