Advanced Souq.com Scraper

I was working on a personal project (Machine Learning) required to scrape Souq.com public data to train my model.

After doing intensive search on the web i didn’t find any powerful scrapper which can get the information i want so i decided to develop one.

My scrapper is advanced because it’s scrape almost all the public information from Souq.com very fast and can scrape the whole website offline for BI Analysis, Machine Learning or any other purpose.

The scraper can scrape the whole Souq.com in 1~2 days at max on regular internet speeds (4mb~16mb) if you have more bandwidth it will be more faster and less time.

Scraper Model

Souq.com Scrapped Model
Souq.com Scrapped Model
  • Categories
  • Products (Bundles, Related Products, Attributes, Configurations)
  • Sellers
  • Reviews & Rating (Seller, Products)
  • Product Delivery

The project is functional but not yet completed but it’s a reference for more advanced scrappers can be built using it.

It’s now opensource and can be accessed on GitHub

https://github.com/roofman2008/AdvancedSouqScrapper

Scraper at Operation Video

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: