Scraping data from other directory sites to a site of my own... legal?

movietub

Free Member
Nov 6, 2008
4,858
1,106
Hi all,

A new project I'm researching...

I want to build a directory listing site for a niche sector that is currently dominated by 3 established sites.

Initially I want to add several thousand free listings to my database, essentially an aggregate of businesses identified by scraping the existing listing sites, and also by automated general web searching for relevant contact and company details.

Once established, the businesses listed would individually be contacted to 'upgrade' their listing for a small charge. But initially, and for quite some time, the listings are free and also relatively worthless until/if the directory site gains some prominence.

So the question is: If I scrape sites that already have business profiles, and replicate only what is publicly available and not copyright protected, is that legal?

The data I would scrape and use would be:
-company name
-contact number
-email
-web address
-type of business
-address

(in a nutshell, everything you might find on a business card)

What I would not scrape is:
-business description (as this may be considered copyright to the person that wrote it)
-any ranking information or other content generated by the site that is being scraped

So basically all data that could have been submitted specifically to be shown on the original site will be left. Anything that can (or could) be gained simply by visiting the businesses own website or looking in a local business directory, will be scraped.

Thoughts? :)
 

movietub

Free Member
Nov 6, 2008
4,858
1,106
let me know your IP number before you do it.
Well I wouldn't even know the IP number myself! You wouldn't scrape entire sites with a single, static IP.

But if I did use my own static IP, your site hasn't taken the most basic steps to a) prevent scraping and b) to be able to take any form of action or bring claim against anyone.

Your site is perhaps the easiest to scrape that I have ever encountered. There isn't even pagination to navigate around, you could get every single listing you have on there by scraping just 3 clicks deep and it could be achieved in less than 10 minutes. Feel free to PM me about that if it concerns you!

Also your terms of use do not prohibit commercial use or replication, and don't mention anything about usage policies with regard to scraping. So far as I can see with just a few minutes looking, all the information is in the public domain in any case so no copyright limitations.

If you detail prohibited usage scenarios in your terms you could at least claim for trespassing on your server if anyone ever did replicate all your content.
 
Upvote 0

Latest Articles

Join UK Business Forums for free business advice