Do hosts block bots without asking

UKSBD

Moderator
  • Dec 30, 2005
    13,042
    1
    2,840
    I recently moved the hosting on one of my sites

    Looking at raw stats since from 19:00 until 22:00 this evening it's had 23,000 hits

    6,000 return 200

    1,500 return 301

    15,000+ return 404

    The majority of 404's are hrefs, semrush, majestic, moz

    I have none of these blocked in my robots or htaccess file

    Do hosts block bots without the clients permission?
     

    WESH.UK

    Free Member
  • Aug 11, 2018
    142
    40
    Greater London
    wesh.uk
    Do hosts block bots without the clients permission?
    Typically, no. But, it depends on your hosting company. Some of the many "unlimited" hosts do put severe limits and restrictions on known crawlers to avoid their "unlimited" offerings being chewed up with sites being indexed or audited.

    What have your hosting company said about this?

    Have you looked at the URL's that these 404's are for? Could they be stale sitemaps or broken links from other sites that are sending the bots to the 404's repeatedly? As those are all SEO Tools, it could be a competitor, checking their competitors, using incorrect info and links too, or old data.

    Keep in mind that "hits" are very different from visitors and page views too.... Very different.
     
    • Like
    Reactions: UKSBD
    Upvote 0
    We certainly have a bad bots block list, not just about restricting bandwidth usage although it certainly helps, this is fully part of a good anti-malware setup, most bad bots are crawling to find backdoors into WP etc and all of them are taking resources away from customers. The level of blocking will depend on the host though, we don't block legitimate crawlers like Ahrefs or other SEO tools just known bad bots. A bad bot would be sent a 406 not a 301 or 404
     
    • Like
    Reactions: UKSBD
    Upvote 0

    UKSBD

    Moderator
  • Dec 30, 2005
    13,042
    1
    2,840
    Have you looked at the URL's that these 404's are for? Could they be stale sitemaps or broken links from other sites that are sending the bots to the 404's repeatedly?

    The vast majority are existing URL's that are visatable by others, something just appears to be blocking the bots.

    I don't particularly mind, it would just be nice to know which are being blocked
     
    Upvote 0

    Latest Articles

    Join UK Business Forums for free business advice