High Physical Memory Usage

UKSBD

Moderator
  • Dec 30, 2005
    13,033
    1
    2,831
    Can traffic (bots, scrapers, crawlers, etc.) cause high Physical Memory Usage on it's own or should I be looking for other problems?

    My site is frequently hit hard by multiple bots, scrapers, etc. and has very high Physical Memory Usage. I block ranges of IP numbers using the IP blocker in cPanel, but then get the bots receiving 404 errors. The more persistent get blocked from whole server.

    I've slowed down Bingbot, Slurp and Google, but when they are all crawling at the same time I can get 100,000+ hits in a few hours, would the traffic alone be causing the high memory usage?
     

    jacobc

    Free Member
    Jan 28, 2012
    253
    49
    Yes, that is possible but that does seem like a lot of activity, have you done a WHOIS on the IPs to see if they look legit?

    If you haven't already, I would suggest you look at any additional caching you can do, we actually blogged about a client today having resource issues that adding in some caching made massive differences - https://www.catalyst2.com/blog/want-make-wordpress-site-faster

    Regards

    Jacob
     
    Upvote 0

    UKSBD

    Moderator
  • Dec 30, 2005
    13,033
    1
    2,831
    Yes, that is possible but that does seem like a lot of activity, have you done a WHOIS on the IPs to see if they look legit?

    If you haven't already, I would suggest you look at any additional caching you can do, we actually blogged about a client today having resource issues that adding in some caching made massive differences - https://www.catalyst2.com/blog/want-make-wordpress-site-faster

    Regards

    Jacob

    The Bingbot, Slurp and Googlebot IP's are all genuine.

    Get a few others faking it to look like Google but I tend to block these.
    I also get hit by MOZ, Hrefs , Majestic, baidu and exabot a lot which I try to slow down.

    It's not a Wordpress site.
     
    Upvote 0

    UKSBD

    Moderator
  • Dec 30, 2005
    13,033
    1
    2,831
    My limits are all set really high as I get 508 errors when they are too low.

    I block via the IP blocker in cPanel, via htaccess and with my robots.txt
    The real bad ones my host blocks from the whole server.

    It's a constant battle though, checking logs and weeding out the bad bots :(
    Could the traffic alone cause the high memory and process usage, or should I be looking at something else as well?
     
    Upvote 0

    jacobc

    Free Member
    Jan 28, 2012
    253
    49
    Yes, realistically, your host should be set something like some mod sec rules up that will block all bad user agents and see if that helps, they should also be rate limiting the number of connections a single IP can make over a certain port.

    Out of interest, what software are you using as there may be some plugins that can automate the quite manual process you are currently going through.
     
    Upvote 0

    UKSBD

    Moderator
  • Dec 30, 2005
    13,033
    1
    2,831
    Problem with limiting IP's automatically is most are legitimate Bots as they crawl my site so regularly, wouldn't want to risk blocking legitimate ones.
    I do slow them down in my robots text and most honour that.

    I'm using cPanel and WHM, I grab the raw access file, save it as a csv and check for bad traffic in there.

    I can then block IP's with the IP Blocker in cPanel and also have a plugin, "ConfigServer Security & Firewall" which I can use to block from the whole server, I seldom use this though and just let my host know every now and then about the bad ones, which he then blocks.

    I don't like things to be too automated as there is risk of blocking things I don't want to.
     
    Upvote 0

    jacobc

    Free Member
    Jan 28, 2012
    253
    49
    Sounds like it will work but a bit of an endless task. Happy to have a look at your site / hosting in more detail if you want and see if I can recommend a solution that requires less manual process to reduce the load on your site.
     
    Upvote 0

    UKSBD

    Moderator
  • Dec 30, 2005
    13,033
    1
    2,831
    To be honest I'm getting pretty good at monitoring and controlling the bots.

    Main question really was, is it likely to be the traffic causing the problems or should I not be too concerned about limiting and look elsewhere for a problem?

    If it is just traffic, I will carry on limiting and slowing down
     
    Upvote 0
    Wherever the traffic is coming from, it will have a similar impact on the hosting so if you know you are getting high volumes of bots, reducing their activity is the first thing to look at to reduce server load. If you do that and resource usage doesn't drop much then look at the application driving the site, have you changed/added anything recently that could cause high resource usage? You should also have it scanned for malware, if you haven't already.
     
    Upvote 0
    E

    Edith@TerraNetwork

    Bots can cause high memory usage if pages are assembled via PHP / MySQL as they can go very fast through the site. But it is not easy to find the actual cause of high RAM usage as it could also be a combination of bots & resource intensive site script(s).

    On cPanel servers, System Health & Server Status in WHM will give you additional info on how the server is being used at any given time.

    Generally, caching will help as it will lower the PHP / MySQL usage and therefore RAM, so if you can, cache all pages as static HTML pages. Using nginx as reverse proxy can then further lower the usage on the Apache server. In short, optimising the site is always a good call, even if bots are part of the problem.
     
    Upvote 0

    astutiumRob

    Free Member
    May 5, 2004
    1,312
    241
    London
    Main question really was, is it likely to be the traffic causing the problems or should I not be too concerned about limiting and look elsewhere for a problem?
    If blocking the intensive access drops the load significantly, then yes, it's probably the amount of "visitors" causing the resource spikes. If it happens even with the blocking then the server is overloaded and further investigation is needed - what do top/atop/mtop tell you ?

    Caching or switching to a more "static" site could help, as will "controlling" the traffic - do you really want majestic or baidu listing your site ? Do they send you convertable traffic (generate income) vs
    using your transfer up (costing you money) ?

    Another option is to upgrade the hardware - depends on how close to the servers' capability "normal" usage is - throwing more RAM/CPU at it may well solve the issue.
     
    Upvote 0

    Latest Articles

    Join UK Business Forums for free business advice