Googlebot freezes Apache and server load increases

Written by - 0 comments

Published on - Listed in Linux Internet Apache


Arrrrgghh!
This is pretty much the summary of my research for the last couple of days. For several days now I have a weird behavior of Apache where suddenly the load increases and some Apache child processes use up to 100% of the CPU.

Top shows that there are 3 Apache processes which use the most % of CPU:

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5439 www-data  20   0  626m 170m  64m S   76  4.3  51:43.32 apache2
 4355 www-data  20   0  680m 173m  58m S   43  4.4  34:08.04 apache2
 3522 www-data  20   0  630m 205m  64m S   39  5.2  39:03.98 apache2

If we take a detailed look of open connections by using the lsof command, we can see the following:

# lsof -i :80
COMMAND   PID     USER   FD   TYPE     DEVICE SIZE NODE NAME
apache2  3522 www-data   26u  IPv6 1210315466       TCP server:www->crawl-66-249-71-78.googlebot.com:54107 (CLOSE_WAIT)

apache2  4355 www-data   39u  IPv6 1210322237       TCP server:www->crawl-66-249-66-136.googlebot.com:36722 (CLOSE_WAIT)

apache2  5439 www-data   26u  IPv6 1210335205       TCP server:www->crawl-66-249-66-136.googlebot.com:62305 (CLOSE_WAIT)

apache2  5439 www-data   30u  IPv6 1210345350       TCP server:www->crawl-66-249-66-136.googlebot.com:40885 (CLOSE_WAIT)

apache2 13904 www-data    3u  IPv6 1210044289       TCP *:www (LISTEN)

apache2 14633 www-data    3u  IPv6 1210044289       TCP *:www (LISTEN)

apache2 14633 www-data   28u  IPv6 1210440119       TCP server:www->195.188.250.137:17518 (ESTABLISHED)

apache2 16314     root    3u  IPv6 1210044289       TCP *:www (LISTEN)

Surprise, surprise. We find the same processes found in the top output again. And we also see that they're not listening to new http connections anymore (meanwhile 3 new child processes were spawned). But the old processes are still open due to a CLOSE_WAIT status between Apache and the Googlebot.

The problem now is: What can I (and anyone else who experiences this problem) do? By definition a CLOSE_WAIT means that the remote side has closed the connection, but the local process still kept it open. Why does it only happen with Googlebot (which could prove an improper CLOSE from the remote side)?
If anyone has a solution for that problem, please let me know. And no, blocking Googlebot is not an option.

As of now the only temporary solution is to kill the affected child processes. This is not dangerous since all other http connections are managed by the new spawned processes, but it is not nice (remember, killing is not nice).

Update February 7th 2011: I was able to identify the reason and solve this, see Googlebot and Apache CLOSE_WAIT's: SOLVED!


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.

RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Office   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder