Handling different timezones of HAProxy logs in ELK stack

Written by - 1 comments

Published on - last updated on March 19th 2020 - Listed in Docker Rancher Linux ELK HAProxy Containers


This is a follow-up article on my previous post (Docker container syslog logs not appearing in ELK Kibana (different timezone)). I had the problem that a Docker container, which runs HAProxy, sent its logs in UTC (because the container's timezone was set in UTC). But the receiving Logstash server (running in UTC+2) thought it had to subtract two hours again. Which resulted in a time gap of 4 hours. The log was received at 13:38 UTC+2, the entry was inserted into Elasticseach as 09:38 UTC:

Aug 25 13:38:00 inf-elk01-t logstash[8657]:      "@timestamp" => 2017-08-25T09:38:00.000Z,

To tackle this problem, I had to get myself a bit into the Logstash filters and their possibilities. But first let me outline the setup so the configs make sense at last.

Usually my HAProxy machines run as VM's or inside a Linux container (LXC). HAProxy writes local log files which are then parsed by Filebeat and sent to Logstash (see article Install and configure Elastic Filebeat through Ansible for additional information on that). As you can see on that article, Filebeat appends the "document_type" with the value "haproxy".
In the Docker environment, which runs on Rancher, we use the Rancher-internal load balancer service. As this is based on a container image we cannot interfere with, the only way to get the HAProxy logs from within the container is to modify the HAProxy config. In Rancher this can be done by using the column "custom haproxy.cfg". Entries in there will be merged with the dynamically created configuration. I added the relevant parts for the logging:

Rancher custom haproxy.cfg

global
  log logstash.example.com:514 local0
  log logstash.example.com:514 local1 notice

defaults
  log global
  option httplog
  mode http

HAProxy allows logging to a central syslog server via IP and UDP port. So in this case I configured rsyslog on the Logstash machine (logstash.example.com) to listen to and receive syslog messages.

On the Logstash machine, the central syslog server forwards into Logstash (which listens on port 10514):

root@logstash:~# cat /etc/rsyslog.d/01-json.conf
template(name="json-template"
  type="list") {
    constant(value="{")
      constant(value="\"@timestamp\":\"")     property(name="timereported" dateFormat="rfc3339")
      constant(value="\",\"@version\":\"1")
      constant(value="\",\"message\":\"")     property(name="msg" format="json")
      constant(value="\",\"sysloghost\":\"")  property(name="hostname")
      constant(value="\",\"severity\":\"")    property(name="syslogseverity-text")
      constant(value="\",\"facility\":\"")    property(name="syslogfacility-text")
      constant(value="\",\"programname\":\"") property(name="programname")
      constant(value="\",\"procid\":\"")      property(name="procid")
    constant(value="\"}\n")
}

root@logstash:~# cat /etc/rsyslog.d/99-remote.conf
*.*                       @localhost:10514;json-template

On the Logstash server, the listener port 10514 is defined as a listener (input) and received messages are labeled as type "rsyslog":

# This input block will listen on port 10514 for logs to come in.
# host should be an IP on the Logstash server.
# codec => "json" indicates that we expect the lines we're receiving to be in JSON format
# type => "rsyslog" is an optional identifier to help identify messaging streams in the pipeline.
input {
  udp {
    host => "0.0.0.0"
    port => 10514
    codec => "json"
    type => "rsyslog"
  }
}

Therefore in this ELK stack there are two ways HAProxy logs are being received and written into Elasticseach:

  1. Through Filebeat using the type "haproxy"; used by manually installed HAProxy
  2. Through central syslog using the type "rsyslog"; used by HAProxy installations inside Docker containers, managed by Rancher

As I wrote at the begin, the Docker containers run in UTC (and the image cannot be changed), all other machines run in UTC+2. With that knowledge, the HAProxy config for Logstash can be adjusted (original grok filter for HAProxy found on logz.io):

root@logstash:~# cat /etc/logstash/conf.d/13-haproxy-filter.conf
filter {

    grok {
    match => [
    "message", "%{IP:client_ip}:%{NUMBER:client_port:int} \[%{NOTSPACE:haproxy_timestamp}] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_queue:int}/%{NUMBER:time_backend_connect:int}/%{NUMBER:time_duration:int} %{NUMBER:bytes_read:int} %{NOTSPACE:termination_state} %{NUMBER:actconn:int}/%{NUMBER:feconn:int}/%{NUMBER:beconn:int}/%{NUMBER:srvconn:int}/%{NUMBER:retries:int} %{NUMBER:srv_queue:int}/%{NUMBER:backend_queue:int}" ,
    "message" , "%{IP:client_ip}:%{NUMBER:client_port:int} \[%{NOTSPACE:haproxy_timestamp}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_client_req:int}/%{NUMBER:time_queue:int}/%{NUMBER:time_backend_connect:int}/%{NUMBER:time_server_response:int}/%{NUMBER:time_duration:int} %{NUMBER:status_code:int} %{NUMBER:bytes_read:int} %{NOTSPACE:captured_request_cookie} %{NOTSPACE:captured_response_cookie} %{NOTSPACE:termination_state_with_cookie_status} %{NUMBER:actconn:int}/%{NUMBER:feconn:int}/%{NUMBER:beconn:int}/%{NUMBER:srvconn:int}/%{NUMBER:retries:int} %{NUMBER:srv_queue:int}/%{NUMBER:backend_queue:int}?( \"%{GREEDYDATA:full_http_request}\")?( %{NOTSPACE:captured_response_headers})?"
    ]
    }

  if [type] == "rsyslog" {
    date {
      match => [ "haproxy_timestamp", "dd/MMM/yyyy:HH:mm:ss.SSS" ]
      timezone => "UTC"
    }
  }

  if [type] == "haproxy" {
    date {
      match => [ "haproxy_timestamp", "dd/MMM/yyyy:HH:mm:ss.SSS" ]
    }
  }

    grok {
    match => [
    "full_http_request", "%{WORD:http_verb} %{URIPATHPARAM:http_request}?( HTTP/%{NUMBER:http_version})" ,
    "full_http_request", "<%{WORD:http_request}>"
    ]
    remove_field => [ "full_http_request" ]
    }

}

As you can see I added two if conditions into the config file:

  • If the received document type is "rsyslog" (therefore coming from a Docker container) tell Logstash that the logged time is UTC.
  • If the received document type is "haproxy" (therefore coming through Filebeat) use system (default) timezone.

Also important to note here: The haproxy_timestamp field is being used here as date/time field (not the actual message time).

Since that modification, I have now both HAProxy logs in the same time zone and can compare them directly in graphs.

Using central syslog server for all HAProxy installations

Update from March 19th 2020

Back when I wrote the article in 2017, I was using different log handlers to separate HAProxy logs: Local logging and filebeat for HAProxy installations on VM's, Direct logging to a central syslog server for HAProxy installations in Docker containers.

In the meantime I moved away from that setup and unified the logging of all HAProxy installations: They all log to a central syslog server now.

To identify whether or not the log came from a container, I added a condition in Logstash filter which checks for the field "sysloghost". All our Docker hosts follow a certain nomenclature (*rado*), therefore syslog logs from these hosts can be identified using a regular expression on the sysloghost field:

ck@logstash:~$ cat /etc/logstash/conf.d/13-filter-haproxy.conf
filter {

  if [programname] == "haproxy" or [type] == "haproxy" {

    # Make sure we use the same type "syslog" for all HAProxy logs, otherwise we run into ElasticSearch 6 problem:
    # Rejecting mapping update to [haproxy-2018.09.05] as the final mapping would have more than 1 type: [log, syslog]
    mutate { replace => { "type" => "syslog" } }

    grok {
      match => [
    "message", "%{IP:client_ip}:%{NUMBER:client_port:int} \[%{NOTSPACE:haproxy_timestamp}] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_queue:int}/%{NUMBER:time_backend_connect:int}/%{NUMBER:time_duration:int} %{NUMBER:bytes_read:int} %{NOTSPACE:termination_state} %{NUMBER:actconn:int}/%{NUMBER:feconn:int}/%{NUMBER:beconn:int}/%{NUMBER:srvconn:int}/%{NUMBER:retries:int} %{NUMBER:srv_queue:int}/%{NUMBER:backend_queue:int}" ,
    "message" , "%{IP:client_ip}:%{NUMBER:client_port:int} \[%{NOTSPACE:haproxy_timestamp}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{NUMBER:time_client_req:int}/%{NUMBER:time_queue:int}/%{NUMBER:time_backend_connect:int}/%{NUMBER:time_server_response:int}/%{NUMBER:time_duration:int} %{NUMBER:status_code:int} %{NUMBER:bytes_read:int} %{NOTSPACE:captured_request_cookie} %{NOTSPACE:captured_response_cookie} %{NOTSPACE:termination_state_with_cookie_status} %{NUMBER:actconn:int}/%{NUMBER:feconn:int}/%{NUMBER:beconn:int}/%{NUMBER:srvconn:int}/%{NUMBER:retries:int} %{NUMBER:srv_queue:int}/%{NUMBER:backend_queue:int}?( \"%{GREEDYDATA:full_http_request}\")?( %{NOTSPACE:captured_response_headers})?"
      ]
    }

    grok {
      match => [
      "full_http_request", "%{WORD:http_verb} %{URIPATHPARAM:http_request}?( HTTP/%{NUMBER:http_version})" ,
      "full_http_request", "<%{WORD:http_request}>"
      ]
      remove_field => [ "full_http_request" ]
    }

    if [sysloghost] =~ /^.*rado.*$/ {
      # We received this HAProxy log from Docker where they all run with UTC
      date {
        match => [ "haproxy_timestamp", "dd/MMM/yyyy:HH:mm:ss.SSS" ]
        timezone => "UTC"
      }
    }


  }
}


Add a comment

Show form to leave a comment

Comments (newest first)

Ferchimi from Argentina wrote on Oct 28th, 2021:

This was very useful, you saved me on openshift router logs: they're UTC and we are on ART (UTC-3). Thank you very much, Fernando


RSS feed

Blog Tags:

  AWS   Android   Ansible   Apache   Apple   Atlassian   BSD   Backup   Bash   Bluecoat   CMS   Chef   Cloud   Coding   Consul   Containers   CouchDB   DB   DNS   Database   Databases   Docker   ELK   Elasticsearch   Filebeat   FreeBSD   Galera   Git   GlusterFS   Grafana   Graphics   HAProxy   HTML   Hacks   Hardware   Icinga   Influx   Internet   Java   KVM   Kibana   Kodi   Kubernetes   LVM   LXC   Linux   Logstash   Mac   Macintosh   Mail   MariaDB   Minio   MongoDB   Monitoring   Multimedia   MySQL   NFS   Nagios   Network   Nginx   OSSEC   OTRS   Observability   Office   OpenSearch   PGSQL   PHP   Perl   Personal   PostgreSQL   Postgres   PowerDNS   Proxmox   Proxy   Python   Rancher   Rant   Redis   Roundcube   SSL   Samba   Seafile   Security   Shell   SmartOS   Solaris   Surveillance   Systemd   TLS   Tomcat   Ubuntu   Unix   VMWare   VMware   Varnish   Virtualization   Windows   Wireless   Wordpress   Wyse   ZFS   Zoneminder