Logstash is mostly known in combination with Elasticsearch, but it can also be used as a listener for centralizing logs from all kinds of applications and saving them into local (log-) files. A typical use case for such a setup are applications which write huge (multi-line) log files, which cannot easily be split into fields. Or if you simply want to use a central "archive" log file.
By using a simple udp listener, Logstash listens on the given port (here udp/5000):
root@logstash:~# cat /etc/logstash/conf.d/01-inputs.conf
# Listener
input {
udp {
port => 5000
codec => "plain"
}
}
Once the "events" are received by Logstash, they will be stored into an endpoint, defined by the output configuration. In the following example, the output is using a local file:
root@logstash:~# cat /etc/logstash/conf.d/99-outputs.conf
output {
file {
path => "/var/log/applications.log"
file_mode => 0644
codec => rubydebug
}
}
Note: Make sure the log file defined in the "path" option is writeable for the logstash user or the directory allows the logstash user to create a file.
After a Logstash restart, the listener can be seen with netstat:
root@logstash:~# netstat -lntup|grep 5000
udp 0 0 0.0.0.0:5000 0.0.0.0:* 90978/java
UDP is a fire and forget protocol. The sender just sends data, the recipient (Logstash in this case) does not confirm that the logs were received. This can lead to missing logs, for example when the network communication problems between sender and recipient occur.
TCP on the other hand is a handshake protocol; every transmission of data needs to be ack(nowledg)ed. This ensures data is correctly transmitted. When data was sent by the sender and not confirmed (acked) by the recipient, data is usually re-transmitted. However when a major network problem occurs, and many log events should be sent, this can lead to a "queuing" of data. The queue builds up, using up available TCP slots (yes, there is a limit). When all of these slots are fully used, the application will (most likely) stop working, as all communication stops. For a real life situation where this happened, you can check out the article "Docker logging with GELF using tcp: The good, the bad, the ugly".
Note: You can use the ss command on Linux to monitor your receive and send queues.
Although you risk losing some logs when using UDP and a communication error to Logstash happens, you don't risk your application to stop working.
To send data to Logstash, you don't have to use a fancy program. A simple line sent via netcat is already enough:
app@appserver:~$ echo "Sending log event to Logstash" | nc -u logstashserver 5000
An application should use a different approach of course. Depending on the application code, this could be a local socket which connects to the listener or a specific library used for sending logs/events to a remote listener.
The log event is then saved in the output file:
root@logstash:~# cat /var/log/applications.log
{
"@version" => "1",
"@timestamp" => 2022-01-24T14:41:29.540Z,
"message" => "Sending log event to Logstash\n",
"host" => "192.168.253.110"
}
Logstash has received the log message and split up the event into several fields. This is due to the rubydebug codec used in the file output definition. This output helps to view all available fields, which can be used to create filters.
The file output plugin supports a couple of different outputs, which can be configured with the codec option. In the following example, the output codec is changed to "plain".
root@logstash:~# cat /etc/logstash/conf.d/99-outputs.conf
output {
file {
path => "/var/log/applications.log"
file_mode => 0644
codec => plain
}
}
This results in a simplified log file, using one line per event:
root@logstash:~# cat /var/log/applications.log
2022-01-24T15:25:44.669Z 192.168.253.110 Sending first log event to Logstash
Logstash supports the usage of variables, including variables referring to date and time. The file path can be appended using a combination of static path with a dynamic variable. The following example shows a variable based on the current date, including the current hour:
root@logstash:~# cat /etc/logstash/conf.d/99-outputs.conf
output {
file {
path => "/var/log/applications-%{+YYYY-MM-dd-HH}.log"
file_mode => 0644
codec => plain
}
}
With this configuration, Logstash automatically creates a new log file every hour.
The log input contains multiple fields which Logstash is able to read and use as variables in the Logstash config.
Let's say you have an application sending logs with a field "debug-file" in a nested JSON object "app". The Logstash output can be adjusted to check for the existence of this field and use the value as variable:
root@logstash:~# cat /etc/logstash/conf.d/99-outputs.conf
output {
if [app][debug-file] {
file {
path => "/var/log/%{[app][debug-file]}-%{+YYYY-MM-dd-HH}.log"
file_mode => 0644
}
} else {
file {
path => "/var/log/applications-%{+YYYY-MM-dd-HH}.log"
file_mode => 0644
codec => plain
}
}
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Observability Office OpenSearch PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder