This is supposed to be a quick reminder to myself, the next time I run into such a problem: regular expressions are not exactly the same in sed!
On my previous article "How to manually clean up Zoneminder events" I wrote a shell script in which I wanted to remove a certain part of a path:
/var/cache/zoneminder/events/5/18/12/14/.448512/06/45/12
should become:
/var/cache/zoneminder/events/5/18/12/14/06/45/12
Simple, right? Just use sed replace and remove ".448512/" out of the string.
But see for yourself:
$ echo "/var/cache/zoneminder/events/5/18/12/14/.448512/06/45/12" | sed "s/\.\d+\///g"
/var/cache/zoneminder/events/5/18/12/14/.448512/06/45/12
The old path is still shown. Nothing was replaced. My first thought was of course that I've made a mistake in my regular expression, but on all the regex checkers online confirmed my regex was correct. For example on https://regexr.com/:
I was able to break it down that it must have something to do with the regular expression for the number (\d+) because simply replacing the dot character works:
$ echo "/var/cache/zoneminder/events/5/18/12/14/.448512/06/45/12" | sed "s/\.//g"
/var/cache/zoneminder/events/5/18/12/14/448512/06/45/12
And then I received the final hint from a friend: Some typical regex don't work in sed! Excerpt from sed's documentation:
* Matches a sequence of zero or more instances of matches for the preceding regular expression, which must be an ordinary character, a special character preceded by \, a ., a grouped regexp (see below), or a bracket expression. As a GNU extension, a postfixed regular expression can also be followed by *; for example, a** is equivalent to a*. POSIX 1003.1-2001 says that * stands for itself when it appears at the start of a regular expression or subexpression, but many nonGNU implementations do not support this and portable scripts should instead use \* in these contexts.
\+ As *, but matches one or more. It is a GNU extension.
[...]
‘[a-zA-Z0-9]’ In the C locale, this matches any ASCII letters or digits.
So first of all the plus-sign (+) must be escaped. And second to match a digit, \d doesn't work, it must be used in [0-9] style!
With these adjustments, sed now finally does the replace part:
$ echo "/var/cache/zoneminder/events/5/18/12/14/.448512/06/45/12" | sed "s/\.[0-9]\+\///g"
/var/cache/zoneminder/events/5/18/12/14/06/45/12
Dang it, I am sure that I ran into this at least once already in my Linux career. Hence this post to not lose much time the next time this happens again.
No comments yet.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder