There are two types of Haproxy log lines. TCP and HTTP.
May 30 10:16:41 zv2096-abc nbid-12345[3046181]: 2601:1418:2101::1638:a7db:57784 [30/May/2024:10:16:40.960] listener12345 backend12345/node12345678 1/0/315 2989 – 9/9/8/0/0 0/0
May 30 10:16:41 zv2104-def nbid-45678[156515]: 136.252.25.169:1033 [30/May/2024:10:16:34.441] listener56789~ backend456789/node98876543 0/0/0/1/6616 200 1087673 - - ---- 26/26/22/14/0 0/0 “GET /blahblahblabload/Stasdfasdfer/1masdfIf.zip HTTP/1.1”
This is a TCP regex
(?P<date_time>\w+ \d+ \S+) (?P<nil>\S+) nbid-(?P<nbid>\d+)\[(?P<pid>\d+)\]: (?P<client_ip>\S+):(?P<client_port>\d+) \[(?P<request_date>\S+)\] (?P<frontend_name>\S+) (?P<backend_name>\S+)/(?P<server_name>\S+) (?P<Tw>\d+)/(?P<Tc>\d+)/(?P<Tt>\d+) (?P<bytes_read>\S+) (?P<termination_state>\S+) (?P<actconn>\d+)/(?P<feconn>\d+)/(?P<beconn>\d+)/(?P<srv_conn>\d+)/(?P<retries>\d+) (?P<srv_queue>\d+)/(?P<backend_queue>\d+)*$
This is an HTTP regex
(?P<date_time>\w+ \d+ \S+) (?P<nil>\S+) nbid-(?P<nbid>\d+)\[(?P<pid>\d+)\]: (?P<client_ip>\S+):(?P<client_port>\d+) \[(?P<request_date>\S+)\] (?P<frontend_name>\S+) (?P<backend_name>\S+)/(?P<server_name>\S+) (?P<TR>\d+)/(?P<Tw>\d+)/(?P<Tc>\d+)/(?P<Tr>\d+)/(?P<Ta>\d+) (?P<status_code>\S+) (?P<bytes_read>\S+) *(?P<request_cookie>\S+) (?P<response_cookie>\S+) (?P<termination_state>\S+) (?P<actconn>\d+)/(?P<feconn>\d+)/(?P<beconn>\d+)/(?P<srv_conn>\d+)/(?P<retries>\d+) (?P<srv_queue>\d+)/(?P<backend_queue>\d+) "(?P<method>\S+) (?P<url_path>[^"]+) (?P<version>\S+)" *$
Instead of using regex, from what I read, patterns are the preferred and recommended way of parsing log lines. Here is the pattern I use for TCP lines.
<_> <_> <_> <nil> nbid-<nbid>[<pid>]: <client_ip>:<client_port> [<request_date>] <frontend_name> <backend_name>/<server_name> <Tw>/<Tc>/<Tt> <bytes_read> <termination_state> <actconn>/<feconn>/<beconn>/<srv_conn>/<retries> <srv_queue>/<backend_queue> <_>
It works well for TCP lines but it keeps parsing HTTP lines and I get unexpected values for some of the variables.
Is there a way to rewrite the patter such that it omits HTTP log lines and filter them out?
Thank you