Streamsets Log Parser allows you to parse and ingest Log Files from server
There are multiple pre-defined "Log Formats" to choose from such as CommonLog Format or Combined Log Format for Apache Access Logs
However, if you have defined your own log format then "GROK" patterns are great way to configure Log Parser to consume them.
The real challenge however is how should you define you GROK Pattern.
Test Grok Patterns (https://grokconstructor.appspot.com/do/match) is a great website to enter your GROK pattern and log line and test if things will work.
It also provides an "Automatic" mode (https://grokconstructor.appspot.com/do/automatic)
This will generate the GROK pattern for you based on the log line that you provide.
However, if you are using a customized version of Apache access log then you can use standard GROK patterns to match your log line.
For example, for my access log line GROK pattern is given below
Log Line
Grok Pattern
Streamsets Log Parser Configuration
There are multiple pre-defined "Log Formats" to choose from such as CommonLog Format or Combined Log Format for Apache Access Logs
However, if you have defined your own log format then "GROK" patterns are great way to configure Log Parser to consume them.
The real challenge however is how should you define you GROK Pattern.
Test Grok Patterns (https://grokconstructor.appspot.com/do/match) is a great website to enter your GROK pattern and log line and test if things will work.
It also provides an "Automatic" mode (https://grokconstructor.appspot.com/do/automatic)
This will generate the GROK pattern for you based on the log line that you provide.
However, if you are using a customized version of Apache access log then you can use standard GROK patterns to match your log line.
For example, for my access log line GROK pattern is given below
Log Line
103.107.92.250 - - [21/Apr/2019:17:34:35 +0530] "GET /form/track-shipment/ HTTP/1.1" 200 8324 "http://onlinexpress.co.in/form/track-shipment/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36" 400
Grok Pattern
%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:responseTime}
Streamsets Log Parser Configuration
In the screenshot MYPATTERN is the custom name that I have given for my pattern in "GROK Pattern Definition" field.
The first word is always the pattern name, which is to be entered in the "GROK Pattern" field.
0 comments:
Post a Comment