Reading a log file from given path using logstash - logstash

input
{
file
{
path => ["D:/logstash-2.3.4/temp/logs/localhost_access_log.2016-08-24.log"]
start_position => "beginning"
}
}
filter
{
date
{
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output
{
stdout { codec => rubydebug }
}
Now after running logstash i am unable to see any output on logstash command window. That is the logs inside a give file are not fetching.
some of the sample logs in my localhost_access_log.2016-08-24 log file are below:
127.0.0.1 - - [24/Aug/2016:10:07:54 +0530] "GET / HTTP/1.1" 200 11452
0:0:0:0:0:0:0:1 - - [24/Aug/2016:10:08:09 +0530] "GET /Migration/firstpage.jsp HTTP/1.1" 404 1040
127.0.0.1 - - [24/Aug/2016:10:08:39 +0530] "GET / HTTP/1.1" 200 11452
0:0:0:0:0:0:0:1 - - [24/Aug/2016:10:08:41 +0530] "GET /Migration/firstpage.jsp HTTP/1.1" 500 3750
0:0:0:0:0:0:0:1 - - [24/Aug/2016:10:09:38 +0530] "GET /Mortgage/faces/NewFile.jsp HTTP/1.1" 404 1046
Is there any problem with the input code or date filter code?
Can anyone help me where i am committing mistake?

Did you try keeping the stdout {} empty as this within your output section of your conf file in order to check the output from your logstash console?
As #baudsp mentioned, it's better to use grok filter when you're dealing with log files. Something like this:
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
}
Source: Parsing Logs with Logstash

Related

Gunicorn access log format not apply

I'm using gunicorn to run a fastapi script, the access log file were created using the gunicorn.conf.py with accesslog yet it will not apply the access_log_format. I tried this apply this example from the github and is still not working
My gunicorn.conf.py
accesslog = '/home/ossbod/chunhueitest/supervisor_log/accesslog.log'
loglevel = 'info'
access_log_format = '%(h)s %(l)s %(t)s "%(r)s" %(s)s %(q)s %(b)s "%(f)s" "%(a)s" %(M)s'
The result I got
<IP>:54668 - "GET /docs HTTP/1.1" 200
<IP>:54668 - "GET /openapi.json HTTP/1.1" 200
<IP>:54668 - "POST /api/v1/add_user HTTP/1.1" 201
How can i get the format to apply to the log?

Django+Vue:static file not found

When I use Django+Vue to build a web appliction, It occoured that the staic files always not found though I had placed all the files correctly.
the logs from server like this:
WARNING Not Found: /static/js/app.4c2224dc.js
WARNING Not Found: /static/css/app.a5d43e07.css
WARNING Not Found: /static/css/chunk-vendors.2ccfa1b8.css
WARNING Not Found: /static/js/chunk-vendors.c2488b8d.js
WARNING "GET /static/js/app.4c2224dc.js HTTP/1.1" 404 179
WARNING "GET /static/css/app.a5d43e07.css HTTP/1.1" 404 179
WARNING "GET /static/css/chunk-vendors.2ccfa1b8.css HTTP/1.1" 404 179
WARNING "GET /static/js/chunk-vendors.c2488b8d.js HTTP/1.1" 404 179
WARNING Not Found: /login
WARNING "GET /login HTTP/1.1" 404 179
WARNING Not Found: //performanceProject
WARNING "GET //performanceProject HTTP/1.1" 404 179
WARNING Not Found: /performanceProject
WARNING "GET /performanceProject HTTP/1.1" 404 179
I solved the problem by https://www.cnpython.com/qa/1400788.
1、pip3 install whitenoise
2、modify settings.py add 'whitenoise.middleware.WhiteNoiseMiddleware' to MIDDLEWARE, example below:
MIDDLEWARE = [
'django.middleware.security.SecurityMiddleware',
'whitenoise.middleware.WhiteNoiseMiddleware', # Add WhiteNoise here
...
]
# when DEBUG=true
# STATICFILES_DIRS = [
# os.path.join(BASE_DIR, 'dist/static')
# ]
# when debug=False
STATIC_ROOT = os.path.join(BASE_DIR, 'dist/static')
# always need to be set ignore debug=False or True
STATICFILES_STORAGE = 'whitenoise.storage.CompressedStaticFilesStorage'
3、python3 manage.py collectstatic
4、python3 manage.py runserver 0.0.0.0:XXXX

How to parse stream of logs aggregated from multiple files with logstash?

I have logs from GitLab installed on Kubernetes. Amongst other pods, there is Sidekiq which has a very peculiar structure of logs - it gathers multiple files that all then go into the stdout (see example at the end or official documentation). I want to gather all these logs by Filebeat, send them to Logstash and process them in a sane way (parse JSONs, get important data from line logs, etc. Also, I would like to add info about the original file) and send the output to elasticsearch.
However, I am struggling with how to do that - as a newbie regarding Logstash I am not sure how it works under the hood - and so far, I was able to come up only with grok that matches line with the file name.
From one perspective it should be relatively easy - I just need to use some sort of a state to mark which file is being processed in the log stream but in the first place I am not sure if Filebeat somehow passes information about the stream to Logstash (important to distinguish from which pod logs came) and secondly whether Logstash allows this state-based processing of log stream.
Is it possible to parse these logs and add the original filename as a field this state-based way? Could you possibly point me in the right direction?
filter {
grok {
match => {"message" => "\*\*\* %{PATH:file} \*\*\*"}
}
if [file] == "/var/log/gitlab/production_json.log" {
json {
match => { ... }
}
}
else if [file] == "/var/log/gitlab/application_json.log" {
grok {
match => { ... }
}
}
}
Please notice that even for each file, there might be multiple types of logs (/var/log/gitlab/sidekiq_exporter.log)
*** /var/log/gitlab/application.log ***
2020-11-18T10:08:28.568Z: Cannot obtain an exclusive lease for Namespace::AggregationSchedule. There must be another instance already in execution.
*** /var/log/gitlab/application_json.log ***
{"severity":"ERROR","time":"2020-11-18T10:08:28.568Z","correlation_id":"BsVuSTdkM45","message":"Cannot obtain an exclusive lease for Namespace::AggregationSchedule. There must be another instance already in execution."}
*** /var/log/gitlab/sidekiq_exporter.log ***
[2020-11-18T10:08:32.076+0000] 10.103.149.75 - - [18/Nov/2020:10:08:32 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:08:42.076+0000] 10.103.149.75 - - [18/Nov/2020:10:08:42 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:08:43.771+0000] 10.103.149.75 - - [18/Nov/2020:10:08:43 UTC] "GET /liveness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:08:52.076+0000] 10.103.149.75 - - [18/Nov/2020:10:08:52 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:09:02.076+0000] 10.103.149.75 - - [18/Nov/2020:10:09:02 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:09:12.076+0000] 10.103.149.75 - - [18/Nov/2020:10:09:12 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:09:22.076+0000] 10.103.149.75 - - [18/Nov/2020:10:09:22 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:09:32.076+0000] 10.103.149.75 - - [18/Nov/2020:10:09:32 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:09:42.076+0000] 10.103.149.75 - - [18/Nov/2020:10:09:42 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:09:43.771+0000] 10.103.149.75 - - [18/Nov/2020:10:09:43 UTC] "GET /liveness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:09:52.076+0000] 10.103.149.75 - - [18/Nov/2020:10:09:52 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:10:02.076+0000] 10.103.149.75 - - [18/Nov/2020:10:10:02 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:10:12.076+0000] 10.103.149.75 - - [18/Nov/2020:10:10:12 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
2020-11-18T10:10:15.783Z 10 TID-oslmgxbxm PagesDomainSslRenewalCronWorker JID-e4891c8d6d57d73f401da697 INFO: start
2020-11-18T10:10:15.807Z 10 TID-oslmgxbxm PagesDomainSslRenewalCronWorker JID-e4891c8d6d57d73f401da697 INFO: done: 0.024 sec
[2020-11-18T10:10:22.076+0000] 10.103.149.75 - - [18/Nov/2020:10:10:22 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:10:32.076+0000] 10.103.149.75 - - [18/Nov/2020:10:10:32 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:10:42.076+0000] 10.103.149.75 - - [18/Nov/2020:10:10:42 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:10:43.771+0000] 10.103.149.75 - - [18/Nov/2020:10:10:43 UTC] "GET /liveness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
*** /var/log/gitlab/application_json.log ***
{"severity":"ERROR","time":"2020-11-18T10:49:11.565Z","correlation_id":"H9wDObekY74","message":"Cannot obtain an exclusive lease for Ci::PipelineProcessing::AtomicProcessingService. There must be another instance already in execution."}
*** /var/log/gitlab/application.log ***
2020-11-18T10:49:11.564Z: Cannot obtain an exclusive lease for Ci::PipelineProcessing::AtomicProcessingService. There must be another instance already in execution.
2020-11-18T10:49:11.828Z 10 TID-gn2cjsz0a ProjectServiceWorker JID-ccb9b5b0f74ced684e15af75 INFO: done: 0.275 sec
2020-11-18T10:49:11.835Z 10 TID-gn2dwudy2 Namespaces::ScheduleAggregationWorker JID-7db9fe9200701bbc7dc7360c INFO: start
2020-11-18T10:49:11.844Z 10 TID-gn2dwudy2 Namespaces::ScheduleAggregationWorker JID-7db9fe9200701bbc7dc7360c INFO: done: 0.009 sec
2020-11-18T10:49:11.888Z 10 TID-oslmgxbxm ArchiveTraceWorker JID-999cc768143b644d051cfe82 INFO: done: 0.21 sec
*** /var/log/gitlab/sidekiq_exporter.log ***
[2020-11-18T10:49:12.076+0000] 10.103.149.75 - - [18/Nov/2020:10:49:12 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:49:22.076+0000] 10.103.149.75 - - [18/Nov/2020:10:49:22 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:49:32.076+0000] 10.103.149.75 - - [18/Nov/2020:10:49:32 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
[2020-11-18T10:49:42.076+0000] 10.103.149.75 - - [18/Nov/2020:10:49:42 UTC] "GET /readiness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
2020-11-18T10:49:43.216Z 10 TID-gn2cjsz0a Namespaces::RootStatisticsWorker JID-c277b38f3daa09648934d99f INFO: start
2020-11-18T10:49:43.243Z 10 TID-gn2cjsz0a Namespaces::RootStatisticsWorker JID-c277b38f3daa09648934d99f INFO: done: 0.027 sec
[2020-11-18T10:49:43.771+0000] 10.103.149.75 - - [18/Nov/2020:10:49:43 UTC] "GET /liveness HTTP/1.1" 200 15 "-" "kube-probe/1.17+"
You can give all the logs path in filebeat.yml for filebeat to read the logs and send it to logstash.
Example filebeat.yml for gitlab:
###################### Filebeat Configuration Example #########################
#=========================== Filebeat inputs =============================
filebeat.inputs:
-
paths:
- /var/log/gitlab/gitlab-rails/application_json.log
fields:
- type: gitlab-application-json
fields_under_root: true
encoding: utf-8
-
paths:
- /var/log/gitlab/sidekiq_exporter.log
fields:
- type: gitlab-sidekiq-exporter
fields_under_root: true
encoding: utf-8
-
paths:
- /var/log/gitlab/gitlab-rails/api_json.log
fields:
- type: gitlab-api-json
fields_under_root: true
encoding: utf-8
-
paths:
- /var/log/gitlab/gitlab-rails/application.log
fields:
- type: gitlab-application
fields_under_root: true
encoding: utf-8
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["10.127.55.155:5066"]
#================================ Processors =====================================
# Configure processors to enhance or manipulate events generated by the beat.
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
Now, in logstash, you can create different grok pattern to filter these logs.
Here is a sample logstash.yml,
input {
beats {
port => "5066"
}
}
filter {
if [type] == "gitlab-sidekiq-exporter" {
grok {
match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\] %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[(?<timestamp>%{MONTHDAY}/%{MONTH}/%{YEAR}\:%{TIME}) %{TZ:timezone}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent}" }
overwrite => [ "message" ]
}
}
filter {
mutate {
remove_tag => [
"_grokparsefailure"
]
}
}
output {
#filtered logs are getting indexed in elasticsearch
elasticsearch {
hosts => ["10.127.55.155:9200"]
user => elastic
password => elastic
action => "index"
index => "gitlab"
}
stdout { codec => rubydebug } #filtered logs can be seen as console output as well, you can comment this out as well, this is for debugging purpose only
}
Note: The beat input port in logstash.yml should be same, as given in output.logstash in filebeat.yml
You can append the logstash.yml for filtering out application_json.log and application.log similar to that of sidekiq_exporter.log
For creating and validating grok pattern to filter the logs, you can use online Grok Debugger.
Here, I have used the Grok Debugger to create a pattern for filtering sidekiq_exporter.log
Pattern: %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent}

Filebeat To Logstash -InvalidFrameProtocolException

I am trying to load data from filebeat into logstash. While loading , while running the command->
bin/logstash -f first-pipeline.conf --config.reload.automatic
, following error is encountered:
[2018-06-05T11:30:43,987][INFO ][logstash.inputs.beats ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
[2018-06-05T11:30:44,047][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x969dfe run>"}
[2018-06-05T11:30:44,083][INFO ][org.logstash.beats.Server] Starting server on port: 5044
[2018-06-05T11:30:44,112][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]}
[2018-06-05T11:32:05,045][INFO ][org.logstash.beats.BeatsHandler] [local: 0:0:0:0:0:0:0:1:5044, remote: 0:0:0:0:0:0:0:1:31903] Handling exception: org.logstash.beats.BeatsParser$InvalidFrameProtocolException: Invalid Frame Type, received: 69
first-pipeline.conf file is:
# The # character at the beginning of a line indicates a comment. Use
# comments to describe your configuration.
input {
beats {
port => "5044"
}
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
stdout { codec => rubydebug }
}
Filebeat.yml file:
filebeat.prospectors:
- type: log
enabled: true
paths:
- \C:\PATH-TO-DOC\elasticDoc\logstash-tutorial-dataset.log
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["localhost:5044"]
Sample dataset of logstash-tutorial-dataset.log :
83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-dashboard3.png HTTP/1.1" 200 171717 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
What is the cause of this error? This question has already been asked before but there were no replies. Please also let me know where i could polish my concepts in logstash and filebeat more. I am a beginner.
The problem was with my filename in filebeat.yml . The extension was not needed.
Also in first-pipeline.conf file, i removed codec and send my logs directly to elastic search and it started working for me.

Grok parse error while parsing Apache logs

OK, so I am in need of some help in figuring out why Logstash is giving me a parse error when I have tested it on the Grok Debugger. This has to do with a custom log from Apache.
Below is the raw log entry:
57.85.212.139 tst.testing.com [13/Mar/2015:10:10:55 -0600] "POST /app/cp/authenticate/updateLog HTTP/1.1" 200 444 195268 "-" "-"
Here is the Grok pattern:
(%{IP:clientip}|-) %{HOSTNAME:host} \[%{HTTPDATE:timestamp}\] "((?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))|-)" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NUMBER:duration} "(%{DATA:referer}|-)" "(%{DATA:useragent}|-)"
Below is the error from Logstash:
{"tags":["_grokparsefailure"],"#version":1,"#timestamp":"2015-03-13T16:10:55.650Z","host":"EOA-ELB-TEST","file":"/var/log/apache2/access.log","message":"57.85.212.139 tst.testing.com [13/Mar/2015:10:10:55 -0600] \"POST /app/cp/authenticate/updateLog HTTP/1.1\" 200 444 216340 \"-\" \...
This makes no sense to me. Why would it pass on the validator but fail in Logstash?
Any help would be greatly appreciated.
Thanks!

Resources