I'm trying to setup a fluentd service to collect logs and send them to elastic search.
All is good, except I cannot get a custom index name AND keep the timestamp similar to what logstash_format: true would have.
Here is my fluent.conf file:
<source>
#type forward
port 24224
bind 0.0.0.0
</source>
<match *.**>
#type copy
<store>
#type elasticsearch
hosts hostaddressandport
user theuser
password password
include_tag_key true
tag_key #log_name
index_name myindex-%Y.%m
<buffer>
flush_interval 3s
</buffer>
</store>
</match>
The index gets created in elastic literally and it shows myindex-%Y.%m I've tried myindex-${%Y.%m} and get the same behaviour.
If I use logstash_format: true instead, then I get an index like logstash-2019.07.09, but I don't want that.
This is where I'm getting my idea from https://docs.fluentd.org/output/elasticsearch but I don't see the expected behaviour.
I have found the following in the docs mentioned above:
<buffer tag, time>
timekey 1h # chunks per hours ("3600" also available)
</buffer>
But it's pretty vague and I don't understand what chunk_keys are.
You can use logstash_format and logstash_prefix to change the index prefix. This will not use the date format you require though.
logstash_format true
logstash_prefix myindex
please use this config you will get your custom date format for your index name
Config file
<source>
#type forward
port 24224
bind 0.0.0.0
</source>
<match *.**>
#type copy
<store>
#type elasticsearch
hosts hostaddressandport
user theuser
password password
include_tag_key true
tag_key #log_name
logstash_format true
logstash_dateformat %Y.%m
logstash_prefix index_name
<buffer>
flush_interval 3s
</buffer>
</store>
</match>
output
indexname-2021.08
Related
Having already a defined custom rsyslog configuration like:
:msg, regex, "myappname", /appl/logs/myappname.log
How can I prevent the logs being written both on /var/log/messages and /appl/logs/myappname.log?
Figured this out as I should just add:
:msg, regex, "myappname" ~
as the 2nd line
This is sort of a complex question, I hope I can explain it clearly:
I have a long blacklist of domains added to my /etc/hosts file and it works perfectly. Now I want to be able to "monitor" with a "simple" Bash script every time a domain/entry is blocked, e.g.:
Let's say I'm running this hypothetical script on my Terminal and I try to access Facebook in my browser (which is blocked in my hosts file), I'd like to see in my Terminal something like:
0.0.0.0 facebook.com
Then, I try to access LinkedIn (also blocked), and now I want to see in my Terminal:
0.0.0.0 facebook.com
0.0.0.0 linkedin.com
Then, I try to access Instagram (blacklisted as well) and I see:
0.0.0.0 facebook.com
0.0.0.0 linkedin.com
0.0.0.0 instagram.com
And so on...
Is that possible? I've spent days looking for an existing program that does this but no luck..
It's possible; whether you find it simple is a different question.
Caveat: if you edit the hosts file you'll need to restart the process.
Save
BEGIN {
while (getline < "/etc/hosts") # read the hosts file line by line
{
split($0,fields," *") # break up each line on "any number of spaces" and assign to the array "fields"
hosts[fields[2]"."]++ # create an array "hosts" that holds e.g. www.google.com as it's key
}
close("/etc/hosts") # close the hosts file, be tidy
}
{
for(a=7;a<=NF;a++){ # for each line of input (these are now coming from tcpdump) iterate over the 7th to last (NF) fields
if($a in hosts){ # if the field is present in the hosts array (matches an index)
print $a # print it
}
}
}
as e.g. hosts.awk ...
You can then run
sudo tcpdump -l port 53 | awk -f hosts.awk
So I whipped up a docker-based fluentd TCP log collector.
Following the examples here, https://docs.fluentd.org/input/tcp , led to successfully sending a line from my host Win 10 WSL (Debian) by saying
echo "my_service: 08:03:10 INFO [my_py_file:343]: My valuable log info." | netcat 127.0.0.1 5170
This arrived in fluentd as a nice JSON, as hoped-for. But I want to do it from python 3.7! So:
import socket
def netcat(hn: str, p: int, content: bytes):
"""https://www.instructables.com/id/Netcat-in-Python/"""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((hn, p))
sock.sendall(content)
sock.close()
msg_raw = "my_service: 08:03:10 INFO [my_py_file:343]: My valuable log info."
netcat('127.0.0.1', 5170, bytes(msg_raw, 'utf-8'))
WSL or no: This python script runs through, no exceptions. Also no reaction at all from fluentd, which I cannot explain. Could and would any of you?
In case it is of any consequence: Here is the relevant section from my fluentd.conf.
<source>
#type tcp
#label mainstream
#id pawc_tcp
tag paws.tcp
port 5170
bind 0.0.0.0
# https://docs.fluentd.org/parser/regexp
<parse>
#type regexp
expression /^(?<service_uuid>[a-zA-Z0-9_-]+): (?<logtime>[^\s]+) (?<loglvl>[^\s]+) \[(?<file>[^\]:]+):(?<line>\d+)\]: (?<msg>.*)$/
time_key logtime
time_format %H:%M:%S
types line:integer
</parse>
</source>
<label mainstream>
<match paws.tcp>
#type file
#id output_tcp
path /fluentd/log/tcp.*.log
symlink_path /fluentd/log/tcp.log
</match>
</label>
Try sending a \r\n or \0 at the end of your message. The message is being sent as bytes over the network so it's probably being stored in buffers and the code reading the buffer needs a way to know the message is over. The regex is also matching on line terminators so will be necessary I think there as well.
As Alex W states above, a \n is needed for the TCP line being accepted by the fluentd regex I use. I'd like to add a second answer to improve the python code of the original question.
There actually is a readily-implemented logging.handler.SocketHandler class! However, it pickles its outputs, looking at a python log server. Using fluentd this means one has to override the emit function to use it. After that all works fine.
import logging, logging.handlers
class SocketHandlerBytes(logging.handlers.SocketHandler):
def emit(self, record):
try:
msg = bytes(self.format(record) + "\n", 'utf-8')
self.send(msg)
except Exception:
self.handleError(record)
sh = SocketHandlerBytes(host, port)
sh.setFormatter(logger_format_appropriate_for_your_fluentd_tcp_regex)
logging.root.addHandler(sh)
I am using Filebeat 6.4.2, Logstash 6.3.1 and want to combine all logs files on the filebeat input path . Logs don't have any specific pattern to start with or end with.
Logs don't have any specific pattern. I want to capture all combined logs to Logstash together in bunch of max lines specified.
I tried with multiple RegEx in the pattern sections, it's not working. Problem is logs does'nt come in any specific pattern.
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/application.log
fields:
type: admin
tags: admin
fields_under_root: true
multiline.pattern: '.'
multiline.negate: true
multiline.match: after
multiline.max_lines: 1000
output.logstash:
# The Logstash hosts
hosts: ["xxx.20.x.xxx:5043"]
I want to combine all the multiline logs together as per max_lines configuration’s .
You can specify a pattern that would not be found in your logs like
'^HeLlO$€(^_^)€$bYe'
and it should do the trick.
I can't figure it out how to write log files using fluentd, any help is welcome.
P.S.: I know that my config file is probably full of redundancies, it's because I was trying many things.
I'm executing using the Td-Agent prompt with the following command:
fluentd -c etc\td-agent\td-agent.conff
<source>
#type forward
bind 0.0.0.0
port 24230
</source>
<match **>
#type stdout
</match>
<match **>
#type file
path logs
add_path_suffix true
path_suffix ".txt"
flush_interval 1
flush_mode.immediate
flush_at_shutdown
compress text
append true
<buffer>
#type file
path logb/logs.*.txt
</buffer>
</match>
Use out_copy plugin to use multiple output plugins.