Azure adds timestamp at the beginning logs - azure

I have a problem with the logs retrieving from my docker containers with Azure log analytics, all logs are retrieving well but Azure adds a date at the beginning of each line of the log, which means that an entry is created for each line and I can't analyze my logs correctly because they are divided...
For example on this image I have in the black rectangle an added date (by azure I think) and in the red rectangle the date appearing in my logs :
Also, if there is no date on a line of my logs, there is still an added date on all lines, even the empty ones
The problem is that azure cuts my log file line by line by adding a date on each line when I would like it to delimit with the dates already present in my logs files.
Do you have any solutions?

One of the solution I can think of is that, when you query the logs, you can use the replace() method to replace the redundant date(replace it with a empty string etc.). And you need to write the proper regular expression for your purpose.
A false query like below:
ContainerLog
| extend new_logEntry=replace(#'xxx', #'xxx', LogEntry)

Currently Azure Monitor for containers doesn’t support multi-line logging, but there are workarounds available. You can configure all the services to write in JSON format and then Docker/Moby will write them as a single line.
https://learn.microsoft.com/fr-fr/azure/azure-monitor/insights/container-insights-faq#how-do-i-enable-multi-line-logging

Related

Azure Data Factory removing spaces from column names of csv file

I'm a bit new to azure data factory so apologies if I'm missing anything obvious. I've done several searches and I can't find anything that quite fits.
So the situation is that we have an existing pipeline that will take the path to a csv file and pass this in as a delimited data set. As a sink it is using a parquet data set. This is a generic process that we can pass any delimited file into and it will output it as parquet.
This has been working well but now we have started receiving files with spaces and special characters in the header which causes the output to parquet to fail. Unfortunately we don't have control over the format of the files we receive so I can't handle this at source.
What I would like to do is on ingestion of the file replace any spaces and other special characters in the header with an underscore. If I were doing this on premise I could quickly create a powershell script to do it. I had thought about creating a custom task in AFD to call a powershell script to do this in the blob storage but that seems more complicated than it should be. Is there something else I can do to get this process working while keeping it generic?
As #Joel Cochran mentioned, you can use the below expression in Select transformation to replace space and special characters in the header.
regexReplace($$,'[^a-zA-Z]','_')
Source:
In Select transformation, remove the auto mappings and add new rule base mapping to use this expression.
preview:
You can change the output filename not directly in the Copy activity, assuming you are using this activity.
The workaround is to use a parameter for the filename output that you can cleanup.
You can use the Get Metadata activity to get all filenames from the source csv files.
Then loop over these files with a foreach activity.
Within the foreach activity you can set the output filename with the new name with the cleaned value.
The function could look like this:
#replace(item().name, ' ', '_')
More information on the replace function

Truncate AWS logs using parse command

I'm having some hard time making a easy AWS Logs Insights statement. At the moment I have it like this:
filter #message like /product-id/
| parse "product_id=" as id
The point that I want to achieve is to only have the product-ids, nothing else after that in the id column. The product-id is at the moment a 10 character long string, followed by some other logs values that I am not currently interested in.
I know that i have to use the substr function but haven't managed to integrate it into my search statement in AWS Cloudwatch.
Thanks.

U-SQL extractor. Can we retrieve the ignored lines when using silent option?

I am using the silent option when using U-sql extractors. Is there a way to retreive and monitor the ignored lines?
Regards
Unfortunately you cannot get the ignored lines. If you want to capture them, you would have to write a custom extractor that returns the ignored lines using one of the two approaches:
Use the DiagnosticStream object to output it into a diagnostic file. Note that you will have to turn diagnostic on when running the script to get the diagnostic result.
Add an extra column to your output of type byte[] that will capture your ignored line and set all other column values to null.

what are the usual problems that we face with sincedb in logstash

I am using ELK stack, so using file input plugin in logstash i am working on it
at first i used file*.txt to match with file pattern
later i used masterfile.txt as a single file which has the data of all matching patterns
and now i am going back to file*.txt , but here i see the problem- I am seeing the data on kibana which is the date after the file*.txt is replaced with masterfile.txt but not the history,
I feel like i must understand the behavior of sincedb logstash here
also a possible solution to get the history data
Logstash stores information about the position of the last byte read in the file that contains the logs with sincedb_path. During the execution, Logstash starts reading the input file from the mentioned position.
Take into account 'start_position' and the name of the index ( Logstash -> output) if you want to create a new index with different logs.
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#plugins-inputs-file-sincedb_path

tailLines and SinceTime in logging api,both not worked simultaneously

I am using container engine, and my pods are hosted there.
I am trying to fetch logs, using log api :
http://localhost:8000/api/v1/namespaces/app-test/pods/designer-0/log?tailLines=100&sinceTime=2017-09-17T10:47:58Z
if i used both the query params separately, it works and show the proper result, but if i am using it simultaneously only the top 100 logs are returning, the sinceTime param is get ignored.
my scenario is, i need a log from a specific time, in a chunk like, 100 lines, 100 lines.. like this.
I am not sure, whether it is a bug, or it is not implemented.
I found this from the api reference manual
https://kubernetes.io/docs/api-reference/v1.6/
tailLines - If set, the number of lines from the end of the logs to
show. If not specified, logs are shown from the creation of the
container or sinceSeconds or sinceTime
So, that means if you specify tailLines, it start from the end. I dont see any option explicitly mentioned other than limitBytes. But you will have to play around with it as it does not guarantee number of lines.
tailLines=X tells the server to start that many lines from the end
sinceTime tells the server to start from the specified time
the options are mutually exclusive
Thanks All,
I have later on recognized that, it is not ignoring the sinceTime, as the TailLines intended functionality is return the lines from the last.
So, if i mentioned the sinceTime= 10 PM yesterday, it will return the records from that time..And if also tailLines, is mentioned, so it will return the recent logs from that chunk.
So, it was working as expected. I need to play with LimitBytes for getting the logs in chunk, from that time, Instead of full logs.

Resources