Splunk splitting xml log event - log4net

We have logs that log an event to a single file. Each log entry looks something like this:
<LogEntry>
<UserName>IIS APPPOOL\ASP.NET v4.0</UserName>
<TimeStamp>02/28/2014 13:54:17</TimeStamp>
<ThreadName>20</ThreadName>
<CorrelationId>7a0d464d-556c-4d47-820f-0cf01322e54c</CorrelationId>
<LoggerName>-Api-booking</LoggerName>
<Level>INFO</Level>
<Identity></Identity>
<Domain>API-1-130380690118132000</Domain>
<CreatedOn>02/28/2014 13:54:22</CreatedOn>
<ExceptionObject />
<RenderedMessage>"7a0d464d-556c-4d47-820f-0cf01322e54c" - "GET https://myapi.com/booking" - API-"Response":
"Unauthorized"</RenderedMessage>
</LogEntry>
When we import these logs into Splunk, the log entry is split up incorrectly into 3 parts e.g.
1-
<LogEntry>
<UserName>IIS APPPOOL\ASP.NET v4.0</UserName>
2-
<CreatedOn>02/28/2014 02:57:55</CreatedOn>
<ExceptionObject />
<RenderedMessage>"66d8cdda-ff62-480a-b7d2-ec175b151e5f" - "POST https://myapi.com/booking" - API-"Response":
"Bad Request"</RenderedMessage>
</LogEntry>
3-
<TimeStamp>02/28/2014 02:57:29</TimeStamp>
<ThreadName>21</ThreadName>
<CorrelationId>66d8cdda-ff62-480a-b7d2-ec175b151e5f</CorrelationId>
<LoggerName>-Api-booking</LoggerName>
<Level>INFO</Level>
<Identity></Identity>
<Domain>/LM/W3SVC/1/ROOT/Api-1-130380256918440000</Domain>
How can I configure Splunk to see these as a single log event?

props.conf (pay attention to LINE_BREAKER)
[your_xml_sourcetype]
TIME_PREFIX = <TimeStamp>
MAX_TIMESTAMP_LOOKAHEAD = 19
TZ = GMT
# A performance tweak is to disable SHOULD_LINEMERGE and then set the
# LINE_BREAKER to "line ending characters coming before a new time stamp"
# (note the direct link of the TIME_FORMAT to the regex of LINE_BREAKER).
TIME_FORMAT = %m/%d/%Y %T
LINE_BREAKER = ([\r\n]+)<LogEntry>
SHOULD_LINEMERGE = False
# 10000 is default, should be set on a case by case basis
TRUNCATE = 5000
# If the data does not have nice key=value pairs, (or some other readily
# machine parseable format, like JSON or XML), set KV_MODE = none so that
# Splunk doesn't spin its wheels on attempting to look for key = value
# pairs which don't exist.
KV_MODE = xml
# Leaving PUNCT enabled can impact indexing performance. Customers can
# comment this line if they need to use PUNCT
ANNOTATE_PUNCT = false
More information here: http://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf

Related

Regex to find text & value in large text

As I SSH into CM, run commands and start reading the CLI output, I get the following
back:
# * A lot more output above but been removed *
terminal_output = """
[24;1H [79b[1GCommand: disp sys cust<<[23;0H[0;7m [79b[1G[0m[24;0H [79b[1G[1;0H[0;7m [79b[1G[0m[2;0H [79b[1G[3;1H[0J7[1;1H[0;7mdisplay system-parameters customer-options [0m8[1;65H[0;7mPage 1 of 12[0m[2;33HOPTIONAL FEATURES[4;8HG3 Version: [4;20HV20 [4;50HSoftware Package: [4;68HEnterprise [5;10HLocation: [5;20H2[6;10HPlatform: [6;20H28 [5;51HSystem ID (SID): [5;68H9990093751 [6;51HModule ID (MID): [6;68H1 [8;60HUSED[9;29HPlatform Maximum Ports: [9;53H 81000[9;60H 436[10;35HMaximum Stations: [10;53H 135[10;60H 110[11;27HMaximum XMOBILE Stations: [11;53H 41000[11;60H 0[12;17HMaximum Off-PBX Telephones - EC500: [12;53H 135[12;60H 2[13;17HMaximum Off-PBX Telephones - OPS: [13;53H 135[13;60H 40[14;17HMaximum Off-PBX Telephones - PBFMC: [14;53H 135[14;60H 0[15;17HMaximum Off-PBX Telephones - PVFMC: [15;53H 135[15;60H 0[16;17HMaximum Off-PBX Telephones - SCCAN: [16;53H 0[16;60H 0[17;22HMaximum Survivable Processors: [17;53H 313[17;62H 1[22;9H(NOTE: You must logoff & login to effect the permission changes.)[2;50H[0m
"""
It's a lot of ANSI escape codes (I think?) which sort of makes the output not too readable but anyways, what I'm trying to get back is the following from the text above:
Maximum Stations: 135 110
I know from my understanding that a Regex would be required for this.
The Regexes that I tried using but did not work:
r'Maximum Stations:\s*(\d+)(\d+)'
r'Maximum Stations: \d+'
If anyone knows how to filter out these ANSI character codes so they don't appear in the final output that'd be great too.
Thank you.
you can try the following
"(Maximum Stations:)\s\[\d*;\d*H\s*(\d*)\[\d*;\d*H\s*(\d*)"gm
it produces three groups the first with the maximum stations text then two more each with the number you wanted to capture. You would have to combine the groups to get your final output.
I don't know if this will be generic enough for your application though.

How do I in FloPy Modflow6 output MAW head values for all timesteps?

I am creating a MAW well and want to use it as an observation well to compare it later to field data, it should be screened over multiple layers. However, I am only getting the head value in the well of the very last timestep in my output file. Any ideas on how to get all timesteps in the output?
The FloPy manual says something about it needing to be in Output Control, but I can't figure out how to do that:
print_head (boolean) – print_head (boolean) keyword to indicate that the list of multi-aquifer well heads will be printed to the listing file for every stress period in which “HEAD PRINT” is specified in Output Control. If there is no Output Control option and PRINT_HEAD is specified, then heads are printed for the last time step of each stress period.
In the MODFLOW6 manual I see that it is possible to make a continuous output:
modflow6
My MAW definition looks like this:
maw = flopy.mf6.ModflowGwfmaw(gwf,
nmawwells=1,
packagedata=[0, Rwell, minbot, wellhead,'MEAN',OBS1welllayers],
connectiondata=OBS1connectiondata,
perioddata=[(0,'STATUS','ACTIVE')],
flowing_wells=False,
save_flows=True,
mover=True,
flow_correction=True,
budget_filerecord='OBS1wellbudget',
print_flows=True,
print_head=True,
head_filerecord='OBS1wellhead',
)
My output control looks like this:
oc = flopy.mf6.ModflowGwfoc(gwf,
budget_filerecord=budget_file,
head_filerecord=head_file,
saverecord=[('HEAD', 'ALL'), ('BUDGET', 'ALL'), ],
)
Hope this is all clear and someone can help me, thanks!
You need to initialise the MAW observations file... it's not done in the OC package.
You can find the scripts for the three MAW examples in the MF6 documentation here:
https://github.com/MODFLOW-USGS/modflow6-examples/tree/master/notebooks
It looks something like this:
obs_file = "{}.maw.obs".format(name)
csv_file = obs_file + ".csv"
obs_dict = {csv_file: [
("head", "head", (0,)),
("Q1", "maw", (0,), (0,)),
("Q2", "maw", (0,), (1,)),
("Q3", "maw", (0,), (2,)),
]}
maw.obs.initialize(filename=obs_file, digits=10, print_input=True, continuous=obs_dict)

How to Generate Grok Patterns automatically using LogMine

I am trying to generate GROK patterns automatically using LogMine
Log sample:
Error IGXL error [Slot 2, Chan 16, Site 0] HSDMPI:0217 : TSC3 Fifo Edge EG0-7 Underflow. Please check the timing programming. Edge events should be fired in the sequence and the time between two edges should be more than 2 MOSC ticks.
Error IGXL error [Slot 2, Chan 18, Site 0] HSDMPI:0217 : TSC3 Fifo Edge EG0-7 Underflow. Please check the timing programming. Edge events should be fired in the sequence and the time between two edges should be more than 2 MOSC ticks.
For the above logs, I am getting the following pattern:
re.compile('^(?P<Event>.*?)\\s+(?P<Tester>.*?)\\s+(?P<State>.*?)\\s+(?P<Slot>.*?)\\s+(?P<Instrument>.*?)\\s+(?P<Content1>.*?):\\s+(?P<Content>.*?)$')
But I expect a Grok Pattern(Logstash) that looks like this:
%{LOGLEVEL:level} *%{DATA:Instrument} %{LOGLEVEL:State} \[%{DATA:slot} %{DATA:slot} %{DATA:channel} %{DATA:channel} %{DATA:Site}] %{DATA:Tester} : %{DATA:Content}
Code: LogMine is imported from the following link: https://github.com/logpai/logparser/tree/master/logparser/LogMine
import sys
import os
sys.path.append('../')
import LogMine
input_dir ='E:\LogMine\LogMine' # The input directory of log file
output_dir ='E:\LogMine\LogMine/output/' # The output directory of parsing results
log_file ='E:\LogMine\LogMine/log_teradyne.txt' # The input log file name
log_format ='<Event> <Tester> <State> <Slot> <Instrument><content> <contents> <context> <desc> <junk> ' # HDFS log format
levels =1 # The levels of hierarchy of patterns
max_dist =0.001 # The maximum distance between any log message in a cluster and the cluster representative
k =1 # The message distance weight (default: 1)
regex =[] # Regular expression list for optional preprocessing (default: [])
print(os.getcwd())
parser = LogMine.LogParser(input_dir, output_dir, log_format, rex=regex, levels=levels, max_dist=max_dist, k=k)
parser.parse(log_file)
This code returns only the parsed CSV file, I am looking to generate the GROK Patterns and use it later in a Logstash application to parse the logs.

Minimal self-compiling to .pdf Rmarkdown file

I need to compose a simple rmarkdown file, with text, code and the results of executed code included in a resulting PDF file. I would prefer if the source file is executable and self sifficient, voiding the need for a makefile.
This is the best I have been able to achieve, and it is far from good:
#!/usr/bin/env Rscript
library(knitr)
pandoc('hw_ch4.rmd', format='latex')
# TODO: how to NOT print the above commands to the resulting .pdf?
# TODO: how to avoid putting everyting from here on in ""s?
# TODO: how to avoid mentioning the file name above?
# TODO: how to render special symbols, such as tilde, miu, sigma?
# Unicode character (U+3BC) not set up for use with LaTeX.
# See the inputenc package documentation for explanation.
# nano hw_ch4.rmd && ./hw_ch4.rmd && evince hw_ch4.pdf
"
4E1. In the model definition below, which line is the likelihood?
A: y_i is the likelihood, based on the expectation and deviation.
4M1. For the model definition below, simulate observed heights from the prior (not the posterior).
A:
```{r}
points <- 10
rnorm(points, mean=rnorm(points, 0, 10), sd=runif(points, 0, 10))
```
4M3. Translate the map model formula below into a mathematical model definition.
A:
```{r}
flist <- alist(
y tilda dnorm( mu , sigma ),
miu tilda dnorm( 0 , 10 ),
sigma tilda dunif( 0 , 10 )
)
```
"
Result:
What I eventually came to use is the following header. At first it sounded neat, but later I realized
+ is indeed easy to compile in one step
- this is code duplication
- mixing executable script and presentation data in one file is a security risk.
Code:
#!/usr/bin/env Rscript
#<!---
library(rmarkdown)
argv <- commandArgs(trailingOnly=FALSE)
fname <- sub("--file=", "", argv[grep("--file=", argv)])
render(fname, output_format="pdf_document")
quit(status=0)
#-->
---
title:
author:
date: "compiled on: `r Sys.time()`"
---
The quit() line is supposed to guarantee that the rest of the file is treated as data. The <!--- and --> comments are to render the executable code as comments in the data interpretation. They are, in turn, hidden by the #s from the shell.

Why does ncl not find netcdf files?

I use ncl-ncarg 6.1.2-7 from Trusty under Ubuntu 14.04. I created a soft link from usr/share/ncarg to usr/lib and set the environment and path by:
export NCARG_ROOT="/usr"
export PATH=$NCARG_ROOT/bin:$PATH
I have a simple_plot_pr.ncl which create a panel plot from 3 netCDF files.
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
begin
;-- read data and set variable references
f1 = addfile ("home/robert/Dokumenty/climatological monthly mean pr_1971-2000.nc","r")
f2 = addfile ("home/robert/Dokumenty/climatological monthly mean pr_2021-2050.nc","r")
f3 = addfile ("home/robert/Dokumenty/climatological monthly mean pr_2071-2100.nc","r")
pr1 = f1->pr
pr2 = f2->pr
pr3 = f3->pr
;-- open a PNG file
wks = gsn_open_wks("png","panel_plot")
;-- create plot array
plot = new(3,graphic)
;-- set resources for contour plots
res = True
res#gsnMaximize = True
res#cnFillOn = True
res#tiMainString = "Climatological mean monthly precipitation amount"
gsn_define_colormap(wks,"rainbow")
plot(0) = gsn_csm_colormap(wks,pr1(:,:),res)
res#tiMainString = ""
plot(1) = gsn_csm_colormap(wks,pr2(:,:),res)
res#tiMainString = ""
plot(3) = gsn_csm_colormap(wks,pr3(:,:),res)
;-- create panel plot
gsn_panel(wks,plot,(/3,1/),False)
end
When I run this .ncl file I get the following error messages:
Copyright (C) 1995-2013 - All Rights Reserved
University Corporation for Atmospheric Research
NCAR Command Language Version 6.1.2
The use of this software is governed by a License Agreement.
See http://www.ncl.ucar.edu/ for more details.
fatal:["FileSupport.c":2761]:_NclFindFileExt: Requested file <home/Dokumenty/climatological monthly mean pr_1971-2000.nc> or <home/Dokumenty/climatological monthly mean pr_1971-2000> does not exist
fatal:["FileSupport.c":3106]:(home/Dokumenty/climatological monthly mean pr_1971-2000.nc) has no file extension, can't determine type of file to open
fatal:["FileSupport.c":2761]:_NclFindFileExt: Requested file <home/robert/Dokumenty/climatological monthly mean pr_2021-2050.nc> or <home/robert/Dokumenty/climatological monthly mean pr_2021-2050> does not exist
fatal:["FileSupport.c":3106]:(home/robert/Dokumenty/climatological monthly mean pr_2021-2050.nc) has no file extension, can't determine type of file to open
fatal:["FileSupport.c":2761]:_NclFindFileExt: Requested file <home/robert/Dokumenty/climatological monthly mean pr_2071-2100.nc> or <home/robert/Dokumenty/climatological monthly mean pr_2071-2100> does not exist
fatal:["FileSupport.c":3106]:(home/robert/Dokumenty/climatological monthly mean pr_2071-2100.nc) has no file extension, can't determine type of file to open
fatal:file (f1) isn't defined
fatal:["Execute.c":8128]:Execute: Error occurred at or near line 11 in file simple_plot_pr.ncl
I checked these files and they exist. I do not understand why ncl not find them? Can someone give me a suggestion to solve this issue?
The error is arising definitely due to the spaces in the file names. Please rename the filename without spaces.
You can also use the forward slash before the space in the file name. For example. Instead of writing "climatological monthly mean pr_1971-2000.nc"; you cane write the file names in the following format.
"climatological\ monthly\ mean\ pr_1971-2000.nc"

Resources