Convert linux log into csv

Convert linux log into csv - linux

I'm newbie on linux. I'm so sorry for asking this question again. But I am really appreciate if someone could help me on this. I have trouble on how to convert my linux log to csv file for more readable.
I have apache log as bellow:
[Sun Mar 01 06:01:30 2015] [error] [client 123.456.789.012] File does not exist: /var/www/html/
How can I separate them by column, using: Date (Sun Mar 01 06:01:30 2015), IP (123.456.789.012) only IP, Error Message (File does not exist) and Target (/var/www/html/)?
Thank you

There are many ways t achieve it in shell script. Will describe the method in detail and will give a sample example.
You have to identify the delimiter to partition your string and either you can use awk or sed command to partition the fields according to the delimiter
For example in you case you can consider ']' as delimiter s to break the line using the delimiter command will be as follows:
cat logfile | awk -F']' '{print "$1, $2, $3"}' > new_log_file.csv

The easiest way would be to use your own Logformat string. You can modify the standard LogFormat to use TAB instead of space as separator. The standard or Common Log Format usually named combined LogFormat looks like this:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
If you want a TAB separated file simply use this CustomLog statement in your server configuration file:
Customlog logs/tabbed-logfile "%h\t%l\t%u\t%t\t\"%r\"\t%>s\t%b\t\"%{Referer}i\"\t\"%{User-Agent}i\""

Related

Puppet File_Line resource always triggering a refresh each time

I am modifying a file utilizing puppet file_line resource but each time puppet runs, its triggering a refresh even though no other change has been made after the first puppet run.
file_line { 'log_format_combined':
ensure => present,
path => '/etc/apache2/apache2.conf',
line => 'LogFormat "%a %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined',
match => '^LogFormat "%h %l %u %t ."%r." %>s %b ."%{Referer}i." ."%{User-Agent}i."" combined',
}
What can I do to only trigger a refresh if a change is made to the file?

I tested your code and it works perfectly, and the file_line makes sure the code is executed only once if no change has made.
So when I run puppet for the first time I get this
Notice: /Stage[main]/Main/Node[default]/File_line[log_format_combined]/ensure: created
and then nothing when I run the puppet agent again.
Are you sure your file hasn't changed after the puppet agent run ? don't you have another puppet resource that change that file ?

I used the same code to test in my setup ,there is no replacement happening after the first run, same with the refresh too.I suspect the refresh would be the outcome of some other code snippet.

How to color highlight text log entries (Apache, Log4J) during terminal viewing

I'm frequently eye-balling apache and jboss logs on Linux with "less" and "tail -f" and would like to have particular lines that match a string to be highlighted in a color of my choice. Is there a way to do this?
I am typically connected via ssh from an MS-DOS command window.
Edit: Preferably, the solution would not modify the log file itself.

Don't have access to a terminal right now, but you can try this,
In Apache config file when you define the LogFormat, try using shell color codes. E.g.,
LogFormat "${START}%h %l %u %t \"%r\" %>s %b ${END}" common_color
where
START="\e[1;34m" # for a blue text, you can use other colors as well
END="\e[0m"
This should work in shell terminals

filename last modification date shell in script

I'm using bash to build a script where I will get a filename in a variable an then with this variable get the file unix last modification date.
I need to get this modification date value and I can't use stat command.
Do you know any way to get it with the common available *nix commands?

Why you shouldn't use ls:
Parsing ls is a bad idea. Not only is the behaviour of certain characters in filenames undefined and platform dependant, for your purposes, it'll mess with dates when they're six months in the past. In short, yes, it'll probably work for you in your limited testing. It will not be platform-independent (so no portability) and the behaviour of your parsing is not guaranteed given the range of 'legal' filenames on various systems. (Ext4, for example, allows spaces and newlines in filenames).
Having said all that, personally, I'd use ls because it's fast and easy ;)
Edit
As pointed out by Hugo in the comments, the OP doesn't want to use stat. In addition, I should point out that the below section is BSD-stat specific (the %Sm flag doesn't work when I test on Ubuntu; Linux has a stat command, if you're interested in it read the man page).
So, a non-stat solution: use date
date, at least on Linux, has a flag: -r, which according to the man page:
display the last modification time of FILE
So, the scripted solution would be similar to this:
date -r ${MY_FILE_VARIABLE}
which would return you something similar to this:
zsh% date -r MyFile.foo
Thu Feb 23 07:41:27 CST 2012
To address the OP's comment:
If possible with a configurable date format
date has a rather extensive set of time-format variables; read the man page for more information.
I'm not 100% sure how portable date is across all 'UNIX-like systems'. For BSD-based (such as OS X), this will not work; the -r flag for the BSD-date does something completely different. The question doesn't' specify exactly how portable a solution is required to be. For a BSD-based solution, see the below section ;)
A better solution, BSD systems (tested on OS X, using BSD-stat; GNU stat is slightly different but could be made to work in the same way).
Use stat. You can format the output of stat with the -f flag, and you can select to display only the file modification data (which, for this question, is nice).
For example, stat -f "%m%t%Sm %N" ./*:
1340738054 Jun 26 21:14:14 2012 ./build
1340738921 Jun 26 21:28:41 2012 ./build.xml
1340738140 Jun 26 21:15:40 2012 ./lib
1340657124 Jun 25 22:45:24 2012 ./tests
Where the first bit is the UNIX epoch time, the date is the file modification time, and the rest is the filename.
Breakdown of the example command
stat -f "%m%t%Sm %N" ./*
stat -f: call stat, and specify the format (-f).
%m: The UNIX epoch time.
%t: A tab seperator in the output.
%Sm: S says to display the output as a string, m says to use the file modification data.
%N: Display the name of the file in question.
A command in your script along the lines of the following:
stat -f "%Sm" ${FILE_VARIABLE}
will give you output such as:
Jun 26 21:28:41 2012
Read the man page for stat for further information; timestamp formatting is done by strftime.

have perl?
perl -MFile::stat -e "print scalar localtime stat('FileName.txt')->mtime"

How about:
find $PATH -maxdepth 1 -name $FILE -printf %Tc
See the find manpage for other values you can use with %T.

You can use the "date" command adding the desired format option the format:
date +%Y-%m-%d -r /root/foo.txt
2013-05-27
date +%H:%M -r /root/foo.txt
23:02

You can use ls -l which lists the last modification time, and then use cut to cut out the modification date:
mod_date=$(ls -l $file_name | cut -c35-46)
This works on my system because the date appears between columns 35 to 46. You might have to play with it on your system.
The date is in two different formats:
Mmm dd hh:mm
Mmm dd yyyy
Files modified more than a year ago will have the later format. Files modified less than a year ago will have to first format. You could search for a ":" and know which format the file is in:
if echo "$mod_date" | grep -q ":"
then
echo "File was modified within the year"
else
echo "File was modified more than a year ago"
fi

Find string using grep

How can I find all strings in a file which are alphanumeric and may contain either the symbol _ or # and end in the hex code 0x00. I've tried using grep with the following options but it doesn't seem to work:
-z [a-zA-Z0-9_]*
Update
Here's an example of some of the strings I'm trying to extract, as you can see they end with the hex code 0x00, vary in length and although this specific example doesn't show they can contain 0-9, an underscore (_) or a hash (#).
http://i42.tinypic.com/23kos5w.jpg

When I run this all I get is 'Binary file /cygdrive/d/dump.bin matches'? I'm using grep in cygwin. – Twisted89 Apr 4 at 10:09
MrJames answer does not include the -a. Plus his putting \w in brackets simply doesn't work.
grep -Eaoz "(\w|_|#)*" FILE

need linux equivalent to windows "echo %date% %time% %COMPUTERNAME%"

In Linux,
"echo %date% %time% %COMPUTERNAME%"
returns
%date% %time% %COMPUTERNAME%
not
Fri 09/24/2010 10:46:25.42 WXP2010043001
as Windows does.
I need to be able to do this for the logs I'm setting up.

Use the date command with a format like this:
date +"%m/%d/%Y %H:%M:%S $HOSTNAME"
To get hundredths of seconds, you may need to do some text processing like this:
DATE=date +'%m/%d/%Y %H:%M:%S.%N'
DATE=${DATE%???????}
DATE="$DATE $HOSTNAME"
This is because date offers seconds, nanoseconds, and nothing in between!

You can do:
dt=$(date)
echo $dt $HOSTNAME

echo $(date '+%Y %b %d %H:%M') Your output $HOSTNAME
Outputs:
2013 Nov 01 09:11 Your output PEGASUS-SYDNEY-CL2

it is also possible to use backtiks caracters for this:
echo `date` `hostname`
or with (localised) date formating:
echo `date +"%a %x %X"` `hostname`

As a complement: percentage character is not used to reference variables on any Linux shell. You should use the dollar sign for this.
You should probably read an introduction to Bash (here)

Several people have provided answers based on date, but your question requires the short day name (although my UK Win 7 installation doesn't provide this with the ECHO command you specified), which no one has (so far) addressed.
To get this, you will probably want to include %a in the format string:
date "+%a %m/%d/%Y %H:%M:%S $HOSTNAME"

In Linux, there is the date command. If you don't like the default format, it can be modified. See the manpage of date
For hostname, you can use hostname command, or $HOSTNAME environment variable, if it is set.
With system name, it is more complicated. You can use uname -a, sometimes it contains the OS name. Some distributions also have lsb-release, but not all of them.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string