I've written a script to get the browser version of users but I need to clean up the output. What the script does is looks at the apache logs for # and IE8 then emails me the information. The problem I have is the output as when the grep finds a email address and IE8 it gives me the full output - i.e. /page/code/user#foobar.com/home.php whereas the output i'm looking is the just the email address and to only have this information recorded once a day:
Example:
user#foobar IE8
Thanks
#!/bin/bash
#Setting date and time (x and y and z aren't being used at the moment)
x="$(date +'%d/%b/%Y')"
y="$(date +'%T')"
z="$(date +'%T' | awk 'BEGIN { FS =":"} ; {print $1}')"
#Human readable for email title
emaildate=$(date +"%d%b%Y--Hour--%H")
#Setting date and time for grep and filename
beta="$(date +'%d/%b/%Y:%H')"
sigma="$(date +'%d-%b-%Y-%H')"
#CurrentAccess logs
log='/var/logs/access.log'
#Set saved log location
newlogs=/home/user/Scripts/browser/logs
#Prefrom the grep for the current day
grep # $log | grep $beta | awk 'BEGIN { FS = " " } ; { print $7 }' | sort -u >> $newlogs/broswerusage"$sigma".txt
mail -s "IE8 usage for $emaildate" user#exmaple.com < $newlogs/broswernusage"$sigma".txt
Related
I'm a beginner to bash scripting and been writing a script to check different log files and I'm bit stuck here.
clientlist=/path/to/logfile/which/consists/of/client/names
#i will grep only the client name from the file which has multiple log lines
clients=$(grep --color -i 'list of client assets:' $clientlist | cut -d":" -f1 )
echo "Clients : $clients"
#For example "Clients: Apple
# Samsung
# Nokia"
#number of clients may vary from time to time
assets=("$clients".log)
echo assets: "$assets"
The code above greps the client name from the log file and i'm trying to use the grepped client name (each) to construct a logfile with each client name.
The number of clients is indefinite and may vary from time to time.
The code I have returns the client name as a whole
assets: Apple
Samsung
Nokia.log
and I'm bit unsure of how to cut the string and pass it on one by one to return the assets which has .log for each client name. How can i do this ?
Apple.log
Samsung.log
Nokia.log
(Apologies if I have misunderstood the task)
Using awk
if your input file (I'll call it clients.txt) is:
Clients: Apple
Samsung
Nokia
The following awk step:
awk '{print $NF".log"}' clients.txt
outputs:
Apple.log
Samsung.log
Nokia.log
(You can pipe straight into awk and omit the file name if the pipe stream is as the file contents in the above example).
It is highly likely that a simple awk procedure can perform the entire task beginning with the 'clientlist' you process with grep (awk has all the functionality of grep built-in) but I'd need to know the structure of the origial file to extract the client names.
One awk idea:
assets=( $(awk -F: '/list of client assets:/ {print $2".log"}' "${clientlist}") )
# or
mapfile -t assets < <(awk -F: '/list of client assets:/ {print $2".log"}' "${clientlist}")
Where:
-F: - define input field delimiter as :
/list of client assets:/ - for lines that contain the string list of clients assets: print the 2nd :-delimited field and append the string .log on the end
One sed idea:
assets=( $(sed 's/.*://; s/$/.log/' "${clientlist}") )
# or
mapfile -t assets < <(sed 's/.*://; s/$/.log/' "${clientlist}")
Where:
s/.*:// - strip off everything up to the :
s/$/.log/ - replace end of line with .log
Both generate:
$ typeset -p assets
declare -a assets=([0]="Apple.log" [1]="Samsung.log" [2]="Nokia.log")
$ echo "${assets[#]}"
Apple.log Samsung.log Nokia.log
$ printf "%s\n" "${assets[#]}"
Apple.log
Samsung.log
Nokia.log
$ for i in "${!assets[#]}"; do echo "assets[$i] = ${assets[$indx]}"; done
assets[0] = Apple.log
assets[1] = Samsung.log
assets[2] = Nokia.log
NOTE: the alternative answers using mapfile address the issue referenced in Charles Duffy comment (see bash pitfall #50); readarray is a synonym for mapfile
Can run the below command in shell script with no problem on Ubuntu 21.04 :
grep -h "new tip" logs/node.log | tail -1000 | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p" | awk -F, -v threshold=2 '{"date +%s.%3N -d\""$1"\""|getline actualTime; delay=actualTime-$2-1591566291; sum+=delay; if (delay > threshold ) print $1 " " delay;} END {print "AVG=", sum/NR}'
but when I run the exact same script on Ubuntu 20.04.2, I get this error :
/bin/sh: 1: Syntax error: Unterminated quoted string
It's definitely the exact same script because I scp'd it from the 21.04 to 20.04.2. Couldn't find any topics in stackoverflow or on the overall internet which addressed this difference. Both Ubuntu's are on Linux cloud servers. About the only way to run the script with no error is taking out this awk line: "date +%s.%3N -d\""$1"\""|getline actualTime;
I tried playing around with the reference to the $1 field but nothing would work. Tried it with nawk instead of awk, but no luck. Maybe as a last resort I can upgrade the OS from v 20 to v 21.
Has anyone seen this before?
Added: Thanks all for the quick replies. Here are the first lines of the log file that the script is running against
[Relaynod:cardano.node.ChainDB:Notice:60] [2021-06-30 02:20:14.36 UTC] Chain extended, new tip: de56b9f458e8942ca74c6a1913dc58fa896823dc19b366285e15481f434ed337 at slot 33453323
[Relaynod:cardano.node.ChainDB:Notice:60] [2021-06-30 02:20:15.17 UTC] Chain extended, new tip: e88ea4f438944bd15186fe93f321c117ec769cfbd33667654634f4510cfd3780 at slot 33453324
Just to make sure it's not a data issue. I ran the script on the Ubuntu 21 server against the file (worked), then copied the file to the Ubuntu 20 server and ran the exact same script against the copied file, and get the error.
I'll try out the suggestions on this topic and will let everyone know the answer.
New update: after laptop crash and replacement, remembered to come back to this post. I ended up using mktime like Ed mentioned. It's working now.
The shell script:
#!/bin/bash
grep -h "new tip" logs/node.log | tail -1000 | sed -rn "s/[(.)].[(.)].* ([0-9]+)/\2,\3/p" | mawk -F, -v threshold=2 -f check_delay.awk
The awk script:
BEGIN{ ENVIRON["TZ"] = "UTC"; }
{
year = substr($1,1,4);
month = substr($1,6,2);
day = substr($1,9,2);
hour = substr($1,12,2);
min = substr($1,15,2);
sec = substr($1,18,2);
timestamp = year" "month" "day" "hour" "min" "sec;
actualTime=mktime(timestamp) + 7200;
delay=actualTime-$2-1591566291;
sum+=delay;
if (delay >= threshold )
print $1 " " delay;}
END {print "AVG=", sum/NR}
You're spawning a shell to call date using whatever value happens to be in $1 in your data so the result of that will depend on your data. Look:
$ echo '3/27/2021' | awk '{"date +%s.%3N -d\""$1"\"" | getline; print}'
1616821200.000
$ echo 'a"b' | awk '{"date +%s.%3N -d\""$1"\"" | getline; print}'
sh: -c: line 0: unexpected EOF while looking for matching `"'
sh: -c: line 1: syntax error: unexpected end of file
a"b
and what this command outputs from a log file:
sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p"
will vary greatly depending on the contents of specific lines in the log file since the parts you're trying to isolate aren't anchored and use .*s when you presumably meant to use [^]]*s. For example:
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50'
[foo] [3/27/2021] 15 something [probably] happened at line 50
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50' | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p"
3/27/2021] 15 something [probably,50
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50' | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p" | awk -F, -v threshold=2 '{"date +%s.%3N -d\""$1"\""|getline actualTime; delay=actualTime-$2-1591566291; sum+=delay; if (delay > threshold ) print $1 " " delay;} END {print "AVG=", sum/NR}'
date: invalid date ‘3/27/2021] 15 something [probably’
AVG= -1591566341
If you want to do that then you could introduce a check for a valid date to avoid THAT specific error, e.g. (but obviously create a better date verification regexp than this):
$ echo 'a"b' | awk '$1 ~ "[0-9]/[0-9]+/[0-9]" {"date +%s.%3N -d\""$1"\"" | getline; print}'
$
but it's still fragile and extremely slow.
You're using GNU sed for -r so you have or can get GNU awk and that has builtin time functions so you shouldn't be spawning a subshell to call date in the first place, you should just be using mktime(), see https://stackoverflow.com/a/68180908/1745001, which will avoid cryptic errors like that and run orders of magnitude faster.
I am working on shellscript with excel sheet. Till now I have done as shown in screenshot by using below command:
bash execution.sh BehatIPOP.xls| awk '/Script|scenario/' | awk 'BEGIN{print "Title\tResult"}1' | awk '0 == NR%2{printf "%s",$0;next;}1' >> BehatIPOP.xls
My requirement is along with the heading Result I want to add(concat) current date also. So I am getting date by using below command:
$(date +"%d-%m-%y %H:%M:%S")
So date will display like this : 25-08-2016 17:00:00
But I am not getting how can use date command in the above mentioned command to achieve heading like below:
| Title | Result # 25-08-2016 17:00:00|
Thanks for any suggestions..
You can pick up the date inside awk and store it in a variable d like this, if that is what you mean:
awk 'BEGIN{cmd="date +\"%d-%m-%y %H:%M:%S\""; cmd |getline d; close(cmd);print "Result # " d}'
Result # 25-08-16 13:44:05
Don't use awk at all for the header, just use date directly:
{ printf "Title\tResult # "; date +"%d-%m-%y %H:%M:%S"; bash execution.sh BehatIPOP.xls |
awk '/Script|scenario/' |
awk '1 == NR%2{printf "%s",$0;next;}1'; } >> BehatIPOP.xls
Note that there's no need for 2 awks, but I'm keeping that here to minimize the diff. Since I've pulled the header out of the awk, the comparison changes from 0==NR%2 to 1==NR%2.
I am working on a shell script which contains following piece of code.
I don't understand these lines, mostly the cut command and export command. Can any one help me...
Also please point me to a better linux command reference.
Thanks in advance!
# determine sum of 60 records
awk '{
if (substr($0,12,2) == "60" || substr($0,12,2) == "78") \
print $0
}'< /tmp/checks$$.1 > /tmp/checks$$.2
rec_sum =`cut -c 151-160 /tmp/checks$$.2 | /u/fourgen/cashnet/bin/sumit`
export rec_sum
Inside my sumit script following is the code
awk '{ total += $1}
END {print total}' $1
Let me show my main script prep_chk
awk 'BEGIN{OFS=""} {if (substr($0,12,2) == "60" && substr($0,151,1) == "-") \
{ print substr($0,1,11), "78", substr($0,14) } \
else \
{ print $0 } \
}' > /tmp/checks$$.1
# determine count of non-header record
rec_cnt=`wc -l /tmp/checks$$.1`
rec_cnt=`expr "$rec_cnt - 1"`
export rec_cnt
# determine sum of 60 records
awk '{ if (substr($0,12,2) == "60" || substr($0,12,2) == "78") \
print $0 }'< /tmp/checks$$.1 > /tmp/checks$$.2
rec_sum=`cut -c 151-160 /tmp/checks$$.2 | /u/fourgen/cashnet/bin/sumit`
export rec_sum
# make a new header record and output it
head -1 /tmp/checks$$.1 | awk '{ printf("%s%011.11d%05.5d%s\n", \
substr($0,1,45), rec_sum, rec_cnt, substr($0,62)) }' \
rec_sum="$rec_sum" rec_cnt="$rec_cnt"
# output everything else sorted by tran code
grep -v "%%%%%%%%%%%" /tmp/checks$$.1 | cut -c 1-150 | sort -k 1.12,13
cut -c cuts text from a given position in a file, in this case characters 151 to 160 in the file /tmp/checks$$.2. This string is piped to some code called submit which produces some output.
That output is then assigned to the environment variable rec_sum. The export command makes this variable available to be used through the system, for example in another shell script.
Edit:
If that's all you have inside your submit script it simply adds on the string you pass it, which it seems must be a number, to some value total and prints the number it was passed. It seems like there must be some more code inside that script otherwise it would be a bit of an over complicated way to do it.
I need to take a file and count the number of occurrences of $7 - I've done this with awk (because I need to run this through more awk)
What I want to do is combine this into one script - so far I have
#! /usr/bin/awk -f
# get the filename, count the number of occurs
# <no occurs> <filename>
{ print $7 | "grep /datasheets/ | sort | uniq -c"}
how do I grab that output and run it through more awk commands - in the same file
Eventually, I need to be able to run
./process.awk <filename>
so it can be a drop-in replacement for a previous setup which would take too much time/effor to to change -
if you want to forward the output of an awk script to another awk script, just pipe it to awk.
awk 'foobar...' file|awk 'new awkcmd'
and your current awk|grep|sort|uniq could be done with awk itself. save your 3 processes. you want to get the repeated counts, don't you?
awk '$7~=/datasheets/{a[$7]++;} END{for(x in a)print x": "a[x]' file
should work.
If you use Gawk, you could use the 2-way communications to push the data to the external command then read it back:
#!/usr/bin/gawk -f
BEGIN {
COMMAND = "sort | uniq -c"
SEEN = 0
PROCINFO[ COMMAND, "pty" ] = 1
}
/datasheets/ {
print $7 |& COMMAND
SEEN = 1
}
END {
# Don't read sort output if no input was provided
if ( SEEN == 1 ) {
# Tell sort no more input data is available
close( COMMAND, "to" )
# Read the sorted data
while( ( COMMAND |& getline SORTED ) > 0 ) {
# Do whatever you want on the sorted data
print SORTED
}
close( COMMAND, "from" )
}
}
See https://www.gnu.org/software/gawk/manual/gawk.html#Two_002dway-I_002fO