How to get file name of the linux awk and find results - linux

i have this linux command using pipes which find files in directyrids and grep and awk on the results the
command working just fine , what im missing is the bility to get the file name of the results so i could know what is the source file of the result .
$ find . -name "*.log" | xargs grep -i TypeOf | grep -v 'Error=APP1' | awk '{split($0,a,"Name="); print a[2]}' | sort -h
how do i add to this command something to pring the file names

You don't need to add much but you do need to remove several things. You never need grep when you're using awk and split() is what awk does by default so your existing command line should just be:
find . -name "*.log" | xargs awk -F"Name=' 'tolower($0)~/typeof/ && !/Error=APP1/ {print $2}' | sort -h
and then to print the file name and line number just add them to the print statement:
find . -name "*.log" | xargs awk -F'Name=' 'tolower($0)~/typeof/ && !/Error=APP1/ {print FILENAME, FNR, $2}' | sort -h

The problem is your awk program. You discard all fields except the second, including the file name, which should be first. You probably need to use something like:
awk -F: '{split($0,a,"Name="); print $1, a[2]}'

Related

Using STDIN from pipe in sed command to replace value in a file

I've got a command to perform a series of commands that produce a variable output string such as 123456. I want to pipe that to a sed command replacing a known string in a csv file that looks like this:
Fred,Wilma,Betty,Barney
However, the command below does not work and I haven't found any other references to using pipe values as the variable for a replace.
How does this code change if the values in the csv are in a random order and I always want to change the second value?
Example code:
find / -iname awk 2>/dev/null | sha256sum | cut -c1-10 > test.txt |
sed -i -e '/Wilma/ r test.txt' -e 's/Wilma//' input.csv
Contents of input.csv should become: Fred,0d522cd316,Betty,Barney
Okay, in
find / -iname awk 2>/dev/null | sha256sum | cut -c1-10 > test.txt | sed -i -e '/Wilma/ r test.txt' -e 's/Wilma//' input.csv
you have a bug. That "> test.txt" after cut is going to eat your stdin on sed, so things go weird with that pipe afterwards taking stdin. You don't want a pipe there, or you don't want to redirect to a file.
The way to take piped stdin and use it as a parameter in a command is through xargs.
find / -iname awk 2>/dev/null | sha256sum | cut -c1-10 | xargs --replace=INSERTED -- sed -i -e 's/Wilma/INSERTED/' input.csv
(...though that find|shasum is suspect too, in that the order of files is random(ish) and it matters for a reliable sum. You prpobably mean to "|sort" after find.)
(Some would sed -i -e "s/Wilma/$(find|sort|shasum|cut)" f, but I ain't among them. Animals.)
For replacing a fixed string like "Wilma", try:
sed -i 's/Wilma/'"$(find / -iname awk 2>/dev/null |
sha256sum | cut -c1-10)"'/' input.csv
To replace the 2nd field no matter what's in it, try:
sed -i 's/[^,]*/'"$(find / -iname awk 2>/dev/null |
sha256sum | cut -c1-10)"'/2' input.csv

Replacing unknown amount of blank spaces for X amount

Hey so I'm writing a linux script and I came to an interesting finding.
I've got a command that will sort the files inside a directory by it's size and prints the largest one. Command is as follows
find . -type f -ls | sort -r -n -k7 | head -n 1
This will print something amongst the likes of
895918591 8 -r-w-x 1 user01 xdf 1931 28 march 23:21 ./myscript.sh
So I want to to get the largest file size alone and print it. To separate it I used cut -d' ' -f2 issue is, this leaves only empty output. That is because the amount of spaces is inconsistent.
So I tried doing something like this
find . -type f -ls | sort -r -n -k7 | head -n 1 | tr -d [:blank:] | cut -d' ' -f2
Issue is, this removes all the blank spaces now I can't separate them by common separator. So I'm asking, is there a way to replace literally all the blank spaces and then replace them with a single blank space?
If not, at least any other way to get to that number of bytes?
Sed and Awk are great tools for this kind of thing. Sed is a regex-based language that modifies the contents of each line the Sed program receives, and Awk is also a line-oriented tool that automatically splits its input into fields.
To turn sequences of blanks into one blank (substitute all matches of /\s+/ with ) in Sed:
$ find ... | sed 's/\s+/ /g'
To just print the first "word" (sequence of nonspaces) of each line in Awk:
$ find ... | awk '{print $1}'
http://tldp.org/LDP/abs/html/sedawk.html can get you started with these languages.
Instead of cut you can use awk:
find . -type f -ls | sort -r -n -k7 | head -n 1 | awk '{print $2}'
However you can even avoid head as well using awk:
find . -type f -ls | sort -r -n -k7 | awk '{print $2; exit}'
The tool to convert multiple spaces to just one is called tr -s:
tr translates
s squeezes
Sample:
$ cat a
hello this is a sample text with multiple spaces
$ tr -s " " < a
hello this is a sample text with multiple spaces
If you then want to convert every space into X, just pipe to sed 's/ / /g'.
I think you're overthinking the issue at hand:
find -type f -printf "%s\n"|sort -n|tail -n1
Instead of using cut, you can try using the printf command that gives you control over your display
find . -type f -ls | sort -r -n -k7 | head -n 1 -printf %s
You're doing it wrong.
Parsing ls in any form ( like find's -ls option ) is the bad approach.
Do not use ls output for anything. ls is a tool for interactively looking at directory metadata. Any attempts at parsing ls output with code are broken.
I strongly suggest you to read further about this subject. Read Parsing ls.
Instead, use the following function:
# Usage: largest [dir]
largest() {
local f size largest
while read -rd '' f; do
size=$(wc -c < "$f")
if (( size > largest[0] )); then
largest=("$size" "$f")
fi
done < <(find "${1-.}" -type f -print0)
printf '%s is the largest file in %s\n' "${largest[1]}" "${1-.}"
}

Bash grep output filename and line no without matches

I need to get a list of matches with grep including filename and line number but without the match string
I know that grep -Hl will give only file names and grep -Hno will give filename with only matching string. But those not ideal for me. I need to get a list without match but with line no. For this grep -Hln doesn't work. I tried with grep -Hn 'pattern' | cut -d " " -f 1 But it doesn't cut the filename and line no properly.
awk can do that in single command:
awk '/pattern/ {print FILENAME ":" NR}' *.txt
You were pointing it well with cut, only that you need the : field separator. Also, I think you need the first and second group. Hence, use:
grep -Hn 'pattern' files* | cut -d: -f1,2
Sample
$ grep -Hn a a*
a:3:are
a:10:bar
a:11:that
a23:1:hiya
$ grep -Hn a a* | cut -d: -f1,2
a:3
a:10
a:11
a23:1
I guess you want this, just line numbers:
grep -nh PATTERN /path/to/file | cut -d: -f1
example output:
12
23
234
...
Unfortunately you'll need to use cut here. There is no way to do it with pure grep.
Try
grep -RHn Studio 'pattern' | awk -F: '{print $1 , ":", $2}'

How to run grep inside awk?

Suppose I have a file input.txt with few columns and few rows, the first column is the key, and a directory dir with files which contain some of these keys. I want to find all lines in the files in dir which contain these key words. At first I tried to run the command
cat input.txt | awk '{print $1}' | xargs grep dir
This doesn't work because it thinks the keys are paths on my file system. Next I tried something like
cat input.txt | awk '{system("grep -rn dir $1")}'
But this didn't work either, eventually I have to admit that even this doesn't work
cat input.txt | awk '{system("echo $1")}'
After I tried to use \ to escape the white space and the $ sign, I came here to ask for your advice, any ideas?
Of course I can do something like
for x in `cat input.txt` ; do grep -rn $x dir ; done
This is not good enough, because it takes two commands, but I want only one. This also shows why xargs doesn't work, the parameter is not the last argument
You don't need grep with awk, and you don't need cat to open files:
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' input.txt dir/*
Nor do you need xargs, or shell loops or anything else - just one simple awk command does it all.
If input.txt is not a file, then tweak the above to:
real_input_generating_command |
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' - dir/*
All it's doing is creating an array of keys from the first file (or input stream) and then looking for each key from that array in every file in the dir directory.
Try following
awk '{print $1}' input.txt | xargs -n 1 -I pattern grep -rn pattern dir
First thing you should do is research this.
Next ... you don't need to grep inside awk. That's completely redundant. It's like ... stuffing your turkey with .. a turkey.
Awk can process input and do "grep" like things itself, without the need to launch the grep command. But you don't even need to do this. Adapting your first example:
awk '{print $1}' input.txt | xargs -n 1 -I % grep % dir
This uses xargs' -I option to put xargs' input into a different place on the command line it runs. In FreeBSD or OSX, you would use a -J option instead.
But I prefer your for loop idea, converted into a while loop:
while read key junk; do grep -rn "$key" dir ; done < input.txt
Use process substitution to create a keyword "file" that you can pass to grep via the -f option:
grep -f <(awk '{print $1}' input.txt) dir/*
This will search each file in dir for lines containing keywords printed by the awk command. It's equivalent to
awk '{print $1}' input.txt > tmp.txt
grep -f tmp.txt dir/*
grep requires parameters in order: [what to search] [where to search]. You need to merge keys received from awk and pass them to grep using the \| regexp operator.
For example:
arturcz#szczaw:/tmp/s$ cat words.txt
foo
bar
fubar
foobaz
arturcz#szczaw:/tmp/s$ grep 'foo\|baz' words.txt
foo
foobaz
Finally, you will finish with:
grep `commands|to|prepare|a|keywords|list` directory
In case you still want to use grep inside awk, make sure $1, $2 etc are outside quote.
eg. this works perfectly
cat file_having_query | awk '{system("grep " $1 " file_to_be_greped")}'
// notice the space after grep and before file name

Grep - returning both the line number and the name of the file

I have a number of log files in a directory. I am trying to write a script to search all the log files for a string and echo the name of the files and the line number that the string is found.
I figure I will probably have to use 2 grep's - piping the output of one into the other since the -l option only returns the name of the file and nothing about the line numbers. Any insight in how I can successfully achieve this would be much appreciated.
Many thanks,
Alex
$ grep -Hn root /etc/passwd
/etc/passwd:1:root:x:0:0:root:/root:/bin/bash
combining -H and -n does what you expect.
If you want to echo the required informations without the string :
$ grep -Hn root /etc/passwd | cut -d: -f1,2
/etc/passwd:1
or with awk :
$ awk -F: '/root/{print "file=" ARGV[1] "\nline=" NR}' /etc/passwd
file=/etc/passwd
line=1
if you want to create shell variables :
$ awk -F: '/root/{print "file=" ARGV[1] "\nline=" NR}' /etc/passwd | bash
$ echo $line
1
$ echo $file
/etc/passwd
Use -H. If you are using a grep that does not have -H, specify two filenames. For example:
grep -n pattern file /dev/null
My version of grep kept returning text from the matching line, which I wasn't sure if you were after... You can also pipe the output to an awk command to have it ONLY print the file name and line number
grep -Hn "text" . | awk -F: '{print $1 ":" $2}'

Resources