Looping a linux command with input and multiple pipes - linux

This command works, but I want it run it on every document (input.txt) in every subdirectory.
tr -d '\n' < input.txt | awk '{gsub(/\. /,".\n");print}' | grep “\[" >> SingleOutput.txt
The code takes the file input and divides it into sentences with new lines. Then it finds all the sentences that contain a “[“ and outputs the sentences to a single file.
I tried several looping techniques with find and for loops, but couldn't get it to run in this example. I tried
for dir in ./*; do
(cd "$dir" && tr -d '\n' < $dir | awk '{gsub(/\. /,".\n");print}' | grep “\[" >> /home/dan/SingleOutput.txt);
done;
and also
find ./ -execdir tr -d '\n' < . | awk '{gsub(/\. /,".\n");print}' | grep "\[" >> /home/dan/SingleOutput.txt;
but they didn't work execute just giving me > marks. any ideas?

Try this:
cd $dir
find ./ | grep "input.txt$" | while read file; do tr -d '\n' < $file | awk '{gsub(/\. /,".\n");print}' | grep “\[" >> SingleOutput.txt; done
This will find all files called input.txt under $dir, the it will perform what you say it's already working send output to $dir/SingleOutput.txt.

Why not just something like this?
tr -d '\n' < */input.txt | awk '{gsub(/\. /,".\n");print}' | grep “\[" >> SingleOutput.txt
Or are you interested in keeping the output for each input.txt separate?

Related

Created directory with for loop in bash

I have these files. Imagine that each "test" represent the name of one server:
test10.txt
test11.txt
test12.txt
test13.txt
test14.txt
test15.txt
test16.txt
test17.txt
test18.txt
test19.txt
test1.txt
test20.txt
test21.txt
test22.txt
test23.txt
test24.txt
test25.txt
test26.txt
test27.txt
test28.txt
test29.txt
test2.txt
test30.txt
test31.txt
test32.txt
test33.txt
test34.txt
test35.txt
test36.txt
test37.txt
test38.txt
test39.txt
test3.txt
test40.txt
test4.txt
test5.txt
test6.txt
test7.txt
test8.txt
test9.txt
In each txt file, I have this type of data:
2019-10-14-00-00;/dev/hd1;1024.00;136.37;/
2019-10-14-00-00;/dev/hd2;5248.00;4230.53;/usr
2019-10-14-00-00;/dev/hd3;2560.00;481.66;/var
2019-10-14-00-00;/dev/hd4;3584.00;67.65;/tmp
2019-10-14-00-00;/dev/hd5;256.00;26.13;/home
2019-10-14-00-00;/dev/hd1;1024.00;476.04;/opt
2019-10-14-00-00;/dev/hd5;384.00;0.38;/usr/xxx
2019-10-14-00-00;/dev/hd4;256.00;21.39;/xxx
2019-10-14-00-00;/dev/hd2;512.00;216.84;/opt
2019-10-14-00-00;/dev/hd3;128.00;21.46;/var/
2019-10-14-00-00;/dev/hd8;256.00;75.21;/usr/
2019-10-14-00-00;/dev/hd7;384.00;186.87;/var/
2019-10-14-00-00;/dev/hd6;256.00;0.63;/var/
2019-10-14-00-00;/dev/hd1;128.00;0.37;/admin
2019-10-14-00-00;/dev/hd4;256.00;179.14;/opt/
2019-10-14-00-00;/dev/hd3;2176.00;492.93;/opt/
2019-10-14-00-00;/dev/hd1;256.00;114.83;/opt/
2019-10-14-00-00;/dev/hd9;256.00;41.73;/var/
2019-10-14-00-00;/dev/hd1;3200.00;954.28;/var/
2019-10-14-00-00;/dev/hd10;256.00;0.93;/var/
2019-10-14-00-00;/dev/hd10;64.00;1.33;/
2019-10-14-00-00;/dev/hd2;1664.00;501.64;/opt/
2019-10-14-00-00;/dev/hd4;256.00;112.32;/opt/
2019-10-14-00-00;/dev/hd9;2176.00;1223.1;/opt/
2019-10-14-00-00;/dev/hd11;22784.00;12325.8;/opt/
2019-10-14-00-00;/dev/hd12;256.00;2.36;/
2019-10-14-06-00;/dev/hd12;1024.00;137.18;/
2019-10-14-06-00;/dev/hd1;256.00;2.36;/
2019-10-14-00-00;/dev/hd1;1024.00;136.37;/
2019-10-14-00-00;/dev/hd2;5248.00;4230.53;/usr
2019-10-14-00-00;/dev/hd3;2560.00;481.66;/var
2019-10-14-00-00;/dev/hd4;3584.00;67.65;/tmp
2019-10-14-00-00;/dev/hd5;256.00;26.13;/home
2019-10-14-00-00;/dev/hd1;1024.00;476.04;/opt
2019-10-14-00-00;/dev/hd5;384.00;0.38;/usr/xxx
2019-10-14-00-00;/dev/hd4;256.00;21.39;/xxx
2019-10-14-00-00;/dev/hd2;512.00;216.84;/opt
2019-10-14-00-00;/dev/hd3;128.00;21.46;/var/
2019-10-14-00-00;/dev/hd8;256.00;75.21;/usr/
2019-10-14-00-00;/dev/hd7;384.00;186.87;/var/
2019-10-14-00-00;/dev/hd6;256.00;0.63;/var/
2019-10-14-00-00;/dev/hd1;128.00;0.37;/admin
2019-10-14-00-00;/dev/hd4;256.00;179.14;/opt/
2019-10-14-00-00;/dev/hd3;2176.00;492.93;/opt/
2019-10-14-00-00;/dev/hd1;256.00;114.83;/opt/
2019-10-14-00-00;/dev/hd9;256.00;41.73;/var/
2019-10-14-00-00;/dev/hd1;3200.00;954.28;/var/
2019-10-14-00-00;/dev/hd10;256.00;0.93;/var/
2019-10-14-00-00;/dev/hd10;64.00;1.33;/
2019-10-14-00-00;/dev/hd2;1664.00;501.64;/opt/
2019-10-14-00-00;/dev/hd4;256.00;112.32;/opt/
I would like to create a directory for each server, create in each directory a txt file for each FS and put in these txt files each lines which correspond to the FS.
For that, I've tried loop :
#!/bin/bash
directory=(ls *.txt | cut -d'.' -f1)
for d in $directory
do
if [ ! -d $d ]
then
mkdir $d
fi
done
for i in $(cat *.txt)
do
file=$(echo $i | awk -F';' '{print $2}' | sort | uniq | cut -d'/' -f3 )
data=$(echo $i | awk -F';' '{print $2}' )
echo $i | grep -w $data >> /xx/xx/xx/xx/xx/${directory/${file}.txt
done
But this loop doesn't work properly. The directories are created but not the file inside each directory.
I would like something like :
test1/hd1.txt ( with each line which for the hd1 fs in the hd1.txt)
And same thing for each server.
Can you show me how to do that?
#!/bin/bash
for src in *.txt; do
# start a subshell so we don't need to cd back afterwards
# make "$src" be stdin before cd, so we don't need full path
# be careful that in subshell only awk reads from stdin
(
# extract server name to use as directory
dir=/xx/xx/xx/xx/xx/"${src%.txt}"
# chain with "&&" so failures don't cause bad files
mkdir -p "$dir" &&
cd "$dir" &&
awk -F \; '{ split($2, dev, "/"); print > dev[3]".txt" }'
) < "$src"
done
The awk script reads lines delimited by semi-colons.
It splits the second field on slashes to extract the device name (assumption is that the devices always have form: /dev/name
Finally, the > sends output to the relevant file.
For reference, you can make your script work by doing directory=$(...); adding the prefix to mkdir (assuming the prefix directories already exist); closing the reference ${directory}; and quoting all variable references for safety:
#!/bin/bash
directory=$(ls *.txt | cut -d'.' -f1)
for d in "$directory"
do
if [ ! -d "$d" ]
then
mkdir /xx/xx/xx/xx/xx/"$d"
fi
done
for i in $(cat *.txt)
do
file=$(echo "$i" | awk -F';' '{print $2}' | sort | uniq | cut -d'/' -f3 )
data=$(echo $i | awk -F';' '{print $2}' )
echo "$i" | grep -w "$data" >> /xx/xx/xx/xx/xx/"${directory}"/"${file}".txt
done
for file in `ls *.txt`
do
echo ${file}
directory=`echo ${file} | cut -d'.' -f1`
#echo ${directory}
if [ ! -d ${directory} ]
then
mkdir ${directory}
fi
FS=`cat ${file} | awk -F';' '{print $2}' | sort | uniq | cut -d'/' -f3`
#echo $FS
for f in $FS
do
cat ${file} |grep -w -e $f > ${directory}/${f}.txt
done
done
Explanation:
For each file in the current directory, the outer for loop will run.
In the loop for the selected file, a respective directory will be created first.
Next using the FS variable we take all the possible file systems from that selected file.
Finally, an inner loop will be run using the FS types to grep and create separate file system files in the directory.

Extract number from first line of a file in linux

I have a file which has contents like the below
SPEC.2.ATTRID=REVISION&
SPEC.2.VALUE=5&
SPEC.3.ATTRID=NUM&
SPEC.3.VALUE=VS&
I am using the below command to extract only the numbers from the first line. Is this way efficient or you guys think of an alternate way ?
cat ticketspecdata | tr -d " " | tr -s "[:alpha:]" "~" | tr -d "[=.=]" | cut -d "~" -f2
Using grep :
$ grep -om1 '[0-9]\+' file
2
Or
head -n1 file | tr -cd '[:digit:]'
You may also want to read about UUOC:
http://porkmail.org/era/unix/award.html

How do i append some text to pipe without temporary file

I am trying to get the max version number from a directory where i have several versions of one program
for example if output of ls is
something01_1.sh
something02_0.1.2.sh
something02_0.1.sh
something02_1.1.sh
something02_1.2.sh
something02_2.0.sh
something02_2.1.sh
something02_2.3.sh
something02_3.1.2.sh
something.sh
I am getting the max version number with the following -
ls somedir | grep some_prefix | cut -d '_' -f2 | sort -t '.' -k1 -r | head -n 1
Now if at the same time i want to check it with the version number which i already have in the system, whats the best way to do it...
in bash i got this working (if 2.5 is the current version)
(ls somedir | grep some_prefix | cut -d '_' -f2; echo 2.5) | sort -t '.' -k1 -r | head -n 1
is there any other correct way to do it?
EDIT: In the above example some_prefix is something02.
EDIT: Actual Problem here is
(ls smthing; echo more) | sort
is it the best way to merge output of two commands/program for piping into third.
I have found the solution. The best way it seems is using process substitution.
cat <(ls smthing) <(echo more) | sort
for my version example
cat <(ls somedir | grep some_prefix | cut -d '_' -f2) <(echo 2.5) | sort -t '.' -k1 -r | head -n 1
for the benefit of future readers, I recommend - please drop the lure of one-liner and use glob as chepner suggested.
Almost similar question is asked on superuser.
more info about process substitution.
Is the following code more suitable to what you're looking for:
#/bin/bash
highest_version=$(ls something* | sort -V | tail -1 | sed "s/something02_\|\.sh//g")
current_version=$(echo $0 | sed "s/something02_\|\.sh//g")
if [ $current_version > $highest_version ]; then
echo "Uh oh! Looks like we need to update!";
fi
You can try something like this :
#! /bin/bash
lastversion() { # prefix
local prefix="$1" a=0 b=0 c=0 r f vmax=0
for f in "$prefix"* ; do
test -f "$f" || continue
read a b c r <<< $(echo "${f#$prefix} 0 0 0" | tr -C '[0-9]' ' ')
v=$(((a*100+b)*100+c))
if ((v>vmax)); then vmax=$v; fi
done
echo $vmax
}
lastversion "something02"
It will print: 30102

Calculate Word occurrences from file in bash

I'm sorry for the very noob question, but I'm kind of new to bash programming (started a few days ago). Basically what I want to do is keep one file with all the word occurrences of another file
I know I can do this:
sort | uniq -c | sort
the thing is that after that I want to take a second file, calculate the occurrences again and update the first one. After I take a third file and so on.
What I'm doing at the moment works without any problem (I'm using grep, sed and awk), but it looks pretty slow.
I'm pretty sure there is a very efficient way just with a command or so, using uniq, but I can't figure out.
Could you please lead me to the right way?
I'm also pasting the code I wrote:
#!/bin/bash
# count the number of word occurrences from a file and writes to another file #
# the words are listed from the most frequent to the less one #
touch .check # used to check the occurrances. Temporary file
touch distribution.txt # final file with all the occurrences calculated
page=$1 # contains the file I'm calculating
occurrences=$2 # temporary file for the occurrences
# takes all the words from the file $page and orders them by occurrences
cat $page | tr -cs A-Za-z\' '\n'| tr A-Z a-z > .check
# loop to update the old file with the new information
# basically what I do is check word by word and add them to the old file as an update
cat .check | while read words
do
word=${words} # word I'm calculating
strlen=${#word} # word's length
# I use a black list to not calculate banned words (for example very small ones or inunfluent words, like articles and prepositions
if ! grep -Fxq $word .blacklist && [ $strlen -gt 2 ]
then
# if the word was never found before it writes it with 1 occurrence
if [ `egrep -c -i "^$word: " $occurrences` -eq 0 ]
then
echo "$word: 1" | cat >> $occurrences
# else it calculates the occurrences
else
old=`awk -v words=$word -F": " '$1==words { print $2 }' $occurrences`
let "new=old+1"
sed -i "s/^$word: $old$/$word: $new/g" $occurrences
fi
fi
done
rm .check
# finally it orders the words
awk -F": " '{print $2" "$1}' $occurrences | sort -rn | awk -F" " '{print $2": "$1}' > distribution.txt
Well, I'm not sure that I've got the point of the thing you are trying to do,
but I would do it this way:
while read file
do
cat $file | tr -cs A-Za-z\' '\n'| tr A-Z a-z | sort | uniq -c > stat.$file
done < file-list
Now you have statistics for all your file, and now you simple aggregate it:
while read file
do
cat stat.$file
done < file-list \
| sort -k2 \
| awk '{if ($2!=prev) {print s" "prev; s=0;}s+=$1;prev=$2;}END{print s" "prev;}'
Example of usage:
$ for i in ls bash cp; do man $i > $i.txt ; done
$ cat <<EOF > file-list
> ls.txt
> bash.txt
> cp.txt
> EOF
$ while read file; do
> cat $file | tr -cs A-Za-z\' '\n'| tr A-Z a-z | sort | uniq -c > stat.$file
> done < file-list
$ while read file
> do
> cat stat.$file
> done < file-list \
> | sort -k2 \
> | awk '{if ($2!=prev) {print s" "prev; s=0;}s+=$1;prev=$2;}END{print s" "prev;}' | sort -rn | head
3875 the
1671 is
1137 to
1118 a
1072 of
793 if
744 and
533 command
514 in
507 shell

Get N line from unzip -l

I have a jar file, i need to execute the files in it in Linux.
So I need to get the result of the unzip -l command line by line.
I have managed to extract the files names with this command :
unzip -l package.jar | awk '{print $NF}' | grep com/tests/[A-Za-Z] | cut -d "/" -f3 ;
But i can't figure out how to obtain the file names one after another to execute them.
How can i do it please ?
Thanks a lot.
If all you need the first row in a column, add a pipe and get the first line using head -1
So your one liner will look like :
unzip -l package.jar | awk '{print $NF}' | grep com/tests/[A-Za-Z] | cut -d "/" -f3 |head -1;
That will give you first line
now, club head and tail to get second line.
unzip -l package.jar | awk '{print $NF}' | grep com/tests/[A-Za-Z] | cut -d "/" -f3 |head -2 | tail -1;
to get second line.
But from scripting piont of view this is not a good approach. What you need is a loop as below:
for class in `unzip -l el-api.jar | awk '{print $NF}' | grep javax/el/[A-Za-Z] | cut -d "/" -f3`; do echo $class; done;
you can replace echo $class with whatever command you wish - and use $class to get the current class name.
HTH
Here is my attempt, which also take into account Daddou's request to remove the .class extension:
unzip -l package.jar | \
awk -F'/' '/com\/tests\/[A-Za-z]/ {sub(/\.class/, "", $NF); print $NF}' | \
while read baseName
do
echo " $baseName"
done
Notes:
The awk command also handles the tasks of grep and cut
The awk command also handles the removal of the .class extension
The result of the awk command is piped into the while read... command
baseName represents the name of the class file, with the .class extension removed
Now, you can do something with that $baseName

Resources