I have a problem. I have a file in this format:
Hi / Tom /
Be / Nice /
...
And I need to delete "/" and " "(space) and sort it
Be Nice
Hi Tom
...
for sorted_word in $(for word in $(sed -e 's/\/ //g' path_to_file); do printf "%s\n" ${word}; done | sort); do printf "%s " ${sorted_word}; done ; printf "%s\n"
you can use tr command like
cat inputfile.txt | tr -s "\/" "" | tr " " "\n" | sort | tr "\n" " "
Related
So I have these two functions: cMinOption and cMaxOption.
cMinOption puts all words from pt.stop_words.txt into myArray.
cMaxOption shows a list of words from ficha01.pdf.txt, including on that same list of words from pt.stop_words.txt.
I want to know how can I delete the words from the array I created and delete those words from the list that appears in the function cMaxOption.
cMinOption(){
declare -a myArray
mapfile -t myArray < pt.stop_words.txt
sed -i 's/myArray/""/g' ficha01.pdf.txt
sed -e 's/[^[:alpha:]]/ /g' ficha01.pdf.txt | tr '\n' " " | tr -s " " | tr " " '\n'| tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr | nl | head -n 7
for ((i=0; i<204; i++)); do
echo "Element [$i]: ${myArray[$i]}";
done
}
# Called when -C is passed as argument
cMaxOption(){
# Check if the file passed in $1 is a PDF file
if [ $(head -c 4 "$1") = "%PDF" ]; then
pdftotext $1 $1.txt
echo "'$1': PDF file";
file="$1.txt"
else
echo "'$1': TXT file";
file="$1"
fi
echo "[INFO] processing '$file'"
echo "STOP WORDS will be counted"
echo "COUNT MODE"
sed -e 's/[^[:alpha:]]/ /g' $file | tr '\n' " " | tr -s " " | tr " " '\n'| tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr | nl | head -n 7
mv $file result---$file
echo "RESULTS: 'result---$file' "
ls -l result---$file
echo -e " $(sort -u result---$file | wc -l) distinct words"
}
I want to make a file, for.sh, in the gnuplot script during its execution. The new file then will be called in the gnuplot script as
set xtics (`(sh for.sh)`)
The contents in for.sh should be these.
#!/bin/bash
rm Final-X-ticks.dat
KPATH=data1
TICK_position=data2
sed -n '/crystal coordinates with respect/,/REPRODUCED(TRANSFORMED) DATASET/p' $KPATH | sed '/crystal/d' | sed '/REPRODUCED/d' | awk '{print $4}' | sed -r '/^\s*$/d' > symbl.dat
line=$(cat symbl.dat | wc -l)
for i in `seq 1 1 $line`
do
echo " '{/Times-New " > symbl-$i.dat
cat symbl.dat | tail -n $line | head -n "$i" | tail -n 1 >> symbl-$i.dat
echo " }'" >> symbl-$i.dat
grep -i "coordinate" $TICK_position | awk '{print $NF}' | head -n "$i" | tail -n 1 >> kpoints-$i.dat
cat kpoints-$i.dat | tail -n 1 | awk 'BEGIN { ORS = " " } { print }' >> symbl-$i.dat
echo " ," >> symbl-$i.dat
cat symbl-$i.dat | awk 'BEGIN { ORS = " " } { print }' | awk '{gsub(/Times-New[ \t]+(G|\$\\Gamma\$)/, "Symbol G")} 1' | awk 'BEGIN { ORS = " " } { print }' >> Final-X-ticks.dat
done
rm symbl-* symbl.dat kpoints-*
I tried with
cat > for.sh <<EOF
above contents
EOF
but it gives me error
cat for.sh << EOF
^
"plot.gnu", line 19: invalid command
You can use a datablock to hold the text, then print it out. For example,
$MYSCRIPT <<EOF
#!/bin/bash
rm Final-X-ticks.dat
...
rm symbl-* symbl.dat kpoints-*
EOF
set print "for.sh"
print $MYSCRIPT
set print
set xtics (`(bash ./for.sh)`)
My aim is to echo a character, for example #, based on a value such as num=6 and it must print # 6 times on the screen.
Not sure how to get this.
You could do something like
printf '#%.0s' {1..6}
or, in the more general case,
printf '#%.0s' $(seq 1 $num)
printf "%*s" "$num" " " | tr " " "#"
or
yes '#' | head -"$num" | tr -d "\n"
I am trying to figure out why this does not work and of course how to address it, I have a long list of dates in a variable and would like to count the number of occurrences using grep, it seems like splitting a variable over new lines does not work as expected? Example,
$ list="2015-a 2015-b 2016-a" ; count=`echo $list | tr " " \\n | grep 2015 | wc -l` ; echo $count
1
$ list="2015-a,2015-b,2016-a" ; count=`echo $list | tr , \\n | grep 2015 | wc -l` ; echo $count
1
$ list="2015-a,2015-b,2016-a" ; count=`echo $list | sed s/,/\\n/g | grep 2015 | wc -l` ; echo $count
1
Any ideas?
The problem is with the way backticks interpret \\:
Backslashes () inside backticks are handled in a non-obvious manner:
$ echo "`echo \\a`" "$(echo \\a)"
a \a
$ echo "`echo \\\\a`" "$(echo \\\\a)"
\a \\a
# Note that this is true for *single quotes* too!
$ foo=`echo '\\'`; bar=$(echo '\\'); echo "foo is $foo, bar is $bar"
foo is \, bar is \\
So instead of saying:
$ echo "`echo $list | tr " " \\n`"
2015-an2015-bn2016-a
You have to say:
$ echo "`echo $list | tr " " \\\\n`"
2015-a
2015-b
2016-a
Even though it is best to use $() because backticks are deprecated:
$ echo "$(echo $list | tr " " '\n')"
2015-a
2015-b
2016-a
If you still want to use backticks, the cleanest solution is to use " " as a wrapper instead of escaping with such a \\\\:
$ echo "`echo $list | tr " " "\n"`"
2015-a
2015-b
2016-a
All of this can be read in Why is $(...) preferred over ... (backticks)?.
All in all, if you just want to count how many words contain 2015 you may consider using grep -o as suggested in the comments or maybe something more robust like this awk:
awk '{for (i=1;i<=NF;i++) if ($i~2015) count++; print count}'
See some examples:
$ awk '{for (i=1;i<=NF;i++) if ($i~2015) s++; print s}' <<< "2015-a 2015-b 2016-a" 2
$ awk '{for (i=1;i<=NF;i++) if ($i~2015) s++; print s}' <<< "2015-a 2015-b 2016-a 20152015-c"
3
I want changes_summary to always be in format <x> files changed, <y> insertion(+), <z> deletions(-) where <x> <y> and <z> are some numbers, but diffstat misses insertions and/or deletions part if <y> and/or <z> is zero, I tried to make it print as <x> files changed 0 insertion(+), 0 deletions(-) always, is there a better or easy way to do this? I would like to change $changes_summary variable so I can use it later part of the script.
changes_summary=`diff -ur ./dir1 ./dir2 | diffstat | tail -1`
if ! echo $changes_summary | grep -q "insertions" && ! echo $changes_summary | grep -q "deletions" ; then
echo $changes_summary | awk '{print $1 " " $2 " " $3 " " "0 insertion(+)," " " "0 deletions(-)"}'
elif ! echo $changes_summary | grep -q "insertions" && echo $changes_summary | grep -q "deletions" ; then
echo $changes_summary | awk '{print $1 " " $2 " " $3 " " "0 insertion(+), "$4 " " $5 }'
elif echo $changes_summary | grep -q "insertions" && ! echo $changes_summary | grep -q "deletions" ; then
echo $changes_summary | awk '{print $1 " " $2 " " $3 " " $4 " " $5 "0 deletions(-)" }'
fi
Probably the closest you can get without some serious bash magic or an other language is something like the following.
changes_summary=`diff -ur ./dir1 ./dir2 | diffstat -s`
CC=$(echo "$changes_summary" | sed -n 's:\(.*[0-9]\+ .* changed\).*:\1:p')
II=$(echo "$changes_summary" | sed -n 's:.*\([0-9]\+ insertions\?\).*:\1:p')
DD=$(echo "$changes_summary" | sed -n 's:.*\([0-9]\+ deletions\?\).*:\1:p')
echo "${CC}, ${II:-0 insertions}(+), ${DD:-0 deletions}(-)"
Sed strips out the message corresponding to each stat. The -n suppresses the normal output, p prints only if a match is found. If not, then CC, II, DD will be empty, in which case the ${II:-...} pattern substitutes a default value.
From man bash:
${parameter:-word} Use Default Values. If parameter is unset or null,
the expansion of word is substituted. Otherwise, the value of
parameter is substituted.
Note that keeping the (s) with s\? might be an overkill for you.
The other option is that in bash you can check for containment with [[ $a =~ "b" ]] and use your original approach. It spares you the greps at least and "b" here can also be regex if you drop the quotes.
if ! [[ "$changes_summary" =~ "insert" ]]; then
awk ...
fi
You can also find the =~ in man bash.