I would like to calculate percentage in shell. But I can't do it. My script is
#n1=$(wc -l < input.txt) #input.txt is a text file with 10000 lines
n2=$(awk '{printf "%.2f", $n1*0.05/100}')
echo 0.05% of $n1 is $n2
It is neither showing any value nor terminating when executing this script.

awk will give you an n1 illegal field name if you do that, as it's inside single quotes.
Also, to avoid awk keep reading stdin you should pass /dev/null as file. Then:
n2=$(awk -v n1="$n1" 'BEGIN {printf "%.2f", n1*0.05/100}' /dev/null)

Rather than start a new process to count the records in the file and then passing that to awk, I would suggest you let awk count the records itself which it does anyway in the variable NR. So, your entire script would become:
percentage=$(awk 'END{print NR*0.05/100}' input.txt)


Rename file as third word on it (bash)

I have several autogenerated files (see the picture below for example) and I want to rename them according to 3rd word in the first line (in this case, that would be 42.txt).
First line:
ligand CC##HOc3ccccc3 42 P10000001
Is there a way to do it?
Say you have file.txt containing:
ligand CC##HOc3ccccc3 42 P10000001
and you want to rename file.txt to 42.txt based on the 3rd field in the file.
*Using awk
The easiest way is simply to use mv with awk in a command substitution, e.g.:
mv file.txt $(awk 'NR==1 {print $3; exit}' file.txt).txt
Where the command-substitution $(...) is just the awk expression awk 'NR==1 {print $3; exit}' that simply outputs the 3rd-field (e.g. 42). Specifying NR==1 ensures only the first line is considered and exit at the end of that rule ensures no more lines are processed wasting time if file.txt is a 100000 line file.
file.txt is now renamed 42.txt, e.g.
$ cat 42.txt
ligand CC##HOc3ccccc3 42 P10000001
Using read
You can also use read to simply read the first line and take the 3rd word as the name there and then mv the file, e.g.
$ read -r a a name a <file.txt; mv file.txt "$name".txt
The temporary variable a above is just used to read and discard the other words in the first line of the file.

numeric variable in egrep regular expression bash script

So I am trying to make a script that contains egrep and accepts a numeric variable
list="egrep "^.{$var}$ /usr/share/dict/words"
cat list
For example, if var is 5, I would like this script to print out every line with 5 characters. For some reason the script does not do that. Help would be greatly appreciated!
Your script doesn't work because there are several problems with these lines:
list="egrep "^.{$var}$ /usr/share/dict/words"
cat list
The first line isn't complete, it's missing a closing quote,
Even if you fixed it, you're assigning a literal string to list, not the output of a command,
RE and filename should be separated
cat doesn't print a variable's content, echo does that.
list="$(egrep '^.{'"$var"'}$' /usr/share/dict/words)"
echo "$list"
should work.
Or even better, you can use just an awk command:
awk 'length==5' /usr/share/dict/words
with $1 or any other variable:
awk -v n="$1" 'length==n' /usr/share/dict/words

Length comparison of one specific field in linux

I was trying to check the length of second field of a TSV file (hundreds of thousands of lines). However, it runs very very slowly. I guess it should be something wrong with "echo", but not sure how to do.
Input file:
prob name
1.0 Claire
1.0 Mark
... ...
So I need to print out what went wrong in the name. I tested with a little example using "head -100" and it worked. But just can't cope with original file.
This is what I ran:
for title in `cat filename | cut -f2`;do
length=`echo -n $line | wc -m`
if [ "$length" -gt 10 ];then
echo $line
awk to rescue:
awk 'length($2)>10' file
This will print all lines having the second field length longer than 10 characters.
Note that it doesn't require any block statement {...} because if the condition is met, awk will by default print the line.
Try this probably:
cat file.tsv | awk '{if (length($2) > 10) print $0;}'
This should be a bit faster since the whole processing is done by the single awk process, while your solution starts 2 processes per loop iteration to make that comparison.
We can use awk if that helps.
awk '{if(length($2) > 10){print}}' filename
$2 here is 2nd field in filename which runs for every line. It would be faster.

Using awk command in Bash

I'm trying to loop an awk command using bash script and I'm having a hard time including a variable within the single quotes for the awk command. I'm thinking I should be doing this completely in awk, but I feel more comfortable with bash right now.
while [ $index -le 13 ]
awk "'"/^$index/ {print}"'" text.txt
Use the standard approach -- -v option of awk to set/pass the variable:
awk -v idx="$index" '$0 ~ "^"idx' text.txt
Here i have set the variable idx as having the value of shell variable $index. Inside awk, i have simply used idx as an awk variable.
$0 ~ "^"idx matches if the record starts with (^) whatever the variable idx contains; if so, print the record.
awk '/'"$index"'/' text.txt
# A lil play with the script part where you split the awk command
# and sandwich the bash variable in between using double quotes
# Note awk prints by default, so idiomatic awk omits the '{print}' too.
should do, alternatively use grep like
grep "$index" text.txt # Mind the double quotes
Note : -le is used for comparing numerals, so you may change index="1" to index=1.

echo"What is the record ID?"
read rID
numA= awk -f "%" '{print $1'}< practice.txt
I cannot figure out how to set numA = to the output of the awk in order to compare rID and numA. numA is equal to the first field of a txt file which is separated by %. Any suggestions?
You can capture the output of any command in a variable via command substitution:
numA=$(awk -F '%' '{print $1}' < practice.txt)
Unless your file contains only one line, however, the awk command you presented (as corrected above) is unlikely to be what you want to use. If the practice.txt file contains, say, answers to multiple questions, one per line, then you probably want to structure the script altogether differently.
You don't need to use awk, just use parameter expansion:
this is the correct syntax.
numA=$(awk -F'%' '{print $1}' practice.txt)
however, it will be easier to do comparisons in awk by passing the bash variable in.
awk -F'%' -v r="$rID" '$1==r{... do something ...}' practice.txt
since you didn't specify any details it's difficult to suggest more...
to remove rID matching line from the file do this
awk -F'%' -v r="$rID" '$1!=r' practice.txt > output
will print the lines where the condition is met ($1 not equal to rID), equivalent to deleting the ones which are equal. You can mimic in place replacement by
awk ... practice.txt > temp && mv temp practice.txt
where you fill in ... from the line above.
Try using
$ numA=`awk -F'%' '{ if($1 != $0) { print $1; exit; }}' practice.txt`
From the question, "numA is equal to the first field of a txt file which is separated by %"
-F'%', meaning % is the only separator we care about
if($1 != $0), meaning ignore lines that don't have the separator
print $1; exit;, meaning exit after printing the first field that we encounter separated by %. Remove the exit if you don't want to stop after the first field.
