How to use the pipe in a file with head-tail operation?

How to use the pipe in a file with head-tail operation? - linux

size=$(wc -l < "$1")
if [ "$size" -gt 0 ]
then
tr "[:lower:]" "[:upper:]" < $1 > output
for (( i=1; i <= "$size"; ++i ))
do
echo "Line " $i $(head -"$i" > output | tail -1 > output)
done
Hi, guys!
I have a problem with this little code. Everything works fine except the head-tail thing. What I wanna do is just displaying the line number "i" from a file.
The results that I receive are just the last line ($size).
I think maybe it is something wrong with the input of tail. The head -"$i" doesn't go at the specified line. :(
Any thoughts?
Ohhhh... I just realised: As input for my tail i give the same input for head.
The solution is to give to tail the result from head. How do I do that? :-/

You don't need to redirect to file output from head. Otherwise, the pipe does not get any input at all. Also, use >> to append results otherwise you will just keep overwriting the file with the next iteration of the loop. But make sure to delete the output file before each new call to the script, else you will just keep appending to the output file infinitely.
echo "Line " $i $(head -"$i" $infile | tail -1 >> output)

Use read to fetch a line of input from the file.
# Since `1` is always true, essentially count up forever
for ((i=1; 1; ++i)); do
# break when a read fails to read a line
IFS= read -r line || break
echo "Line $i: $(tr [:lower:] [:upper:])"
done < "$1" > output
A more standard approach is to iterate over the file and maintain i explicitly.
i=1
while IFS= read -r line; do
echo "Line $i: $(tr [:lower:] [:upper:])"
((i++))
done < "$1" > output

I think you're re-implementing cat -n with prefix "Line ". If so, awk to the rescue!
awk '{print "Line "NR, tolower($0)}'

I made it. :D
The trick is to put the output of the head to another file that will be the input for tail, like that:
echo "Line " $i $(head -"$i" < output >outputF | tail -1 < outputF)
Your questions made me think differently. Thank you!

Related

Shell script to read values from a file and to compare them with another value

I need a shell script, where I read in a number and it compares the number with the numbers in another file.
Here is an example:
I have a file called numbers.txt, which contains the following:
name;type;value;description
samsung;s5;1500;blue
iphone;6;1000;silver
I read in a number for example 1200. And it should print out the values from the file which are lesser than 1200(in my example it should print out 1000)
Here is the code that I started to write but I don't know how to finish it.
echo " Enter a number"
read num
if [ $numbersinthefile -le $num ]; then
echo "$numbersinthefile"
I hope I defined my question properly. Can somebody help me?

Use:
#!/bin/bash
echo -n "Enter the number: "
read num
awk -F\; '$3 < '$num' {print $0;}' myfile

Try this, first you use sed to remove first line then you use cut to get the actual number from line and you compare that number to the input.
echo " Enter a number"
read num
sed '1d' numbers.txt | while read line; do
numbersinthefile=$(echo $line | cut -d';' -f3);
if [ $numbersinthefile -lt $num ]; then
echo $line;
fi
done

How to efficiently loop through the lines of a file in Bash?

I have a file example.txt with about 3000 lines with a string in each line. A small file example would be:
>cat example.txt
saudifh
sometestPOIFJEJ
sometextASLKJND
saudifh
sometextASLKJND
IHFEW
foo
bar
I want to check all repeated lines in this file and output them. The desired output would be:
>checkRepetitions.sh
found two equal lines: index1=1 , index2=4 , value=saudifh
found two equal lines: index1=3 , index2=5 , value=sometextASLKJND
I made a script checkRepetions.sh:
#!bin/bash
size=$(cat example.txt | wc -l)
for i in $(seq 1 $size); do
i_next=$((i+1))
line1=$(cat example.txt | head -n$i | tail -n1)
for j in $(seq $i_next $size); do
line2=$(cat example.txt | head -n$j | tail -n1)
if [ "$line1" = "$line2" ]; then
echo "found two equal lines: index1=$i , index2=$j , value=$line1"
fi
done
done
However this script is very slow, it takes more than 10 minutes to run. In python it takes less than 5 seconds... I tried to store the file in memory by doing lines=$(cat example.txt) and doing line1=$(cat $lines | cut -d',' -f$i) but this is still very slow...

When you do not want to use awk (a good tool for the job, parsing the input only once),
you can run through the lines several times. Sorting is expensive, but this solution avoids the loops you tried.
grep -Fnxf <(uniq -d <(sort example.txt)) example.txt
With uniq -d <(sort example.txt) you find all lines that occur more than once. Next grep will search for these (option -f) complete (-x) lines without regular expressions (-F) and show the line it occurs (-n).

See why-is-using-a-shell-loop-to-process-text-considered-bad-practice for some of the reasons why your script is so slow.
$ cat tst.awk
{ val2hits[$0] = val2hits[$0] FS NR }
END {
for (val in val2hits) {
numHits = split(val2hits[val],hits)
if ( numHits > 1 ) {
printf "found %d equal lines:", numHits
for ( hitNr=1; hitNr<=numHits; hitNr++ ) {
printf " index%d=%d ,", hitNr, hits[hitNr]
}
print " value=" val
}
}
}
$ awk -f tst.awk file
found 2 equal lines: index1=1 , index2=4 , value=saudifh
found 2 equal lines: index1=3 , index2=5 , value=sometextASLKJND
To give you an idea of the performance difference using a bash script that's written to be as efficient as possible and an equivalent awk script:
bash:
$ cat tst.sh
#!/bin/bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: bash 4.0 required" >&2; exit 1;; esac
# initialize an associative array, mapping each string to the last line it was seen on
declare -A lines=( )
lineNum=0
while IFS= read -r line; do
(( ++lineNum ))
if [[ ${lines[$line]} ]]; then
printf 'Content previously seen on line %s also seen on line %s: %s\n' \
"${lines[$line]}" "$lineNum" "$line"
fi
lines[$line]=$lineNum
done < "$1"
$ time ./tst.sh file100k > ou.sh
real 0m15.631s
user 0m13.806s
sys 0m1.029s
awk:
$ cat tst.awk
lines[$0] {
printf "Content previously seen on line %s also seen on line %s: %s\n", \
lines[$0], NR, $0
}
{ lines[$0]=NR }
$ time awk -f tst.awk file100k > ou.awk
real 0m0.234s
user 0m0.218s
sys 0m0.016s
There are no differences in the output of both scripts:
$ diff ou.sh ou.awk
$
The above is using 3rd-run timing to avoid caching issues and being tested against a file generated by the following awk script:
awk 'BEGIN{for (i=1; i<=10000; i++) for (j=1; j<=10; j++) print j}' > file100k
When the input file had zero duplicate lines (generated by seq 100000 > nodups100k) the bash script executed in about the same amount of time as it did above while the awk script executed much faster than it did above:
$ time ./tst.sh nodups100k > ou.sh
real 0m15.179s
user 0m13.322s
sys 0m1.278s
$ time awk -f tst.awk nodups100k > ou.awk
real 0m0.078s
user 0m0.046s
sys 0m0.015s

To demonstrate a relatively efficient (within the limits of the language and runtime) native-bash approach, which you can see running in an online interpreter at https://ideone.com/iFpJr7:
#!/bin/bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: bash 4.0 required" >&2; exit 1;; esac
# initialize an associative array, mapping each string to the last line it was seen on
declare -A lines=( )
lineNum=0
while IFS= read -r line; do
lineNum=$(( lineNum + 1 ))
if [[ ${lines[$line]} ]]; then
printf 'found two equal lines: index1=%s, index2=%s, value=%s\n' \
"${lines[$line]}" "$lineNum" "$line"
fi
lines[$line]=$lineNum
done <example.txt
Note the use of while read to iterate line-by-line, as described in BashFAQ #1: How can I read a file line-by-line (or field-by-field)?; this permits us to open the file only once and read through it without needing any command substitutions (which fork off subshells) or external commands (which need to be individually started up by the operating system every time they're invoked, and are likewise expensive).
The other part of the improvement here is that we're reading the whole file only once -- implementing an O(n) algorithm -- as opposed to running O(n^2) comparisons as the original code did.

How to make the input file of my program start from the 3rd line?

I have a program that reads data from a file in this way
root#root# myprogram < inputfile.txt
Now I want my program to read the input file from the 3rd line and not from the beginning of the file.
I have to use < inputfile.txt. I can not call with pipe because of variable scope issues
Is there a way to do that in Linux?

Maybe this will work for you (process substitution):
program < <(sed -n '3,$p' inputfile.txt)

Pure shell, no extra processes:
{ read -r; read -r; program; } < inputfile.txt
The first two calls to read each consume a line of input from input file.txt, so that they are not seen by program.
You can generalize this to skip the first $n lines of input.
{
while [ "$((i++))" -lt "$n" ]; do read -r; done
program
} < inputfile.txt
This becomes a little more readable with the use of some bash extensions:
{ while (( i++ < n )); do read -r; done; program; } < inputfile.txt

You can use tail:
tail -n +3 inputfile.txt | myprogram
In bash, you can also use
myprogram < <(tail -n +3 inputfile.txt)

try this command: sed -n '3,$p' inputfile.txt | myprogram

Shell Script error: "head: invalid trailing option -- 1"

I have this code in my shell(bash) script for splitting a file into smaller parts:
for (( i=$start; i<=$lineCount; i=i+$interval))
do
temp=`expr $i + $interval`;
if [ $temp -le $lineCount ]
then
command="head -$temp $fileName | tail -$interval > $tmpFileName";
echo "Creating Temp File: $command";
else
lastLines=`expr $lineCount - $i`;
command="head -$temp $fileName | tail -$lastLines > tmpFileName";
echo "Creating Temp File: $command";
fi
`$command`;
done
It prints the following output on stdin:
Creating Temp File: head -10 tmp.txt | tail -10 > tmp.txt_TMP
head: invalid trailing option -- 1
Try `head --help' for more information.
But the command printed: head -10 tmp.txt | tail -10 > tmp.txt_TMP runs correctly on the command line.
What am I doing wrong?

When you put the pipe | in a variable, the shell interprets it as an ordinary character and not as a pipe. Ditto for redirection operators like >, <, ...
An ugly way would be to use eval.
A better approach would be to split your command into different parts so as to get rid of pipes and redirection operators in it.
For example:
command="head -$temp $fileName | tail -$lastLines > tmpFileName";
would be written as:
cmd1="head -$temp $fileName";
cmd2="tail -$lastLines";
and executed by saying:
"$cmd1" | "$cmd2" > tmpFileName;
Moreover, you don't need backticks to execute a command that is stored in a variable. Simply say:
$command

Problem is here:
command="head -$temp $fileName | tail -$interval > $tmpFileName"
and later:
`$command`
Instead of storing whole piped command in a string you can directly execute the command:
head -$temp "$fileName" | tail -$interval > "$tmpFileName"

There are several issues here. First of all, lets see our refactored version with correct quoting and many other improvements:
for (( i=start; i<=lineCount; i+=interval)); do
((temp = i + interval))
if (( temp <= lineCount )); then
echo "Creating Temp File using 'tail -n $interval'"
head -n "$temp" "$fileName" | tail -n "$interval" > "$tmpFileName"
else
((lastLines = lineCount - i))
echo "Creating Temp File using 'tail -n $lastLines'"
head -n "$temp" "$fileName" | tail -n "$lastLines" > "$tmpFileName"
fi
done
I have changed all arithmetic expressions to correct syntax. This is what you want, because it is more readable.
Then, it seems like you want to put a command into a variable and then run it. To cut the story short, you simply should not do that. Here's why
Also, this is not c++, you don't have to place ; on every line.
You have also missed $ character in tmpFileName on the 10th line of your code.

Parsing a CSV string in Shell Script and writing it to a File

I am not a Linux scripting expert and I have exhausted my knowledge on this matter. Here is my situation.
I have a list of states passed as a command line argument to a shell script ( e.g "AL,AK,AS,AZ,AR,CA..." ). The Shell script needs to extract each of the state code and write it to a file ( states.txt) , with each state in one line. See below
AL
AK
AS
AZ
AR
CA
..
..
How can this be achieved using a linux shell script.
Thanks in advance.

Use tr:
echo "AL,AK,AS,AZ,AR,CA" | tr ',' '\n' > states.txt

echo "AL,AK,AS,AZ,AR,CA" | awk -F, '{for (i = 1; i <= NF; i++) print $i}';

Naive solution:
echo "AL,AK,AS,AZ,AR,CA" | sed 's/,/\n/g'

I think awk is the simplest solution, but you could try using cut in a loop.
Sample script (outputs to stdout, but you can just redirect it):
#!/bin/bash
# Check for input
if (( ${#1} == 0 )); then
echo No input data supplied
exit
fi
# Initialise first input
i=$1
# While $i still contains commas
while { echo $i| grep , > /dev/null; }; do
# Get first item of $i
j=`echo $i | cut -d ',' -f '1'`
# Shift off the first item of $i
i=`echo $i | cut --complement -d ',' -f '1'`
echo $j
done
# Display the last item
echo $i
Then you can just run it as ./script.sh "AL,AK,AS,AZ,AR,CA" > states.txt (assuming you save it as script.sh in the local directory and give it execute permission)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to use the pipe in a file with head-tail operation? - linux

I think you're re-implementing cat -n with prefix "Line ". If so, awk to the rescue! awk '{print "Line "NR, tolower($0)}'

I made it. :D The trick is to put the output of the head to another file that will be the input for tail, like that: echo "Line " $i $(head -"$i" < output >outputF | tail -1 < outputF) Your questions made me think differently. Thank you!

Related

Shell script to read values from a file and to compare them with another value

How to efficiently loop through the lines of a file in Bash?

How to make the input file of my program start from the 3rd line?

Shell Script error: "head: invalid trailing option -- 1"

Parsing a CSV string in Shell Script and writing it to a File

Categories

Resources