I was trying using head command, in macOS using zsh, code below,
a.txt:
1
2
3
4
5
6
7
8
9
10
tail -n +5 a.txt // line 5 to line end
tail -n -5 a.txt // last line 5 to line end
head -n +5 a.txt // line 1 to line 5
head -n -5 a.txt // # What did this do?
The last command shows an error.
head: illegal line count -- -5
What did head -n -5 actually do?
Some implementations of head like GNU head support negative arguments for -n
But that's not standard! Your case is clearly not supported.
When supported The negative argument should remove the last 5 lines before doing the head
It becomes more clear, if using 3 instead of 5. Note the signs!
# print 10 lines:
seq 10
1
2
3
4
5
6
7
8
9
10
#-------------------------
# get the last 3 lines:
seq 10 | tail -n 3
8
9
10
#--------------------------------------
# start at line 3 (skip first 2 lines)
seq 10 | tail -n +3
3
4
5
6
7
8
9
10
#-------------------------
# get the first 3 lines:
seq 10 | head -n 3
1
2
3
#-------------------------
# skip the last 3 lines:
seq 10 | head -n -3
1
2
3
4
5
6
7
btw, man tail and man head explain this behavior.
Related
For example:how can I print specific lines of a .txt file between line 5 and line 8 using only tail and head
Copied from here
infile.txt contains a numerical value on each line.
➜ X=3
➜ Y=10
➜ < infile.txt tail -n +"$X" | head -n "$((Y - X))"
3
4
5
6
7
8
9
➜
I have a group of data like the attached raw data, when I sort the raw data by sort -n , the data were sorted line by line, the output looks like this:
3 6 9 22
2 3 4 5
1 7 16 20
I want to sort the data in a columnwise manner, the output would look like this:
1 2 4 3
3 6 9 16
5 7 20 22
Ok, I did try something.
My primary ideal is to extract the data columnwise and then sort and then paste them, but I can't get through. Here is my script:
for ((i=1; i<=4; i=i+1))
do
awk '{print $i}' file | sort -n >>output
done
The output:
1 7 20 16
3 6 9 22
5 2 4 3
1 7 20 16
3 6 9 22
5 2 4 3
1 7 20 16
3 6 9 22
5 2 4 3
1 7 20 16
3 6 9 22
5 2 4 3
It seems that $i is unchangeable and equals to $0
Thanks a lot.
raw data1
3 6 9 22
5 2 4 3
1 7 20 16
raw data2
488.000000 1236.000000 984.000000 2388.000000 788.000000 704.000000
600.000000 1348.000000 872.000000 2500.000000 900.000000 816.000000
232.000000 516.000000 1704.000000 1668.000000 68.000000 16.000000
244.000000 504.000000 1716.000000 1656.000000 56.000000 28.000000
2340.000000 3088.000000 868.000000 4240.000000 2640.000000 2556.000000
2588.000000 3336.000000 1116.000000 4488.000000 2888.000000 2804.000000
Let me introduce a flexible solution using cut and sort that you can use on any M,N size tab delimited input matrix.
$ cat -vTE data_to_sort.in
3^I6^I9^I22$
5^I2^I4^I3$
1^I7^I20^I16$
$ col=4; line=3;
$ for i in $(seq ${col}); do cut -f$i data_to_sort.in |\
> sort -n; done | paste $(for i in $(seq ${line}); do echo -n "- "; done) |\
> datamash transpose
1 2 4 3
3 6 9 16
5 7 20 22
If the input file is not \t delimited you need to define proper delimiter to using -d"$DELIM_CHAR" have the cut working properly.
for i in $(seq ${col}); do cut -f$i data_to_sort.in | sort -n; done will separate each column of the file and sort it
paste $(for i in $(seq ${line}); do echo -n "- "; done) the paste column will then recreate a matrix structure
datamash transpose is needed to transpose the intermediate matrix
Thanks to the feedback from Sundeep, let me introduce to you a better solution using pr instead of paste command to generate the columns:
$ col=4; line=3
$ for i in $(seq ${col}); do cut -f$i data_to_sort.in |\
> sort -n; done | pr -${line}ats | datamash transpose
Last but not least,
$ col=4; for i in $(seq ${col}); do cut -f$i data_to_sort.in |\
> sort -n; done | pr -${col}ts
1 2 4 3
3 6 9 16
5 7 20 22
The following solution will allow us to not use datamash at all!!!
(many thanks to Sundeep)
Proof that is working for the skeptics and the downvoters...
2nd run with 6 columns:
$ col=6; for i in $(seq ${col}); do cut -f$i <(sed 's/^ \+//g;s/ \+/\t/g' data2) | sort -n; done | pr -${col}ts | tr '\t' ' '
232.000000 504.000000 868.000000 1656.000000 56.000000 16.000000
244.000000 516.000000 872.000000 1668.000000 68.000000 28.000000
488.000000 1236.000000 984.000000 2388.000000 788.000000 704.000000
600.000000 1348.000000 1116.000000 2500.000000 900.000000 816.000000
2340.000000 3088.000000 1704.000000 4240.000000 2640.000000 2556.000000
2588.000000 3336.000000 1716.000000 4488.000000 2888.000000 2804.000000
awk to the rescue!!
awk '{f1[NR]=$1; f2[NR]=$2; f3[NR]=$3; f4[NR]=$4}
END{asort(f1); asort(f2); asort(f3); asort(f4);
for(i=1;i<=NR;i++) print f1[i],f2[i],f3[i],f4[i]}' file
1 2 4 3
3 6 9 16
5 7 20 22
there may a smarter way of doing this as well...
I would like to extract n-th line from file and save it to a new file. For example I have index.txt :
cat index.txt
1 AAAGCGT
2 ACGAAGT
3 ACCTTGT
4 ATAATGT
5 AGGGTGT
6 AGCCAGT
7 AGTTCGT
8 AATGCAG
9 AAAGCGT
10 ACGAAGT
and output should be
cat index.1.txt:
1 AAAGCGT
2 ACGAAGT
cat index.2.txt:
3 ACCTTGT
4 ATAATGT
cat index.3.txt:
5 AGGGTGT
6 AGCCAGT
And so on.. So I would like to extract form input file first 2 rows in cycle and save to new file.
It doesn't give you exactly the names you want, but:
split -l 2 index.txt index.
seems like the easiest solution. It will create files with names beginning with the final argument, so will get names like 'index.aa' and 'index.bb'
This will work for any number of grouped lines just by changing the 2 to a 3 or whatever number you like:
$ awk 'NR%2==1{++i} {print > ("index." i ".txt")}' index.txt
$ ls index.?.txt
index.1.txt index.2.txt index.3.txt index.4.txt index.5.txt
$ tail index.?.txt
==> index.1.txt <==
1 AAAGCGT
2 ACGAAGT
==> index.2.txt <==
3 ACCTTGT
4 ATAATGT
==> index.3.txt <==
5 AGGGTGT
6 AGCCAGT
==> index.4.txt <==
7 AGTTCGT
8 AATGCAG
==> index.5.txt <==
9 AAAGCGT
10 ACGAAGT
awk '{print >"index."(x+=NR%2)".txt"}' file
This increments x every two lines starting from 1 and then prints the line into a file with that name
cat index.1.txt:
1 AAAGCGT
2 ACGAAGT
cat index.2.txt:
3 ACCTTGT
4 ATAATGT
cat index.3.txt:
5 AGGGTGT
6 AGCCAGT
In some awks, extra parens may be required as shown below (As commented by Ed Morton)
awk '{print >("index."(x+=NR%2)".txt")}' file
I would say:
awk '{file=int((NR+1)/2)".txt"; print > file}' file
int((NR+1)/2 maps every line number:
1 --> 1
2 --> 1
3 --> 2
x --> (x+1) / 2
So you get these files:
$ cat 1.txt
1 AAAGCGT
2 ACGAAGT
or
$ cat 3.txt
5 AGGGTGT
6 AGCCAGT
I have a .txt file with 25,000 lines. Each line there is a number from 1 to 20. I want to compute the total occurrence of each number in the file. I don't know should I use grep or awk and how to use it. And I'm worried about I got confused with 1 and 11, which both contain 1's. Thank you very much for helping!
I was trying but this would double count my numbers.
grep -o '1' degreeDistirbution.txt | wc -l
With grep you can match the beginning and end of a line with '^' and '$' respectively. For the whole thing I'll use an array, but to illustrate this point I'll just use one variable:
one="$(grep -c "^1$" ./$inputfile)"
then we put that together with the magic of bash loops and loop through all the numbers with a while like so:
i=1
while [[ $i -le 20 ]]
do
arr[i]="$(grep -c "^$i$" ./$inputfile)"
i=$[$i+1]
done
if you like you can of course use a for as well
An easier method is:
sort -n file | uniq -c
Which will count the occurrences of each number in the sorted file and display the results like:
$ sort -n dat/twenty.txt | uniq -c
3 1
3 2
3 3
4 4
4 5
4 6
4 7
4 8
4 9
4 10
4 11
3 12
2 13
2 14
4 15
4 16
4 17
2 18
2 19
2 20
Showing I have 3 ones, 3 twos, etc.. in the sample file.
Say I have a text file called "demo.txt" who looks like this:
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Now I want to read a certain line, say line 2, with a command which will look something like this:
Line2 = read 2 "demo.txt"
So when I'll print it:
echo "$Line2"
I'll get:
5 6 7 8
I know how to use 'sed' command in order to print a n-th line from a file, but not how to read it. I also know the 'read' command but dont know how to use it in order a certain line.
Thanks in advance for the help.
Using head and tail
$ head -2 inputFile | tail -1
5 6 7 8
OR
a generalized version
$ line=2
$ head -"$line" input | tail -1
5 6 7 8
Using sed
$ sed -n '2 p' input
5 6 7 8
$ sed -n "$line p" input
5 6 7 8
What it does?
-n suppresses normal printing of pattern space.
'2 p' specifies the line number, 2 or ($line for more general), p commands to print the current patternspace
input input file
Edit
To get the output to some variable use some command substitution techniques.
$ content=`sed -n "$line p" input`
$ echo $content
5 6 7 8
OR
$ content=$(sed -n "$line p" input)
$ echo $content
5 6 7 8
To obtain the output to a bash array
$ content= ( $(sed -n "$line p" input) )
$ echo ${content[0]}
5
$ echo ${content[1]}
6
Using awk
Perhaps an awk solution might look like
$ awk -v line=$line 'NR==line' input
5 6 7 8
Thanks to Fredrik Pihl for the suggestion.
Perl has convenient support for this, too, and it's actually the most intuitive!
The flip-flop operator can be used with line numbers:
$ printf "0\n1\n2\n3\n4" | perl -ne 'printf if 2 .. 4'
1
2
3
Note that it's 1-based.
You can also mix regular expressions:
$ printf "0\n1\nfoo\n3\n4" | perl -ne 'printf if /foo/ .. -1'
foo
3
4
(-1 refers to the last line)