Compare two files in Linux: ignoring first and last lines - linux

I´d like to compare two files, but I don´t want to take into account the first 10 lines, and the last 3 lines of both files. I tried to do it with diff and tail commands, like in here, but without success.
how can I do it?

Use GNU tail and head:
To ignore the first 10 lines of a file, use tail like this:
tail -n +11 file
To ignore the last 3 lines of a file, use head like this:
head -n -4 file
You can then construct your diff command using process substitution as follows:
diff <(tail -n +11 file | head -n -4) <(tail -n +11 file2 | head -n -4)

Related

compare text between specific line linux using diff

I have two java files as text file and I want to compare the two text file only between specific set of lines. So how do i specify the line number to search for in the diff command.
Diff can't do that, but you can use head and tail to extract the lines, and use process substitution to use the results in diff:
diff <(head -n 100 file1.java | tail -n 20) <(head -n 120 file2.java | tail -n25)

Retrieve last 100 lines logs

I need to retrieve last 100 lines of logs from the log file.
I tried the sed command
sed -n -e '100,$p' logfilename
Please let me know how can I change this command to specifically retrieve the last 100 lines.
You can use tail command as follows:
tail -100 <log file> > newLogfile
Now last 100 lines will be present in newLogfile
EDIT:
More recent versions of tail as mentioned by twalberg use command:
tail -n 100 <log file> > newLogfile
"tail" is command to display the last part of a file, using proper available switches helps us to get more specific output. the most used switch for me is -n and -f
SYNOPSIS
tail [-F | -f | -r] [-q] [-b number | -c number | -n number] [file ...]
Here
-n number :
The location is number lines.
-f : The -f option causes tail to not stop when end of file is
reached, but rather to wait for additional data to be appended to the
input. The -f option is ignored if the
standard input is a pipe, but not if it is a FIFO.
Retrieve last 100 lines logs
To get last static 100 lines
tail -n 100 <file path>
To get real time last 100 lines
tail -f -n 100 <file path>
You can simply use the following command:-
tail -NUMBER_OF_LINES FILE_NAME
e.g tail -100 test.log
will fetch the last 100 lines from test.log
In case, if you want the output of the above in a separate file then you can pipes as follows:-
tail -NUMBER_OF_LINES FILE_NAME > OUTPUT_FILE_NAME
e.g tail -100 test.log > output.log
will fetch the last 100 lines from test.log and store them into a new file output.log)
Look, the sed script that prints the 100 last lines you can find in the documentation for sed (https://www.gnu.org/software/sed/manual/sed.html#tail):
$ cat sed.cmd
1! {; H; g; }
1,100 !s/[^\n]*\n//
$p
$ sed -nf sed.cmd logfilename
For me it is way more difficult than your script so
tail -n 100 logfilename
is much much simpler. And it is quite efficient, it will not read all file if it is not necessary. See my answer with strace report for tail ./huge-file: https://unix.stackexchange.com/questions/102905/does-tail-read-the-whole-file/102910#102910
I know this is very old, but, for whoever it may helps.
less +F my_log_file.log
that's just basic, with less you can do lot more powerful things. once you start seeing logs you can do search, go to line number, search for pattern, much more plus it is faster for large files.
its like vim for logs[totally my opinion]
original less's documentation : https://linux.die.net/man/1/less
less cheatsheet : https://gist.github.com/glnds/8862214
len=`cat filename | wc -l`
len=$(( $len + 1 ))
l=$(( $len - 99 ))
sed -n "${l},${len}p" filename
first line takes the length (Total lines) of file
then +1 in the total lines
after that we have to fatch 100 records so, -99 from total length
then just put the variables in the sed command to fetch the last 100 lines from file
I hope this will help you.

How do i copy every line X line from a bunch of files to another file?

So my problem is as follows:
I have a bunch of files and i need only the information from a certain line in each of these files (the same line for all files).
Example:
I want the content of the line 10 from file example_1.dat~example_10.dat and then i want to save it on > test.dat
I tried using: head -n 5 example_*.dat > test.dat. But this gives me all the information from the top till the line i have chosen instead of just the line.
Please help.
$ for f in *.dat ; do sed -n '5p' $f >> test.dat ; done
This code will do the following:
Foreach file f in the directory that ends with .dat.
Use sed on the 5:th row in file and write to test.dat.
The ">>" will add the row at the bottom of the file if existing.
Use a combination of head and tail to zoom to the needed line. For example, head -n 5 file | tail -n 1
You can use a for loop to get it done over several files
for f in *.dat ; do head -n 5 $f | tail -n 1 >> test.dat ; done
PS: Don't forget to clean the test.dat file (> test.dat) before running the loop. Otherwise you'll get results from previous runs as well.
You can use sed or awk:
sed -n "5p"
awk "NR == 5"
This might work for you (GNU sed):
sed -sn '5wtest.dat' example_*.dat

first two results from ls command

I am using ls -l -t to get a list of files in a directory ordered by time.
I would like to limit the search result to the top 2 files in the list.
Is this possible?
I've tried with grep and I struggled.
You can pipe it into head:
ls -l -t | head -3
Will give you top 3 lines (2 files and the total).
This will just give you the first 2 lines of files, skipping the size line:
ls -l -t | tail -n +2 | head -2
tail strips the first line, then head outputs the next 2 lines.
To avoid dealing with the top output line you can reverse the sort and get the last two lines
ls -ltr | tail -2
This is pretty safe, but depending what you'll do with those two file entries after you find them, you should read Parsing ls on the problems with using ls to get files and file information.
Or you could try just this
ls -1 -t | head -2
The -1 switch skips the title line.
You can use the head command to grab only the first two lines of output:
ls -l -t | head -2
You have to pipe through head.
ls -l -t | head -n 3
will output the two first results.
Try this:
ls -td -- * | head -n 2

Print a file, skipping the first X lines, in Bash [duplicate]

This question already has answers here:
How can I remove the first line of a text file using bash/sed script?
(19 answers)
Closed 3 years ago.
I have a very long file which I want to print, skipping the first 1,000,000 lines, for example.
I looked into the cat man page, but I did not see any option to do this. I am looking for a command to do this or a simple Bash program.
You'll need tail. Some examples:
$ tail great-big-file.log
< Last 10 lines of great-big-file.log >
If you really need to SKIP a particular number of "first" lines, use
$ tail -n +<N+1> <filename>
< filename, excluding first N lines. >
That is, if you want to skip N lines, you start printing line N+1. Example:
$ tail -n +11 /tmp/myfile
< /tmp/myfile, starting at line 11, or skipping the first 10 lines. >
If you want to just see the last so many lines, omit the "+":
$ tail -n <N> <filename>
< last N lines of file. >
Easiest way I found to remove the first ten lines of a file:
$ sed 1,10d file.txt
In the general case where X is the number of initial lines to delete, credit to commenters and editors for this:
$ sed 1,Xd file.txt
If you have GNU tail available on your system, you can do the following:
tail -n +1000001 huge-file.log
It's the + character that does what you want. To quote from the man page:
If the first character of K (the number of bytes or lines) is a
`+', print beginning with the Kth item from the start of each file.
Thus, as noted in the comment, putting +1000001 starts printing with the first item after the first 1,000,000 lines.
If you want to skip first two line:
tail -n +3 <filename>
If you want to skip first x line:
tail -n +$((x+1)) <filename>
A less verbose version with AWK:
awk 'NR > 1e6' myfile.txt
But I would recommend using integer numbers.
Use the sed delete command with a range address. For example:
sed 1,100d file.txt # Print file.txt omitting lines 1-100.
Alternatively, if you want to only print a known range, use the print command with the -n flag:
sed -n 201,300p file.txt # Print lines 201-300 from file.txt
This solution should work reliably on all Unix systems, regardless of the presence of GNU utilities.
Use:
sed -n '1d;p'
This command will delete the first line and print the rest.
If you want to see the first 10 lines you can use sed as below:
sed -n '1,10 p' myFile.txt
Or if you want to see lines from 20 to 30 you can use:
sed -n '20,30 p' myFile.txt
Just to propose a sed alternative. :) To skip first one million lines, try |sed '1,1000000d'.
Example:
$ perl -wle 'print for (1..1_000_005)'|sed '1,1000000d'
1000001
1000002
1000003
1000004
1000005
You can do this using the head and tail commands:
head -n <num> | tail -n <lines to print>
where num is 1e6 + the number of lines you want to print.
This shell script works fine for me:
#!/bin/bash
awk -v initial_line=$1 -v end_line=$2 '{
if (NR >= initial_line && NR <= end_line)
print $0
}' $3
Used with this sample file (file.txt):
one
two
three
four
five
six
The command (it will extract from second to fourth line in the file):
edu#debian5:~$./script.sh 2 4 file.txt
Output of this command:
two
three
four
Of course, you can improve it, for example by testing that all argument values are the expected :-)
cat < File > | awk '{if(NR > 6) print $0}'
I needed to do the same and found this thread.
I tried "tail -n +, but it just printed everything.
The more +lines worked nicely on the prompt, but it turned out it behaved totally different when run in headless mode (cronjob).
I finally wrote this myself:
skip=5
FILE="/tmp/filetoprint"
tail -n$((`cat "${FILE}" | wc -l` - skip)) "${FILE}"

Resources