How to compare two large files in Linux? [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
I have two large files, 9600000 float values are written one by line by a C code.I know there are similar,they should actually be the same.How to compare them and see if there is any difference?
I have tried
diff --unchanged-group-format='' base.txt base4.txt
But this does not work it prints out the second file on the screen.
With
cmp base.txt base4.txt
base.txt base4.txt differ: byte 811221, line 62402
what does this mean,that 62402 lines are different?

The output from cmp means that the first difference between the files is at byte position 811221 in the files, which is on line 62402. For example, if the two files are:
abcd
1234
wxyz
9876
and
abcd
1234
wqyz
9812
the output is:
file1.txt file2.txt differ: char 12, line 3
because on line 3 one file has x and the other file has q, and these are at byte position 12 (the newline characters are included in the byte count).
If you want to see all the differences, use the -l option.
$ cmp -l file1.txt file2.txt
12 170 161
18 67 61
19 66 62
Note that unlike diff, this isn't smart about insertions and deletions, it just compares each byte at each position. So if you insert or delete a character early in the file, everything after that will be shown as a mismatch.

Related

Unix - Finding third largest file in a folder and read first 5 lines [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 months ago.
Improve this question
How do I find the third largest file and read first 5 lines
I tried below to get the third largest file
ls -s /file/path | head 3 | tail -1
How do I use this file name to read the first 5 lines and insert to a new file?
You are very close, you just forgot to capitilize -s and you forgot to put dash in front of the number in head 3
ls -S /file/path
That works to output a list of files sorted by size.
head -3
Then you need to put a dash in front of your option for head just like you did with tail -1.
To read a certain number of lines from a file you can use head in the same way and provide it a filename.

compare two files and get the positions in third file in linux [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I need help with comparison of two files and get the positions in third file, both files will have the same fields but the order will be unsorted in 2nd file, third file will give the line number where the data is found.
eg. file1.txt
A
B
C
D
file2.txt
B
D
A
C
outputfileposition.txt
3
1
4
2
Any help appreciated, thanks in advance
In awk
awk 'FNR==NR{a[$0]=FNR;next}{print a[$0] > "outputfileposition.txt"}' file{2,1}.txt
This will do the trick :
while read line
do
grep -n $line file2.txt | grep -o ^[0-9]* >> outputfileposition.txt
done < file1.txt

List files by using UNIX/Linux commands [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
By using UNIX/Linux commands, pipes (“|”) and redirections (“>”, “>>”)
make a listing of the smallest 5 files in the “/etc” directory whose names
contains string “.conf”, sorted by increasing file size.
This will work:
ls -lS /etc | sort -k 5 -n| grep ".conf" | head -n 5
First list files by size, then sort by 5th column of results by number, then filter lines containing the string ".conf" and finally show only 5 lines.

Find pattern from one file listed in another [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I want to find patterns that are listed in one file and find them in other file.
The second file has those patterns separated by commas.
for e.g. first file F1 has genes
ENSG00000187546
ENSG00000113492
ENSG00000166971
and second file F2 has those genes along with some more columns which I need
ENSG00000164252
ENSG00000187546
ENSG00000113492
ENSG00000166971,ENSG00000186106
So the gene ENSG00000166971 which is present in the second file does not show up in grep because it has another gene with it,separated by comma.
My code is:
grep -f "F1.txt" "F2.txt" >output.txt
I want those values even if one of them is present,and the associated data with it.Is there any way to do this?
Tried to create the same situation.
getting ENSG00000166971 in the grep result.
may be this is due to different version.
i m using Fedora release 20 with grep 2.14.56-1e3d.

rename all files in folder to numbered list 1.jpg 2.jpg [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 9 years ago.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Improve this question
I have a folder full of images with several different random file names to help organize this mess I would like to, in one command rename all of them to a sequential order so if I have 100 files it starts off naming the first file file-1.jpg file-2.jpg etc. Is this possible in one command?
The most concise command line to do this I can think of is
ls | cat -n | while read n f; do mv "$f" "file-$n.jpg"; done
ls lists the files in the current directory and cat -n adds line numbers. The while loop reads the resulting numbered list of files line by line, stores the line number in the variable n and the filename in the variable f and performs the rename.
I was able to solve my problem by writing a bash script
#!/bin/sh
num=1
for file in *.jpg; do
mv "$file" "$(printf "%u" $num).jpg"
let num=$num+1
done

Resources