gvim: Traces of previously opened files seen when reading a file - vim

I am seeing this strange issue where if I open a file in gvim on linux.
For e.g, if I am reading file2 in gvim, some of the lines in file2 have traces of file1 I had opened earlier. Here are the snapshots of the original contents of file1, file2 and also a snapshot showing file1 overlapping file2.
>
As seen in the snapshots, traces of file1 is present while reading file2 (highlighted).
How to fix this?

Related

Is there a way to compare N files at once, and only leave lines unique to each file?

Background
I have five files that I am trying to make unique relative to each other. In other words, I want to make it so that the lines of text in each file have no commonality with each other.
Attempted solution
So far, I have been able to run the grep -vf command comparing one file with the other 4 as so:
grep -vf file2.txt file1.txt
grep -vf file3.txt file1.txt
...
This makes it print out the lines in file1 that are not in file2, nor file3, etc.. However, this becomes cumbersome because I would need to do this for the superset of all files. In otherwords, to truly reduce each file to lines of text only in that file, I would have to do every combination of files into the grep -vf command. Given that this sounds cumbersome to me, I wanted to know...
Question
What is the command/series of commands in linux to find the lines of text in each file that is mutually exclusive to all the other files?
You could just do:
awk '!a[$0]++ { out=sprintf("%s.out", FILENAME); print > out}' file*
This will write the lines that are uniq in file to file.out. Each line will be written to the output file of the associated input file in which it first appears, and subsequent duplicates of that same line will be suppressed.

Vim: Tabe multiple files?

I know about running
vim -p file1 file2 file3
But is there away to do something similar from within vim?
What I've thought about wanting to do is
:tabe file1 file2 file3
or (IMO worse, but still an improvement):
:edit file1 file2 file3
...but neither of those are possible, by default at least.
try this:
:args file1 file2 file3 |tab sall
this will only put newly opened file (file1, file2, file3) in tabs, not all buffers.
if you want to open all buffers in tabs, replace sall to sball
btw, is tab convenient to work with? personally I like working with buffer more...
Upon browsing the web more extensively, researching in regards to this question, the best solution I've found so far is:
:arga file1 file2 file3 (...)
Which will create buffers for all the input files.
Then follow that by:
:tab sball
Which will open all the buffers into separate tabs.
But maybe there's an even better/shorter way? If not maybe I'm helping someone else out there, having the same problem/question.

Why doesn't "sort file1 > file1" work?

When I am trying to sort a file and save the sorted output in itself, like this
sort file1 > file1;
the contents of the file1 is getting erased altogether, whereas when i am trying to do the same with 'tee' command like this
sort file1 | tee file1;
it works fine [ed: "works fine" only for small files with lucky timing, will cause lost data on large ones or with unhelpful process scheduling], i.e it is overwriting the sorted output of file1 in itself and also showing it on standard output.
Can someone explain why the first case is not working?
As other people explained, the problem is that the I/O redirection is done before the sort command is executed, so the file is truncated before sort gets a chance to read it. If you think for a bit, the reason why is obvious - the shell handles the I/O redirection, and must do that before running the command.
The sort command has 'always' (since at least Version 7 UNIX) supported a -o option to make it safe to output to one of the input files:
sort -o file1 file1 file2 file3
The trick with tee depends on timing and luck (and probably a small data file). If you had a megabyte or larger file, I expect it would be clobbered, at least in part, by the tee command. That is, if the file is large enough, the tee command would open the file for output and truncate it before sort finished reading it.
It doesn't work because '>' redirection implies truncation, and to avoid keeping the whole output of sort in the memory before re-directing to the file, bash truncates and redirects output before running sort. Thus, contents of the file1 file will be truncated before sort will have a chance to read it.
It's unwise to depend on either of these command to work the way you expect.
The way to modify a file in place is to write the modified version to a new file, then rename the new file to the original name:
sort file1 > file1.tmp && mv file1.tmp file1
This avoids the problem of reading the file after it's been partially modified, which is likely to mess up the results. It also makes it possible to deal gracefully with errors; if the file is N bytes long, and you only have N/2 bytes of space available on the file system, you can detect the failure creating the temporary file and not do the rename.
Or you can rename the original file, then read it and write to a new file with the same name:
mv file1 file1.bak && sort file1.bak > file1
Some commands have options to modify files in place (for example, perl and sed both have -i options (note that the syntax of sed's -i option can vary). But these options work by creating temporary files; it's just done internally.
Redirection has higher precedence. So in the first case, > file1 executes first and empties the file.
The first command doesn't work (sort file1 > file1), because when using the redirection operator (> or >>) shell creates/truncates file before the sort command is even invoked, since it has higher precedence.
The second command works (sort file1 | tee file1), because sort reads lines from the file first, then writes sorted data to standard output.
So when using any other similar command, you should avoid using redirection operator when reading and writing into the same file, but you should use relevant in-place editors for that (e.g. ex, ed, sed), for example:
ex '+%!sort' -cwq file1
or use other utils such as sponge.
Luckily for sort there is the -o parameter which write results to the file (as suggested by #Jonathan), so the solution is straight forward: sort -o file1 file1.
Bash open a new empty file when reads the pipe, and then calls to sort.
In the second case, tee opens the file after sort has already read the contents.
You can use this method
sort file1 -o file1
This will sort and store back to the original file. Also, you can use this command to remove duplicated line:
sort -u file1 -o file1

How to store result of diff in Linux

How to get the result on another file after applying diff to file A.txt and B.txt.
Suppose File A.txt has:
a
b
c
File B.txt has:
a
b
on running
diff A.txt B.txt
It gives result as c, but how to store it in a file C.txt?
The diff utility produces its output on standard output (usually the console). Like any UNIX utility that does this, its output may very simply be redirected into a file like this:
diff A.txt B.txt >C.txt
This means "execute the command diff with two arguments (the files A.txt and B.txt) and put everything that would otherwise be displayed on the console into the file C.txt". Error messages will still go to the console.
To save the output of diff to a file and also send it to the terminal, use tee like so:
diff A.txt B.txt | tee C.txt
tee will duplicate the data to all named files (only C.txt here) and also to standard output (most likely the terminal).
Using > you can redirect output to a file. Eg:
diff A.txt B.txt > C.txt
This will result in the output from the diff command being saved in a file called C.txt.
Use Output Redirection.
diff file1 file2 > output
will store the diff of file1 and file2 to output
There are some files that diff may not do well with the output, like block special files, character special files, and broken links. The output due to differences with these may go to standard error.
Interestingly, when I redirected standard error, I still missed some things:
diff -qr <DirA> <DirB> 2>&1 > mydiff.txt
The only way to see the results of everything was to:
diff -qr <DirA> <DirB> |tee mydiff.txt
I was comparing the results of a live-cd mounted directory after copying to an external HD

Editing large text files on linux ( 5 - 10gb)

Basically, i need a file of specified format and large size(Around 10gb). To get this, i am copying the contents of my original file into the same file, multiple times, to increase its size. I dont care about the contents of the file as long as they have the required format.
Initially, i tried to do this using gedit, which failed miserably after few 100mbs. I'm looking for an editor which will help me do this. Or, may be a suggestion on alternate ways
You could make 2 files and repeatedly append them to each other:
cp file1 file2
for x in `seq 1 200`; do
cat file1 >> file2
cat file2 >> file1
done;
In Windows, from the command line:
copy file1.txt+file2.txt file3.txt
concats 1 and 2, places in 3 - repeat or add +args until you get the size you need.
For Unix,
cat file1.txt file2.txt >> file3.txt
concats 1 and 2, places in 3 - repeat or add more input files until you get the size you need.
There are probably many other ways to do this in Unix.

Resources