Using diff command, ignore character at end of line - linux

I'm not entirely sure what sort of diff command I'd do to match what I need. But basically I have two different directories full of files that I need to compare and outline the changes of. But in one set of files they basically have a '1' at the end of the line.
An example would be if comparing these two objects
File1/1.txt
I AM IDENTICAL
File2/1.txt
I AM IDENTICAL 1
So I'd just want the diff command to leave out the '1' at the end of the line and show me the files which actually have changes. So far I came up with something like
diff file1/ file2/ -rw -I "$1" | more
but that doesn't work.
Apologies if this is an easy obvious question.

If the number of files and/or size is not that much, you can eyeball the differences and simply use vimdiff command to compare two files side by side
vimdiff File1/1.txt File2/1.txt
Otherwise, as arkascha suggested, first you need to modify your files to eliminate the ending character(s) before comparing them.

Related

Is there a way to adjust vimdiff's line position for a change?

I have fixed some indentation issues in my project and I'm looking at the vimdiff output for the before and after. I notice that vimdiff seems to be very confused as to what the actual changes are, rendering a pretty much useless output in this case:
For example, it seems to think that the very first line is a newly added line:
<div class="text-xs-center p-4">
In reality, all that has changed is the indentation. Vimdiff is not recognizing the changes properly.
In another, similar file, the diff works much better:
I think the difference is that in the second file I did not remove the first line break.
Is there a way to manually fix this sort of thing, so that the diff shows properly? I don't want to change either file, the changes are correct. But I'd like to tell vimdiff that it's comparing the wrong lines to one another.
Is this possible?
The underlying diff tool compares individual lines, regardless of whether "only" indentation changed, or something more fundamental. As in your first case, there's one extra unindented line, so diff recognizes that as unchanged, and this messes up the entire diff.
If you want to ensure that only indentation got changed, you can ignore whitespace changes via
:set diffopt+=iwhite
Then, the diff should show no changes at all (or, in your first example, only the added line 5).
Maybe there are also other diff utilities that more intelligently handle these cases. If you find such tool, you can configure Vim to use it via 'diffexpr'.

Keep only rows in a range using vim

I was wondering if there's a way in VIM to keep only rows in a certain range, i.e say I wanted to to keep only rows 1:20 in a file, and discard everything else. Better yet say I wanted to keep lines 1-20 and 40-60 is there a way to do this?
Is there a way to do this without manually deleting stuff?
If you mean entire lines by "rows", just use the :delete command with the inverted range:
:21,$delete
removes all lines except 1-20.
If the ranges are non-consecutive, an alternative is the :vglobal command with regular expression atoms that match only in certain lines. For example, to only keep lines 3 and 7:
:g!/\%3l\|\%7l/delete
There are also atoms for "lines less/larger than", so you can build ranges with them, too.
In order to keep lines 1 through 20 and 40 through 60, the following construct should do:
:v/\%>0l\%<21l\|\%>39l\%<61l/d
If you want to (as I now understand from your comments) save (different) parts of the buffer as new files, it's best to not modify the original file, but to write fragments as separate files. In fact, Vi(m) supports this well, because you can just pass a range to the :write command:
:1,20w newfile1
:40,60w newfile2
Append works, too:
:40,60w >> newfile1
There's not only one way to achieve what you want:
If this question is really about the first rows in a file:
head -20 <filename> > newfile
If it shall be a vim solution:
:21ENTER
dG
However, you mention that you want to split up a large file into smaller pieces. The tool for this is split: it lets you split up files into chunks of even line count or even size.

Using sed to delete lines present in similar file

I have a file listing from an original and a duplicate drive consisting of 985257 lines and 984997 lines respectfully.
As the number of lines do not match I am certain that some of the files have not duplicated.
In order to establish which files are not present I wish to use sed to filter the original file listing by deleting any lines present in the duplicate listing from the source listing.
I had thought about using a match formula in excel but due to the number of lines the program crashes. I thought using this approach in sed would be a viable option.
I have had no success with my approach so far however.
echo "Start"
# Cat the passed argument which is the duplicate file listing
for line in $(cat $1)
do
#sed the $line variable over the larger file and remove
#sed "${line}/d" LiveList.csv
#sed -i "${line}/d" LiveList.csv
#sed -i '${line}' 'd' LiveList.csv
sed -i "s/'${line}'//" /home/listings/LiveList.csv
done
There is a temporary file which is created and fills to the 103.4mb of the listing file however the listing file itself is not altered at all.
My other concern is that as the listing has been created in windows the '\' character may be escaping the string leading to no matches and therefore no alteration.
Example path:
Path,Length,Extension
Jimmy\tail\images\Jimmy\0001\0014\Text\A0\20\A056TH01-01.html,71982,.html
Please help.
This might work for you:
sort orginal_list.txt duplicate_list.txt | uniq -u
First thing that comes to my mind is just using rsync to take care of copying the missing files as fast as possible. It really works wonders.
If not, you can first sort both files to identify where they differ. You can use some paste trickery to put side by side differences, or even use the diff side-by-side output. When files are ordered, I think diff finds it easily to identify what lines have been added.

Compare 2 files with shell script

I was trying to find the way for knowing if two files are the same, and found this post...
Parsing result of Diff in Shell Script
I used the code in the first answer, but i think it's not working or at least i cant get it to work properly...
I even tried to make a copy of a file and compare both (copy and original), and i still get the answer as if they were different, when they shouldn't be.
Could someone give me a hand, or explain what's happening?
Thanks so much;
peixe
Are you trying to compare if two files have the same content, or are you trying to find if they are the same file (two hard links)?
If you are just comparing two files, then try:
diff "$source_file" "$dest_file" # without -q
or
cmp "$source_file" "$dest_file" # without -s
in order to see the supposed differences.
You can also try md5sum:
md5sum "$source_file" "$dest_file"
If both files return same checksum, then they are identical.
comm is a useful tool for comparing files.
The comm utility will read file1 and file2, which should be
ordered in the current collating sequence, and produce three
text columns as output: lines only in file1; lines only in
file2; and lines in both files.

How can I view log files in Linux and apply custom filters while viewing?

I need to read through some gigantic log files on a Linux system. There's a lot of clutter in the logs. At the moment I'm doing something like this:
cat logfile.txt | grep -v "IgnoreThis\|IgnoreThat" | less
But it's cumbersome -- every time I want to add another filter, I need to quit less and edit the command line. Some of the filters are relatively complicated and may be multi-line.
I'd like some way to apply filters as I am reading through the log, and a way to save these filters somewhere.
Is there a tool that can do this for me? I can't install new software so hopefully it's something that would already be installed -- e.g., less, vi, something in a Python or Perl lib, etc.
Changing the code that generates the log to generate less is not an option.
Use &pattern command within less.
From the man page for less
&pattern
Display only lines which match the pattern; lines which do not
match the pattern are not displayed. If pattern is empty (if
you type & immediately followed by ENTER), any filtering is
turned off, and all lines are displayed. While filtering is in
effect, an ampersand is displayed at the beginning of the
prompt, as a reminder that some lines in the file may be hidden.
Certain characters are special as in the / command:
^N or !
Display only lines which do NOT match the pattern.
^R Don't interpret regular expression metacharacters; that
is, do a simple textual comparison.
Try the multitail tool - as well as letting you view multile logs at once, I'm pretty sure it lets you apply regex filters interactively.
Based on ghostdog74's answer and the less manpage, I came up with this:
~/.bashrc:
export LESSOPEN='|~/less-filter.sh %s'
export LESS=-R # to allow ANSI colors
~/less-filter.sh:
#!/bin/sh
case "$1" in
*logfile*.log*) ~/less-filter.sed < $1
;;
esac
~/less-filter.sed:
/deleteLinesLikeThis/d # to filter out lines
s/this/that/ # to change text on lines (useful to colorize using ANSI escapes)
Then:
less logfileFooBar.log.1 -- applies the filter applies automatically.
cat logfileFooBar.log.1 | less -- to see the log without filtering
This is adequate for now but I would still like to be able to edit the filters on the fly.
see the man page of less. there are some options you can use to search for words for example. It has line editing mode as well.
There's an application by Casstor Software Solutions called LogFilter (www.casstor.com) that can edit Windows/Mac/Linux text files and can easily perform file filtering. It supports multiple filters as well as regular expressions. I think it might be what you're looking for.

Resources