SED or other editor - remove strings from file on Windows - text

I need to find a string in a textfile, delete the line containing it, and save the file. The string is found (read from) another textfile, containing hundreds of different strings, one per row. The process is to go on from the first to the last string in the file.
Any (hopefully easy to use) text editors (on Windows OS) recommended ? To achive the task.
I am not into serious day-to-day editing. So I'd be ever so happy if the task could be accomplished with a easy-to-use but still reliable editor.
Thanks a bunch,
Frank

You can try notepad++ since it has a lot of plugins, also a great search algorithm. I did a similar task where I had to do a lot of search/replace stuff, and used a plugin I dug up from the internet, can't remember the name exactly (try google-ing I think it's replaacc for notepad++ or something similar).

On unix/linux/cygwin:
grep -v -f pattern_file unmodified_file > new_file
Remove all lines containing the patterns in pattern_file from unmodified_file, write to new_file.
grep -v outputs lines not matching any pattern. -f reads patterns from a file.
On windows this appears equivalent to running this at the command prompt:
FINDSTR /V /G:pattern_file unmodified_file > new_file
That's it. If you already have the two source files, it's a one-liner.
pattern_file is going to be whitespace and case sensitive unless you delve into other options, which are described with FINDSTR /?

Using sed:
sed -n '/PATTERN/n;p' FILE > FILE.new # then copy FILE.new to FILE
Tells sed to not output anything by default (-n), find the pattern (/PATTERN/) and skip this line if found (n), otherwise print the line (;p). If you have GNU sed you do can just call
sed -i -n '/PATTERN/n;p' FILE`
which automatically updates the file due to (-i /--in-place).

Related

How to conditionally edit files in vim

I have a requirement to batch edit a bunch of files using vim based on their content. The simplest example is that I'd like to perform a series of let's say substitutions on files but only if the first line of the file matches a certain pattern.
I'm trying to do this kind of thing:
vim -e -s $file < changes.vim
I should add that I have no access to tools like sed and awk and would like to perform the entire operation in vim.
I recommend that you find the list of files you need, and pass that list into the command you want. For this, a combination of awk and xargs would seem useful. There are probably clever shorter things you can do…
awk 'FNR>1 {nextfile} /pattern/ { print FILENAME ; nextfile }' filePattern | xargs -I{} vim -e -s {} < changes.vim
In the above, filePattern gives all the files you want (maybe *.c), /pattern/ is the regex of the match you are looking for. xargs will take "one output at a time" and substitute it into the following command at the place where I put the {}.
I want to give a tip of the hat to this link where I found the inspiration for this answer.
vim only solution
EDIT - after I posted this you said you need a "vim only" solution. Here it is…
Step 1: create a conditionalEdits.vim file with the following lines at the start:
let line_num = search('searchExpression') " any regex
if line_num == 1 " first line matched
center " put your editing commands here...
update " save changes
endif
quit
Of course, instead of just centering the first line, you will want to put all your editing commands inside the if statement.
Now, you execute this command with
vim -c '/path/to/my/conditionalEdits.vim' -s filePattern
where filePattern matches all the files you might be interested in (but you will know for sure after you have looked at line 1 inside…)
Obviously you can navigate through the file in the usual way and look for matches / patterns etc to your heart's content - but this is the basic idea.
Helpful links: http://www.ibm.com/developerworks/library/l-vim-script-1/
and http://learnvimscriptthehardway.stevelosh.com
I highly recommend that you do this in a separate directory, using copies of a handful of files first, to make sure this actually does what you think it does. I would hate to be responsible for a bunch of files being overwritten (you do back up, right?)
You can loop over all files, if you find the pattern, open vim. Once it is modified to your needs and closed, the next one will open.
#!/usr/bin/env bash
for file in *; do
if [[ "$(sed '1q' ${file})" == "pattern" ]]; then
vim ${file}
fi
done
Within Vim, you can determine the matching files via :vimgrep; to check for a match in the first line, the \%l atom is handy:
:vimgrep /\%1lcertain pattern/ {file-glob}
Then, you can iterate through all matches with :cfnext, or use the :QFDo command from here.
You can pass those commands either via vim -c {cmd} -c {cmd} ..., or in a separate script, as you outline in your question.

Terminal command to find lines containing a specific word?

I was just wondering what command i need to put into the terminal to read a text file, eliminate all lines that do not contain a certain keyword, and then print those lines onto a new file. for example, the keyword is "system". I want to be able to print all lines that contain system onto a new separate file. Thanks
grep is your friend.
For example, you can do:
grep system <filename> > systemlines.out
man grep and you can get additional useful info as well (ex: line numbers, 1+ lines prior, 1+lines after, negation - ie: all lines that do not contain grep, etc...)
If you are running Windows, you can either install cygwin or you can find a win32 binary for grep as well.
grep '\<system\>'
Will search for lines that contain the word system, and not system as a substring.
below grep command will solve ur problem
grep -i yourword filename1 > filename2
with -i for case insensitiveness
without -i for case sensitiveness
to learn how grep works on ur server ,refer to man page on ur server by the following command
man grep
grep "system" filename > new-filename
You might want to make it a bit cleverer to not include lines with words like "dysystemic", but it's a good place to start.

grep based on blacklist -- without procedural code?

It's a well-known task, simple to describe:
Given a text file foo.txt, and a blacklist file of exclusion strings, one per line, produce foo_filtered.txt that has only the lines of foo.txt that do not contain any exclusion string.
A common application is filtering compiler warnings from a build log, but to ignore warnings on files that are not yours. The file foo.txt is the warnings file (itself filtered from the build log), and a blacklist file excluded_filenames.txt with file names, one per line.
I know how it's done in procedural languages like Perl or AWK, and I've even done it with combinations of Linux commands such as cut, comm, and sort.
But I feel that I should be really close with xargs, and just can't see the last step.
I know that if excluded_filenames.txt has only 1 file name in it, then
grep -v foo.txt `cat excluded_filenames.txt`
will do it.
And I know that I can get the filenames one per line with
xargs -L1 -a excluded_filenames.txt
So how do I combine those two into a single solution, without explicit loops in a procedural language?
Looking for the simple and elegant solution.
You should use the -f option (or you can use fgrep which is the same):
grep -vf excluded_filenames.txt foo.txt
You could also use -F which is more directly the answer to what you asked:
grep -vF "`cat excluded_filenames.txt`" foo.txt
from man grep
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing.
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.

Linux command to replace string in LARGE file with another string

I have a huge SQL file that gets executed on the server. The dump is from my machine and in it there are a few settings relating to my machine. So basically, I want every occurance of "c://temp" to be replace by "//home//some//blah"
How can this be done from the command line?
sed is a good choice for large files.
sed -i.bak -e 's%C://temp%//home//some//blah%' large_file.sql
It is a good choice because doesn't read the whole file at once to change it. Quoting the manual:
A stream editor is used to perform
basic text transformations on an input
stream (a file or input from a
pipeline). While in some ways similar
to an editor which permits scripted
edits (such as ed), sed works by
making only one pass over the
input(s), and is consequently more
efficient. But it is sed's ability to
filter text in a pipeline which
particularly distinguishes it from
other types of editors.
The relevant manual section is here. A small explanation follows
-i.bak enables in place editing leaving a backup copy with .bak extension
s%foo%bar% uses s, the substitution command, which
substitutes matches of first string
in between the % sign, 'foo', for the second
string, 'bar'. It's usually written as s//
but because your strings have plenty
of slashes, it's more convenient to
change them for something else so you
avoid having to escape them.
Example
vinko#mithril:~$ sed -i.bak -e 's%C://temp%//home//some//blah%' a.txt
vinko#mithril:~$ more a.txt
//home//some//blah
D://temp
//home//some//blah
D://temp
vinko#mithril:~$ more a.txt.bak
C://temp
D://temp
C://temp
D://temp
Just for completeness. In place replacement using perl.
perl -i -p -e 's{c://temp}{//home//some//blah}g' mysql.dmp
No backslash escapes required either. ;)
Try sed? Something like:
sed 's/c:\/\/temp/\/\/home\/\/some\/\/blah/' mydump.sql > fixeddump.sql
Escaping all those slashes makes this look horrible though, here's a simpler example which changes foo to bar.
sed 's/foo/bar/' mydump.sql > fixeddump.sql
As others have noted, you can choose your own delimiter, which would prevent the leaning toothpick syndrome in this case:
sed 's|c://temp\\|home//some//blah|' mydump.sql > fixeddump.sql
The clever thing about sed is that it operating on a stream rather than a file all at once, so you can process huge files using only a modest amount of memory.
There's also a non-standard UNIX utility, rpl, which does the exact same thing that the sed examples do; however, I'm not sure whether rpl operates streamwise, so sed may be the better option here.
The sed command can do that.
Rather than escaping the slashes, you can choose a different delimiter (_ in this case):
sed -e 's_c://temp/_/home//some//blah/_' file1.txt > file2.txt
perl -pi -e 's#c://temp#//home//some//blah#g' yourfilename
The -p will treat this script as a loop, it will read the specified file line by line running the regex search and replace.
-i This flag should be used in conjunction with the -p flag. This commands Perl to edit the file in place.
-e Just means execute this perl code.
Good luck
gawk
awk '{gsub("c://temp","//home//some//blah")}1' file

Replacing a line in a csv file?

I have a set of 10 CSV files, which normally have a an entry of this kind
a,b,c,d
d,e,f,g
Now due to some error entries in this file have become of this kind
a,b,c,d
d,e,f,g
,,,
h,i,j,k
Now I want to remove the line with only commas in all the files. These files are on a Linux filesystem.
Any command that you recommend that can replaces the erroneous lines in all the files.
It depends on what you mean by replace. If you mean 'remove', then a trivial variant on #wnoise's solution is:
grep -v '^,,,$' old-file.csv > new-file.csv
Note that this deletes just those lines with exactly three commas. If you want to delete mal-formed lines with any number of commas (including zero) - and no other characters on the line, then:
grep -v '^,*$' ...
There are endless other variations on the regex that would deal with other scenarios. Dealing with full CSV data with commas inside quotes starts to need something other than a regex machine. It can be done, within broad limits, especially in more complex regex systems such as PCRE or Perl. But it requires more work.
Check out Mastering Regular Expressions.
sed 's/,,,/replacement/' < old-file.csv > new-file.csv
optionally followed by
mv new-file.csv old-file.csv
Replace or remove, your post is not clear... For replacement see wnoise's answer. For removing, you could use
awk '$0 !~ /,,,/ {print}' <old-file.csv > new-file.csv
What about trying to keep only lines which are matching the desired format instead of handling one exception ?
If the provided input is what you really want to match:
grep -E '[a-z],[a-z],[a-z],[a-z]' < oldfile.csv > newfile.csv
If the input is different, provide it, the regular expression should not be too hard to write.
Do you want to replace them with something, or delete them entirely? Either way, it can be done with sed. To delete:
sed -i -e '/^,\+$/ D' yourfile1.csv yourfile2.csv ...
To replace: well, see wnoise's answer, or if you don't want to create new files with the output,
sed -i -e '/^,\+$/ s//replacement/' yourfile1.csv yourfile2.csv ...
or
sed -i -e '/^,\+$/ c\
replacement' yourfile1.csv yourfile2.csv ...
(that should be entered exactly as is, including the line break). Of course, you can also do this with awk or perl or, if you're only deleting lines, even grep:
egrep -v '^,+$' < oldfile.csv > newfile.csv
I tested these to make sure they work, but I'd advise you to do the same before using them (just in case). You can omit the -i option from sed, in which case it'll print out the results (rather than writing them back to the file), or omit the output redirection >newfile.csv from grep.
EDIT: It was pointed out in a comment that some features of these sed commands only work on GNU sed. As far as I can tell, these are the -i option (which can be replaced with shell redirection, sed ... <infile >outfile ) and the \+ modifier (which can be replaced with \{1,\} ).
Most simply:
$ grep -v ,,,, oldfile > newfile
$ mv newfile oldfile
yes, awk or grep are very good option if you are working in linux platform. However you can use perl regex for other platform. using join & split options.

Resources