Move lines from one .txt file to another - string

I am trying to move certain lines from one .txt file to another. These lines all follow a certain pattern. I have been looking at using the find command in a batch file, but this does not delete the line from the original file.
For example:
find \i pattern "d:\example1.txt" >> "d:\example2.txt"
Is there any way to achieve this?
Thanks in advance.

Using findstr you can print lines that don't match, too. So you can do it in several steps, psudocoded like this:
findstr pattern input > output
findstr /v pattern input > input-inverse
move /y input-inverse input
This should leave you with all lines matching pattern in output, and an input without those lines.
EDIT: Made the last step use move with an option to overwrite, so no need to remove the input before. I guess I (being mainly a Linux person) think of "rename" and "move" as the same thing, and took that overwrite for granted. So, thanks for the heads-up.

If you can use external programs, one way would be using awk or sed.
Awk example:
awk /pattern/ { print }
Sed example:
sed '/inverse_pattern/ d' //Deletes lines which do not match

How about creating two files, then replacing the original?
find \i pattern "d:\example1.txt" >> "d:\example2.txt"
find \i antipattern "d:\example1.txt" >> "d:\example3.txt"
del example1.txt
ren example3.txt example1.txt
Deleting lines from files is hard. Typically, even in a genuine programming environment, you'd be using an extra file here.
Here's a slightly different implementation:
ren example1.txt source.txt
find \i pattern "d:\source.txt" >> "d:\example2.txt"
find \i antipattern "d:\source.txt" >> "d:\example1.txt"
del source.txt

Related

How do I replace ".net" with space using sed in Linux?

I'm using for loop, with arguments i. Each argument contains ".net" at the end and in directory they are in one line, divided by some space. Now I need to get rid of these ".net" using substitution of sed, but it's not working. I went through different options, the most recent one is
sed 's/\.(net)//g' $i;
which is obviously not correct, but I just can't find anything online about this.
To make it clear, lets say I have a directory with 5 files with names
file1.net
file2.net
file3.net
file4.net
file5.net
I would like my output to be
file1
file2
file3
file
file5
...Could somebody give me some advice?
You can use
for f in *.net; do mv "$f" "${f%.*}"; done
Details:
for f in *.net; - iterates over files with net extension
mv "$f" "${f%.*}" - renames the files with the file without net extension (${f%.*} removes all text - as few as possible - from the end of f till the first ., see Parameter expansion).
This is a work for perl's rename :
rename -n 's/\.net//' *.net
The -n is for test purpose. Remove it if the output looks good for you
This way:
sed -i.backup 's/\.net$//g' "$1";
It will create a backup for safeness

SED or other editor - remove strings from file on Windows

I need to find a string in a textfile, delete the line containing it, and save the file. The string is found (read from) another textfile, containing hundreds of different strings, one per row. The process is to go on from the first to the last string in the file.
Any (hopefully easy to use) text editors (on Windows OS) recommended ? To achive the task.
I am not into serious day-to-day editing. So I'd be ever so happy if the task could be accomplished with a easy-to-use but still reliable editor.
Thanks a bunch,
Frank
You can try notepad++ since it has a lot of plugins, also a great search algorithm. I did a similar task where I had to do a lot of search/replace stuff, and used a plugin I dug up from the internet, can't remember the name exactly (try google-ing I think it's replaacc for notepad++ or something similar).
On unix/linux/cygwin:
grep -v -f pattern_file unmodified_file > new_file
Remove all lines containing the patterns in pattern_file from unmodified_file, write to new_file.
grep -v outputs lines not matching any pattern. -f reads patterns from a file.
On windows this appears equivalent to running this at the command prompt:
FINDSTR /V /G:pattern_file unmodified_file > new_file
That's it. If you already have the two source files, it's a one-liner.
pattern_file is going to be whitespace and case sensitive unless you delve into other options, which are described with FINDSTR /?
Using sed:
sed -n '/PATTERN/n;p' FILE > FILE.new # then copy FILE.new to FILE
Tells sed to not output anything by default (-n), find the pattern (/PATTERN/) and skip this line if found (n), otherwise print the line (;p). If you have GNU sed you do can just call
sed -i -n '/PATTERN/n;p' FILE`
which automatically updates the file due to (-i /--in-place).

How to conditionally edit files in vim

I have a requirement to batch edit a bunch of files using vim based on their content. The simplest example is that I'd like to perform a series of let's say substitutions on files but only if the first line of the file matches a certain pattern.
I'm trying to do this kind of thing:
vim -e -s $file < changes.vim
I should add that I have no access to tools like sed and awk and would like to perform the entire operation in vim.
I recommend that you find the list of files you need, and pass that list into the command you want. For this, a combination of awk and xargs would seem useful. There are probably clever shorter things you can do…
awk 'FNR>1 {nextfile} /pattern/ { print FILENAME ; nextfile }' filePattern | xargs -I{} vim -e -s {} < changes.vim
In the above, filePattern gives all the files you want (maybe *.c), /pattern/ is the regex of the match you are looking for. xargs will take "one output at a time" and substitute it into the following command at the place where I put the {}.
I want to give a tip of the hat to this link where I found the inspiration for this answer.
vim only solution
EDIT - after I posted this you said you need a "vim only" solution. Here it is…
Step 1: create a conditionalEdits.vim file with the following lines at the start:
let line_num = search('searchExpression') " any regex
if line_num == 1 " first line matched
center " put your editing commands here...
update " save changes
endif
quit
Of course, instead of just centering the first line, you will want to put all your editing commands inside the if statement.
Now, you execute this command with
vim -c '/path/to/my/conditionalEdits.vim' -s filePattern
where filePattern matches all the files you might be interested in (but you will know for sure after you have looked at line 1 inside…)
Obviously you can navigate through the file in the usual way and look for matches / patterns etc to your heart's content - but this is the basic idea.
Helpful links: http://www.ibm.com/developerworks/library/l-vim-script-1/
and http://learnvimscriptthehardway.stevelosh.com
I highly recommend that you do this in a separate directory, using copies of a handful of files first, to make sure this actually does what you think it does. I would hate to be responsible for a bunch of files being overwritten (you do back up, right?)
You can loop over all files, if you find the pattern, open vim. Once it is modified to your needs and closed, the next one will open.
#!/usr/bin/env bash
for file in *; do
if [[ "$(sed '1q' ${file})" == "pattern" ]]; then
vim ${file}
fi
done
Within Vim, you can determine the matching files via :vimgrep; to check for a match in the first line, the \%l atom is handy:
:vimgrep /\%1lcertain pattern/ {file-glob}
Then, you can iterate through all matches with :cfnext, or use the :QFDo command from here.
You can pass those commands either via vim -c {cmd} -c {cmd} ..., or in a separate script, as you outline in your question.

Extracting sub-strings in Unix

I'm using cygwin on Windows 7. I want to loop through a folder consisting of about 10,000 files and perform a signal processing tool's operation on each file. The problem is that the files names have some excess characters that are not compatible with the operation. Hence, I need to extract just a certain part of the file names.
For example if the file name is abc123456_justlike.txt.rna I need to use abc123456_justlike.txt. How should I write a loop to go through each file and perform the operation on the shortened file names?
I tried the cut - b1-10 command but that doesn't let my tool perform the necessary operation. I'd appreciate help with this problem
Try some shell scripting, using the ${NAME%TAIL} parameter substitution: the contents of variable NAME are expanded, but any suffix material which matches the TAIL glob pattern is chopped off.
$ NAME=abc12345.txt.rna
$ echo ${NAME%.rna} #
# process all files in the directory, taking off their .rna suffix
$ for x in *; do signal_processing_tool ${x%.rna} ; done
If there are variations among the file names, you can classify them with a case:
for x in * ; do
case $x in
*.rna )
# do something with .rna files
;;
*.txt )
# do something else with .txt files
;;
* )
# default catch-all-else case
;;
esac
done
Try sed:
echo a.b.c | sed 's/\.[^.]*$//'
The s command in sed performs a search-and-replace operation, in this case it replaces the regular expression \.[^.]*$ (meaning: a dot, followed by any number of non-dots, at the end of the string) with the empty string.
If you are not yet familiar with regular expressions, this is a good point to learn them. I find manipulating string using regular expressions much more straightforward than using tools like cut (or their equivalents).
If you are trying to extract the list of filenames from a directory use the below command.
ls -ltr | awk -F " " '{print $9}' | cut -c1-10

Extract Directory from Log File with sed

I'm trying to parse through an application.log that has many lines that follow the same syntax below.
"Error","jrpp-237","10/13/11","02:55:04",,"File not found: /indexUsa~.cfm The specific sequence of files included or processed is: c:\websites\pj7fe4\indexUsa~.cfm '' "
I need to use some type of command to pull out what is listed between c:\websites\ and the next \
e.g. in this case it would be pj7fe4
I thought that the following command would work..
bin/sed -n '/c:\\websites\\/,/\\/p' upload/test.log
Unfortunately from reading further I now understand that this will return the entire line containing c:\websites through the \ and I need to know the in between, not the whole line.
To be more difficult I need to match all of the directory sub paths, not just one particular line as this is for multiple sites.
You're using range patterns incorrectly. You can't use it to limit the command (print in this case) to a part of the line, only to a range of lines. You also don't escape the backspaces.
Try this: sed 's/.*c:\\websites\\\([0-9a-zA-Z]*\)\\.*/\1/'
There's a good sed tutorial here: Sed - An Introduction and Tutorial by Bruce Barnett
grep way:
grep -Po "(?<=c:\\\websites\\\)[^\\\]+(?=\\\)" yourFile
test:
kent$ echo '"Error","jrpp-237","10/13/11","02:55:04",,"File not found: /indexUsa~.cfm The specific sequence of files included or processed is: c:\websites\pj7fe4\indexUsa~.cfm '' "'|grep -Po "(?<=c:\\\websites\\\)[^\\\]+(?=\\\)"
pj7fe4

Resources