I have a big text file in which I search for lines I need to delete. I manually copy one line than put the mark button and repeat it again, I want to know if it possible to put all lines or at least more than one and mark it all? Maybe something in this way - word1>word2>word3 ets I don't know all combinations sorry
All words (lines) are unique so recording a marco is not a solution.
My text
word1
word2
word3
word4
word5
word2
word5
word4
I need to mark lines - word1, word2, word5
I do it manually one by one but I want to mark them all at once, maybe there is a regex for this I'm not sure like [word1, word2, word5] in the Find what area
If the number of words is not too big, you can do:
Ctrl+M
Find what: \b(?:word1|word2|word5)\b
TICK Match case
Mark all
Explanation:
\b # word boundary
(?: # non capture group
word1 # word1
| # OR
word2 # word2
| # OR
word5 # word3
) # end group
\b # word boundary
Screenshot:
Related
I have a pdf which has word meanings in the following format:
word1 meaning1
_____________________
word2 meaning2 in
multiple lines.
_____________________
word3 meaning3
_____________________
I want to store this information in Excel as
| column1 | column2 |
+----------+-----------------------------+
| word1 | meaning1 |
| words2 | meaning2 in multiple lines. |
| word3 | meaning3 |
For each pair, I want only one cell per word and meaning. I tried this converting to Excel via online tools but its not getting identified correctly. Copy pasting is not working because multiple lines of pdf become multiple lines in excel and merging(via macro) them with a full stop as delimiter fails as some meanings have full stops inside them. Is there any easy way to achieve it?
Update screenshot of Excel needed:
This is a sample, all lines have been trimmed leading and trailing spaces.
word word
word word word
word word word word
word
Result would be
word word
If you want to find lines that just have one space (with some 'word' before and after), try with this:
^\S+ \S+$
Explained:
^ # Begin of line
\S+ # non-blank character repeated 1 or more times
[ ] # a space (I'm using [ ] to make it more clear)
\S+ # non-blank character repeated 1 or more times
$ # end of line
Demo
I am trying to filter for multiple words using a loop but following is not working:
function! Myfilter (...)
for s in a:000
v/s/d
endfor
endfunction
It deletes all lines that do not contain the letter s rather than the value of s. How can I get value of s in statement v/s/d?
Deletes all lines other than those that contain all of the function's arguments
function! Myfilter0 (...)
exec 'v/\(.*' . join(a:000, '\)\#=\(.*') . '\)\#=/d'
endfunction
Example buffer
word1 beta word2
a word1 b word2 c
a word2 b word1 c word3 d
a word2 b word3 c word1 d
word1 delta
epsilon
Example function call
:call Myfilter("word1", "word2", "word3")
Example result
a word2 b word1 c word3 d
a word2 b word3 c word1 d
Note
Uses regex lookahead to match words in any order. This is what the example regex looks like after substitution and without the escape characters for clarity:
:v/(.*word1)#=(.*word2)#=(.*word3)#=/d
The :execute command is designed precisely for this task of
evaluating an expression and executing it as an Ex command:
exec 'v/' . s . '/d'
A much more efficient solution would be to compose a regex from your parameters, and to use it to remove non matching lines.
exe 'v/\('.join(a:000, '\|').'\)/d_'
can you help me figure out with this problem
if I have text document with list line by line
text1
text2
text3
text4
how to remove last 3 lines in the text document, just make minus three lines last from all exist lines, provided that the list is constantly updated with new lines
and also remember first line word from deleted three, not sure maybe with append
word1
word2 <--- must be removed but remember for further manipulation
word3 <--- must be removed
word4 <--- must be removed
and then I want back word2 which was remembered to my text document this way:
word1
word2 <--- back it to the list where it was before with two symbols
new word <--- new word comes here and second two lines space and X
X
but I'm not really sure how to find last three lines and fix first from it
Tackling your first issue, finding the last 3 lines of your list, you can use [-3:]
In code it might look something like this:
list1 = []
list1.append('The first text item')
list1.append('Now the second text item')
list1.append('This one is the third text item')
list1.append('Finally the fourth text item')
list2 = list1[-3:]
You can also delete the last 3 items from the first list with
del list1[-3:]
If you use the above and put the last 3 text items into a separate list, obtaining the first word from each is straight forward.
So, read in the text document to a list, adjust the list and output the new list to a temp file.
Delete the original text document and rename the temp file.
I'm not very good at linux, and am trying to use grep to count five letter words.
You can use:
grep -o -w "\w\{5\}" your_file | wc -w
With -o only matched words will be printed, -w denotes that regex is searched as a word, \w\{5\} - regex string itself (matches 5 continuous word characters). So, with your_file containing
word1 word2 word3
long_word 123 word4
Output of grep -o -w "\w\{5\}" your_file will be
word1
word2
word3
word4
Piped wc -w just counts this.
Note: If you don't want to match all alphanumeric characters - replace \w meta-character by something more specific. For example [a-z] - lowercase English letters.
This gnu awk (due to mulitple characters in Record Selector) does count how many word have 5 letters. It does ignore ., etc.
awk -v RS="[ .,?!]|\n" 'length($0)==5 {a++} END {print a}' file
Use the c flag to count, look for patterns containing five characters:
$ cat file
some text file containing many words and sentences.
$ tr ' ' '\n' < file | grep -c '^[ \t]*[a-zA-Z]\{5\}[ \t]*$'
1