I am writing parser to parse PyMOL files (language for bioinformatics).
I know that double-quote char makes a string like "text". But single-quote char ' is different. Here is example of PyMOL lines with this strange symbol.
load dat/names.pdb
select test,name O4'
select test,*/O4'
select test,*/O4'+O3'
select test,(*/O4',O3')
select test,name O4'+O3'
select test,name "O4'+O3'"
select test,name O4'+Na\+
select test,(name Na\+,O4')
select test,name Na\++O4'
select test,*/Na\++O4'
select test,*/O4'+O4
select test,*/O2\*+O2
select test,*/O2\*+O2'
To what language tokens does this quote apply? How to colorize such lines? Maybe quote-char is a word-char, or a separator char? In one example file I saw such usage of quote, it is string token 'text':
iterate (all),resn = 'NON'
This is valid code taked from PyMOL GitHub repo.
PyMOL is more a software than a language. It provides, however, a set of commands to support some python scripting. What your file contains is a set of such commands.
The first command load dat/names.pdb load a pdb file (a text file containing the 3D coordinates, names and other data about atoms from a molecule, usually - but not always - a protein). The full documentation about pdb files can be found here.
The second and subsequent commands create a PyMOL selection (basically, a list of atoms) following a specific syntax. The keyword name indicates that you want to select atoms whose name contains the string O4' (for the first select command). Note that the single quote is part of the atom name and NOT a language token. Usually, atom names with single quote characters points out atoms from nucleic acids (DNA or RNA).
The command iterate (all),resn = 'NON' is a PyMOL command to iterate over all atoms of a selection ; here, the selection is (all), which means all the atoms loaded in the session. But the syntax doesn't seem correct to me. I assume that you want to iterate over all atoms that belongs to residues named NON, in which case your command should look like iterate (resn NON), print name to print the name of all atoms of the selection for instance. If you want to change the residues name of all atoms to NON, you should consider using the PyMOL command alter instead.
Related
I'm trying to convert multiple instances of Unicode codes to their corresponding characters.
I have some text with this format:
U+00A9
And I want to generate the following next to it:
©
I have tried to select the code in visual mode and use the selection range '<,'> in command mode as input for i_CTRL_V but I don't know how to use special keys on a command.
I haven't found anything useful in the manual with :help command-mode . I could solve this problem using other tools but I want to improve my vim knowledge. Any hint is appreciated.
Edit:
As #m_mlvx has pointed out my goal is to visually select, then run some command that looks up the Unicode and does the substitution. Manually input a substitution like :s/U+00A9/U+00A9 ©/g is not what I'm interested in as it would require manually typing each of the special characters on every substitution.
Any hint is appreciated.
Here are a whole lot of them…
:help i_ctrl-v is about insert mode and ranges matter in command-line mode so :help command-mode is totally irrelevant.
When they work on text, Ex commands only work on lines, not arbitrary text. This makes ranges like '<,'> irrelevant in this case.
After carefully reading :help i_ctrl-v_digit, linked from :help i_ctrl-v, we can conclude that it is supposed to be used:
with a lowercase u,
without the +,
without worrying about the case of the value.
So both of these should be correct:
<C-v>u00a9
<C-v>u00A9
But your input is U+00A9 so, even if you somehow manage to "capture" that U+00A9, you won't be able to use it as-is: it must be sanitized first. I would go with a substitution but, depending on how you want to use that value in the end, there are probably dozens of methods:
substitute('U+00A9', '\(\a\)+\(.*\)', '\L\1\2', '')
Explanation:
\(\a\) captures an alphabetic character.
+ matches a literal +.
\(.*\) captures the rest.
\L lowercases everything that comes after it.
\1\2 reuses the two capture groups above.
From there, we can imagine a substitution-based method. Assuming "And I want to generate the following next to it" means that you want to obtain:
U+00A9©
you could do:
v<motion>
y
:call feedkeys("'>a\<C-v>" . substitute(#", '\(\a\)+\(.*\)', '\L\1\2', '') . "\<Esc>")<CR>
Explanation:
v<motion> visually selects the text covered by <motion>.
y yanks it to the "unnamed register" #".
:help feedkeys() is used as low-level way to send a complex series of characters to Vim's input queue. It allows us to build the macro programatically before executing it.
'> moves the cursor to the end of the visual selection.
a starts insert mode after the cursor.
<C-v> + the output of the substitution inserts the appropriate character.
That snippet begs for being turned into a mapping, though.
In case you would like to just convert unicodes to corresponding characters, you could use such nr2char function:
:%s/U+\(\x\{4\}\)/\=nr2char('0x'.submatch(1))/g
Brief explanation
U+\(\x\{4\}\) - search for a specific pattern (U+ and four hexadecimal characters which are stored in group 1)
\= - substitute with result of expression
'0x'.submatch(1) - append 0x to our group (U+00A9 -> 0x00A9)
In case you would like to have unicode character next to text you need to modify slightly right side (use submatch(0) to get full match and . to append)
In case someone wonders how to compose the substitution command:
'<,'>s/\<[uU]+\(\x\+\)\>/\=submatch(0)..' '..nr2char(str2nr(submatch(1), 16), 1)/g
The regex is:
word start
Letter "U" or "u"
Literal "plus"
One or more hex digits (put into "capture group")
word end
Then substituted by (:h sub-replace-expression) concatenation of:
the whole matched string
single space
character by UTF-8 hex code taken from "capture group"
This is to be executed in Visual/command mode and works over selected line range.
I would like to define a text object like iw, aB and the other ones listed in :help text-objects that defines an area beginning with some pattern and ending with another. More precisely, I would like to define a text object which starts with some {pattern1} and ends with some {pattern2}. The patterns included. It is important that it can stretch over multiple lines (like aB but unlike a").
The examples I have in mind are for selecting in-line equations in LaTeX, that is, everything between one $ and the next $ (including the $'s), and for selecting LaTeX environments like between \begin{*} and the following \end{*}, where the * here is just any string of characters (but non-greedy like \{-} in Vim regex).
I have tried to tried to look at this guide at the Vim Tips Wiki, but I do not know how to replace [z and ]z with something that searches backwards for some pattern and forwards for some patters, respectively, so that it works as I want it to.
So to give the example of the inline equation (lets say the text obejct is called ad), then, if the cursor was placed somewhere between the $'s in the following line:
it follows that $ \sum_{n=0}^\infty 2^{-n} $ is two
in normal mode, and vad was pressed, then $ \sum_{n=0}^\infty 2^{-n} $ should be in visual, or if dad was pressed it should be deleted.
The mentioned Vim Tips Wiki page lists the two plugins (under "Related scripts") that make defining new text objects very easy:
textobj-user is very flexible and generic
CountJump plugin (by me) is specially written for text objects defined by start and end patterns
The following call defines an ad text object for text inside $...$:
call CountJump#TextObject#MakeWithCountSearch('', 'd', 'a', 'v', '\$', '\$')
Doing some String manipulation and I want to ask if the below is possible in Notepad++:
I have a string with Years:
10-Jan-13
22-Feb-14
10-Jan-13
10-Mar-13
I want
10-JAN-13
22-FEB-14
10-JAN-13
10-MAR-13
(There's more data on each line there but I am just showing a simplified example).
I know I can OR search with | character so find, JAN|FEB|MAR... but how do I replace according to what's found.
(Just trying to save some time)
Thanks.
Not sure if it's a plugin or built-in, but you can use the TextFX Characters plugin, to select the text, and then in the textfx characters dropdown, click UPPER CASE.
Update
Looks like it is a plugin:
TextFX menu is missing in Notepad++
Multiple Files
I found this site which gives a way to convert text to uppercase with regular expressions: http://vim.wikia.com/wiki/Changing_case_with_regular_expressions
So, what you can do is bring up the find in files dialog (CTRL+SHIFT+F), change search mode to Regular Expression, and then use something like this:
Find: (\d{2}-\w{3}-\d{2})
Replace with: \U\1
Directory: Whichever directory your files are in (and only the files you want changed).
\U is an uppercase flag, and the brackets in the Find regex correspond with the \1 backreference, which will basically replace it with itself (but uppercase).
Using Sublime Text 2 - Is it possible to insert a line break/text return after a specific String in a text file e.g. by using the Find ‣ Replace tool?
(Bonus question: Is it possible to remove all line breaks after a specific String)
Here's how you'd do it on a Mac:
Command+F > type string > Control+Command+G > ESC > Right Arrow > line break
and Windows/Linux (untested):
Control+F > type string > Alt+F3 > ESC > Right Arrow > line break
The important part being Control+Command+G to select all matches.
Once you've selected the text you're looking for, you can use the provided multiple cursors to do whatever text manipulation you want.
Protip: you can manually instantiate multiple cursors by using Command+click (or Control+click) to achieve similar results.
Using the Find - Replace tool, this can be accomplished in two different ways:
Click in the Replace field and press Ctrl + Enter to insert a newline (the field should resize but it doesn't, so it is hard to see the newline inserted).
Inside the Find - Replace tool, activate the S&R regex mode (first icon on the left .*, keyboard shortcut is Alt + Ctrl/Cmd + R to activate/deactivate it).
Type \n in the Replace field wherever you want to insert a newline.
Both solutions also work if you want to find newlines, just do it in the Find field.
Edit->Lines->Join Line (Ctrl+J)
You should probably use multiple cursors. See the unofficial documentation, or this nice tutorial. Here's some brief instructions to set you on your way:
Put the cursor on the string of interest.
Type Command+D (Mac) or Control+D (Windows/Linux) to select the current instance of the string.
Type Command+D (Mac) or Control+D (Windows/Linux) to select successive instances of the string.
Alternately, type Control+Command+G (Mac) or Control+Command+G to select all instances of your string.
Now you have multiple cursors, so insert or remove your newline as you please.
(type esc to exit multiple cursor mode.)
Have fun!
I want to search for a string and find the number of occurrences in a file using the vi editor.
THE way is
:%s/pattern//gn
You need the n flag. To count words use:
:%s/\i\+/&/gn
and a particular word:
:%s/the/&/gn
See count-items documentation section.
If you simply type in:
%s/pattern/pattern/g
then the status line will give you the number of matches in vi as well.
:%s/string/string/g
will give the answer.
(similar as Gustavo said, but additionally: )
For any previously search, you can do simply:
:%s///gn
A pattern is not needed, because it is already in the search-register (#/).
"%" - do s/ in the whole file
"g" - search global (with multiple hits in one line)
"n" - prevents any replacement of s/ -- nothing is deleted! nothing must be undone!
(see: :help s_flag for more informations)
(This way, it works perfectly with "Search for visually selected text", as described in vim-wikia tip171)
:g/xxxx/d
This will delete all the lines with pattern, and report how many deleted. Undo to get them back after.
Short answer:
:%s/string-to-be-searched//gn
For learning:
There are 3 modes in VI editor as below
: you are entering from Command to Command-line mode. Now, whatever you write after : is on CLI(Command Line Interface)
%s specifies all lines. Specifying the range as % means do substitution in the entire file. Syntax for all occurrences substitution is :%s/old-text/new-text/g
g specifies all occurrences in the line. With the g flag , you can make the whole line to be substituted. If this g flag is not used then only first occurrence in the line only will be substituted.
n specifies to output number of occurrences
//double slash represents omission of replacement text. Because we just want to find.
Once got the number of occurrences, you can Press N Key to see occurrences one-by-one.
For finding and counting in particular range of line number 1 to 10:
:1,10s/hello//gn
Please note, % for whole file is repleaced by , separated line numbers.
For finding and replacing in particular range of line number 1 to 10:
:1,10s/helo/hello/gn
use
:%s/pattern/\0/g
when pattern string is too long and you don't like to type it all again.
I suggest doing:
Search either with * to do a "bounded search" for what's under the cursor, or do a standard /pattern search.
Use :%s///gn to get the number of occurrences. Or you can use :%s///n to get the number of lines with occurrences.
** I really with I could find a plug-in that would giving messaging of "match N of N1 on N2 lines" with every search, but alas.
Note:
Don't be confused by the tricky wording of the output. The former command might give you something like 4 matches on 3 lines where the latter might give you 3 matches on 3 lines. While technically accurate, the latter is misleading and should say '3 lines match'. So, as you can see, there really is never any need to use the latter ('n' only) form. You get the same info, more clearly, and more by using the 'gn' form.