How do I select a pymol subunit by name? - pymol

I'm working with 5IVW downloaded from PDB. This has 5 subunits; each has it's own residue numbering--if I color residues 1-100, 5 sets of residues get colored. If I select atoms with the mouse I see the identification of the subunit: e.g. 5IVW/B/... ; i.e. I've selected atoms in subunit B. But how do I select the subunit, or anything in pymol, by its internal name--so I can just color the residues in subunit B? The documentation has extensive information on selecting by atom, element, residue, etc.

Have a look in the Macromolecules section at 5IVW. Here,
the six chains (or subunits) in the structure are named V, W, 0, 1, 2, and 3.
so your syntax to make a selection called "subunit2" should be:
select subunit2, chain W
note that in /5ivw/B, the B stands for segi-identifier and not the chain or subunit.

Related

Excel: How to find six different combinations of words in string?

I have been working for several days on this and have researched everything looking for this answer. I'd appreciate any help you can give.
In Excel I am searching a string of text in column A:
Bought 1 HD Sep 3 2021 325.0 Call # 2.75
I am detecting the first word (in this case "Bought") and detecting the last word before "#" symbol (in this case "Call").
I am then detecting the price following the "#" symbol (in this case "2.75"). This number will go into column B (header "Open") or column C (header "Close") depending on the combination of words found:
Sold/Put=Close
Sold/Call=Open
Bought/Put=Open
Bought/Call=Close
Sold (by itself)=Open
Sold (by itself)=Close.
Bought 1 HD Sep 3 2021 325.0 Call # 2.75
The combination found in the above string is: "Bought Call". Therefore the number at the end ("2.75"), goes into "Open" column.
Here's another example:
Sold 4 AI Sep 17 2021 50.0 Put # 1.5
The combination found in the above string is: "Sold Put". Therefore the number at the end ("1.5") goes into "Close" column.
I am currently using this formula to determine if the string contains "Sold" and "Call" and get the desired number and it does work:
=IF(AND(
ISNUMBER(SEARCH({"Sold","Call"},A10))),
TRIM(MID(A10,SEARCH("#",A10)+LEN("#"),255))," ")
But, I don't know how to search for all the other possible combinations.
The point behind this is to be able to paste the transaction from the broker and have most of the entry process automated. I'm sure many will benefit from this as I've not found anything like this.
I'd appreciate any help and if possible, an explanation of the formula so I can better learn.
Thanks!
I think you have the right idea, but would just extend the IF statement.
Something like the below might work for you:
=IF(ISNUMBER(SEARCH("Call", $A1)),
IF(ISNUMBER(SEARCH({"Bought","Sold"}, $A1)),
NUMBERVALUE(RIGHT($A1, LEN($A1)-SEARCH("#", $A1))),""),
IF(ISNUMBER(SEARCH({"!!!","!!!","Bought","Sold"}, $A1)),
NUMBERVALUE(RIGHT($A1, LEN($A1)-SEARCH("#", $A1))),""))
Just enter in column B and drag down; columns B through E should fill as needed.
For example:
Note that the search for "!!!" is just random characters, it can be anything that you don't think has a good chance of appearing in the string.
Here/screenshots refer:
(requires Office 365 compatible version Excel)
Main lookup
=LET(fn_1,MATCH("*"&$H$7:$H$12&"*",B4,0),fn_2,MATCH("*"&$I$7:$I$12&"*",B4,0),IFERROR(INDEX($J$7:$J$12,MATCH(1,IF($I$7:$I$12="",fn_1*ISNUMBER(fn_2),fn_1*fn_2),0)),))
EDIT:
Other Excel versions:
=IFERROR(INDEX($J$7:$J$12,MATCH(1,IF($I$7:$I$12="",MATCH("*"&$H$7:$H$12&"*",B4,0)*ISNUMBER(MATCH("*"&$I$7:$I$12&"*",B4,0)),MATCH("*"&$H$7:$H$12&"*",B4,0)*MATCH("*"&$I$7:$I$12&"*",B4,0)),0)),)
(all that falls away is the 'Let' formula, replacing fn_1 and fn_2 with respective functions in index formula within the let making first equation somewhat longer, but otherwise identical)
Example applications
Have provided 2 examples of how one might customize to insert numeric in one of the columns (the key part to this question is really how to do lookup in first instance, from thereon it's a matter of finetuning/taking appropriate action)...
Assuming calls/buys are "long" position and strike price go in first col (here, D), and puts/sales are "short" position with strike price going in 2nd col (here, E):
Long - insert strike price col D
=IF(LET(fn_1,MATCH("*"&$H$7:$H$12&"*",B4,0),fn_2,MATCH("*"&$I$7:$I$12&"*",B4,0),IFERROR(INDEX($K$7:$K$12,MATCH(1,IF($I$7:$I$12="",fn_1*ISNUMBER(fn_2),fn_1*fn_2),0)),))=1,MID(SUBSTITUTE(B4," ",""),SEARCH("#",SUBSTITUTE(B4," ",""))+1,LEN(SUBSTITUTE(B4," ",""))),"")
EDIT
Other Excel versions:
=IF(IFERROR(INDEX($K$7:$K$12,MATCH(1,IF($I$7:$I$12="",MATCH("*"&$H$7:$H$12&"*",B4,0)*ISNUMBER(MATCH("*"&$I$7:$I$12&"*",B4,0)),MATCH("*"&$H$7:$H$12&"*",B4,0)*MATCH("*"&$I$7:$I$12&"*",B4,0)),0)),)=1,MID(SUBSTITUTE(B4," ",""),SEARCH("#",SUBSTITUTE(B4," ",""))+1,LEN(SUBSTITUTE(B4," ",""))),"")
Short - insert strike price col E
=IF(LET(fn_1,MATCH("*"&$H$7:$H$12&"*",B4,0),fn_2,MATCH("*"&$I$7:$I$12&"*",B4,0),IFERROR(INDEX($K$7:$K$12,MATCH(1,IF($I$7:$I$12="",fn_1*ISNUMBER(fn_2),fn_1*fn_2),0)),))=2,MID(SUBSTITUTE(B4," ",""),SEARCH("#",SUBSTITUTE(B4," ",""))+1,LEN(SUBSTITUTE(B4," ",""))),"")
EDIT
Other Excel versions:
Follow same routine in previous Edits (remove Let, replace fn_1 & fn_2 with respective formulae...)
Note similarity in all 3 equations above: 2nd and 3rd contain 1st (effectively they just wrap a big old 'if' statement around 1st, use lookup_2 col (here, col K), and use mid/search to extract rate after the hashtag.
Assumes you don't have other hashtags in the sentence..
Customize as required.

Is there a way to compare text strings in Excel and output a complete/partial/no match column (with the information missing listed)?

I have a large spreadsheet (upwards of 119K rows) of mismatched data. Column A contains a list of names in full (and occasionally a Trustee or company name), and Column B contains initialized first/middle names with full names (and occasionally Trustee or company names).
I do not currently have a way to compare them short of doing so manually as there are many variable, and am looking for some assistance.
So far I have tried using a VBA script from (How do I fuzzy match just adjacent cells?) to see if it can output the difference (which would allow me to eliminate the cells in Column 2 that had no matching data), but this did not function as intended.
I have also tried various LEFT/RIGHT to trim the names from Column A and then match this to Column B, but this has also not worked due to variance in text in Column A.
Here are some examples of the cells. Note that the names in Column A are not always in alphabetical order, but Column B is:
Example (complete match):
Column A: Column B:
Smith Marcus John J M Smith
Page Binder Book, Quoth Nevermore Raven B B Page, R N Quoth
Orange Apple Banana, Orange Pear Plum A B & P P Orange
Koala Bear, Koala Marsupial Pouch, Koala Gum Tree B, P M & T G Koala
S & P Limited S & P Limited
S & P Limited A D Cumin (S & P Limited)
Example (partial):
Column A: Column B:
Page Binder Book, Quoth Nevermore Raven B B Page
Orange Apple Banana, Orange Pear Plum A B & P P Orange (Fruit 2019 Limited)
Koala Bear, Koala Marsupial Pouch, Goanna Gumtree, Koala Gum Tree B, P M & T G Koala
Example (no match):
Column A: Column B:
Smith Marcus John H J Hyde
Sheppard Garrus Thane B B Page, R N Quoth
What I am hoping to do:
Firstly, I am hoping to correctly mark each cell in Column B as complete/partial/no match with a fill (green/yellow/red). Secondly, for partial matches (whether Column A has extra information, or Column B is missing information) I want to output in Column C the missing information, like so:
Column A: Column B:
Page Binder Book, Quoth Nevermore Raven B B Page
Orange Apple Banana, Orange Pear Plum A B & P P Orange (Fruit 2019 Limited)
Column C:
Quoth Nevermore Raven
(Fruit 2019 Limited)
Is this kind of thing even possible, or are there just too many variations in the way the data is presented in each column?
Very new to both this site and excel functions in general, this is my first task!
Thank you for your assistance/knowledge/time.
Importing and using this VBA module: https://github.com/kyledeer-32/vba_fuzzymatching
Which contains several User Defined Functions (UDFs) will get you a near optimal solution (you will still have to review matches), but you can easily fuzzy match, then calculate the similarity between strings, then a simple "=IF" function can rank them. Using this VBA module I recommended, I got the following results:
I noted that "Koala Bear..." in Column A matched to "S & P..." in Column B. I expected the value in Column B with "...Koala" to match. I checked the script and the Levenshtein Edit distance was actually equal for both. This scarce occurrence will require you to review your matches, but you can do this quickly by ranking your results based on string similarity. Here is a formula view of what I did:
To import the VBA module linked in the beginning of this answer - here is a guide: https://www.excelcampus.com/vba/copy-import-vba-code/
Note: after importing this module, you will need to enable the "Microsoft Scripting Runtime" library in the Visual Basic Editor Window it to run. Steps to do this (takes less than a minute):
From Excel Workbook:
Select Developer tab on ribbon
Select Visual Basic
Select Tools on the Toolbar
Select References
Scroll down until you see Microsoft Scripting Runtime, then check the
box
Press OK
Then your all set! You can use the UDFs (just like in my second image - above) just as you would use normal excel functions! Hope this helps!

Excel Return 0,1 or 2 if cell equals A, B or C

I have a cell (Say B16) that has a dropdown list with three options, depending on which one is picked I would like Cell D16 to return a value depending, i.e. if you Pick A return 0, B returns 1 and C returns 2.
I have tried multiple IF, OR, LOOKUP but nothing is working.
Any help would be fantastic.
Make a hidden sheet called mapping with the following:
Now it's just =VLOOKUP(B16,mapping!A:B,2,0). The advantage of this over the nested IF solution is that it's trivial to add more options and easier to read / edit in my option.
also you can use column A of mapping to populate your dropdown list.
Personally I don't like nested IFs as they are hard to read. If you want the transformation to be to {0, 1, 2} then you could use
=CODE(B16)-CODE("A")
This is idiomatic in programming languages using ASCII encoding. You can generalise this if you use CHOOSE. If you want {a, b, c} then use
=CHOOSE(1 + CODE(E3) - CODE("A"), a, b, c)
where a, b, and c are the values that you want: {0, 1, 3} in your case.
How about:
=IF(B16="A",1,IF(B16="B",2,IF(B16="C",3,"")))
This is for a return of {1,2,3}, modify for any three values you would like.

Concatenate part of cell 1 by multiple criteria with all of cell 2, finishing with cell 1’s remainder

Introduction
This is a continuation of the question posted and answered here, but has been advised to be posted as a separate question by #Jordan.
Goal: Join part of cell 1’s contents with all of cell 2’s, finishing with the remained of cell 1.
Twist: Multiple criteria have to be applied to cell 1.
Problem
After successfully altering Jordan’s excellent answer to accomodate concatenated names for joined Thinglag, the following function will perform the task, as long as there is only a single criteria to identify:
IF(F2="",E2,CONCATENATE(LEFT(E2,SEARCH(" T",E2))&"("&F2&")"&MID(E2,SEARCH(" T",E2),Len(E2)-SEARCH(" T",E2)+1)))
However, for parishes and annexes, multiple criteria are needed, vz. the following:
Sogn
Hovedsogn
Annex
Præstegjeld
Structure
AB–AH: Sogn_anx_[1–7]
AI–AO: Sogn_anx_[1–7]_altnvn
AP–AV: Sogn_anx_[1–7]_hele
As with the original post, I have a source giving current official names for the area, as well as previously used names (providing etymological information for the current name). In the source, where the old name is included, it is given as a paranthetical remark, e.g.:
‘Søndeløvs (Sundaleid) Annex’
‘Tromø (Thrumø) Annex’
‘Hvitesø (Hviteseids) Hovedsogn’
‘Attraa (Attrod) Hovedsogn’
‘Thjølings (Thjodaling) Sogn’
These have been entered into the database using three columns:
One for the official name
One for the old name
One showing the name as printed
This is to allow for better searchability when this information is to be made publicly available.
Example data:
Sogn_anx_1 Sogn_anx_2 Sogn_anx_3 … Sogn_anx_1_altnvn Sogn_anx_2_altnvn Sogn_anx_3_altnvn … Sogn_anx_1_hele Sogn_anx_2_hele Sogn_anx_3_hele
AB AC AD … AI AJ AK … AP AQ AR
Soleims Hovedsogn … Solheims … Soleims (Solheims) Hovedsogn
Meleims Annex … Medelheims … Meleims (Medelheims) Hovedsogn
Holdens Hovedsogn Romenæs Annex Holdens Hovedsogn … Rumenæs Hollen … Holdens Hovedsogn Romenæs (Rumenæs) Annex Holdens (Hollen) Hovedsogn
As can be seen, the first set of columns contain the official name; the second set of columns (altnvn = alt_name) contains the old name, which in the source is written as a paranthetical remark; and the third set of columns contains the full, concatenated name (hele = entire/whole), which—in those cases where there is an alternative name—includes this in parentheses.
Desired result
I would like to perform the same task in the third column as done in the post referenced, only this time it has to be able to perform the search by looking for any of the four criteria, so " T" would have to be replaced by all four variants: " So", " Ho", " An" or " Pr" (note: spaces are intentional). I have tried editing the original function using OR, but this—to no surprise—fails.
It might be simpler with a VBA solution. But your immediate problem, to find one of several defined words using SEARCH, can be accomplished by using an array constant for find_text and appending the terms to within_text. If you are not guaranteed that find_text will always appear, you'll need to check that the result is less than the length of the original within_text.
You might also consider using the case-sensitive Find function, or longer find_text strings in case there might be some ambiguity.
=MIN(SEARCH({" So"," Ho"," An"," Pr"},AB3&" So Ho An Pr"))

Excel Solver - Prevent Identically Named Results

This may sound a bit odd and maybe I'm just missing the forest through the trees on this question, but is there a way to force the Excel Solver to return only one instance of a result? As a short example imagine that we have some results on the likability of various objects (colors, animals, and shapes). We want the solver to return the three most preferred objects from this list.
Red (400)
Dog (120)
Circle (100)
Red (400)
Cat (90)
Square (75)
Blue (90)
Horse (60)
Triangle (70)
Green (80)
Snake (30)
Rectangle (40)
Yellow (40)
Rabbit (20)
Pentagon (15)
The problem is, of course, simplified in this example. Basically, my issue arises in that I want one of each type, namely Red, Dog, and Circle but I keep getting Red, Red (again), and Dog because the total is higher. I want to define a way to prevent Solver from returning two values named the same. I just can't seem to figure it out and Google doesn't seem to produce any viable responses either.
It's unclear how your data is setup, and this could affect how you setup the Solver problem, but here is one method (nb - this method will only work if you have 200 or fewer values to choose from).
Make Column A for "Category". This would have values such as "Color", "Animal", and "Shape".
Column B would be for "Type", and contain the information you provided. (e.g. Dog, Cat, ... Red, Blue, ... Circle, Square, ...)
Column C is the Value or Score for the type shown in Column B, again the information you provided.
Column D has fields that Solver will manipulate, let's call it "Selected". Selected will be a 0 or a 1.
Column E is the result of selection, a simple calculation, =C2*D2, filled down.
Make Cell H2 the sum of Column E. This will be your objective for Solver.
Make G3 through G5 the values in "Category" (Color, Animal, Shape).
Make H3 through H5 the total selected values in each category. That is =SUMIF($A$2:$A$16,"="&G3,$D$2:$D$16) filled down.
The workbook looks like this ...
... from this, you can setup Solver with the following ...
Set Objective: is $H$2
To: is set to Max. (i.e. you are looking for the most preferred)
By Changing Variable Cells: is set to $D$2:$D$16
Subject to the Constraints: has four entries. $D$2:$D$16 = binary; $H$3 = 1; $H$4 = 1; $H$5 = 1
Select a Solving Method: is set to Evolutionary. You can use GRG Nonlinear, but it takes longer.
The dialog looks like this ...
... with the following result, which meets your criteria ...

Resources