Identify/highlight the most significant digits/characters from data arranged in two columns using the Vim editor - vim

Is it possible in Vim editor to identify or highlight a common sequence of characters/digits from data arranged in two columns?
For instance,
0.0470013487688 40989223 0.0470013487688 002292
0.0421698758 73493412044 0.0421698758 476354659
0.0417166986 15951258722 0.0417166986 257990344
0.04167166 8474116192737 0.04167166 69861257942
0.041667 018771432653979 0.041667 1666698611258
0.0416 78177953892309171 0.0416 667166666986111
0.04 4004728342134522001 0.04 16666716666669861
0.04 0846598100993794511 0.04 16666671666666699
The first location where the digits in the two columns are different is shown with a space.
The goal is to hightlight the most significant digits obtained in a computation (left column) with the respective exact values (right column).

Based on source data like this:
0.047001348768840989223 0.0470013487688002292
0.042169875873493412044 0.0421698758476354659
0.041716698615951258722 0.0417166986257990344
0.041671668474116192737 0.0416716669861257942
0.041667018771432653979 0.0416671666698611258
0.041678177953892309171 0.0416667166666986111
0.044004728342134522001 0.0416666716666669861
0.040846598100993794511 0.0416666671666666699
the following pattern will match as many digits in the first column that are identical with the ones in the second column:
/^\(\S\+\)\ze\S*\s\+\1
This captures non-whitespace (\S; you can refine that part) characters, stops matching (\ze), but asserts that there must be possibly more characters, and then the same characters in the next column.
I hope this is what you meant; it wasn't entirely clear to me.

Related

How to extract text from a string between where there are multiple entires that meet the criteria and return all values

This is an exmaple of the string, and it can be longer
1160752 Meranji Oil Sats -Mt(MA) (000600007056 0001), PE:Toolachee Gas Sats -Mt(MA) (000600007070 0003)GL: Contract Services (510000), COT: Network (N), CO: OM-A00009.0723,Oil Sats -Mt(MA) (000600007053 0003)
The result needs to be column1 600007056 column2 600007070 column3 600007053
I am working in Spotfire and creating calclated columns through transformations as I need the columns to join to other data sets
I have tried the below, but it is only picking up the 1st 600.. number not the others, and there can be an undefined amount of those.
Account is the column with the string
Mid([Account],
Find("(000",[Account]) + Len("(000"),
Find("0001)",[Account]) - Find("(000",[Account]) - Len("(000"))
Thank you!
Assuming my guess is correct, and the pattern to look for is:
9 numbers, starting with 6, preceded by 1 opening parenthesis and 3 zeros, followed by a space, 4 numbers and a closing parenthesis
you can grab individual occurrences by:
column1: RXExtract([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))',1)
column2: RXExtract([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))',2)
etc.
The tricky bit is to find how many columns to define, as you say there can be many. One way to know would be to first calculate a max number of occurrences like this:
maxn: Max((Len([Amount]) - Len(RXReplace([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))','','g'))) / 9)
still assuming the number of digits in each column to extract is 9. This compares the length of the original [Amount] to the one with the extracted patterns replaced by an empty string, divided by 9.
Then you know you can define up to maxn columns, the extra ones for the rows with fewer instances will be empty.
Note that Spotfire always wants two back-slash for escaping (I had to add more to the editor to make it render correctly, I hope I have not missed any).

Macro to split up text into different rows depending on a keyword

I have a varying number of rows of text which I paste into excel like those two below. The content will vary slightly but the overall structure will stay the same:
Now I need to split these up and would therefore like a macro which searches for the word "maturity" and selects this word and all the text on the right side of the word and moves it one cell to the right.
I tried splitting it up via text to row, but the position of the word varies and splitting it up via space or comma destroys the rest of the data.
example:
1/ Worst Of Put K UN, WS UQ, XYZ YX maturity 22May2019, 80% strike, size q€7M (€ quanto), BID
2/ Worst Of Put xyz xy, TSLA UQ, KK BK maturity 20Nov2021, 100% strike, size €3.5M (€ quanto), BID
the macro should keep "2/ Worst Of Put xyz xy, TSLA UQ, KK BK" in one cell and move "maturity 20Nov2021, 100% strike, size €3.5M (€ quanto), BID" one cell to the right.
Many thanks for the help,
Ragnar
Step 1) Do a Find/Replace on your data. Find maturity, replace with ~maturity. [Note: This assumes you won't have ~ anywhere in the strings. Use another character if you have ~ somewhere.]
Step 2) Highlight your data, go to Text to Columns, and split on a delimiter ~

How to find numeric values Before & After a String in an Excel Cell

I hope I can get some assistance as to which formula to use. In the three rows below, I am trying to pull values from the right.
First line you can see that we have 10x50 meaning 10 packages have 50 items each. So I need to extract values Before and After X
It could be two cells, where I have values Before X and then next cell values After X. Sometimes the X is located a few spaces before the last word. I'm wondering if any kind soul can help please?
DEXTROSE 50% 2G/ML 10X50 LSSYR
LEVETIRACETAM INJ USP 500MG SSOL 25X5
DOBUTAMINE 100 INJ 1X5 ML AMP SAM (PF)
This should work for you. Assumes the measurement is at the end, or near the end and looks for the last occurrence of "x". So if there is another x after this measurement, then it will not work. Also your example had only numbers between 1 and 99 (aka no more than two digits). So this formula will not work if the measurement is longer than 5 characters. aaXbb is OK. aaaXbb is not OK.
=TRIM(RIGHT(LEFT(A1,SEARCH("^^",SUBSTITUTE(A1,"x","^^",LEN(A1)-LEN(SUBSTITUTE(A1,"x",""))))+2),5))

Using tbl.Lookup to match just part of a column value

This question relates to the Schematiq add-in for Microsoft Excel.
Using =tbl.Lookup(table, columnsToSearch, valuesToFind, resultColumn, [defaultValue]) the values in the valuesToFind column have a consistent 3 characters to the left and then varying characters after (e.g. 908-123456 or 908-321654 - i.e. 908 is always consistent)
How can I tell the function to lookup the value based on the first 3 characters only? The expected answer should be the sum of the results of the above, i.e. 500 + 300 = 800
tbl.Lookup() works by looking for an exact match - this helps ensure it's fast but in this case it means you need an extra step to calculate a column of lookup values, something like this:
A2: =tbl.CalculateColumn(A1, "code", "x => LEFT(x, 3)", "startOfCode")
This will give you a new column that you can use for the columnsToSearch argument, however tbl.Lookup() also looks for just one match - it doesn't know how to combine values together if there is more than one matching row in the table, so I think you also need one more step to group your table by the first 3 chars of the code, like this:
A3: =tbl.Group(A2, "startOfCode", "amount")
Because tbl.Group() adds values together by default, this will give you a table with a row for each distinct value of startOfCode and the subtotal of amount for each of those values. Finally, you can do the lookup exactly as you requested, which for your input table will return 800:
A4: =tbl.Lookup(A3, "startOfCode", "908", "amount")

Excel SUMIF with exact word but not case sensitive match

I have a simplified table for this problem example
Column A       |  Column B  |  Column C
war            | 1          | war
War            | 2
warred         | 3
war and peace  | 4
awful war      | 5
dead war horse | 6
Now I need to find all rows containing the word "war" that is not case sensitive, but must be a separate word, not a part of another word.
For example
=SUMIF(A1:A6;"C1";B1:B6)
right now finds only values "war" and "War" and SUM is 3.
I want it to find also values "war and peace", "awful war" and "dead war horse" since they all contain the word "war" and the SUM value should be 18.
I can't use search term
"*war*"
since this also includes the value "warred" and this is a separate word and shouldn't match.
One possibility is to create 4 different SUMIF-s with terms
war
war_*
*_war
*_war_*
_ is space
and then sum those four, but this is not that elegant.
I thought SUMPRODUCT with EXACT would work, but this seems to work over columns, not rows and EXACT isn't suitable..it think.
Is there a way to match row based on word that is not case sensitive and then sum all the values in Column B that have a matching row?
You could use:
=SUMPRODUCT((ISNUMBER(SEARCH(" "&C1&" "," "&A1:A6&" ")))*B1:B6)

Resources