Macro to split up text into different rows depending on a keyword - excel

I have a varying number of rows of text which I paste into excel like those two below. The content will vary slightly but the overall structure will stay the same:
Now I need to split these up and would therefore like a macro which searches for the word "maturity" and selects this word and all the text on the right side of the word and moves it one cell to the right.
I tried splitting it up via text to row, but the position of the word varies and splitting it up via space or comma destroys the rest of the data.
example:
1/ Worst Of Put K UN, WS UQ, XYZ YX maturity 22May2019, 80% strike, size q€7M (€ quanto), BID
2/ Worst Of Put xyz xy, TSLA UQ, KK BK maturity 20Nov2021, 100% strike, size €3.5M (€ quanto), BID
the macro should keep "2/ Worst Of Put xyz xy, TSLA UQ, KK BK" in one cell and move "maturity 20Nov2021, 100% strike, size €3.5M (€ quanto), BID" one cell to the right.
Many thanks for the help,
Ragnar

Step 1) Do a Find/Replace on your data. Find maturity, replace with ~maturity. [Note: This assumes you won't have ~ anywhere in the strings. Use another character if you have ~ somewhere.]
Step 2) Highlight your data, go to Text to Columns, and split on a delimiter ~

Related

Is there a way to find any one of a set of characters using an excel formula

I have data that uses a range, or a less than symbol to denote 'between 0 and number'. But multiple characters are used for the same purpose.
It looks like below (first two columns), plus a column showing the results I want:
Country
Average hotdog consumption
Desired output
Madeupaland
10-200
105
Exampledesh
50—1000
525
Republic of Notreal
<1000
500
Inventia
≤5000
2500
Plus many rows where the data in the second column is purely numerical and doesn't need finessing into a number
I can use this formula to calculate the midpoint where there is a range:
=IFERROR(AVERAGE(LEFT(C2,FIND("–",C2)-1),RIGHT(C2, LEN(C2)-FIND("–",C2))), A2)
But they only covers one kind of dash(- and not —). Similarly, if I want to halve the numbers in rows with < and ≤ I'd need to replicate a formula there.
Is there a way of finding multiple different characters from a set? My understanding is that find looks for the whole string of characters. substitute is a work around, but I'd have to substitute every different value in the 'character set'.
In regex this would just be [-—].
I'm using Excel 2013 if that matters
It's not a perfect solution but you can try the following. This replaces those patterns of text with replacements representing which formula to use:
Create a Reference Table (I have made this in I1:K5)
|Pattern |Pattern Name |Substitution Rule |
|------- |------------ |----------------- |
|— |double dash |/2+0.5* |
|- |dash |/2+0.5* |
|< |lt |0.5* |
|≤ |lte |0.5* |
In your third column enter the following array formula (Using Ctrl + Shift + Enter to confirm)
=IF(ISNUMBER(B2),B2,"'="&SUBSTITUTE(B2,INDEX($I$2:$I$5,MIN(IF(ISNUMBER(FIND($I$2:$I$5,B2)),ROW($I$2:$I$5)-1,99))),INDEX($K$2:$K$5,MIN(IF(ISNUMBER(FIND($I$2:$I$5,B2)),ROW($I$2:$I$5),99)-1))))
Copy your third column and past values into a fourth column
Replace all the ''s with nothing to evaluate the expressions using Ctrl + H
My Result:
Country
Average hotdog consumption
Desired output
Formula Paste
Output after replacing 's
Madeupaland
10-200
105
'=10/2+0.5*200
105
Exampledesh
50—1000
525
'=50/2+0.5*1000
525
Republic of Notreal
<1000
500
'=0.5*1000
500
Inventia
≤5000
2500
'=0.5*5000
2500

Splitting text in a cell everytime it finds a special character, in vba, and pasting values, starting at a specific cell, and every row below that

I have this string of text, it could be larger, this is an example:
2.01g 10k gold wedding band~15xps3 games~14.01 14k diamond solitaire with .30pt diamond~2ps3 games~14dvds
Every time it sees the "~", for example, I want it to paste values in cell g34. If there is more than one row, I want it to continue pasting the values into g35, g36, g37 and so on until the list is exhausted.
I want this to be done in VBA so I can attach it to a button. I do not want to do it by text to columns.
The result should look like this:
2.01g 10k gold wedding band
15xps3 games
14.01 14k diamond solitaire with .30pt diamond
2ps3 games
14dvds
Any help GREATLY appreciated...I can find similar solutions, but most want to paste it into new columns.
Consider:
Sub parse()
Dim s As String
s = "2.01g 10k gold wedding band~15xps3 games~14.01 14k diamond solitaire with .30pt diamond~2ps3 games~14dvds"
arr = Split(s, "~")
Range("G34").Resize(UBound(arr) + 1, 1) = Application.Transpose(arr)
End Sub

How to find numeric values Before & After a String in an Excel Cell

I hope I can get some assistance as to which formula to use. In the three rows below, I am trying to pull values from the right.
First line you can see that we have 10x50 meaning 10 packages have 50 items each. So I need to extract values Before and After X
It could be two cells, where I have values Before X and then next cell values After X. Sometimes the X is located a few spaces before the last word. I'm wondering if any kind soul can help please?
DEXTROSE 50% 2G/ML 10X50 LSSYR
LEVETIRACETAM INJ USP 500MG SSOL 25X5
DOBUTAMINE 100 INJ 1X5 ML AMP SAM (PF)
This should work for you. Assumes the measurement is at the end, or near the end and looks for the last occurrence of "x". So if there is another x after this measurement, then it will not work. Also your example had only numbers between 1 and 99 (aka no more than two digits). So this formula will not work if the measurement is longer than 5 characters. aaXbb is OK. aaaXbb is not OK.
=TRIM(RIGHT(LEFT(A1,SEARCH("^^",SUBSTITUTE(A1,"x","^^",LEN(A1)-LEN(SUBSTITUTE(A1,"x",""))))+2),5))

How do I sum data based on a PART of the headers name?

Say I have columns
/670 - White | /650 - black | /680 - Red | /800 - Whitest
These have data in their rows. Basically, I want to SUM their values together if their headers contain my desired string.
For modularity's sake, I wanted to merely specify to sum /670, /650, and /680 without having to mention the rest of the header text.
So, something like =SUMIF(a1:c1; "/NUM & /NUM & /NUM"; a2:c2)
That doesn't work, and honestly I don't know what i should be looking for.
Additional stuff:
I'm trying to think of the answer myself, is it possible to mention the header text as condition for ifs? Like: if A2="/650 - Black" then proceed to sum the next header. Is this possible?
Possibility it would not involve VBA, a draggable formula would be preferable!
At this point, I may as well request a version which handles the complete header name rather than just a part of it as I believe it to be difficult for formula code alone.
Thanks for having a look!
Let me know if I need to elaborate.
EDIT: In regards to data samples, any positive number will do actually, damn shame stack overflow doesn't support table markdown. Anyway, for example then..:
+-------------+-------------+-------------+-------------+-------------+
| A | B | C | D | E |
+---+-------------+-------------+-------------+-------------+-------------+
| 1 |/650 - Black |/670 - White |/800 - White |/680 - Red |/650 - Black |
+---+-------------+-------------+-------------+-------------+-------------+
| 2 | 250 | 400 | 100 | 300 | 125 |
+---+-------------+-------------+-------------+-------------+-------------+
I should have clarified:
The number range for these headers would go from /100 - /9999 and no more than that.
EDIT:
Progress so far:
https://docs.google.com/spreadsheets/d/1GiJKFcPWzG5bDsNt93eG7WS_M5uuVk9cvkt2VGSbpxY/edit?usp=sharing
Formula:
=SUMPRODUCT((A2:D2*
(MID($A$1:$D$1,2,4)=IF(LEN($H$1)=4,$H$1&"",$H$1&" ")))+(A2:D2*
(MID($A$1:$D$1,2,4)=IF(LEN($I$1)=4,$I$1&"",$I$1&" ")))+(A2:D2*
(MID($A$1:$D$1,2,4)=IF(LEN($J$1)=4,$J$1&"",$J$1&" "))))
Apparently, each MID function is returning false with each F9 calculation.
EDIT EDIT:
Okay! I found my issue, it's the /being read when you ALSO mentioned that it wasn't required. Man, I should stop skimming!
Final Edit:
=SUMPRODUCT((RETURNSUM*
(MID(HEADER,2,4)=IF(LEN(Match5)=4,Match5&"",Match5&" ")))+(RETURNSUM*
(MID(HEADER,2,4)=IF(LEN(Match6)=4,Match6&"",Match6&" ")))+(RETURNSUM*
(MID(HEADER,2,4)=IF(LEN(Match7)=4,Match7&"",Match7&" ")))
The idea is that Header and RETURNSUM will become match criteria like the matches written above, that way it would be easier to punch new criterion into the search table. As of the moment, it doesn't support multiple rows/dragging.
I have knocked up a couple of formulas that will achieve what you are looking for. For ease I have made the search input require the number only as pressing / does not automatically type into the formula bar. I apologise for the length of the answer, I got a little carried away with the explanation.
I have set this up for 3 criteria located in J1, K1 and L1.
Here is the output I achieved:
Formula 1 - SUMPRODUCT():
=SUMPRODUCT((A4:G4*(MID($A$1:$G$1,2,4)=IF(LEN($J$1)=4,$J$1&"",$J$1&" ")))+(A4:G4*(MID($A$1:$G$1,2,4)=IF(LEN($K$1)=4,$K$1&"",$K$1&" ")))+(A4:G4*(MID($A$1:$G$1,2,4)=IF(LEN($L$1)=4,$L$1&"",$L$1&" "))))
Sumproduct(array1,[array2]) behaves as an array formula without needed to be entered as one. Array formulas break down ranges and calculate them cell by cell (in this example we are using single rows so the formula will assess columns seperately).
(A4:G4*(MID($A$1:$G$1,2,4)=IF(LEN($J$1)=4,$J$1&"",$J$1&" ")))
Essentially I have broken the Sumproduct() formula into 3 identical parts - 1 for each search condition. (A4:G4*: Now, as the formula behaves like an array, we will multiply each individual cell by either 1 or 0 and add the results together.
1 is produced when the next part of the formula is true and 0 for when it is false (default numeric values for TRUE/FALSE).
(MID($A$1:$G$1,2,4)=IF(LEN($J$1)=4,$J$1&"",$J$1&" "))
MID(text,start_num,num_chars) is being used here to assess the 4 digits after the "/" and see whether they match with the number in the 3 cells that we are searching from (in this case the first one: J1). Again, as SUMPRODUCT() works very much like an array formula, each cell in the range will be assessed individually.
I have then used the IF(logical_test,[value_if_true],[value_if_false]) to check the length of the number that I am searching. As we are searching for a 4 digit text string, if the number is 4 digits then add nothing ("") to force it to a text string and if it is not (as it will have to be 3 digits) add 1 space to the end (" ") again forcing it to become a text string.
The formula will then perform the calculation like so:
The MID() formula produces the array: {"650 ","670 ","800 ","680 ","977 ","9999","143 "}. This combined with the first search produces {TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE} which when multiplied by A4:G4
(remember 0 for false and 1 for true) produces this array: {250,0,0,0,0,0,0} essentially pulling the desired result ready to be summed together.
Formula 2: =SUM(IF(Array)): [This formula does not work for 3 digit numbers as they will exist within the 4 digit numbers! I have included it for educational purposes only]
=SUM(IF(ISNUMBER(SEARCH($J$1,$A$1:$G$1)),A8:G8),IF(ISNUMBER(SEARCH($K$1,$A$1:$G$1)),A8:G8),IF(ISNUMBER(SEARCH($L$1,$A$1:$G$1)),A8:G8))
The formula will need to be entered as an array (once copy and pasted while still in the formula bar hit CTRL+SHIFT+ENTER)
This formula works in a similar way, SUM() will add together the array values produced where IF(ISNUMBER(SEARCH() columns match the result column.
SEARCH() will return a number when it finds the exact characters in a cell which represents it's position in number of characters. By using ISNUMBER() I am avoiding having to do the whole MID() and IF(LEN()=4,""," ") I used in the previous formula as TRUE/FALSE will be produced when a match is found regardless of it's position or cell formatting.
As previously mentioned, this poses a problem as 999 can be found within 9999 etc.
The resulting array for the first part is: {250,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE} (if you would like to see the array you can highlight that part of the formula and calculate with F9 but be sure to highlight the exact brackets for that part of the formula).
I hope I have explained this well, feel free to ask any questions about stuff that you don't understand. It is good to see people keen to learn and not just fishing for a fast answer. I would be more than happy to help and explain in more depth.
I start this solution with the names in an array, you can read the header names into an array with not too much difficulty.
Sub test()
Dim myArray(1 To 4) As String
myArray(1) = "/670 - White"
myArray(2) = "/650 - black"
myArray(3) = "/680 - Red"
myArray(4) = "/800 - Whitest"
For Each ArrayValue In myArray
'Find position of last character
endposition = InStr(1, ArrayValue, " - ", vbTextCompare)
'Grab the number section from the string, based on starting and ending positions
stringvalue = Mid(ArrayValue, 2, endposition - 2)
'Convert to number
NumberValue = CLng(stringvalue)
'Add to total
Total = Total + NumberValue
Next ArrayValue
'Print total
Debug.Print Total
End Sub
This will print the answer to the debug window.

Identify/highlight the most significant digits/characters from data arranged in two columns using the Vim editor

Is it possible in Vim editor to identify or highlight a common sequence of characters/digits from data arranged in two columns?
For instance,
0.0470013487688 40989223 0.0470013487688 002292
0.0421698758 73493412044 0.0421698758 476354659
0.0417166986 15951258722 0.0417166986 257990344
0.04167166 8474116192737 0.04167166 69861257942
0.041667 018771432653979 0.041667 1666698611258
0.0416 78177953892309171 0.0416 667166666986111
0.04 4004728342134522001 0.04 16666716666669861
0.04 0846598100993794511 0.04 16666671666666699
The first location where the digits in the two columns are different is shown with a space.
The goal is to hightlight the most significant digits obtained in a computation (left column) with the respective exact values (right column).
Based on source data like this:
0.047001348768840989223 0.0470013487688002292
0.042169875873493412044 0.0421698758476354659
0.041716698615951258722 0.0417166986257990344
0.041671668474116192737 0.0416716669861257942
0.041667018771432653979 0.0416671666698611258
0.041678177953892309171 0.0416667166666986111
0.044004728342134522001 0.0416666716666669861
0.040846598100993794511 0.0416666671666666699
the following pattern will match as many digits in the first column that are identical with the ones in the second column:
/^\(\S\+\)\ze\S*\s\+\1
This captures non-whitespace (\S; you can refine that part) characters, stops matching (\ze), but asserts that there must be possibly more characters, and then the same characters in the next column.
I hope this is what you meant; it wasn't entirely clear to me.

Resources