RFM Segmentation Excel Take Mid Value - excel

I am trying to segment Customer with RFM Segmentation ranged 1 to 5 for each column R, F and M. After I combined the three column, there are many possibilities such as 151, 555, or 254 and so on.
Code 555 is the best Customer
and X5X is the loyal customer. "X" defines any numbers, e.g Code 454 is also Loyal Customer segmentation.
The problem is i cannot exactly deliver the IF function in excel correctly. Here is my trial for 555
=IF(O14="555","Best Customer",IF(MID(O14,2,1)="5","Loyal Customer"))
The function overlaps since it took the latest IF, so the result for 555 is Loyal Customer which should be Best Customer. There are many segmentation such as XX5 for the big spenders, however since the formula turns to overlap i cannot continue the rest. Thank you for your help.

If you only want 1 result per number, then you just need to put the highest priority first. The first IF that is TRUE will be the one that is used. From your example, it looks like "455" would be both a Loyal Customer and a Big Spender. We can't tell from your explanation what the result should be in this case. But whichever is the higher priority should just come earlier in your nested IF statements.
Your formula looks correct that 555 should return "Best Customer". If it is returning Loyal Customer, then it seems like you've got 555 stored as a number rather than text in O14. If it is a number, your formula should instead be:
=IF(O14=555,"Best Customer",IF(MID(O14,2,1)="5","Loyal Customer"))
The only difference is removing the quotes around 555. If O14 is stored as a number, then O14="555" is comparing a number to a string (three "5" characters rather than the number 555), which will always return FALSE, hence it moves on the next IF statement. To get a TRUE result at the start, you need to compare O14 to the number 555 instead.
You may then be confused about why the 2nd part of the formula works. This is because the MID function will accept a number as input and then force a type conversion.
When you use the = operator, excel can only compare like values. Meaning it can compare strings to strings or numbers to numbers, but not strings to numbers.
However, the MID function will accept either strings or numbers. When it is given a number, it will first convert it to a string and then output a string.
If it is given MID(555,2,1), it first changes 555 to "555" and gives the same result as MID("555",2,1), which is the character "5" rather than the number 5.
So, even if O14 has the number 555, MID(014,2,1) will return the character "5" and the comparison MID(O14,2,1)="5" will return TRUE.

Related

Excel Formula Extract any number greater than x charters from a string

I have a file which contains a list of data. In each cell is a name and number and a date the date is either mm/yy or mm-yy or mm-yyyy etc. (never the day just month and year)
The number I need is always going to be greater than 5 characters. Is there a way that I can get just the number from the string
xx company holding - 96923432 -02-22. (number required 96923432)
yy Company (HOLDINGS) LTD - 131002204 - 02/2023 (number required 131002204)
ab HOLDINGS LIMITED / 115472907 / Feb-23 (number required 115472907)
... prior removed
=========UPDATE=========
This formula will work for you, which splits your data by space, then converts to a number and then extracts the max. Adjust as needed if you have occasions where you may not have a number greater than 5 by wrapping with an IF().
=MAX(IFERROR(NUMBERVALUE(TEXTSPLIT(A2," ")),0))
This is interesting since you use 2 different delimiters. However, no worries you can simply use the following to capture both instances. If you have more possible delimiters simply just add them between the {} in both textbefore and textafter functions. Here is an example of the equation:=TEXTBEFORE(TEXTAFTER(A2, {"-","/"}), {"-","/"})
This should work for you then if you want to return nothing if output is less than 5. =IF(LEN(TEXTBEFORE(TEXTAFTER(A1,{"-","/"}),{"-","/"}))>5,TEXTBEFORE(TEXTAFTER(A1,{"-","/"}),{"-","/"}),"")

How do I find the most common string in a column by replacing?

Suppose I have a column of wind directions ("N","S","W",E"). Each cell only contains 1 letter. If I am to find the most common wind directions,=CHAR(MODE(CODE(range))) will do the job
But if I handle wind directions like "SW","NE", the above function would not work. I know that =INDEX(range, MODE(MATCH(range, range, 0 ))) will work.
Just curious, somewhat similar to the first function, is there a way to substitue strings with numbers of choice only when passing in the column into MODE()function, so that it will return a number for me to MATCH() and INDEX() to get the result?
Clarify: Say that I have the following data
And I would like to substitue "N" with 0, "NE" with 45, "SW" with 225 and so on. So that MODE() will be applicable. And if needed, I can then use functions like INDEX(MATCH()) to return the actual letter representation of the wind direction.
=SWITCH(MODE(SWITCH(A:A,"N",0,"NE",45,"E",90,"SE",135,"S",180,"SW",225,"W",270,"NW",315,"")),0,"N",45,"NE",90,"E",135,"SE",180,"S",225,"SW",270,"W",315,"NW","")
Or similar:
=CHOOSE(1+MODE(SWITCH(A:A,"N",0,"NE",45,"E",90,"SE",135,"S",180,"SW",225,"W",270,"NW",315,""))/45,"N","NE","E","SE","S","SW","W","NW")
This first translates the strings to values, then translates the MODE result back to it's string.

How to extract text from a string between where there are multiple entires that meet the criteria and return all values

This is an exmaple of the string, and it can be longer
1160752 Meranji Oil Sats -Mt(MA) (000600007056 0001), PE:Toolachee Gas Sats -Mt(MA) (000600007070 0003)GL: Contract Services (510000), COT: Network (N), CO: OM-A00009.0723,Oil Sats -Mt(MA) (000600007053 0003)
The result needs to be column1 600007056 column2 600007070 column3 600007053
I am working in Spotfire and creating calclated columns through transformations as I need the columns to join to other data sets
I have tried the below, but it is only picking up the 1st 600.. number not the others, and there can be an undefined amount of those.
Account is the column with the string
Mid([Account],
Find("(000",[Account]) + Len("(000"),
Find("0001)",[Account]) - Find("(000",[Account]) - Len("(000"))
Thank you!
Assuming my guess is correct, and the pattern to look for is:
9 numbers, starting with 6, preceded by 1 opening parenthesis and 3 zeros, followed by a space, 4 numbers and a closing parenthesis
you can grab individual occurrences by:
column1: RXExtract([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))',1)
column2: RXExtract([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))',2)
etc.
The tricky bit is to find how many columns to define, as you say there can be many. One way to know would be to first calculate a max number of occurrences like this:
maxn: Max((Len([Amount]) - Len(RXReplace([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))','','g'))) / 9)
still assuming the number of digits in each column to extract is 9. This compares the length of the original [Amount] to the one with the extracted patterns replaced by an empty string, divided by 9.
Then you know you can define up to maxn columns, the extra ones for the rows with fewer instances will be empty.
Note that Spotfire always wants two back-slash for escaping (I had to add more to the editor to make it render correctly, I hope I have not missed any).

Trying to increment a 4 character alphanumeric code in Excel

I'm trying to create a CSV file of one of my customer's serial numbers. We print them as barcodes for them to use, and normally I'd use our barcode software to generate the numbers. However, we're using a different method of printing, and it requires a CSV/Excel file of all the numbers to be printed. The barcode is as follows:
MC100VGVA.
The last digit is a check digit created from the rest of the string.
Now, my problem comes with the "VGVA" bit. Column A is the prefix (MC), Column B is the number (100), Column C is the incrementing 4 characters (VGVA), and Column D is the check digit.
I need for the VGVA bit to increment alphanumerically. So, when it gets to VGVZ, I need it to go to VGW0. Then when it gets to VGZZ, it needs to go to VH00 and so on until they reach ZZZZ, in which the next digit would increase Column B to 101, and Column C would become 0000.
I've attempted to use the CHAR formula, as well as CONCATENATE, and MID. But, because I'm not well versed in these formulas, my attempts at editing them to work with 4 digits have been failing me.
I'm not opposed to using VBA if needed, but it's not something I've ever worked with, so you'll have to forgive any ignorance on my part.
Please let me know if you need more information. Thanks!
It looks like you are trying to create a new base, the one based on 27 digits (0 and all letter from 'A' to 'Z'). So I'd advise you to create a conversion from and to 27-digit system.
Let me first explain you what I mean in octal numbering (8 digits, from 0 to 7): in that system we start from (just some examples):
a=0011
b=1237
c=1277
The meaning of those numbers is:
a equals 0*8^3 + 0*8^2 + 1*8^1 + 1*8^0 = 9, so:
a+1 equals 10, and converting this to octal numbering yields:
0012
b equals 1*8^3+2*8^2+3*8^1+7*8^0 = 671, so:
b+1 equals 672, and converting this to octal numbering yields:
1240
c equals 1*8^3 + 2*8^2 + 7*8^1 + 7*8^0 = 703, so:
c+1 equals 704, and converting this to octal numbering yields:
1300
I propose to do exactly the same for your 27-digit system, with following example:
VGZZ equals 22*27^3 + 7*27^2 + 26*27^1 + 26 = 438857
VGZZ+1 equals 438858, and converting this to 27-digit numbering yields:
VH00
You can do this, using a VBA function you need to develop yourself. The converting from the string to the normal number is obvious, and in the other way around, you use =MOD(...,27^3) and other similar functions.
I believe I've found a non-VBA answer to this question, thanks to someone on another forum.
Here's what they suggested and it seems to be working perfectly:
B2
=B1+(C2="0000")
C2
=RIGHT(BASE(DECIMAL(C1,36)+1,36,4),4)
and maybe try this at D1
=MID("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ-. $/+%",MOD(SUMPRODUCT(SEARCH(MID((A1&B1&C1),ROW($1:$99),1),
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ-. $/+%") )-99,43)+1,1)

Using tbl.Lookup to match just part of a column value

This question relates to the Schematiq add-in for Microsoft Excel.
Using =tbl.Lookup(table, columnsToSearch, valuesToFind, resultColumn, [defaultValue]) the values in the valuesToFind column have a consistent 3 characters to the left and then varying characters after (e.g. 908-123456 or 908-321654 - i.e. 908 is always consistent)
How can I tell the function to lookup the value based on the first 3 characters only? The expected answer should be the sum of the results of the above, i.e. 500 + 300 = 800
tbl.Lookup() works by looking for an exact match - this helps ensure it's fast but in this case it means you need an extra step to calculate a column of lookup values, something like this:
A2: =tbl.CalculateColumn(A1, "code", "x => LEFT(x, 3)", "startOfCode")
This will give you a new column that you can use for the columnsToSearch argument, however tbl.Lookup() also looks for just one match - it doesn't know how to combine values together if there is more than one matching row in the table, so I think you also need one more step to group your table by the first 3 chars of the code, like this:
A3: =tbl.Group(A2, "startOfCode", "amount")
Because tbl.Group() adds values together by default, this will give you a table with a row for each distinct value of startOfCode and the subtotal of amount for each of those values. Finally, you can do the lookup exactly as you requested, which for your input table will return 800:
A4: =tbl.Lookup(A3, "startOfCode", "908", "amount")

Resources