Advance sorting in Excel - excel

In our warehouse we have even/odd system of locations.
here is the example:
1-101-1
1-103-1
1-105-1
....
1-285-1
and
2-102-1
2-104-1
2-116-1
2-240-1
....
2-286-1
and have levels too
1-101-2
1-101-3
1-101-4
there have a lot of data, and I need sort like this:
example numbers:
1-101-1
2-130-1
1-131-1
1-150-2
2-132-3
3-229-5
4-262-1
4-286-5
7-267-1
5-239-1
6-270-1
7-267-3
I need sort like this:
1-101-1
2-130-1
1-131-1
2-132-3
4-286-5
4-262-1
3-229-5
5-239-1
6-270-1
7-267-1
7-267-1
point is first two numbers(1-101-1;2-102-1) goes from smallest to biggest, next two(3-285-1;4-286) goes from biggest to smallest and
5 - 6 goes again from smallest to biggest and with that system to the end
second thing for sort is middle number, that number will goes as first from smallest to biggest, then from biggest to smallest, and last number is level, that is same as level 1 but must be sorted as level one, or be near level 1 if there is 7-267-1 and 7-267-3
is there any solution? thanks
edit:
here is image for easier understanding because it is hard to explain
Thanks all for answers, especially Daniel who are an expert in Excel and understand what I need.
I mean there is not solution for sort like that without VBA, but Daniel show me that i was wrong. Thanks again.
That is what i need, but there are some errors, if you can help me with that
this is other example with other locations:
this is unsorted locations with formulas you give me
and this is sorted, but with bad order:
bad sort
and here is with errors:
errors
we have 120 rows, and numbers bigger then 99 display error, and number 22-250-1 goes in -25 in second row
I try formula with numbers you enter in this example, and i got same good sort as you, but after entering other places, there is some bad sort.

Welcome to StackOverflow!
I think I understand what is being requested. It's a bit difficult to explain but I'll give it a try.
The primary sorting is to be as follows:
If first digit is either 3 or 4, then it should be in descending order else ascending.
If the middle 3-digits are from a 3 or 4 numbered sequence (see #1 above), then the middle pair should be in descending order.
All sequences should be in ascending based on their final digit.
My solution breaks the sequence into distinct columns:
For example, create three columns: First, Second, Third.
Formula for First:
=INT(LEFT(A2, 1))
Formula for Second:
=INT(RIGHT(LEFT(A2,5), 3))
Formula for Third:
=INT(RIGHT(A2,1))
Next, we assign values for sorting these three fields:
Create a column labeled First_Sort_Pair:
=IF(OR(B2=1,B2=2),1,
IF(B2=3,3,
IF(B2=4,2,
IF(OR(B2=5,B2=6),4,
IF(OR(B2=7,B2=8),5,6)))))
Create a column labeled First_Sort:
=IF(OR(B2=3, B2=4), 2, 1)
Create a column labeled Second_Sort:
=IF(E2=4, 2, IF(E2=3, 3, 1))
Create a column labeled Sort_3_4:
=IF(OR(B2=3,B2=4),RANK(C2,C:C,0),)
You can now begin sorting:
[
Result:
You will now have your data sorted as intended:

Related

Counting if part of string is within interval

I am currently trying to check if a number in a comma-separated string is within a number interval. What I am trying to do is to check if an area code (from the comma-separated string) is within the interval of an area.
The data:
AREAS
Area interval
Name
Number of locations
1000-1499
Area 1
?
1500-1799
Area 2
?
1800-1999
Area 3
?
GEOLOCATIONS
Name
Areas List
Location A
1200, 1400
Location B
1020, 1720
Location C
1700, 1920
Location D
1940, 1950, 1730
The result I want here is the number of unique locations in the "Areas list" within the area interval. So Location D should only count ONCE in the 1800-1999 "area", and the Location A the same in the 1000-1499 location. But location B should count as one in both 1000-1499 and one in 1500-1799 (because a number from each interval is in the comma-separated string in "Areas list"):
Area interval
Name
Number of locations
1000-1499
Area 1
2
1500-1799
Area 2
3
1800-1999
Area 3
2
How is this possible?
I have tried with a COUNTIFS, but it doesnt seem to do the job.
Here is one option using FILTERXML():
Formula in C2:
=SUM(FILTERXML("<x><t>"&TEXTJOIN("</s></t><t>",,"1<s>"&SUBSTITUTE(B$7:B$10,", ","</s><s>"))&"</s></t></x>","//t[count(.//*[.>="&SUBSTITUTE(A2,"-","][.<=")&"])>0]"))
Where:
"<x><t>"&TEXTJOIN("</s></t><t>",,"1<s>"&SUBSTITUTE(B$7:B$10,", ","</s><s>"))&"</s></t></x>" - Is the part where we construct a valid piece of XML. The theory here is that we use three axes here. Each t-node will be named a literal 1 to make sure that once we return them with xpath we can sum the result. The outer x-nodes are there to make sure Excel will handle the inner axes correctly. If you are curious to know how this xml-syntax looks at the end, it's best to step through using the 'Evaluate Formula' function on the Data-tab;
//t[count(.//*[.>="&SUBSTITUTE(A2,"-","][.<=")&"])>0]")) - Basically means that we collect all t-nodes where the count of child s-nodes that are >= to the leftmost number and <= to the rightmost number is larger than zero. For A2 the xpath would look like //t[count(.//*[.>=1000][.<=1499])>0]")) after substitution. In short: //t - Select t-nodes, where count(.//* select all child-nodes where count of nodes that fullfill both requirements [.>=1000][.<=1499] is larger than zero;
Since all t-nodes equal the number 1, the SUM() of these t-nodes equals the amount of unique locations that have at least one area in its Areas List;
Important to note that FILTERXML() will result into an error if no t-nodes could be found. That would mean we need to wrap the FILTERXML() in an IFERROR(...., 0) to counter that and make the SUM() still work correctly.
Or, wrap the above in BYROW():
Formula in C2:
=BYROW(A2:A4,LAMBDA(a,SUM(FILTERXML("<x><t>"&TEXTJOIN("</s></t><t>",,"1<s>"&SUBSTITUTE(B$7:B$10,", ","</s><s>"))&"</s></t></x>","//t[count(.//*[.>="&SUBSTITUTE(a,"-","][.<=")&"])>0]"))))
Using MMULT and TEXTSPLIT:
=LET(rng,TEXTSPLIT(D2,"-"),
tarr,IFERROR(--TRIM(TEXTSPLIT(TEXTJOIN(";",,$B$2:$B$5),",",";")),0),
SUM(--(MMULT((tarr>=--TAKE(rng,,1))*(tarr<=--TAKE(rng,,-1)),SEQUENCE(COLUMNS(tarr),,1,0))>0)))
I am in very distinguished company but will add my version anyway as byrow probably is a slightly different approach
=LET(range,B$2:B$5,
lowerLimit,--#TEXTSPLIT(E2,"-"),
upperLimit,--INDEX(TEXTSPLIT(E2,"-"),2),
counts,BYROW(range,LAMBDA(r,SUM((--TEXTSPLIT(r,",")>=lowerLimit)*(--TEXTSPLIT(r,",")<=upperLimit)))),
SUM(--(counts>0))
)
Here the ugly way to do it, with A LOT of helper columns. But not so complicated 🙂
F4= =TRANSPOSE(FILTERXML("<m><r>"&SUBSTITUTE(B4;",";"</r><r>")&"</r></m>";"//r"))
F11= =TRANSPOSE(FILTERXML("<m><r>"&SUBSTITUTE(A11;"-";"</r><r>")&"</r></m>";"//r"))
F16= =SUM(F18:F21)
F18= =IF(SUM(($F4:$O4>=$F$11)*($F4:$O4<=$G$11))>0;1;"")
G18= =IF(SUM(($F4:$O4>=$F$12)*($F4:$O4<=$G$12))>0;1;"")
H18= =IF(SUM(($F4:$O4>=$F$13)*($F4:$O4<=$G$13))>0;1;"")

How to extract text from a string between where there are multiple entires that meet the criteria and return all values

This is an exmaple of the string, and it can be longer
1160752 Meranji Oil Sats -Mt(MA) (000600007056 0001), PE:Toolachee Gas Sats -Mt(MA) (000600007070 0003)GL: Contract Services (510000), COT: Network (N), CO: OM-A00009.0723,Oil Sats -Mt(MA) (000600007053 0003)
The result needs to be column1 600007056 column2 600007070 column3 600007053
I am working in Spotfire and creating calclated columns through transformations as I need the columns to join to other data sets
I have tried the below, but it is only picking up the 1st 600.. number not the others, and there can be an undefined amount of those.
Account is the column with the string
Mid([Account],
Find("(000",[Account]) + Len("(000"),
Find("0001)",[Account]) - Find("(000",[Account]) - Len("(000"))
Thank you!
Assuming my guess is correct, and the pattern to look for is:
9 numbers, starting with 6, preceded by 1 opening parenthesis and 3 zeros, followed by a space, 4 numbers and a closing parenthesis
you can grab individual occurrences by:
column1: RXExtract([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))',1)
column2: RXExtract([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))',2)
etc.
The tricky bit is to find how many columns to define, as you say there can be many. One way to know would be to first calculate a max number of occurrences like this:
maxn: Max((Len([Amount]) - Len(RXReplace([Amount],'(?<=\\(000)6\\d{8}(?=\\s\\d{4}\\))','','g'))) / 9)
still assuming the number of digits in each column to extract is 9. This compares the length of the original [Amount] to the one with the extracted patterns replaced by an empty string, divided by 9.
Then you know you can define up to maxn columns, the extra ones for the rows with fewer instances will be empty.
Note that Spotfire always wants two back-slash for escaping (I had to add more to the editor to make it render correctly, I hope I have not missed any).

Using tbl.Lookup to match just part of a column value

This question relates to the Schematiq add-in for Microsoft Excel.
Using =tbl.Lookup(table, columnsToSearch, valuesToFind, resultColumn, [defaultValue]) the values in the valuesToFind column have a consistent 3 characters to the left and then varying characters after (e.g. 908-123456 or 908-321654 - i.e. 908 is always consistent)
How can I tell the function to lookup the value based on the first 3 characters only? The expected answer should be the sum of the results of the above, i.e. 500 + 300 = 800
tbl.Lookup() works by looking for an exact match - this helps ensure it's fast but in this case it means you need an extra step to calculate a column of lookup values, something like this:
A2: =tbl.CalculateColumn(A1, "code", "x => LEFT(x, 3)", "startOfCode")
This will give you a new column that you can use for the columnsToSearch argument, however tbl.Lookup() also looks for just one match - it doesn't know how to combine values together if there is more than one matching row in the table, so I think you also need one more step to group your table by the first 3 chars of the code, like this:
A3: =tbl.Group(A2, "startOfCode", "amount")
Because tbl.Group() adds values together by default, this will give you a table with a row for each distinct value of startOfCode and the subtotal of amount for each of those values. Finally, you can do the lookup exactly as you requested, which for your input table will return 800:
A4: =tbl.Lookup(A3, "startOfCode", "908", "amount")

Obtain every nth row of filtered records

I'm looking for information on how to copy nth rows of records from one excel sheet to the next, and now I am wondering if there is a way to do this for filtered data (i.e. I have 400 students enrolled at school, and I want every 15th male whose parents have not graduated from college (flags have been created for both gender and parent education, which I am using to filter on). Are there any ideas on how to do this? If not, I could just use the offset function for each combination of variables I am filtering on, but that's over 30-40 combinations if I did my math right. Thanks for any help you can provide.
There are a few standard formulas used for retrieving the first, second, third, etc set of values that match criteria. I prefer a standard formula model using the INDEX function and SMALL function. By throwing a little maths at the increment to change it from 1, 2, 3 ... to 1, 16, 31, 46, ... you should be able to achieve your offset results. In the following example image, I've used a stagger of 4 rather than 15 in order to accommodate sample data vertically while still producing more than a single result.
        
The formula in F2 is,
=IFERROR(INDEX(A$2:A$999, SMALL(INDEX(ROW($1:$998)+((C$2:C$999<>"M")+(D$2:D$999<>"N"))*1E+99, , ), 1+(ROW(1:1)-1)*4)), "")
For your purposes the 4 in 1+(ROW(1:1)-1)*4 will need to be changed to 15.
=IFERROR(INDEX(A$2:A$999, SMALL(INDEX(ROW($1:$998)+((C$2:C$999<>"M")+(D$2:D$999<>"N"))*1E+99, , ), 1+(ROW(1:1)-1)*15)), "")
Fill down as necessary.
Once you have retrieved a unique identifier, the remainder can be retrieved with a simple VLOOKUP function.

Using VLookUp for a partial search

I have two tables in excel.
In table 1, one column contains a list of order numbers. This is done the format of XXXX-YYYY where X is an integer and Y is a letter. For example 3485-XTIP
Table 2 also has an order number column but this time it's in the format XXXX-YYYY (ZZ) where Z is the initials of the customer who made the order. Example: 3485-XTIP (KN)
How can I use a VLookUp to search for the order number in Table 2 but only using the XXXX-YYYY part? I tried using TRUE for an approximate search but it still failed for some reason.
This is what I have
=VLOOKUP("I3",'Table2 '!A:B,2,FALSE)
I am open to any alternatives other than VLookup for this situation.
Note that there are hundreds of order numbers and entering the strings manually will take forever.
You can use * as wildcard and add it at the end of the order number so that your VLOOKUP will match any order plus any other characters that come after it:
=VLOOKUP(I3&"*", 'Table2 '!A:B, 2, 0)
* will match anything after the order number.
Note: 0 and False have the same behaviour here.

Resources