Hi Stackoverflow Community,
This is my first post, apologies ahead if I haven't structured my question better, I'll try to improve on it later on.
I have a 2 excel columns with street address numbers and street address, none of which are unique hence using vlookup/index match is tricky.
I am trying to populate another column with the minimum and maximum value of the street address numbers using TEXTJOIN and it works, however I need the min/max for each specific street address group, there are close to 1 million lines of data.
For example, min=1, max=13, on florence st, min=3, max=53, on gibson st
Is such a formula helpful for you:
=MINIFS(B$2:B$10,A$2:A$10,"Ramsay street")
As you see, I take the minimum value of the "B" column, based on a criterion on the "A" column.
Hereby a screenshot of an example:
If you are on Excel 365 - current channel you can use this formula:
=LET(streetnamesUnique,UNIQUE(data[StreetName]),
minNumber,BYROW(streetnamesUnique,LAMBDA(s, MINIFS(data[Number],data[StreetName],s))),
maxNumber,BYROW(streetnamesUnique,LAMBDA(s, MAXIFS(data[Number],data[StreetName],s))),
HSTACK(streetnamesUnique,minNumber,maxNumber))
If on Excel 365 semi annual channel:
=LET(streetnamesUnique,UNIQUE(data[StreetName]),
minNumber,BYROW(streetnamesUnique,LAMBDA(s, MINIFS(data[Number],data[StreetName],s))),
maxNumber,BYROW(streetnamesUnique,LAMBDA(s, MAXIFS(data[Number],data[StreetName],s))),
MAKEARRAY(ROWS(streetnamesUnique),3,LAMBDA(r,c,
IF(c=1,INDEX(streetnamesUnique,r),
IF(c=2,INDEX(minNumber,r),
INDEX(maxNumber,r))))))
Both formulas first retrieve the unique streetnames - then retrieves, per each streetname min and max number.
In the end the new range is built from these values - either by HSTACK or MAKEARRAY.
Related
I am trying to separate information copied from a PDF table - id usually use text to columns but the only delamination is spaces and this then splits the data into multiple unusable columns
The data comes like this:
Raw Data
A1 Company 0
Company2 40000
name a 1
name b 15
name c 184
Big 17 Company 1887
I need the output to be:
Company
Units
A1 Company
0
Company2
40000
name a
1
name b
15
name c
184
Big 17 Company
1887
So the company name (that might contain numbers) is separated for the unit number (that could be 1-5 digits long).
I haven't been able to figure out a way that uses =len() as the string length isn't a constant mixed with the last numbers not being a consistent number of digits.
I'm currently using:
=SUMPRODUCT(MID(0&A2, LARGE(INDEX(ISNUMBER(--MID(A2, ROW(INDIRECT("1:"&LEN(A2))), 1)) * ROW(INDIRECT("1:"&LEN(A2))), 0), ROW(INDIRECT("1:"&LEN(A2))))+1, 1) * 10^ROW(INDIRECT("1:"&LEN(A2)))/10)
This gives me all the numbers in the cell - which works for 90% of the data as most of the company's don't have numbers in their name. But for something like 'A1 Company 0' it gives 10 as the output not just the 0. I then go and manually edit the small number of companies that this happens too.
I then use a mixture of =LEN() =LEFT and =RIGHT to split the information up as required for the further automated analysis.
I'd prefer a formula over VBA/macro
I cant provide the actual data but I hope I've given enough examples in the table above to show the main problems (different company name lengths, companies with numbers in their name, different amount of digits representing the units)
Using Libre Office, but this formula checks for the last space in the cell
=RIGHT(A1,LEN(A1)-FIND("#",SUBSTITUTE(A1," ","#",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))),1))
Taken from: https://trumpexcel.com/find-characters-last-position/
FILTERXML() would best choice for this case. Try-
=FILTERXML("<t><s>"&SUBSTITUTE(A1:A6," ","</s><s>")&"</s></t>","//s[last()]")
Details about FILTERXML() from JvdV here.
See if the following works for you:
Formula in B2:
=LEFT(A2,LEN(A2)-1-LEN(C2))
In C2:
=-LOOKUP(1,-RIGHT(A2,ROW($1:$5)))
For those users using ms365's newest functions:
=HSTACK(TEXTBEFORE(A2," ",-1),TEXTAFTER(A2," ",-1))
I am writing a series of queries to my workbook's data model to retrieve the number of documents by Category_Name which are greater than a certain numbers of days old (e.g. >=650).
Currently this formula (entered in celll C3) returns the correct number for a single Days Old value (=3).
=CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]",
"[EDD_Report_10-01-18].[Days Old].[34]")
How do I return the number of documents for Days Old values >=650?
The worksheet looks like:
A B C
1 Date PL Count of Docs
2 10/1/2018 ALD 3
3 ...
UPDATE: As suggested in #ama 's answer below, the expression in step B did not work.
However, I created a subset of the Days Old values using
=CUBESET("ThisWorkbookDataModel",
"{[EDD_Report_10-01-18].[Days Old].[all].[650]:[EDD_Report_10-01-18].[Days Old].[All].[3647]}")
The cell containing this cubeset is referenced as the third Member_expression of the original CUBEVALUE formula. The limitation is now that the values for the beginning and end must be members of the Days Old set.
This is limiting, in that, I was hoping for a more general test for >=650 and there is no way to guarantee that specific values of Days Old will be in the query.
First time I hear about CUBE, so you got me curious and I did some digging. Definitely not an expert, but here is what I found:
MDX language should allow you to provide value ranges in the form of {[Table].[Field].[All].[LowerBound]:[Table].[Field].[All].[UpperBound]}.
A. Get the total number of entries:
D3 =CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All]")
B. Get the number of entries less than 650:
E3 =CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All].[0]:[EDD_Report_10-01-18].[Days Old].[All].[649]}")
Note I found something about using .[All].[650].lag(1)} but I think for it to work properly your data might need to be sorted?
C. Substract
C3 =D3-E3
Alternatively, go for the quick and dirty:
=CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All].[650]:[EDD_Report_10-01-18].[Days Old].[All].[99999]}")
Hope this helps and do let me know, I am still curious!
I have multiple records as below in an excel file say Col A:
Infogain India (P) Ltd. 3-6 yrs Noida
ROBOSPECIES TECHNOLOGIES PVT LTD 0-2 yrs New Delhi
Red Lemon 0-3 yrs Noida(Sector-7 Noida)
Within the data there is a range of years mentioned e.g. 3-6 yrs in the first list item.
I want to extract the data 3-6, 0-2, 0-3 etc from above 3 list items. I understand a search for " yrs " in all the strings will give me the end position. However, I am unable to determine how to find the starting position of the Number of years.
I require the excel formula which will give me the year range.
I do not want to use any VBA for the solution.
If there are no spaces between numbers then you can use following formula.
=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(SUBSTITUTE(A3," yrs",REPT(" ",99)),99))," ",REPT(" ",99)),99))
Try,
=TRIM(RIGHT(REPLACE(A1, FIND(" yrs", A1), LEN(A1), TEXT(,)), 4))
Try the following though pretty sure it can be condensed. I have attempted to handle additional white space potentially being present and also the years being multi digit in length e.g. 12-15. Incorporates a method by Raystafarian to find a last occurence of a character.
=RIGHT(TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)),LEN(TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)))-LOOKUP(9.9999999999E+307,FIND(" ",TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)),ROW($1:$1024))))
Try with below formula
=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1),LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1))-SEARCH("|",SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1))))
I'm making an excel sheet for calculating z-score for infant weight/age (Input: "Baby Month Age", and "Baby weight"). To do that, I need get LMS parameters first for a specific month, from below table.
http://www.who.int/childgrowth/standards/tab_wfa_boys_p_0_5.txt
(For Integer Month number, this can be done by vlookup Method without issue.) For Non-Integer Month number, I need use some kind of "linear interpolation" approach to get an approximate LMS data.
The question is, both Trend method and Vlookup method are not working for me. For Trend method, it is not working as the raw data, like L parameters is not linear data, if I use Trend method, for the several top month, return data will far from existing data. As for Vlookup method, it just finds the closest month data.
I had to use multiple "Match" and "Index" Method to do the "linear interpolation" for myself. However, I wonder whether there is any existing function for that?
My current formula for L parameters is below:
=MOD([Month Age],1)*(INDEX('WHO BOY AGE WEIGHT'!A:D,MATCH([Month Age],'WHO BOY AGE WEIGHT'!A:A)+1,2)-INDEX('WHO BOY AGE WEIGHT'!A:D,MATCH([Month Age],'WHO BOY AGE WEIGHT'!A:A),2))+INDEX('WHO BOY AGE WEIGHT'!A:D,MATCH([Month Age],'WHO BOY AGE WEIGHT'!A:A),2)
If we assume that months increment always by 1 (no gap in month data), you can use something like this formula to interpolate between the two values surrounding the give non-integer value:
=(1-MOD(2.3, 1))*VLOOKUP(2.3,A:S,2)+MOD(2.3, 1)*VLOOKUP(2.3+1,A:S, 2)
Which interpolates L(2.3) from data of L(2) = .197 and L(3) = .1738, resulting in .19004.
You can replace 2.3 by any cell reference. You can also change the lookup column 2 for L into 3 for M, 4 for S etc.
To answer the question whether there is some direct "interpolate" function in Excel, not that I know about, although there is good artillery for statistical estimation.
I'm looking for information on how to copy nth rows of records from one excel sheet to the next, and now I am wondering if there is a way to do this for filtered data (i.e. I have 400 students enrolled at school, and I want every 15th male whose parents have not graduated from college (flags have been created for both gender and parent education, which I am using to filter on). Are there any ideas on how to do this? If not, I could just use the offset function for each combination of variables I am filtering on, but that's over 30-40 combinations if I did my math right. Thanks for any help you can provide.
There are a few standard formulas used for retrieving the first, second, third, etc set of values that match criteria. I prefer a standard formula model using the INDEX function and SMALL function. By throwing a little maths at the increment to change it from 1, 2, 3 ... to 1, 16, 31, 46, ... you should be able to achieve your offset results. In the following example image, I've used a stagger of 4 rather than 15 in order to accommodate sample data vertically while still producing more than a single result.
The formula in F2 is,
=IFERROR(INDEX(A$2:A$999, SMALL(INDEX(ROW($1:$998)+((C$2:C$999<>"M")+(D$2:D$999<>"N"))*1E+99, , ), 1+(ROW(1:1)-1)*4)), "")
For your purposes the 4 in 1+(ROW(1:1)-1)*4 will need to be changed to 15.
=IFERROR(INDEX(A$2:A$999, SMALL(INDEX(ROW($1:$998)+((C$2:C$999<>"M")+(D$2:D$999<>"N"))*1E+99, , ), 1+(ROW(1:1)-1)*15)), "")
Fill down as necessary.
Once you have retrieved a unique identifier, the remainder can be retrieved with a simple VLOOKUP function.