how to auto sorted result from vlookup(google sheet) - excel

I have this formula on google sheet
VLOOKUP(upper(J2:J),colorState!A:B,{2}*sign(row(J2:J)),FALSE)
and I want it to sort the result ascending automatically when I add new data or edit(like arrayformula)
Is there anyway or any formula to do that? (I know that there's SORT formula but I'm not sure how to use it together)
thanks.

I believe I understand what you need :)
Essentially what I understand is that you would like to recreate the "main" sheet but have it automatically ordered by the 'color' column when new data is added. I don't have any idea how to do this to the raw data but you can mirror the raw data by creating another sheet (name 'mainmirror') and in cell A1 just enter this formula:
=query(main!$A:$R,"select * order by P ASC",-1)
It will take you 2 seconds to reformat with a filter view, and you'll be left with a mirror of 'main' that is always sorted by column P and should remain current as data is added.
Hopefully this is an acceptable workaround. Other option would be to use a script but this is less tedious if it's suitable.
Side note: this method will turn your values into strings to mirror them on the duplicate sheet, so on the 'main' sheet I would recommend changing the cell format of column P to a custom number format, 00, which will ensure there's a leading 0 if there's only one digit. this will cause the strings in the mirror to sort correctly, instead of 1,11,12,2,3,4,etc. If you're expecting column P to have 3 digit value, make the number format 000 accordingly.

Related

Transpose multiple occurrences

EDIT: I have revived the source data source to remove the ambiguity of my last screen shots
I am trying to transpose spreadsheet data where there are many rows where the customer name may be duplicated but each row contains a different product.
For instance
revised original data source
to
revised proposed data format
I would like to do it with formulae if possible as I struggle with VB
Thank you for any help
I realise this is a huge answer, apologies but I wanted to be clear. If you need anything from me, drop me a comment and I'll help out.
Here's the output from my formula:
EDITED ANSWER - Named ranges used for ease of understanding:
These are just an example of a few of the named ranges I have used, you can reference the ranges directly or name them yourself (simplest way is to highlight the data then put the name in the drop down next to the formula bar [top left])
Be wary that as we will be using Array formulas for AccNum and AccType, you will not want to select the entire column and instead opt for either the exact data length or overshoot it by 100 or so. Large array formulas tend to slow down calculation and will calculate every cell individually regardless of it being empty.
First formula
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Account Number ",LEFT((COLUMN(A:A)+1)/2,1)),"")
This formula is identical to the one in the original answer apart form the adjusted heading title.
=IF(Condition,True,False) - There are so many uses for the IF logic, it is the best formula in Excel in my opinion. I have used to IF with COUNTIF to check whether there is more than 0 cells that are more than BLANK (or ""). This is just a trick around using ISBLANK() or other blank identifiers that get confused when formula is present.
If the result is TRUE, I use CONCATENATE(Text1,Text2,etc.) to build a text string for the column header. ROW(1:1) or COLUMN(A:A) is commonly used to initiate an automatically increasing integer for formulas to use based on whether the count increase is required horizontally or vertically. I add 1 to this increasing integer and divide it by 2 so that the increase for each column is 0.5 (1 > 1.5 > 2 > 2.5) I then use LEFT formula to just take the first digit to the left of this decimal answer so the number increases only once every 2 columns.
If the result is FALSE then leave the cell blank ,""). Standard stuff here, no explanation needed.
Second Formula
=CONCATENATE(INDEX(Forename,MATCH(Sheet4!$A2,Reference,0)))
=CONCATENATE(INDEX(Surname,MATCH(Sheet4!$A2,Reference,0)))
CONCATENATE has only been used here to force blank cells to remain blank when pulled by INDEX. INDEX will read blank cells as values and therefore 0's whereas CONCATENATE will read them as text and therefore "".
INDEX(Range,Row,Column): This is a lookup formula that is much more advanced than VLOOKUP or HLOOKUP and not limited in the way that they are.
The range i have used is the expected output range - Forename or Surname
The row is then calculated using MATCH(Criteria,Range,Match Type). Match will look through a range and return the position as an integer where a match occurs. For this I have set the criteria to the unique reference number in column A for that row, the range to the named range Reference and the match type as 0 (1 Less than, 0 Exact Match, -1 Greater than).
I did not define a column number for INDEX as it defaults to the first column and I am only giving it one column of data to output from anyway.
Third Formula
Remember these need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter)
=IFERROR(INDEX(AccNum,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0))),"")
=IFERROR(INDEX(AccType,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))),"")
As you can see, one of these is used for AccNum and the other for AccType.
IFERROR(Value): The reason that this has been used is that we are not expecting the formula to always return something. When the formula cannot return something or SMALL has run out of matches to go through then an error will occur (usually #VALUE or #NUM!) so i use ,"") to force a blank result instead (again standard stuff).
I have already explained the INDEX formula above so let's just dive in to how I have worked out the rows that match what we are looking for:
SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))
The IF statement here is fairly self explanatory but as we have used it as an array formula, it will perform =Sheet4!$A2 which is the unique reference on every cell in the named range Reference individually. In your mock data this returns a result of: {FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} for the first entry (I included titles in the range, hence the initial FALSE). IF will do my row calculation* for every true but leave the FALSEs as they are.
This leaves a result of {FALSE;2;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} that SMALL(array,k) will use. SMALL will only work on numeric values and will display the 'k'th result. Again the column trick has been used but to cover more ground, I used another method: ROUNDDOWN(Number,digits) as opposed to using LEFT() Digits here means decimal places so I used 0 to round down to a whole integer for the same result. As this copies across the columns like so: 1, 1, 2, 2, 3, 3, SMALL will alternatively (as the formulas alternate) grab the 1st smallest AccNum then the 1st Smallest AccType before grabbing the 2nd AccNum and Acctype and so forth.
*(Row number of the match minus the first row number of the range then plus 1, again fairly common as a foolproof way to always get the correct row regardless of where the data starts; actually as your data starts on row 1 we could just do ROW(Reference) but I left it as is incase you had data in a different format)
ORIGINAL ANSWER - Same logic as above
Here's your solution in 3 parts
Part 1 being a trick for the auto completion of the titles so that they will hide when not used (in case you will just copay and paste values the whole lot to speed up use again).
=IF(COUNTIF(C2:C11,">""")>0,CONCATENATE("Product ",LEFT((COLUMN(A:A)+1)/2,1)),"") in C
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Prod code ",LEFT((COLUMN(B:B)+1)/2,1)),"") in D
Highlight both of the cells and drag across to stagger the outputs "Product " and "Prod code "
Part 2 would be inputting the unique IDs to the new sheet, I would suggest copying your entire column A across to a new sheet and using DATA > REMOVE DUPLICATES > Continue with current selection to trim out the multiple occurrences of unique IDs.
In column B use =INDEX(Sheet2!$B$1:$B$7,MATCH(Sheet4!$A2,Sheet2!$A$1:$A$7,0)) to get the names pulled across.
Part 3, the INDEX
Once again, we are doing a staggered input here before copying the formula across the page to cover the entirety of the data.
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0)),1),"") in C
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0)),2),"") in D
The formulas of Part 3 will need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter) . This will need to be done before copying the formulas across.
These formulas can now be dragged / copied in all directions and will feed off of the unique ID in column A.
My Answer is already rather long so I haven't gone on to break the formula down. If you have any trouble understanding how this works, let me know and I will be happy to write up a quick guide, breaking it down chunk by chunk for you.

Excel: Sorting Data that is Custom Formatted to add a Prefix

Hi I have a list of Values in a table that I would like to Order, the data in the column has been compiled from three separate tables where the custom format was applied for example;
When I enter the value "2" the formatting adds the prefix "MCRY-" so the value I get is "MCRY-2".
But in another table it adds the prefix "ACCRY-"
When I copy this over into the "master" sheet the formatting is copied over to which is perfect, however when I order the column the formatting is ignored as the cell values are only the numbers and not the prefixes.
My question is, how do I get the sorting process to acknowledge the prefixes as if they are apart of the cell value?
So much so that is I have "MCRY-01", "ACCRY-01", "MCRY-02", "ACCRY-02" it will order it ACCRY and then MCRY.
I have tried special pasting as values but that doesn't work. Any help?
Thanks
I wouldn't use conditional formatting for this process. You will need to hardcode. If "2" is in A1 then put the following into B1 to get the required result
="MCRY-"&TEXT($A1,"00")
Will give you "MCRY-02"
I’ve had some ‘lulus’ but this may count as my nastiest hack ever (#CallumDS33 is basically correct).
Format the “ACCRY-“ table with "ACCRY-"#;"ACCRY-"# then proceed as normal but add a helper column in your “master” sheet with:
=CELL("format",B1)
copied down to suit, where B is assumed to be the column to be ordered. Now sort on the column to be ordered within sorting of the helper column.

Perl Excel::Writer::XLSX - set column format/merge cells dynamically

I am using Excel::Writer::XLSX to create an Excel file from an array of arrays. Right now I'm trying to create a formatted table from the data (as much as I can, as opposed to just spitting it back into another file).
First off, when I use set_column() to set the background color, that color is formatted for the entire column. Is there a way to specify to only go as far as the content in the file goes? Unfortunately, when the program is run it is dynamic each time and unknown what the final row in the table should be.
Second, is there a way to merge cells based on the content inside of them? This has to do with the dynamic problem again, there is an optimal output if all the data I am gathering is online. If that were the case I could easily set a range of what these merged cells should be. But for example, if I have 10 rows of column 2 saying 'A' and then 10 rows of column 2 saying 'B', I would like to merge the A's and B's together. The issue is that is is unknown if it will always have 10 rows with that value inside of it.
Thanks for your input!
First off, when I use set_column() to set the background color, that color is formatted for the entire column. Is there a way to specify to only go as far as the content in the file goes?
No. You will have to have to add the format to the cells as you write them.
But for example, if I have 10 rows of column 2 saying 'A' and then 10 rows of column 2 saying 'B', I would like to merge the A's and B's together.
This isn't possible with Excel::Writer::XLSX. (In fact I don't think it is possible in Excel without using macros).
Since both of your issues relate to not knowing the size and value of the data beforehand then perhaps you could first read your data into an array of arrays, process it to find the required format dimensions and merge ranges and then write them out.

Vlookup and get the min value (date)

TOP Table is Input, and bottom table is preview for required output.
For Each ID I need to find earliest datetime. I also need other information from other columns (please see image below).
My current solution is:
In Cell E2 =A2
Cell E3 drag down =IF(E2<>A3,IF(E1=A3,"",A3),"")
In Cell F2 drag down =IF(E2<>"",MIN(IF($A$2:$A$14=E2,$C$2:$C$14)),"") Ctrl+Shift+Enter
One more option without any intermediate calculations:
Select the whole range starting E2 and to the last row where IDs are located - for the sample given it's row 14, so select range E2:E14: =IFERROR(INDEX($A$2:$A$14,SMALL(IF(MATCH($A$2:$A$14,$A$2:$A$14,0)=ROW(INDIRECT("1:"&ROWS($A$2:$A$14))),MATCH($A$2:$A$14,$A$2:$A$14,0),""),ROW(INDIRECT("1:"&ROWS($A$2:$A$14))))),"") and press CTRL+SHIFT+ENTER instead of usual ENTER - this will define a Multicell ARRAY formula and will result in curly {} brackets around it (but do NOT type them manually!).
F2 (ID2): =IF(E2="","",SUMPRODUCT(--(E2=$A$2:$A$14),--(G2=$C$2:$C$14),$B$2:$B$14)) - normal formula.
G2 (Min Date): =IF(E2="","",MIN(IF(E2=$A$2:$A$14,$C$2:$C$14,2^100))) and press CTRL+SHIFT+ENTER instead of usual ENTER - this will define an ARRAY formula and will result in curly {} brackets around it (but do NOT type them manually!).
H2 (InCh): =IF(E2="","",INDEX($D$2:$D$14,SUMPRODUCT(--(E2=$A$2:$A$14),--(F2=$B$2:$B$14),--(G2=$C$2:$C$14),ROW(INDIRECT("1:"&ROWS($D$2:$D$14)))))) - normal formula.
Remarks:
To make the solution more compact and easy to read, define named range for ID column, and then reference other data columns using OFFSET.
ID2 values may not be unique - as they are on the sample for IDs 1...3.
Resulting set for Min Date should be formatted the same way as source Date row.
The key formula of the solution - is multicell monster which returns unique IDs without empty rows - as OP requested)
Sample file: https://www.dropbox.com/s/d2098updfh8djnf/MinDateIDs.xlsx
This is quite a challenge... I think I have found an approach that works. For the sake of clarity, I used a few helper columns. Also, I did not use any named ranges but stuck with the column-row indications. You might want to change that.
It looks like this:
and zooming in to the relevant columns:
Column F contains an array formula to filter out duplicates. An approach is explained here. The formula I used in F2 is
=INDEX($A$2:$A$14, MATCH(MIN(IF(COUNTIF($F$1:F1,$A$2:$A$14)=0, 1, MAX((COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1)*2))*(COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1)), COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1, 0))
Use Ctrl-Shift-Enter to confirm as array formula. Drag this down or copy into column F. Then columns G and H contain the starting and ending indices of the duplicate ID values. This answer helped, please upvote it :-). The two formulas used are:
=MATCH(2,1/FREQUENCY($F2,$A$2:$A$14))
in G2, and
=FREQUENCY($A$2:$A$14,$F2)
in H2. Again, drag them down to get the full column filled. Next, column I is for clarification only -- and for sanity checking. It contains the desired minimum date from each sub-array. Column J substitutes that formula into a MATCH to find the actual index of the desired date.
=MIN(OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1))
in I2 and
=$G2-1+MATCH(2,1/FREQUENCY(MIN(OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1)), OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1))
in J2. Finally, columns L, M and N index into the original set of data via
=INDEX(B$2:B$14,$J2)
in L2, which you can drag horizontally and then vertically.
When you are done, you can hide the helper columns, or fold everything into big formulas. Good luck with that... There might be an easier way to achieve this, but I did not find it.
If you want the value from column D in G then assuming that column C values are unique you could just use a VLOOKUP, i.e. in G2 copied down
=VLOOKUP(F2,C$2:D$14,2,0)
Per your picture, they're all in the same sheet. Just sort by ID, then Date (ascending). As you work your way down the ID column, each time the ID changes, you know you've found the row with the minimum Date for that specific ID. Create an extra column to signify where ID changes occur, and filter for those rows (hide the column if you so desire).
And... voila.
Know this link is old, but there is a much shorter and easier way!
How about using a pivot table using the Minimum as field setting and then do a =GETPIVOTDATA() to get the information back!
Seems a lot simpler as these formulas!
Actually, I just realized I've been overthinking this...Excel keeps the top item and removes all that follow when removing duplicates.
So if you are going to create an extra working table anyway, why not just copy the range/columns you want to keep, then use the basic sort.
Sort first by ID, then by the column you want as the second filter. Be sure the sorts are in the order you want (e.g. newest to oldest, oldest to newest, A to Z, Largest to smallest, etc).
Once the data is sorted, remove duplicates based on ID. You are left with all of your columns of data, filtered by newest/oldest/largest/smallest per individual.
This worked for my table with 30,000+ records, filtered down to 1500 unique individuals with most recent (plus associated amount), and with a second filter, the largest (plus associated date) for each person.

Question regarding optimal excel function implementation

I have a question about Excel! I hope that isn't too unconventional for this site...
So I have an Excel table with several thousand rows. It is kind of setup like a db in that the first three of my four columns have numerical values identifying the sequence or order that the content or fourth row contains.
I am running into some possible duplication issues, and I am remembering back to my college days something about there being a function for the type of test I need to do. I need to verify that there are no two rows that have the same values for column 1-3. There should never be a time where all three columns' values match exactly that of another row.
Is VLookUp the function I need? Any excel experts out there that know of a function I could look into? Thanks so much!
the quick one-off solution I employ for this kind of quest is the following
create a single key in one temporary column - say F "=A2 & B2 & C2 ..." if combined key - I copy this formula all the way down
create a group counter for that single key - say G "=IF(F2=F1,G1+1,1)" - I can safely include the header row here because it will move the formula into the false part
This formula in G numerates all identical keys from 1 to N and starts by 1 for a new key - I copy this formula all the way down
Important: convert G formulae into values (copy / paste special onto itself)
sort descending by G and delete/manipulate all rows where counter <> 1 - or use autofilter
later on I delete F & G columns
this may sound a bit complicated, but especially in large tables VLOOKUP, COUNTIF's etc can be very time consuming.
Hope that helps
You could create another column that concatenates the first 3, then do a countif on that. Let's say the concatenation column is D and your data begins in the second row:
=countif(D:D,D2)
Copy the formula down, then filter on >1.
I think what you need is a countifs function.
assume you add one formula in a ceel in row 4:
=COUNTIFS(A:A,A4,B:B,B4,C:C,C4)
and copy to formula to the whole column
Then the cells with value 1 is a unique set while those larger than 1 have duplicates.
If you only need to check the data once, try the "Remove duplicates" functionality. This can be found in the Data tab -> Data Tools -> Remove Duplicates. Just unselect all but the first three columns in the dialog and Excel will do the rest.

Resources