Do something with sequence of cells in an Excel cell? - excel

I want to sum "every 5th row between A9 and A54," i.e.
A(5x + 9)-A(54),
inside excel.
There's also a sub-question here. Is it possible to sum a range of cells using a function or VBA?
Here are some test questions, based on the answer:
=SUMPRODUCT(--(MOD(ROW(Range)-MIN(ROW(Range))+1,1)=0),A1:A50)--sums row A1 throughA50?
=SUMPRODUCT(--(MOD(ROW(A50)-MIN(ROW(A50))+1,2)=0),A1:A50)--sums every other row, A1 through A50
=SUMPRODUCT(--(MOD(ROW(A5:A50)-MIN(ROW(A1:A50))+1,5)=0),A1:A50)--sums everyth 5th row between A5 and A50?
Based on:
Substitute A9:A54 for Range and 5 for n for your specific query. Only
those parts change Answer will remain the same even if you delete rows
below Range.

I would use a more robust version of RocketDonkey's answer. The -8 in his comment is because the first row of the range is 9 but generically for any single column range you can use this formula to sum every nth row (starting with the first row of the range)
=SUMPRODUCT(--(MOD(ROW(Range)-MIN(ROW(Range)),n)=0),Range)
Substitute A9:A54 for Range and 5 for n for your specific query. Only those parts change
Answer will remain the same even if you delete rows below Range
Explanation
The part ROW(Range)-MIN(ROW(Range)) will return an array of integers, starting with 0 through the number of integers corresponds to the number of cells in the range, so
=ROW(A9:A54)-MIN(ROW(A9:A54))
produces this array
{0;1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45}
When you feed that in to MOD function with divisor 5 then clearly MOD(0,5) = 0, so the first cell (A9) in this case, always satisfies the condition.....and so with every 5th (or nth) cell in the range so
=SUMPRODUCT(--(MOD(ROW(A9:A54)-MIN(ROW(A9:A54)),5)=0),A9:A54)
sums every 5th cell starting with the first cell in the range, i.e. it sums A9, A14, A19, A24....etc.
Clearly you could replace MIN(ROW(A9:A54)) with 9 but then if you delete some rows below row 9 the formula results will change so using that construction is a little more robust

I guess ROW(A9:A54)-MIN(ROW(A9:A54) in MOD(ROW(A9:A54)-MIN(ROW(A9:A54)),5 produces the number -9. In this equation, MOD, the value of x in -x is the location on the number line away from cell 0. So, the result of ROW(A9:A54)-MIN(ROW(A9:A54), -9, makes A9 the 0th entry in the array we run MOD over. It's possible that's what the =0 is for....
=SUMPRODUCT(--(MOD(-9,5)),A9:A54) apparently has the same effect. For some reason, the long version is preferred, which is: =SUMPRODUCT(--(MOD(ROW(Range)-MIN(ROW(Range)),n)=0),Range). I don't see the advantage, but it has something to do with either shifting or deleting cells. I imagine it's shifting because having an empty cell shouldn't matter.
I'm also a bit perplexed as to why we use =SUMPRODUCT and not SUM and -- rather than SUM. -- does addition, and it also converts bools to numbers, I guess. I don't see why either of those things would be necessary, here. Perhaps it's a just-in-case thing. For example, perhaps every 15th row is an undesired string constant, rather than a desired number. I think that it would be more universally useful to just enclose one series within another, in that case, though.

Related

Is there a way to scan an entire column based on one cell in another column and pull out a value of the corresponding column?

A
B
C
D
4
1
6
5649
3
8
10
9853
5
2
7
1354
I have two worksheets, for example column A in sheet 1 and columns B-D in sheet 2.
What I want to do is to take one value in Column A, and scan both columns B and C and it is between those two values, then display the corresponding value from column D in a new worksheet.
There could be multiple matches for each of the cell in column A and if there is no match, to skip it and not have anything displayed. Is there a way to code this and somehow create a loop to do all of column A? I tried using this formula, but I think it only matches for each row and not how I want it to.
=IF(AND([PQ.xlsx]Sheet1!$A2>=[PQ.xlsx]Sheet2!$B2,[PQ.xlsx]Sheet1!$A2<[PQ.xlsx]Sheet2!$C2),[PQ.xlsx]Sheet2!$D$2,"")
How do I do this?
Thank you.
I'm not positive if I understood exactly what you intended. In this sheet, I have taken each value in A:A and checked to see if it was between any pair of values in B:C, and then returned each value from D:D where that is true. I did keep this all on a single tab for ease of demonstration, but you can easily change the references to match your own layout. I tested in Excel and then transferred to this Google Sheet, but the functions should work the same.
https://docs.google.com/spreadsheets/d/1-RR1UZC8-AVnRoj1h8JLbnXewmzyDQKuKU49Ef-1F1Y/edit#gid=0
=IFERROR(TRANSPOSE(FILTER($D$2:$D$15, ($A2>=$B$2:$B$15)*($A2<=$C$2:$C$15))), "")
So what I have done is FILTEREDed column D on the two conditions that Ax is >= B:B and <= C:C, then TRANSPOSED the result so that it lays out horizontally instead of vertically, and finally wrapped it in an error trap to avoid #CALC where there are no results returned.
I added some random data to test with. Let me know if this is what you were looking at, or if I misunderstood your intent.
SUPPORT FOR EXCEL VERSIONS WITHOUT DYNAMIC ARRAY FUNCTIONS
You can duplicate this effect with array functions in pre-dynamic array versions of Excel. This is an array function, so it has be finished with SHFT+ENTER. Put it in F2, SHFT+ENTER, and then drag it to fill F2:O15:
=IFERROR(INDEX($D$2:$D$15, SMALL(IF(($A2>=$B$2:$B$15)*($A2<=$C$2:$C$15), ROW($A$2:$A$15)-MIN(ROW($A$2:$A$15))+1), COLUMNS($F$2:F2))),"")
reformatted for easier explanation:
=IFERROR(
INDEX(
$D$2:$D$15,
SMALL(
IF(
($A2>=$B$2:$B$15)*($A2<=$C$2:$C$15),
ROW($A$2:$A$15) - MIN(ROW($A$2:$A$15))+1
),
COLUMNS($F$2:F2)
)
),
"")
From the inside out: ROW($A$2:$A$15) creates an array from 2 to 15, and MIN(ROW($A$2:$A$15))+1 scales it so that no matter which row the range starts in it will return the numbers starting from 1, so ROW($A$2:$A$15) - MIN(ROW($A$2:$A$15))+1 returns an array from 1 to 14.
We use this as the second argument in the IF clause, what to return if TRUE. For the first argument, the logical conditions, we take the same two conditions from the original formula: ($A2>=$B$2:$B$15)*($A2<=$C$2:$C$15). As before, this returns an array of true/false values. So the output of the entire IF clause is an array that consists of the row numbers where the conditions are true or FALSE where the conditions aren't met.
Take that array and pass it to SMALL. SMALL takes an array and returns the kth smallest value from the array. You'll use COLUMNS($F$2:F2) to determine k. COLUMNS returns the number of columns in the range, and since the first cell in the range reference is fixed and the second cell is dynamic, the range will expand when you drag the formula. What this will do is give you the 1st, 2nd, ... kth row numbers that contain matches, since FALSE values aren't returned by SMALL (as a matter of fact they generate an error, which is why the whole formula is wrapped in IFERROR).
Finally, we pass the range with the numbers we want to return (D2:D15 in this case) to INDEX along with the row number we got from SMALL, and INDEX will return the value from that row.
So FILTER is a lot simpler to look at, but you can get it done in an older version. This will also work in Google Sheets, and I added a second tab there with this formula, but array formulas work a little different there. Instead of using SHFT+ENTER to indicate an array formula, Sheets just wraps the formula in ARRAY_FORMULA(). Other than that, the two formulas are the same.
Since FALSE values aren't considered, it will skip those.

How to double the number of elements of the row in Excel?

In 1st row I generated random numbers, I want to keep the range of the element while doubling the number of elements in the next row. Ex)If the 1st row has 5 elements, 2nd row 5x2=10, 3rd= 10x2=20 and so on.
if you have your value in A1 then write in A2 = 2*A1.
Therefore you will have that Ai = 2 * A(i-1) where i is any number in
the naturals (up to 2^20 rows which is the limit in an excel sheet).
If this works remember to mark this answer as a solution.
Have a good day.
This is a confusing question. I think what you want is to take a formula that happens X number of times in the first row, and cause that formula to occur in twice as many cells in the second row, twice that again in the third, and so on.
If that is the case you can wrap your formula in these functions:
=IF(COUNT(1:1)*2>=COLUMN(),Put your formula here,"")
Put that in the second row and drag across and down as far as you need it to.
Let me know if that doesn't do what you wanted.

How can you exclude a row from a SUM based on a cell's value?

I have a range that I want to sum, which is A2:M35. However, if column 'N' has the number 1 in it, I want to exclude that entire row from the sum. So, If N3 contains 1 I want to exclude the range A3:M3 from the sum calculation. Is this possible?
UPDATE:
I should also include that the 1 or 0 in column N is a flag to state whether this row should be excluded or not (1 = yes, 0 = no). However, this value is derived by checking whether any values in that row = "excluded". So, the additional complication here appears to be that even though the rows with "excluded" in them should be excluded, the sum calculation will show '#VALUE' as it believes some of the values are of the wrong data type (even though they shouldn't be included).
SIMPLE SOLUTION (with helper column)
If you can, to keep it simple, I'd just add a helper column.
So In cell O2:
=IF($N2=1,0,SUM($A2:$M2))
Drag that down to cell O35.
Then you can simply:
=SUM($O$2:$O$35)
COMPLEX SOLUTION (no helper column)
If you would like to avoid having to have a helper column cluttering up your sheet, you could use a SUMPRODUCT formula:
=SUMPRODUCT($A$2:$M$35,(LEN($A$2:$M$35)-LEN($A$2:$M$35))+NOT($N$2:$N$35))
HOW IT WORKS:
The first range (A2:M35) is the array (or in this case a range of excel cells) that you want to sum.
The SUMPRODUCT is going to take each value in that array and multiply it by each corresponding value in the next array we supply, then sum all the results together.
The problem is that the first array is a table, 13 values across and 34 values down. The second array (column N) is only 1 value across. SUMPRODUCT requires that all arrays are the same size.
To do this, we first create an array the correct size:
(LEN(A2:M35)-LEN(A2:M35))
LEN returns an array containing the number of characters in each cell supplied to it. If we take it away from it's self, we are left with an array of the correct size, filled with zeros.
We can then add the values in our smaller array (column N) to the zeros in the array of the correct size, this will fill all the columns with the correct value.
+NOT(N2:N35))
The NOT is there because we want to sum the rows which have a zero. All it is doing is swapping the zeros and ones in column N. So, all 1's become 0's and vice versa.
I hope you can follow my explanation. If not, please let me know and I will elaborate.

Excel formula to lookup the last value in a column and return the value of the adjacent cell

I have the following formula to return the value of the last value in a column:
=LOOKUP(2,1/(D:D<>""),D:D)
What I need now is to return the value of the cell adjacent to it as well. (It will not necessarily be the last value in that column and the info in Column D could have duplicates.
If your data looks like this:
A 1
A 2
A 3
B 4
B 5
B 6
C 7
To get last value this will do the trick:
=INDIRECT("B"&COUNTA(A:A))
And to get last where value is A:
=INDIRECT("B"&MATCH("A",A1:A7,0)+COUNTIF(A1:A7,"A")-1)
Just use next column:
=LOOKUP(2,1/(D:D<>""),E:E)
Ok, So I have found an answer by playing around with array formulas.
The problem was that this is a stock control sheet where there are changes made at multiple times, each recorded in the next available row. There is always a date (Column E) but not necessarily a Supplier, as it might be stock moving out. When a Supplier delivers, the Supplier name is recorded in Column D. In D1 the last supplier is then shown with the following formula.
=LOOKUP(2,1/(D:D<>""),D:D)
I want to then see what date it was last received. The formula I found that works is as follows (Array Formula):
=INDEX(E:E,MAX(IF(D:D=D1,ROW(D:D)-ROW(INDEX(D:D,1,1))+1)))
This is generally how I do it:
=XMATCH(FALSE,ISBLANK(A:A),0,-1)
This is what each part does:
Parameter
Explanation
FALSE
Instructs Excel to find the first instance of FALSE that it finds
ISBLANK(A:A)
Takes in the column A:A and notionally assigns a value to every item in the column
0
Means we want an exact match. Probably not necessary to put in, but I think it's good practice anyway
-1
Instructs Excel to start the search at the bottom/right of the range and work up/left. If you change this to 1 (the default), Excel will begin the search at the top/left and work down/right
So, taken together, this will search from the bottom of the column A:A, until Excel finds the first cell that is not blank, and return that cell.
Also, yes, this equation can be changed to a row format (e.g. 1:1), and can take a smaller range (e.g. A1:A20), but it cannot take a 2-dimensional range (e.g. A1:B20).
As a practical matter, this approach is much faster than other approaches (and much faster than you'd think, given it's evaluating against every row/column in the range), and won't get fooled by columns that have empty spaces in them (like with a COUNTA style approach).

Find a range of value in excel

I have two different sheets with 300,000 data in Excel.
First sheet contains:
S2_Symbol Start_Pos End Position
STE 254857 267891
PRI 748578 758962
ILA 852741 963369
VIS 789456 796325
Second:
S1_Location
789460
852898
748678
My output should be like this:
S1_Location Symbol
789460 VIS
852898 ILA
748678 PRI
I have to find that S1_location falls in which S2_location and its corresponding Symbol. I have used INDEX formula in Excel but for each cell, I have to change the reference cell manually. I couldn't do it 300,000 data.
How can I do in an in Excel or should I use a script?
This solution assumes the following:
Start and End Positions for each S2 Symbol are unique (i.e. there is no intersection between the ranges allocated to each symbol)
Data in first sheet is located at A1:D17 (adjust ranges in formulas as needed)
Data in second sheet is locate at A1:B300010 (adjust ranges in formulas as needed)
The solution requires:
To add a working column in worksheet one. Enter this formula in D2 and copy till last record.
=ROWS($A$1:$A2)
Fig. 1
Then in second worksheet enter this formula at B2 and copy till last record.
=INDEX( Sheet1!$A$1:$A$17,
SUMIFS( Sheet1!$D$1:$D$17,
Sheet1!$B$1:$B$17, "<=" & $A2, Sheet1!$C$1:$C$17, ">=" & $A2 ) )
Fig. 2
It took aprox. less than 14 seconds to copy downwards and calculate the formulas in sheet 2.
As it can be seen in figures 1 and 2 none of the tables need to be sorted.
Assuming both sheets start in A1, and First sheet ColumnB is sorted ascending, in Second sheet B2 please try:
=INDEX(First!A:A,MATCH(A2,First!B:B))
copied down to suit. It relies on inexact matching.
Assuming we have a Sheet1 like this:
note, the Sheet1is sorted by Start_Pos, End_Pos in ascending order.
and a Sheet2 like this:
Then the formula in Sheet2!B2 downwards could be:
=INDEX(Sheet1!A:A,IF(MATCH(A2,Sheet1!B:B)>IFERROR(MATCH(A2-(10^-10),Sheet1!C:C),0),MATCH(A2,Sheet1!B:B),NA()))
See MATCH: https://support.office.com/en-us/article/MATCH-function-e8dffd45-c762-47d6-bf89-533f4a37673a
The idea is: MATCH without exact matching (without parameter match_type) gets the row of the largest value which is smaller or equal the search value. So in the Start_Pos column it will get the row from which we can get the S2_Symbol. But from the End_Pos column it should get one row beforehand if the value is not outside the given ranges.
There is only one exception. If the value is exact the value in the End_Pos column, then it will return the same row as in the Start_Pos column. Considering this exception, we can search in the End_Pos column with a little bit smaller value. Thanks to Tom Sharpe for his comment.
The formula in Sheet2!D2 downwards is:
{=INDEX(Sheet1!A:A,MIN(IF($A2>=Sheet1!$B$2:$B$300000,IF($A2<=Sheet1!$C$2:$C$300000,ROW(Sheet1!$A$2:$A$300000),2^20+1))))}
this is an array formula which is exactly formulated respecting the requirements. But this is very bad in performance for using in much many cells. But using this, the Sheet1 is not required to be sorted.
Benchmark test:
Have the following Sheet1:
Formulas:
A2:A300002: ="S"&(ROW(A1)-1)*10&"-"&(ROW(A1)-1)*10+7
B2:B300002: =(ROW(A1)-1)*10
C2:C300002: =B2+7
and the following Sheet2:
Formulas:
A2:A300002: =RANDBETWEEN(0,3000007)
B2:B300002: =INDEX(Sheet1!A:A,IF(MATCH(A2,Sheet1!B:B)>IFERROR(MATCH(A2-10^-9,Sheet1!C:C),0),MATCH(A2,Sheet1!B:B),NA()))
Note the -10^-9 instead of -10^-10 in previous version. This is because we have only 16 digits precision. In previous version this was maximum 6 digits integer part and then 10 digits decimal part. Now it is maximum 7 digits integer part and then 9 digits decimal part.
Calculation after pressing F9 in Sheet2 takes ca. 2 s. (Excel 2007, Windows 7, 4 core processor).
I would have gone for something like this which gives you the first match if there is one:-
=INDEX(First!A:A,MATCH(1,(First!B:B<=A2)*(First!C:C>=A2),0))
assuming keys and start and end values are in a sheet called First and lookup values start in A2.
Array formula which must be entered with CtrlShiftEnter
In response to the question from #pnuts about how long it will take, I have set up a similar benchmark with 300,000 rows in each sheet and it has reached 1% after 90 minutes, so it should take about 150 hours to reach 100% or roughly one week. This is to be expected as the number of computations required is (rows in sheet 1) X (rows in sheet 2)
300,000 X 300,000
but in fact because the multiplication applies to complete columns, I believe it is more correctly
300,000 X 1,048,576
i.e. > 300 billion.
A practical version which gives good response for smaller ranges is as follows:-
I define three named ranges Range1, Range2 and Range3
=First!$A$1:INDEX(First!$A:$A,MATCH("ZZZ",First!$A:$A))
=First!$B$1:INDEX(First!$B:$B,MATCH(9.9E+307,First!$B:$B))
=First!$C$1:INDEX(First!$C:$C,MATCH(9.9E+307,First!$C:$C))
and the modified formula is
=INDEX(Range1,MATCH(1,(Range2<=A2)*(Range3>=A2),0))
I was thinking of deleting this answer, but would rather it stood as a counter-example.

Resources