Get count of days from last date over time? (non-VBA) - excel

I have two columns like so:
Item | Date
Item1 | 1/1/20
Item2 | 1/2/20
Item1 | 1/3/20
Item2 | 1/4/20
Item1 | 1/6/20
Item2 | 1/8/20
I want to be able to get a count of days passed since any item showed from its last date, like so:
Item | Date | Days passed
Item1 | 1/1/20 | 0
Item2 | 1/2/20 | 0
Item1 | 1/3/20 | 2
Item2 | 1/4/20 | 2
Item1 | 1/6/20 | 3
Item2 | 1/8/20 | 4
Any ideas that are a non-VBA solution?

=B10-LOOKUP(2,1/($A$4:A9=A10),$B$4:B9)

Related

SUM values in one column if criteria exists at least once per value in another column

| A B C D | E F | G H
----|----------------------------------------------------|-----------------------|-------------------
1 | | |
2 | Products date quantity | |
----|----------------------------------------------------|-----------------------|-------------------
3 | Product_A 2020-01-08 0 | From 2020-01-01 | Result: 800
4 | Product_A 2020-12-15 0 | to 2020-10-31 |
5 | Product_A 2020-12-23 0 | |
6 | Product_A 500 | |
----|----------------------------------------------------|-----------------------|------------------
7 | Product_B 2020-11-09 0 | |
8 | Product_B 2021-03-14 0 | |
9 | Product_B 700 | |
----|----------------------------------------------------|-----------------------|------------------
10 | Product_C 2020-02-05 0 | |
11 | Product_C 2020-07-19 0 | |
12 | Product_C 2020-09-18 0 | |
13 | Product_C 2020-09-25 0 | |
14 | Product_C 300 | |
14 | | |
15 | | |
In the table I have listed different products with multiple dates per product.
Below each product there is a row in which a quantity is displayed.
Now in Cell H3 I want to get the Sum of the quantity of all products that have at least one date between the dates in Cell F3 and Cell F4. In the example this applies to Product_A and Product_C therefore the sum is 500+300=800.
I have no clue what kind of formula I need to achieve this.
I guess it must be something like this:
SUMIFS(Date in Cell F3 OR in Cell F4 exists for Product in Column C THEN SUM over Column D)
Do you have an idea how this formula has to look like?
One way would be with SUMPRODUCT() combined with COUNTIFS():
=SUMPRODUCT((COUNTIFS(B3:B14,B3:B14,C3:C14,">="&F3,C3:C14,"<="&F4)>0)*D3:D14)

Cube/ Roll up Function of dataframe but to skip the summing of a column for few records in spark

I have a following dataframe:
+--------+------+---------+---------+
| Col1 | col2 | values1 | Values2 |
+--------+------+---------+---------+
| item1 | A1 | 5 | 11 |
| item1 | A2 | 5 | 25 |
| item1 | A3 | 5 | 33 |
| item1 | na | | 18 |
| item2 | A1 | 6 | 12 |
| item2 | A2 | 6 | 26 |
| item2 | A3 | 6 | 34 |
| item2 | na | 6 | |
+--------+------+---------+---------+
which can be created with this code
df = Seq(
(item1, A1,5 ,11),
(item1, A2,5 ,25),
(item1, A3,5 ,33),
(item1, na,0,18),
(item2, A1,6 ,12),
(item2, A2,6 ,26),
(item2, A3,6 ,34),
(item2, na,6 ,0)).toDF('Col1', 'col2', 'values1', 'Values2');
I want to skip the adding of column values1 for all the records when doing rollup or cube on it.
My Desired OutPut:
+-------+------+---------+---------+
| Col1 | col2 | values1 | values2 |
+-------+------+---------+---------+
| null | null | 17 | 159 |
| item1 | null | 5 | 87 |
| item1 | A1 | 5 | 11 |
| item1 | A2 | 5 | 25 |
| item1 | A3 | 5 | 33 |
| item1 | na | 0 | 18 |
| item2 | null | 12 | 72 |
| item2 | A1 | 6 | 12 |
| item2 | A2 | 6 | 26 |
| item2 | A3 | 6 | 34 |
| item2 | na | 6 | |
+-------+------+---------+---------+
How can I get a rollup or cube Function applied to this dataset so that sum of values1 to Col1 should sum up the values for either (A1/A2/A3)+na=
so for eg:
the second row shows
values1 =5= 5+0 and values2= 87=11+25+33+18 and the 6th row
values1 =12=6+6 and values2 =12+26+34+0=72
But what I get now by doing the rollup operation is
Adds up all the agg which I don't want to happen for values1 column.
df.rollup("Col1","col2").agg(sum("values1") as "values1",sum("values2") as "values2");
Current Output:
+-------+------+---------+---------+
| Col1 | col2 | values1 | values2 |
+-------+------+---------+---------+
| null | null | 39 | 159 |
| item1 | null | 15 | 87 |
| item1 | A1 | 5 | 11 |
| item1 | A2 | 5 | 25 |
| item1 | A3 | 5 | 33 |
| item1 | na | 0 | 18 |
| item2 | null | 24 | 72 |
| item2 | A1 | 6 | 12 |
| item2 | A2 | 6 | 26 |
| item2 | A3 | 6 | 34 |
| item2 | na | 6 | |
+-------+------+---------+---------+
(The link which was posted as dup is not the actual ask here. The desired output is different than the answers in the link )

Statistical significance test for ranked data

I have a list of rankings in the following format:
Item | Score | Rank
item1 | 0.97 | 6
item2 | 0.53 | 4
item3 | 0.05 | 1
item4 | 0.68 | 5
item5 | 0.10 | 2
item6 | 0.29 | 3
I want to determine whether the difference between each two pair of ranked items is significant given the scores. What statistical test should I conduct? Thank you!

How to return columns with unique counts per value in other column

So I have a pandas dataframe containing names in 3 columns. Looking something like this:
+-------------+-------------+-------------+
| NameColumn1 | NameColumn2 | NameColumn3 |
+-------------+-------------+-------------+
| Name1 | Name2 | Name3 |
| Name1 | Name2 | Name6 |
| Name1 | Name2 | Name8 |
| Name1 | Name4 | Name5 |
+-------------+-------------+-------------+
Now I would like to add 3 new columns containing counts of the unique values per name in the column to the left of it.
So for example the first column I would like to add would be the count of unique names in Column2 per unique name in column 1. So that is 2 (Name2 and Name4) and add this to the dataframe.
For Column 3 and name in Column2 it would be 3 (name3, name6 and name8).
So for the example like this:
+----------+----------+----------+-------------+-------------+--+
| NameCol1 | NameCol2 | NameCol3 | CountOfCol2 | CountOfCol3 | |
+----------+----------+----------+-------------+-------------+--+
| Name1 | Name2 | Name3 | 2 | 3 | |
| Name1 | Name2 | Name6 | 2 | 3 | |
| Name1 | Name2 | Name8 | 2 | 3 | |
| Name1 | Name4 | Name5 | 2 | 1 | |
+----------+----------+----------+-------------+-------------+--+
This is how to get the answer for columns 2 and 3: count the unique pairs grouped by source against source,target pairs, broadcasting the result with the transform.
In [60]:df.groupby('NameColumn2')[['NameColumn2','NameColumn3']].transform(lambda x: x.nunique())['NameColumn3']
Out[60]:
0 3
1 3
2 3
3 1
Name: NameColumn3, dtype: int64
Replace 2 by x and 3 by y in the formula above to get result of countofColy for unique pairs of columnx, columny

Copy value N times in Excel

I have simple list:
A B
item1 3
item2 2
item3 4
item4 1
Need to output:
A
item1
item1
item1
item2
item2
item3
item3
item3
item3
item4
Here is one way of doing it without VBA:
Insert a column to the left of A, so your current A and B columns are now B and C.
Put 1 in A1
Put =A1+C1 in A2 and copy down to A5
Put an empty string in B5, by just entering a single quote (') in the cell
Put a 1 in E1, a 2 in E2, and copy down as to get 1, 2, ..., 10
Put =VLOOKUP(E1,$A$1:$B$5,2) in F1 and copy down.
It should look like this:
| A | B | C | D | E | F |
|----|-------|---|---|----|-------|
| 1 | item1 | 3 | | 1 | item1 |
| 4 | item2 | 2 | | 2 | item1 |
| 6 | item3 | 4 | | 3 | item1 |
| 10 | item4 | 1 | | 4 | item2 |
| 11 | | | | 5 | item2 |
| | | | | 6 | item3 |
| | | | | 7 | item3 |
| | | | | 8 | item3 |
| | | | | 9 | item3 |
| | | | | 10 | item4 |
Here's the VBA solution. I don't quite understand the comment that VBA won't be dynamic. It's as dynamic as you make it, just like a formula. Note that this macro will erase all data on Sheet1 and replace it with the new output. If you want the desired output on a different sheet, then change the reference to Sheet2 or what have you.
Option Explicit
Sub MultiCopy()
Dim arr As Variant
Dim r As Range
Dim i As Long
Dim currRow As Long
Dim nCopy As Long
Dim item As String
'store cell values in array
arr = Sheet1.UsedRange
currRow = 2
'remove all values
Sheet1.Cells.ClearContents
Sheet1.Range("A1") = "A"
For i = 2 To UBound(arr, 1)
item = arr(i, 1)
nCopy = arr(i, 2) - 1
If nCopy > -1 Then
Sheet1.Range("A" & currRow & ":A" & (currRow + nCopy)).Value = item
currRow = currRow + nCopy + 1
End If
Next
End Sub

Resources