Compare two excel columns which the most frequently occur in specific date - excel

I would like to compare between few columns, what where the top 5 most popular products in year 2015.
I have this kind of data flow to work with:
Client | Product | Date of buy
------------------------------
client1 | A | 15.06.2015
client3 | A | 04.12.2015
client5 | F | 15.06.2015
client9 | G | 15.01.2015
client2 | G | 15.01.2015
client1 | R | 05.07.2015
client3 | G | 15.06.2015
client1 | F | 05.07.2015
client3 | F | 15.06.2016
Results - which products client bought the most with (in same date) the top 5 products communities of them. E.g..
1. Product A + Product H 222 times
2. Product A + Product E 77 times
3. Product B + Product O 70 times
4. etc
5. ...
Greetz,

Making the assumption:
you can use helper columns.
Your Columns up above are A, B and C.
You have two header rows and data starts in row 3.
Your dates are stored in an excel date format and not string values.
In E2 I generated a list of unique product items using the following formula:
=INDEX($B$3:$B$11,MATCH(0,INDEX(COUNTIF($E$2:E2,$B$3:$B$11),0,0),0))
I copied it down to match the number of rows in the initial list. It starts spitting #N/A when all the unique items in the list have been listed. If you want to avoid this you could put the formula inside of:
=IFERROR(insert formula,"")
Now in column F I did a count based on your criteria of each item and within the year 2015. I used a multiple count if function called COUNTIFS:
=COUNTIFS($C$3:$C$11,"<"&DATE(2016,1,1),
$C$3:$C$11,">"&DATE(2014,12,31),
$B$3:$B$11,E3)
I just reformatted that for easier reading. You will have to edit that slightly if you want to copy and paste. If you don't like seeing 0 when there is no product in the adjacent column you could wrap the equation in:
=IF(E3="","", insert formula )
I then skipped a column and sorted the list of counted items from largest to smallest and had it return the numbers in sequence. I only went down two rows, but you could technically do the whole list. The large function does this and the formula in H3 looks like:
=LARGE($F$3:$F$11,ROWS($1:1))
I then went back 1 column and put the product name that corresponds to the count, and then took the next name in the list when products had equal count. I put that in column F as normally when I read I want to read the product name first then read the quantity. If you want it the other way around just swap the columns. The formula in G1 is:
=INDEX($E$3:$E$11,MATCH(H3,$F$3:$F$11,0)+COUNTIF($H$3:$H3,H3)-1)
Copy E3 and F3 down as far as you need. Copy G3 and H3 down one row and you will have top two. down two rows and you have top three etc.
This is how it looks...The dates are displayed according to my computers date format.

Related

Build 1D array / list in formula by multiplying values for use in AVERAGE()

I have an excel spreadsheet with a list of values, column A contains the grading, column B contains the number of occurrences:
A | B
---------------
Grading | Count
1 | 1
2 | 1
3 | 2
4 | 3
5 | 5
I would like to find the average grading based on the count but to do this I need to build a list based on these values, I.E. the above chart should translate into:
=AVERAGE(1,2,3,3,4,4,4,5,5,5,5,5).
I have managed to come to a solution through a very convoluted method of creating a new table, using IF and COUNTIF to print out an array and then AVERAGE the entire range but this is time consuming to repeat and I'm sure there is much simpler way of doing this.
If I'm not mistaken, you can just take the sum of product of columns A and B, then divide by the sum of the Count column:
=SUMPRODUCT(A2:A6, B2:B6) / SUM(B2:B6)
Note that using your hand written expanded formula yielded the same results:
=AVERAGE(1,2,3,3,4,4,4,5,5,5,5,5)

How to copy and process data in the columns based on texts in the Header but header is not in first row

I am new to Excel and been told that I may find the solution in VBA.
I am working on a system generated report from which I need to remove few columns, but the report contains some important information in first 25 rows. After this information, we get 15 to 40 rows of actual report data (number of rows varies by centres).
I need to process this actual data by removing blank columns, sorting by A to Z, and then inserting the ‘Average’ in the last column.
Currently, I am copying actual report data, trim down unnecessary columns, make the necessary changes, applying VLOOKUP with last month’s report to verify opening and closing counts are matching or not (they almost always do, but why to take chances?) and then pasting it at the same location. Can it be done using VBA?
Format of report
> For the period 01-11-2011 to 30-11-2011
> *Parameter Selection List
>From 01-11-2011
>Date 30-11-2011
> Partner SAM
>Code TWO
>Location 999
>Report For All
>Code : TWO
>Location : ABC
Product Name |LastCount|AddedInPeriod |Left In The Period |Net Total | Average
SUPER GLUE |123456 |0 | 0 | 234567 |
CRICKET BAT |345678 |0 | 0 | 346899 |
NICON |2345 |0 | 0 | 2456 |
OLD STICKS |45689 |0 | 0 | 56778 |
Total |517168 |0 | 0 | 640700 |
Product Name is in Column B, Column C is blank, Last Count is in Column D, Header AddedInPeriod is mearged in Column D and F but data is in Column F, same is with Left In The Period(Header is merged inColumn G and H,but data is in column H), Col I is blank, Net Total is in J, Col K is blank and Average is in Col L
Data below the headers Product name, Last Counts and Net Total is necessary, rest of the range should be removed. (Please note, few cells are merged)
Final Report should look like this
For the period 01-11-2011 to 30-11-2011
*Parameter Selection List
From 01-11-2011
Date 30-11-2011
Partner SAM
Code TWO
Location 999
Report For All
Code :TWO
Location : ABC
Product Name|Last Count |Net Total|Average
CRICKET BAT |345678 |346899 |346288.5
NICON |2345 |2456 |2400.5
OLD STICKS |45689 |56778 |51233.5
SUPER GLUE |123456 |234567 |179011.5
**Total |517168 |640700 |578934**
How to do that?
Here is a function that will get the letter of the column with the header you suggest.
Function Letter(oSheet As Worksheet, name As String, Optional num As Integer)
If num = 0 Then num = 1
Letter = Application.Match(name, oSheet.rows(num), 0)
Letter = Split(Cells(, Letter).Address, "$")(1)
End Function
just create a new variable, and pass the header to the function, i.e.,
newvar = letter(sheets("Sheets1"), "SUPER GLUE", 4)
where the 4 is, where you indicate the row where the header can be found, in this instance I put row 4. Once you have the letter, then use
Sheets("Sheet2").Range(newvar & "4:" & newvar & Range("A1").End(xlDown).Row).Copy Destination:=Sheets("Sheet2").Range("A1")
This will take the information from the particular column, with the column header, and paste it into sheet2 A1

How to find parent in an indented hierarchy?

I currently have a sheet in excel with an indented hierarchy of items as shown below. Each item is indented (four spaces per indent) to show how it fits into the overall hierarchy. I have been able to create a "Level" column that translates the indentation level into a number.
+------------+-------+--------+
| Item | Level | Parent |
+------------+-------+--------+
| P1 | 1 | N/A |
| P2 | 2 | P1 |
| P3 | 2 | P1 |
| P4 | 3 | P3 |
| P5 | 2 | P1 |
| P6 | 3 | P5 |
+------------+-------+--------+
What I want to do is generate the "Parent" column above, which uses the "level" information to display each item's parent.I think that this would need to be done with a loop that would do this for each item X :
-Find level info for X
-Find (levelx-1) which would equal the parent item's level
-Search upward for the first row with a level equal to (levelx-1)
-Find the item number in that row
-Write item number in adjacent cell to X
Unfortunately, I'm not sure how to translate this idea into VBA.Thanks in advance for any assistance.
OK, assuming the above table starts in cell A1, useful data starts in row 2. This formula will do the trick:
=INDEX($A$1:$A$7,MAX(IF($B$2:$B2=B2-1,ROW($B$2:$B2),"")))
Enter this in cell C2 as an array formula (Ctrl+Shift+Enter), then pull it down. The first one will obviously be an error (not #NA but #VALUE).
How it works:
IF($B$2:$B2=B2-1,ROW($B$2:$B2),"")
This creates an array with the row numbers for values with one level lower than the actual value. To examine only the values above the current row, you need to use expanding ranges, hence the $B$2:$B2 style references.
The MAX function gets the maximum of these rows, which is the closest to our current cell. Now we have the row number. All we need now is a formula to extract the data from column A from the indicated row. This is what INDEX does.
It took me a while to understand how this formula works, so after figuring it out (ok, my wife helped me a bit) I'd like to share an idiot-proof explanations for other Excel-dummies like me. Here we go:
=INDEX($A$1:$A$7,MAX(IF($B$2:$B2=B2-1,ROW($B$2:$B2),"")))
means:
Among values in range $B$2:$B2 find all values that equal to
B2-1.
If you find them, list the row numbers with value equal to
B2-1. (ROW)
From the list of the row numbers, pick the highest
row number (lets call it number X). (MAX)
Return the value which is in the line number X in the range $A$1:$A$7
(Warning! Your range has to start in the row no. 1, so that the row number is the same as the line number in your range. Otherwise - you have to adapt the formula.)

Counting the number of older siblings in an Excel spreadsheet

I have a longitudinal spreadsheet of adolescent growth.
ID | CollectionDate | DOB | MOTHER ID | Sex
1 | 1Aug03 | 3Apr90 | 12 | 1
1 | 4Sept04 | 3Apr90 | 12 | 1
1 | 1Sept05 | 3Apr90 | 12 | 1
2 | 1Aug03 | 21Dec91 | 12 | 0
2 | 4Sept04 | 21Dec91 | 12 | 0
2 | 1Sept05 | 21Dec91 | 12 | 0
3 | 1Aug03 | 30Jan89 | 23 | 0
3 | 4Sept04 | 30Jan89 | 23 | 0
This is a sample of how my data is formatted and some of the variables that I have. As you can see, since it is longitudinal, each individual has multiple measurements. In the actual database there are over 10 measurements per individual and over 250 individuals.
What I am wanting to do is input a value signifying the number of older brothers and older sisters each individual has. That is why I have included the Mother ID (because it represents genetic relatedness) and sex. These new variable columns would just say how many older siblings of each sex each individual has. Is there a formula that I could use to do this quickly?
=COUNTIFS($B:$B,"<>"&$B2,$H:$H,$H2,$AI:$AI,$AI2,$J:$J,"<"&$J2)
Create a column named Distinct with this formula
=1/COUNTIF([ID],[#ID])
Then you can find all the older 0-sexed siblings like this
=SUMPRODUCT(([DOB]>[#DOB])*([MOTHERID]=[#MOTHERID])*([Sex]=0)*([Distinct]))
Note that I made the data a Table and used table notation. If you're not familiar [COLUMNNAME] refers to the whole column and [#COLUMNNAME] refers to the value in that column on the current row. It's similar to saying $A:$A and A2 if you're dealing with column A.
The first formula gives you a value to count that will always result in 1 for a particular ID. So ID=1 has three lines and Distinct will result in .33333 for each line. When you add up the three lines you get 1. This is similar to a SELECT DISTINCT in Sql parlance.
The SUMPRODUCT formula sums [Distinct] for every row where the DOB is greater than the current DOB, the Mother is the same as the current Mother, and the Sex is zero.
I have a possible solution. It involves adding two columns -- One for "# older siblings" and one for "unique?". So here are all the headings I have currently:
A -- ID
B -- CollectionDate
C -- DOB
D -- MOTHER ID
E -- Sex
F -- # older siblings
G -- unique?
In G2, I added the following formula:
=IF(A2=A1,0,1)
And dragged down. As long as the data is sorted by ID, this will only display "1" once for each unique person.
In F2, I added the following formula:
=COUNTIFS(G:G,"=1",D:D,"="&D2,C:C,"<"&C2)
And dragged down. It seemed to work correctly for the sample data you provided.
The stipulations are:
You would need the two columns.
The data would need to be sorted by ID
I hope this helps.
You need a formula like this (for example, for row 2):
=COUNTIFS($A:$A,"<>"&$A2,$E:$E,$E2,$D:$D,$D2,$C:$C,"<"&$C2)
Assuming E:E is column for sex, D:D is column for mother ID and C:C is column for DOB.
Write this formula in H2 cell for example and drag it down.

Excel Sophisticated Sort - Return low/high values

I am trying to sort data imported from a csv file. The data comes in like such:
Columns
A | B
--------
t1 | 1
t3 | 9
t1 | 2
t2 | 5
t1 | 1
t3 | 13
t1 | 3
t3 | 11
t2 | 4
t2 | 7
t3 | 10
t3 | 10
and i want output similar to this:
Columns
D | E | F
----------------
t1 | 1 | 3
t2 | 4 | 7
t3 | 9 | 13
Explanation: Basically what I need to do is find the lowest and highest values from column B for each different value in column A, and list them neatly as shown in the second example.
Ive worked with VBA before, so if this would have to be done via VBA thats fine. Im just at a loss as to how to accomplish this task. Any help would be appreciated.
EDIT: Forgot to mention, if would make the task simpler, its fine if i have to manually sort the data alphabetically based on col A (thus putting same values together)
I agree with #chrisneilsen that a Pivot Table is the best way to go. If you are set on using formulas, you can try using the following (both entered as arrays - Ctrl+Shift+Enter):
In cell E1, which will represent the minimum value:
=MIN(IF($A$1:$A$12=D1,1,MAX($B$1:$B$12)+1)*$B$1:$B$12)
And in cell F1, which will represent the maximum value:
=MAX(IF($A$1:$A$12=D1,1,MIN($B$1:$B$12)-1)*$B$1:$B$12)
The general idea is that check to see which values in column A are equal to your target value (column D). The result will be an array of 1's where there is a match, and using MIN as an example, the maximum of the column + 1. This is done because we want to set this equal to a value that can't possibly be attained in your current setup, so the maximum value + 1 will ensure that MIN will return a value that is legitimate.
Here is a Pivot Table using Excel 2007. To create, add column headers to your data, select your data and then in the Ribbon click Insert -> Pivot Table. In the dialog box, you decide where you want to put it (it is commonly put in a New Worksheet, so you can leave the default if you want - I left it in the same worksheet for illustration purposes). From there, you can arrange it by dragging each field so it matches the pictures. For the Max/Min fields, just drag the Value field into the Values section twice. Then, in the actual Pivot Table, you can right-click on one of the values in the column and select Summarize Data By -> Min to summarize by the minimum value for each key:

Resources