Multilevel Sum up (related child levels to its parent level) without macro - excel

in one of my workbooks ı will have dynamic Bill of Material like (BoM) structure. Table will be dynamic so indenture levels can be change in time.
constants are:
only childs can take value and parents will get their value from
their child.
Parent value = sum of its children value
Aim is:
any parent labeled row is the sum of its children's values. level (n+1) are child of level n
for example
1 {30+50} parent
2 {10+10+10} parent
3 10 child
3 10 child
3 10 child
2 50 parent
what I tried:
I could achieve parent/child labeling. assuming Level heading is in
cell A1 then related formula is:
=IF(NOT(ISBLANK(A3));IF(A3>A2;"parent";"child");IF(A46=A45;"child";"parent"))
searched the web especially SO however couldn't find a beneficial
example
After searching a lot, I concluded that it can be achieved with offset,match,index,row and usage of array formulas may be however ı couldn't achieve
These dynamic structure has lots of rows so it will be really beneficial to me to make it auto. I can't run a macro (not allowed)
I think solution will be a long command sentence and If someone can help with some explanations also (logic, a very brief aim of the substep maybe) it'll be very appreciated
I like here cause I learn a lot (rather than getting a ready solution until next same case)
I studied with the example data below. (edit: "Manually Calculated Vals" column added to clarify what is being asked for. Actually that column is what I need done auto by excel)
edit2: there was an error in "Manually Calculated Vals" column. Thanks to XORLX. I corrected it.
regards
Level Value Parent/Child Manually Calculated Vals
0 parent 1815
1 parent 668
2 parent 110
3 19 child
3 91 child
2 parent 330
3 parent 200
4 40 child
4 79 child
4 81 child
3 60 child
3 42 child
3 28 child
2 3 child
2 35 child
2 parent 137
3 parent 113
4 46 child
4 67 child
3 24 child
2 53 child
1 parent 1147
2 parent 195
3 96 child
3 99 child
2 parent 325
3 parent 142
4 59 child
4 83 child
3 40 child
3 79 child
3 64 child
2 parent 240
3 parent 151
4 80 child
4 71 child
3 89 child
2 parent 157
3 57 child
3 100 child
2 parent 169
3 91 child
3 20 child
3 58 child
2 61 child

Assuming your table is in A1:C46 (with headers in row 1), put this array formula** in D2:
=IF(C2="child","",SUM(B3:INDEX(B3:B$46,LOOKUP(10^10,MATCH({6,1},SEARCH("T",(A3:A$46<=A2)&"T"),0))-1)))
Copy down as required.
Regards
**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).

Related

Is there a way you can determine the minimum value for each independent variable to meet the desired value of a dependent variable in Python?

I am currently working on a decision support system for Licensure Examination Performance using Python but I stumbled with this problem. I want to determine which AREA should an examinee should concentrate on to be able to pass or at least get a Rating of 75.
Suppose I have the following dataframe in Python
Age Sex_M HS_GWA Col_GWA Major Passed_P Rating
21 1 85 90 1 1 85
23 0 87 88 3 1 75
19 0 91 92 2 1 77
20 0 86 85 0 1 80
20 1 76 86 1 0 65
22 1 88 75 2 0 70
I have SUCCESSFULLY implemented the classification (when the target is Passed_P) and regression (target variable is Rating) algorithms of SKLearn in creating prediction models and used them for prediction.
THE CHALLENGE:
What if HS_GWA is now the TARGET variable while Rating becomes constant at 75 and become part of the independent variables. How can we use the independent variables to determine/forecast the minimum value of the DEPENDENT variable?
What if HS_GWA is now the TARGET variable while Passed_P becomes constant at 1 and become part of the independent variables. How can we use the independent variables to determine/forecast the minimum value of the DEPENDENT variable?

sort pyspark dataframe within groups

I would like to sort column "time" within each "id" group.
The data looks like:
id time name
132 12 Lucy
132 10 John
132 15 Sam
78 11 Kate
78 7 Julia
78 2 Vivien
245 22 Tom
I would like to get this:
id time name
132 10 John
132 12 Lucy
132 15 Sam
78 2 Vivien
78 7 Julia
78 11 Kate
245 22 Tom
I tried
df.orderby(['id','time'])
But I don't need to sort "id".
I have two questions:
Can I just sort "time" within same "id"? and How?
Will be more efficient if I just sort "time" than using orderby() to sort both columns?
This is exactly what windowing is for.
You can create a window partitioned by the "id" column and sorted by the "time" column. Next you can apply any function on that window.
# Create a Window
from pyspark.sql.window import Window
w = Window.partitionBy(df.id).orderBy(df.time)
Now use this window over any function:
For e.g.: let's say you want to create a column of the time delta between each row within the same group
import pyspark.sql.functions as f
df = df.withColumn("timeDelta", df.time - f.lag(df.time,1).over(w))
I hope this gives you an idea. Effectively you have sorted your dataframe using the window and can now apply any function to it.
If you just want to view your result, you could find the row number and sort by that as well.
df.withColumn("order", f.row_number().over(w)).sort("order").show()

Ranking Dates Based on Another Column - Spotfire

Does anyone know of way to circumvent the Spotfire limitation for using the OVER function to RANK or order dates when using a custom expression?
Providing a little background, I am trying to identify or mark a lease based on the below data as 1, 2, 3 etc. For example, since we see twice 63 in the left column, I would like to return a 1 and a 2 to identify the two different leases, starting on 1/1/2016 and 8/1/2016. Then a 1 and 2 for 72, a 1 for 140 and so one. Unfortunately, OVER functions can only be used with aggregation methods and I don't know of another method to produce the result that I am looking for.
Tenant Lease_From Lease_To Tenant_status
63 1/1/2016 1/31/2017 Current
63 8/1/2017 7/31/2018 Current
72 10/1/2016 7/31/2017 Current
72 8/1/2017 7/31/2018 Current
140 2/1/2017 7/31/2018 Current
149 8/1/2016 7/31/2017 Current
149 8/1/2017 7/31/2018 Current
156 1/15/2017 3/31/2018 Current
156 4/1/2018 3/31/2019 Current
Use this:
Rank([Lease_From], [Tenant])
Gives this as the result:
Tenant Lease_From Lease_To Tenant_status Rank([Lease_From], [Tenant])
63 1/1/2016 1/31/2017 Current 1
63 8/1/2017 7/31/2018 Current 2
72 10/1/2016 7/31/2017 Current 1
72 8/1/2017 7/31/2018 Current 2
140 2/1/2017 7/31/2018 Current 1
149 8/1/2016 7/31/2017 Current 1
149 8/1/2017 7/31/2018 Current 2
156 1/15/2017 3/31/2018 Current 1
156 4/1/2018 3/31/2019 Current 2
please consider #blakeoft's answer as the correct one!
that said, as an FYI, First() is considered an aggregation method, and OVER statements can be included inside of an If()! so you can accomplish the same thing with an expression like:
If([Lease_From] = First([Lease_From]) OVER ([Tenant]), 1, 2)
when you combine If() and OVER in this way, you can get some really cool and powerful visualizations, BUT you do lose the ability to mark data effectively. this is because the expression is evaluated from the context of the If() rather than the OVER; in other words, all rows are considered instead of only the ones selected.
you can get around this with some black magic (AKA data functions) but it's a bit contrived.
again, in this situation, Rank() is absolutely the correct solution.

How can I align columns where the biggest number or greatest string is the align indicator?

How can I right align (and left align?) a block of numbers or text in vim like this:
from:
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
to this:
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
That means the biggest number or greatest string in every column doesn't move.
In the first column it is 45+34, in the second column 209+120, in the third column 300 and in the last column 12.
Have a look at the align plugin, it can do this and much more. Great tool in your utility belt!
Found here
After some serious vimhelp/reading I found the correct AlignCtrl mapping...
Visually select the table, e.g. by using ggVG, then do a \Tsp i.e. <leader>Tsp
Then I get this:
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
From vimhelp:
\Tsp : use Align to make a table separated by blanks |alignmap-Tsp|
(right justified)
You can look into the Tabularize plugin. So if you have something like
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
just select those lines in the visual mode and type :Tab/ and it will format it as
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
Also, it looks like you don't have an equal number of spaces separating the numbers at the moment. So before you use the plugin, replace all the multiple spaces with a single space with the following regex:
%s![^ ]\zs \+! !g
With the Align plugin you can select the rows you want to align and hit :
<Leader>Tsp
From Align.txt
\Tsp : use Align to make a table separated by blanks |alignmap-Tsp|
(right justified)
(The help mention \ because it is the default leader but in case you have changed it to something else you must adapt accordingly)
Just trying on my install, I got the following result :
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
In my opinion Align plugin is great but the "align maps" and various commands are not really easy to remember.
With the Align and AlignMaps plugins: select using V, then \anum (AlignMaps comes with Align). One advantage of \anum is that it also handles decimal points (commas) and scientific notation.
I think the best thing to do is to first eat all multiple spaces with
:{range}s/ \+/ /g
And then call Tabularize
:Tab / /r1
Or change that r to l.

Excel date/product count to specified limit

Column A "Sales Dates", Column B "=A2-A1" for "Date Diff", Column C "Customer Name", Column D "Item", Column E "Items Ordered Count"
My issue is I have to do a running 30 day total for each customer to see that specific items are not being ordered above "x" number within any 30-day period.
Does anyone have any ideas?
I may not be fully understanding your question, but I don't think you can do what you ask in excel. This might be a situation where a database that can do SQL might come in handy.
The best I can come up with in excel is a Pivot Table, with the customers as rows, dates as columns (group by month), and sum of Items Ordered in the data area. Then conditional format the data area to highlight values > your limit.
Perhaps if you provide some sample data & output I can come up with something more like what you need.
The formula would look something like this:
{=SUM(IF((A$2:A2>=A2-29)*(D$2:D2=D2),E$2:E2,0))}
It should be entered into cell F2 and copied down to the last row of your data. I pasted in a test spreadsheet below so you can see where things go (sorry for the formatting--hopefully it will look better if you paste it into Excel).
IMPORTANT: This is an array formula, so after you type in the formula (and don't type in the braces {} when you do), you must press Ctrl-Shift-Enter instead of just Enter (see this link for more details).
What does the formula do? It does two loops:
First, it loops through all the Sales Dates from the beginning of the log to the current row and checks if each date is between the date of the current row and 29 days earlier (which makes a 30-day window). (By "current row" I mean the row where the formula is located.)
Second, it loops through all the Items from the beginning of the log to the current row and checks if there is a match with the Item of the current row.
For any row where both checks are true (the "*" in the formula does an "and" operation), Items Ordered Count is added to the sum, otherwise zero is added to the sum. So, when it's finished, you have a count for each row of how many orders there were in the past 30 days for that item.
HTH,
-Dan
Sales Dates Date Diff Customer Name Item Items Ordered Count 30-Day Count
1/1/2009 0 dfsadf 11336 70 70
1/2/2009 1 asdfd 10218 121 121
1/3/2009 1 fsdfjkfl 10942 101 101
1/6/2009 3 slkdjflsk 13710 80 80
1/7/2009 1 slkdjls 10480 127 127
1/9/2009 2 sdjjf 11336 143 213
1/11/2009 2 woieuriwe 11501 84 84
1/14/2009 3 owqieyurtn 10191 78 78
1/15/2009 1 weisd 10480 113 240
1/16/2009 1 woieuriwe 12024 133 133
1/17/2009 1 vkcjl 13818 125 125
1/20/2009 3 sdflkj 11336 128 341
1/23/2009 3 jnbkdl 10480 141 381
1/25/2009 2 pqcvnlz 10480 137 518
1/27/2009 2 hwodkjgfh 12878 80 80
1/28/2009 1 zjdnfg;pwlkd 10942 123 224
1/31/2009 3 zlkdjnf;psod 13173 93 93
2/2/2009 2 zlknpdodfg 11336 119 390
2/4/2009 2 zjhdfpwskjh 12004 57 57
2/5/2009 1 asdfd 10218 121 121
2/8/2009 3 fsdfjkfl 10942 101 224
2/11/2009 3 slkdjflsk 13710 80 80
2/14/2009 3 slkdjls 10480 127 405
2/16/2009 2 sdjjf 11336 143 390
2/18/2009 2 woieuriwe 11501 84 84
2/21/2009 3 owqieyurtn 10191 78 78
2/24/2009 3 weisd 10480 113 240
2/25/2009 1 woieuriwe 12024 133 133
2/27/2009 2 vkcjl 13818 125 125
2/28/2009 1 sdflkj 11336 128 390

Resources