Consider that I have a table of salesman and I would like to know the average value if we do not count the current salesman:
|Salesman|sum(sales)| avg[other](sales)|
|-------------------|------------------|
|A | 100 | 50 |
|B | 50 | 66.6 |
|C | 50 | 66.6 |
|D | 50 | 66.6 |
----------------------------------------
Is that possible easily with set analysis? My real case is a bit more complicate, I will go through the aggregate function, but I dont know how to limit the set analysis to ignore the current row in pivot and take all the other rows for the current format.
In reality there are three dimensions for which the result is delimited and I would like to get the average over the two dimension, but with the third dimension being other than the current.
E.g. imagine the dimensions are Sales_City, Sales_branch and Salesman, then I want for each combination of Sales_City, Sales_brach and Salesman to get the average of Sales in the given Sales_City and Sales_Branch but over all Salesman other than the Salesman from the current row.
I hope it is atleast a bit understandable what I want to achieve.
Thank in advance!
Think that this can be solved without set analysis. The calculation below will give you the same result
( sum(total Sales) - sum(Sales) ) / ( count(total Salesman) - 1 )
And this is the result
Related
I am calculating fraction of orders that were not filled from a large list of order lines. I am using datafusion crate to perform analysis. I want to build a table that looks as shown below:
+--------+--------------+---------------+--------------+
| Month | Total Orders | Missed Orders | Missed Ratio |
+--------+--------------+---------------+--------------+
| 201803 | 10 | 3 | 0.3 |
+--------+--------------+---------------+--------------+
To achieve this I have return following code:
let result = record_count
.select(vec![col("Month"),
col("Total Orders"),
col("Missed Orders"),
(col("Missed Orders").cast_to(&DataType::Float64, &m_order_schema).unwrap() / col("Total Orders").cast_to(&DataType::Float64, &t_order_schema).unwrap()).alias("Service Level")])?;
The total orders and missed orders column as integers so, I am casting them to float to get fraction. But, Service Level column comes out as integer with all zeros. Result looks as shown below:
+--------+--------------+---------------+--------------+
| Month | Total Orders | Missed Orders | Missed Ratio |
+--------+--------------+---------------+--------------+
| 201803 | 10 | 3 | 0 |
+--------+--------------+---------------+--------------+
Question: How to perform float operations with integer columns?
I don't think many people are monitoring stack overflow for DataFusion issues and you might get a quicker response by filing an issue at https://github.com/apache/arrow-datafusion/issues
I have two tables (and more coming) of items with each a unique id.
However, in table 1 (month 1) there might be 5 items.
In table 2 there might be 4 of the 5 previous ones (still the same id) and 1 other one.
and in table 3 there might be 2 left of those (also still the same id) and 8 new ones.
What i try to do is compare the values of those same ID's. Preferable looking at the latest table and only use those current ID's as the old ones that are no longer in the lists became irrelevant.
Table 1
+-----+--------+------------------+
| id | name | times sold total |
+-----+--------+------------------+
| 11 | banana | 100 |
| 23 | apple | 0 |
| 66 | bread | 80 |
+---+-+--------+------------------+
Table 2
+-----+--------+------------------+
| id | name | times sold total |
+-----+--------+------------------+
| 11 | banana | 192 |
| 23 | apple | 0 |
| 71 | cookie | 30 |
+---+-+--------+------------=-----+
Actually the most relevant information I am trying to retrieve is....
which item is in the newest table that is ALSO in the first table and has not sold anything since the first table. (or not sold a lot).
I tried using power pivot in excel, but I just can not get it to work.
ID's always stay the same. all other columns can vary (also the name can change).
It seems very simple to do, but I just need someone to tell me the steps in power pivot.
Or maybe even another tool, as long as i reach my goal.
(I know i could paste every new column of "times sold total" next to the old table and compare with a formula, but the items are not always the same so i don't think i can easily copy paste it.
I hope i explained clear what I am trying to do.
I am trying to create an MDX measure in Excel (in OLAP Tools) that will count how many members there are for every other item in another dimension. As I don't know the exact syntax and notation for MDX and OLAP cubes I will try to simply explain what I want to do:
I have a pivot table based on an OLAP Cube. I have a Machine Number field stored in one dimension, that is the "parent" and for every machine number there is a number of articles that were produced (in certain period of time). Those articles are represented by Order Numbers. Those numbers are stored in another dimension. I would like the measure to count how many order numbers there are for every machine number.
So the table looks like this:
+------------------+----------------+
| [Machine Number] | [Order Number] |
+------------------+----------------+
| Machine001 | |
| | 111111111 |
| | 222222222 |
| | 333333333 |
| Machine002 | |
| | 444444444 |
| | 555555555 |
| | 666666666 |
| | 777777777 |
+------------------+----------------+
and I would like the result to be:
+------------------+----------------+------------+
| [Machine Number] | [Order Number] | [Measure1] |
+------------------+----------------+------------+
| Machine001 | | 3 |
| | 111111111 | |
| | 222222222 | |
| | 333333333 | |
| Machine002 | | 4 |
| | 444444444 | |
| | 555555555 | |
| | 666666666 | |
| | 777777777 | |
+------------------+----------------+------------+
I've tried using the COUNT function with EXISTING as well, but it wouldn't work (always showing 1, or the same wrong number for every machine). I believe that I have to somehow connect those two dimensions together so the Order Number is dependent to Machine Number, but lacking the knowledge about MDX and OLAP Cubes I don't even know how to ask Google how to do that.
Thanks in advance for any tips and solutions.
Your problem basicly is, you have two attributes in diffrent dimensions. You want to retrive the valid combinations of these attribute, further you want to count the number of attribute values avaliable in the sceond attribute based on the value of the first attribute.
Based on the above problem statement, in an OLAP cube a fact table or a Measure defines the relations between attributes of diffrent dimension linked to the Measure\Fact-Table. Take a look at the example below.(I have used the SSAS sample db Adventureworks)
--Iam trying to find the promotions that were offered for each product category.
select
[Measures].[Internet Sales Amount]
on columns,
([Product].[Category].[Category],[Promotion].[Promotion].[Promotion])
on rows
from
[Adventure Works]
Result
The result is cross-product of all the product categories and the promotions. Now lets make the cube return the valid combinations only.
select
[Measures].[Internet Sales Amount]
on columns,
nonempty(
([Product].[Category].[Category],[Promotion].[Promotion].[Promotion])
,[Measures].[Internet Sales Amount])
on rows
from
[Adventure Works]
Result
Now we indicated that it needs to return only valid combinations. Note that we provided a measure that belonged to the fact connecting the two dimensions. Now lets count them
with member
[Measures].[test]
as
count(
nonempty(([Product].[Category].currentmember,[Promotion].[Promotion].[Promotion]),[Measures].[Internet Sales Amount])
)
select
[Measures].[Test]
on columns,
[Product].[Category].[Category]
on rows
from
[Adventure Works]
Result
Alternate query
with member
[Measures].[test]
as
{nonempty(([Product].[Category].currentmember,[Promotion].[Promotion].[Promotion]),[Measures].[Internet Sales Amount]) }.count
select
[Measures].[Test]
on columns,
[Product].[Category].[Category]
on rows
from
[Adventure Works]
first of all, thank you in advance.
the problem I am facing is I have two different values I need to combine when I lookup against a different table, however I do not know which columns those two combinations will be, and they can be different per row. hopefully, the example will help
look up table
ID | Benefit | Option | Tier | Benefit | Option | Tier
123| 1 | 1 | 3 | 2 | 7 |3
456| 2 |3 |1 |1 |3 |2
current table
ID | Benefit |
123 | 1
123 | 2
456 | 1
456 | 2
the example i am giving there is only two posibility it can be in but my actual program is it could be in maybe 20 different location. the one positive i have is that it will always be under the benefit column, so what i was thinking is concat benefit & 04 and using the index match. i would like to dynamically concat based on the row my lookup is on
here is what i got so far but its not working
=INDEX(T3:X4,MATCH(N4,$S$3:$S$4,0),MATCH($O$3&O4,T2:X2&ADDRESS(ROW(INDEX($S$3:$S$4,MATCH(N4,$S$3:$S$4,0))),20):ADDRESS(ROW(INDEX($S$3:$S$4,MATCH(N4,$S$3:$S$4,0))),24),0))
where
ADDRESS(ROW(INDEX($S$3:$S$4,MATCH(N4,$S$3:$S$4,0))),20) does return T3
and ADDRESS(ROW(INDEX($S$3:$S$4,MATCH(N4,$S$3:$S$4,0))),24) returns x3
so i was hoping it would combine benefit&1 and it would see its a match on t 3
I guess you are trying to find a formula to put in P4 to P7 ?
=INDEX($S$2:$X$4,MATCH(N4,$S$2:$S$4,0),SUMPRODUCT(($S$2:$X$2="wtwben")*(OFFSET($S$2:$X$2,MATCH(N4,$S$3:$S$4,0),0)=O4)*(COLUMN($S$2:$X$2)-COLUMN($S$2)+1))+1)
If the values to return are always numeric and there is only one match for each ID/Benefit combination (as it appears in your sample) then you can get the Option value with this formula in P4 copied down
=SUMPRODUCT((S$3:S$4=N4)*(T$2:W$2="Benefit")*(T$3:W$4=O4),U$3:X$4)
[assumes the headers are per the first table shown in your question, i.e. where T2 value is "Benefit"]
Notice how the ranges change
....or to return text values.....or if the ID/Benefit combination repeats this will give you the "first" match, where "first" means by row.
=INDIRECT(TEXT(AGGREGATE(15,6,(ROW(U$3:X$4)*1000+COLUMN(U$3:X$4))/(S$3:S$4=N4)/(T$2:W$2="Benefit")/(T$3:W$4=O4),1),"R0C000"),FALSE)
Maybe I'm going about it all wrong so I'm open to all suggestions.
I am trying to assign a grade (A-F) to a row based on a percentage. The criteria is as follows:
A = 0
B = +/-2
C = +/-15 (between 2 and 15 %)
D = +/-25 (between 15 and 25 %)
F = +/-26 or more
| **Percent Remaining** | **Grade** |
| :-------------------: | :-------: |
| 0.00% |A
| -1.77% |B
| 5.5% |C
| -18.53% |D
| 27.4% |F
So these are percentages of budgets spent so the criteria needs to be positive or negative so, for example overspending by 1.77 percent would have return a value of -1.77% and a "B" grade would need to be assigned or under spending returns a positive number which should yield the same results. I don't know why the markdown tips are not working for me for tables.
Thanks in advance
Create a lookup table with the Absolute thresholds and the corresponding result.
Then use VLOOKUP on the Absolute of the value:
=VLOOKUP(ABS(A1),C:D,2,TRUE)
Or you can use a hard coded INDEX/MATCH:
=INDEX({"A","B","C","D","F"},MATCH(ABS(A1),{0,.0001,.0201,.1501,.2501}))