Hi I am having the data in sharepoint like
Title
ID
S1
101
S1
102
S2
103
S3
104
S3
105
Now I have to create a piechart in PowerApps Showing the Percentage of Each title.
So, I am creating a new column using AddCoulmn(GroupBy,"Title", "Grouped"), "Titles", CountRows(Grouped))
Now I want to create another Column with the distinct titles and its percentage with reference to the ID.
How can I do that?
The AddColumns function can add multiple columns at once. For example, the expression below can be used to add a percentage in addition to the number of titles in the grouping that you have.
With(
{ totalCount: CountRows(dataSource) },
AddColumns(
GroupBy(dataSource, "Title", "Grouped"),
"Titles", CountRows(Grouped),
"TitlePercent", 100.0 * CountRows(Grouped) / totalCount))
Related
I have two different data frames pertaining to sales analytics. I would like to merge them together to make a new data frame with the columns customer_id, name, and total_spend. The two data frames are as follows:
import pandas as pd
import numpy as np
customers = pd.DataFrame([[100, 'Prometheus Barwis', 'prometheus.barwis#me.com',
'(533) 072-2779'],[101, 'Alain Hennesey', 'alain.hennesey#facebook.com',
'(942) 208-8460'],[102, 'Chao Peachy', 'chao.peachy#me.com',
'(510) 121-0098'],[103, 'Somtochukwu Mouritsen',
'somtochukwu.mouritsen#me.com','(669) 504-8080'],[104,
'Elisabeth Berry', 'elisabeth.berry#facebook.com','(802) 973-8267']],
columns = ['customer_id', 'name', 'email', 'phone'])
orders = pd.DataFrame([[1000, 100, 144.82], [1001, 100, 140.93],
[1002, 102, 104.26], [1003, 100, 194.6 ], [1004, 100, 307.72],
[1005, 101, 36.69], [1006, 104, 39.59], [1007, 104, 430.94],
[1008, 103, 31.4 ], [1009, 104, 180.69], [1010, 102, 383.35],
[1011, 101, 256.2 ], [1012, 103, 930.56], [1013, 100, 423.77],
[1014, 101, 309.53], [1015, 102, 299.19]],
columns = ['order_id', 'customer_id', 'order_total'])
When I group by customer_id and order_id I get the following table:
customer_id order_id order_total
100 1000 144.82
1001 140.93
1003 194.60
1004 307.72
1013 423.77
101 1005 36.69
1011 256.20
1014 309.53
102 1002 104.26
1010 383.35
1015 299.19
103 1008 31.40
1012 930.56
104 1006 39.59
1007 430.94
1009 180.69
This is where I get stuck. I do not know how to sum up all of the orders for each customer_id in order to make a total_spent column. If anyone knows of a way to do this it would be much appreciated!
IIUC, you can do something like below
orders.groupby('customer_id')['order_total'].sum().reset_index(name='Customer_Total')
Output
customer_id Customer_Total
0 100 1211.84
1 101 602.42
2 102 786.80
3 103 961.96
4 104 651.22
You can create an additional table then merge back to your current output.
# group by customer id and order id to match your current output
df = orders.groupby(['customer_id', 'order_id']).sum()
# create a new lookup table called total by customer
totalbycust = orders.groupby('customer_id').sum()
totalbycust = totalbycust.reset_index()
# only keep the columsn you want
totalbycust = totalbycust[['customer_id', 'order_total']]
# merge bcak to your current table
df =df.merge(totalbycust, left_on='customer_id', right_on='customer_id')
df = df.rename(columns = {"order_total_x": "order_total", "order_total_y": "order_amount_by_cust"})
# expect output
df
df_merge = customers.merge(orders, how='left', left_on='customer_id', right_on='customer_id').filter(['customer_id','name','order_total'])
df_merge = df_merge.groupby(['customer_id','name']).sum()
df_merge = df_merge.rename(columns={'order_total':'total_spend'})
df_merge.sort_values(['total_spend'], ascending=False)
Results in:
total_spend
customer_id name
100 Prometheus Barwis 1211.84
103 Somtochukwu Mouritsen 961.96
102 Chao Peachy 786.80
104 Elisabeth Berry 651.22
101 Alain Hennesey 602.42
A step-by-step explanation:
Start by merging your orders table onto your customers table using a left join. For this you will need pandas' .merge() method. Be sure to set the how argument to left because the default merge type is inner (which would ignore customers with no orders).
This step requires some basic understanding of SQL-style merge methods. You can find a good visual overview of the various merge types in this thread.
You can append your merge with the .filter() method to only keep your columns of interest (in your case: customer_id, name and order_total).
Now that you have your merged table, we still need to sum up all the order_total values per customer. To achieve this we need to group all non-numeric columns using .groupby() and then apply an aggregation method on the remaining numeric columns (.sum() in this case).
The .groupby() documentation link above provides some more examples on this. It is also worth knowing that this is a pattern referred to as "split-apply-combine" in the pandas documentation.
Next you will need to rename your numeric column from order_total to total_spend using the .rename() method and setting its column argument.
And last, but not least, sort your customers by your total_spend column using .sort_values().
I hope that helps.
I have:
order amount
105 2€
105 4€
105 5.50€
108 1€
108 1€
124 25€
Using Excel powerquery I want to create the colum "total-order". Desired result is:
order amount total-order
105 2€ 11.50€
105 4€ 11.50€
105 5.50€ 11.50€
108 1€ 2.00€
108 1€ 2.00€
124 25€ 25.00€
"total-order" colum, is the sum of the "amount" lines of each "order".
I want to keep all lines. Therefore is not valid as result:
order total-order
105 11.50€
108 2.00€
124 25.00€
The last I know how to do using the option "group by"
Thank you!
full code if data was in range Table1:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "total-order", (i) => List.Sum(Table.SelectRows(Source, each ([order] = i[order]))[amount]), type number )
in #"Added Custom"
You can do this with the PowerQuery UI. Begin with your data:
Define your range as a new table.
PowerQuery -> From Table/Range
Select your data.
Give the query a name -- lets say, SourceData.
Home -> Close & Load
Create a new query that produces the groups and totals.
Select the original data, including the headers.
Power Query -> From Table/Range
Group By as follows:
Merge the source data into the grouping query.
Merge Queries
Select SourceData as the second table from the dropdown.
Choose Right Outer from the Join Kind dropdwn.
Select the order columns in both tables.
Currently, you'll still have the grouped rows. But if you click on the Table entry in the SourceData column, you'll see the original records that are related to this row.
Now, expand each row in the related data to a row in the main data (the groups).
Click on the button on the right of the header of the SourceData column:
Hide the order column:
and press OK.
The final results:
Use the below function.
=IF(A1=A2,SUMIF(A:A,A2,B:B),SUMIF(A:A,A3,B:B))
Column A having order, B having amount. (with header)
I have the following:
Each table represents a game (in this case of CS:GO).
What I want to do is get the sum of all kills, by all players, for each map, like:
Train: 208
Mirage: 103
I'm having some trouble with discriminating for each map. I can either do this in Google Sheets or in Excel.
=QUERY(QUERY(ARRAYFORMULA(SPLIT(TRANSPOSE(SPLIT(SUBSTITUTE(TEXTJOIN(" ", 1, B:B),
"Map", "♦"), "♦")), " ")),
"select Col1, Col3+Col4+Col5+Col6+Col7"),
"select Col1, sum(Col2) group by Col1 label sum(Col2)''")
Table A: Product Attributes
This table contains two columns; the first one is a unique product ID represented by an integer, the second is a string containing a collection of attributes assigned to that product.
product tags
100 chocolate, sprinkles
101 chocolate, sprinkles
102 glazed
Table B: Customer Attributes
The second table contains two columns as well; the first one is a string that contains a customer name, the second is an integer that contains a product number. The product IDs from column two are the same as the product IDs from column one of Table A.
customer product
A 100
A 101
B 101
C 100
C 102
B 101
A 100
C 102
Generated Table
I want to create a table matching this format, where the contents of the cells represent the count of occurrences of product attribute by customer.
customer chocolate sprinkles glazed
A ? ? ?
B ? ? ?
C ? ? ?
I want count instead of ?.
And I want to do this in python.
One more question: If the two starting tables were in a relational database or Hadoop cluster and each had 100 million rows, how might my approach change?
I have a PowerPivot Data Model in Excel 2013. There are several measures that I have grouped into a named set using MDX - something like this:
{[Measures].[Sum of Value1],
[Measures].[Sum of Value2],
[Measures].[Sum of Value3]}
By using this named Set, I can place multiple measures on the rows or columns of an Excel PivotTable in a single action. My question is, is there any way using MDX (or DAX in the PowerPivot screen when working with the individual measures) to filter out or hide the entire set based on a single measure value (whether that measure is included in the set or not)? Preferably, I'm looking for a way to do this without including another member in the set (I.e. Not a measure).
Ror example, if the Sum of Value3 in the above example was zero, I'd want the entire set to be hidden from the pivot table.
I know I could edit the DAX in the Data Model to return BLANK() for each measure included in the set based on the value of another measure, but there may be times I want to show those measures in all cases. This would require writing at least 2 measures for every one I have now which I don't like the thought of doing.
UPDATE:
Sourav's answer looks great, but unfortunately won't work in my particular scenario, I believe, because I'm using the "Create Set using MDX" function (under the Manage Sets option in the Fields, Items, & Sets ribbon menu) within Excel. It will only let me write the MDX as:
IIF([Measures].[Sum of Value3]=0,
{},
{[Measures].[Sum of Value1],[Measures].[Sum of Value2],[Measures].[Sum of Value3]})
And once I add that new set to the PivotTable, it will still display all 3 measures for any members where [Sum of Value3] is 0.
I think I'm going to have to find an approach using DAX and the Excel Data Model measures.
UPDATE 2:
Below is a screenshot to help illustrate. Keep in mind the data source in my example is not an external cube, it's simply an Excel file linked in the Data Model against which MDX queries (with limitations?) can be run. In this example, I would like the set to return only Rows A and C because Sum of Value3 is not zero. However, as you can see, all rows are being returned. Thanks!
You can't choose to hide/unhide members/sets on the fly. Instead, you can use IIF to conditionally return an empty set
WITH SET MyNamedSet AS
IIF([Measures].[Sum of Value3] = 0,
{},
{[Measures].[Sum of Value1],[Measures].[Sum of Value2], [Measures].[Sum of Value3]}
Working example in AdventureWorks for #whytheq(DISCLAIMER - Cube was created by me for testing purposes)
with set abc as
iif([Measures].[Fact Internet Sales Count]>34229,
{
[Measures].[Fact Internet Sales Count],
[Measures].[Extended Amount - Fact Internet Sales]
},
{}
)
SELECT
abc
on 0
from [AdventureWorksDW]
where [Due Date].[Year].&[2004]
As you can see, the scope IS changing the results.
An alternative would be to create a dummy measure that returns null or 1 depending on your [Measures].[Sum of Value3]. Then multiply all other target measures by this dummy measure.
Here is an example of you scenario in AdvWrks:
SELECT
[Product].[Product Categories].[Category].[Components] ON 0
,{
[Measures].[Internet Sales Amount]
,[Measures].[Sales Amount]
,[Measures].[Standard Product Cost]
,[Measures].[Total Product Cost]
} ON 1
FROM [Adventure Works];
Returns this:
Adding the dummy measure and amending the other measures:
WITH
MEMBER [Measures].[isItZero] AS
IIF
(
[Measures].[Internet Sales Amount] = 0
,null
,1
)
MEMBER [Measures].[Sales Amount NEW] AS
[Measures].[Sales Amount] * [Measures].[isItZero]
MEMBER [Measures].[Standard Product Cost NEW] AS
[Measures].[Standard Product Cost] * [Measures].[isItZero]
MEMBER [Measures].[Total Product Cost NEW] AS
[Measures].[Total Product Cost] * [Measures].[isItZero]
SELECT
NON EMPTY //<<<<this is required
{
[Measures].[Internet Sales Amount]
,[Measures].[Sales Amount NEW]
,[Measures].[Standard Product Cost NEW]
,[Measures].[Total Product Cost NEW]
} ON 0
,{} ON 1
FROM [Adventure Works]
WHERE
[Product].[Product Categories].[Category].[Components];
Now this returns:
EDIT
According to your latest edit please just try this (I'm assuming you're using Excel 2013):
Create two new measures to replace two of the existing ones:
Name: "Sum of Value1 NEW"
Definition:
IIF
(
[Measures].[Sum of Value3] = 0
,null
,[Measures].[Sum of Value1]
)
Name: "Sum of Value2 NEW"
Definition:
IIF
(
[Measures].[Sum of Value3] = 0
,null
,[Measures].[Sum of Value2]
)
Now use only these three measures in your pivot and just use the ID dimension in a normal way on rows i.e. do not use the custom set you have already tried.
[Measures].[Sum of Value1 NEW]
[Measures].[Sum of Value2 NEW]
[Measures].[Sum of Value3]
Has ID B should now disappear?