I am working with Excel 2010, Power Query, and PowerPivot.
I have a query named Database that consists of 60+ merged tables containing a total of 2m+ rows. I also have a separate query that consists of two columns PrimaryKey3 and Members (a count of members per month). The entries in PrimaryKey3 are unique, consisting of ID-MMM-YY.
Both queries have PrimaryKey3 in common, however in Database there can be multiple rows with the same PrimaryKey3.
In order to match a member amount to each row in Database, I tried a Left Outer join. There were no errors, but when I try to upload to PowerPivot it says there are only 169K rows. I then tried Full Outer join and Inner Join, and received an error "could not convert value to number," coming from a column already formatted as a text in Database. This column contains numbers and numbers proceeding with a letter: 1234, A234. Every non-blank row has a PrimaryKey3. Why is it trying to reformat my columns/ how do I get around that?
Should I be using a different type of join, or is there another way besides merging to do this?
Hope this makes sense, thank you for any help in advance!
I uploaded both queries to PowerPivot, and created a relationship through PrimaryKey3. I then created a new column in Database with =Related(Enrollment[Members]).
Related
How to get data from 2 columns, in different tables within Excel Power Query, to sum into 1 column?
For example: Table names - Main_Company_HR_Data and Company1_HR_Data -- Fields I want to combine: Main_Company_HR_Data.Gross Company1_HR_Data.Gross.
Within the Data Model; I have established the connection to my data warehouse, and have linked the tables in Power Query as well. I added a column and attempted several suggestions from multiple sites. None work. I have been using: Calculate(SUMX(Main_Company_HR_Data.Gross))+ SUMX(Company1_HR_Data.Gross). And other iterations of the sort, and it doesn't work.
I had this issue, and the way I tackle it, is to make it as simple as possible.
Try this:
Enter this code as a formula, adding a column, in the tab which will have the field summed.
=LOOKUPVALUE('Company1_HR_Data'[Gross],'Company1_HR_Data'[Unique_ID],'Main_Company_HR_Data'[Unique_ID])
Create a column to sum the two in the same tab you created the above formula.
='Main_Company_HR_Data'[Gross]+'Company1_HR Data'[Gross]
I have mismatched data lines in a power query so I am attempting to renumber/reorganize the data then merge the information to realign.
Here, I want the data in Column Answer 2 to go into column Answer, cells 6,7,11,12.
I've indexed each of my files and merged the queries. However, when I expand the merged queries, PQ seems to randomize my data.
I'm new to PQ so I don't really write the 'code', just use the user interface.
As you can see from the second image, the data comes out in the wrong order.
I merged two tables, then added index column and moved it to the beginning, then expanded the merged table and deleted index column. The order of rows has left the same, as in the source table.
I have 2 big tables (1 has 690K Rows, 2nd one has 890K rows).
They have the same format and columns:
Username - Points - Bonuses - COLUMN D... COLUMN - K.
Lets say in the first table i have the "Original" usernames and in the 2nd table i have "New" usernames + Some of the "Original" usernames (So people who are still playing + people who are new to the game).
What I'm trying to do is to merge them so i can have in a single table (sum up) their values.
I've already made my tables proper System Tables.
I created their connection in the workbook.
I've tried to merge them but i keep getting less rows than i expect to have, so some records are being left out or not being summed.
I've tried Left Outer, Right Outer, Full Outer with no success.
This is where im standing:
As #Jenn said, i had to append the tables instead of merging them and i also used a filter inside PowerQuery to remove all blanks/zeros before loading it into Excel, i was left with 500K Unique rows, instead of 1.6 Million. Thanks for the comment!
I would append the tables, as indicated above. First load each table separately into PowerQuery, and then append one table into the other one. The column names look a little long and it may make sense to simplify the column names so that the system doesn't read them as different columns due to an inadvertent typo.
I have a single column table of customer account numbers and a main table containing 400,000 records pulling from an access database. I want to remove all records from the table where the customer account number can be found in the single column table.
The merge query capability in power query allows me to return only the records where there is a match on the customer list (in addition to a variety of other variations on this theme) but I would like to know whether there is a way to invert this so that I return all records where the customer number does not appear in this list.
I have achieved this already by using the List.Contains function and adding a custom column to identify the rows to exclude and then filtering them out, but I think this is severely impacting the performance of my workbook. Refreshing the table that initially has 400,000 rows prior to this series of transformations takes a very long time, and all queries that depend on this table then also take a long time to refresh.
Thank you
If you do a Left Anti Join of your table with a single column, this will give you your table filtered to only have the rows which do not match to the single column.
I have imported a bunch of data using PowerQuery into a single table and am building dashboard reporting. I have been using Pivot Tables to build my reports, which has worked fine so far.
However, I've come to a point though where I want to simply show the count of multiple columns (calculated fields). So I have column A,B,C,D, and want to show the count each of each. But, I don't want them to be subsets (or children) of one another, and I don't want to build a bunch of Pivot Tables (file is already getting pretty big, and I want them row by row for easy viewing). Any suggestions?
Also, I am using the "Columns" field already to show the counts by certain weeks (week one, week two, etc.).
Thanks,
-A
Thanks for the follow-up. Within PowerPivot, I have four calculated fields/columns that are True/False for each column. I want to know how many times each of those columns were marked "True" (I can rename the "True" field to distinguish between which field it's referencing). But I don't want four pivot tables. Right now I can only think of making four pivot tables, filtering out the false for each one, then hiding the rows so the "True" values stack on top of one another. If I put all the four fields together in the same Pivot, the three below the first become subsets. I don't want subsets, just occurrence counts.
Does this help provide clarification?
If I understand you correctly, here's an example that shows what you're trying to achieve:
The table on the left has the TRUE/FALSE entries and the PivotTable on the right just shows the number of true items in each of those columns.
The format of the DAX measure to produce these count totals is:
[Count of A]=CALCULATE(COUNTROWS(PetFacts),PetFacts[A]=TRUE)
(Apologies to any parrot owners who may get upset that I have inadvertently re-classified their pets as cold-blooded!)