EXCEL - Combine rows based on ID with no loss of data which is spread across columns - excel

I have received an export from a database which contains a huge amount of duplicated records.
There are approx 8000 records with over 100 columns. Issues with data relating to the unique ID being spread across about 5 columns are causing duplications. I expect about 1500 actual unique records.
I have attached a simplified version of what I have and what i'm trying to achieve.
I feel like there could be a solution along the lines of of: merge the rows, if data = the same OK otherwise take non-nulls. Is there something that could be down in power query?
Thanks!
Helen
enter image description here

Apply a simple GROUP (right click on id column and select Group) in Power Query as shown below.
Here is the final output-

Related

How to get data from 2 columns, in different tables within Power Query, to sum into 1 column?

How to get data from 2 columns, in different tables within Excel Power Query, to sum into 1 column?
For example: Table names - Main_Company_HR_Data and Company1_HR_Data -- Fields I want to combine: Main_Company_HR_Data.Gross Company1_HR_Data.Gross.
Within the Data Model; I have established the connection to my data warehouse, and have linked the tables in Power Query as well. I added a column and attempted several suggestions from multiple sites. None work. I have been using: Calculate(SUMX(Main_Company_HR_Data.Gross))+ SUMX(Company1_HR_Data.Gross). And other iterations of the sort, and it doesn't work.
I had this issue, and the way I tackle it, is to make it as simple as possible.
Try this:
Enter this code as a formula, adding a column, in the tab which will have the field summed.
=LOOKUPVALUE('Company1_HR_Data'[Gross],'Company1_HR_Data'[Unique_ID],'Main_Company_HR_Data'[Unique_ID])
Create a column to sum the two in the same tab you created the above formula.
='Main_Company_HR_Data'[Gross]+'Company1_HR Data'[Gross]

Comparing two tables of different size, with multiple columns in VBA

Looking to use VBA to compare two tables, with three columns each against each other. Beginner here and very lost.
They may have a different amount of entries each, and there may be some in table A that aren't in table B, and vice versa
Some of the individual Columns may match but trying to work out how to make sure all three columns are compared as one against all three columns in the other table
For example
xyz123 55.50 12/07/21 if compared with XYZ123 54.55 12/07/21 will show up as not a match, because the middle column is a different number.
Have attached a picture below. For the most part, and unlike the photo, each table will be in a completely random order, and its unlikely that there will be the same entry in table 1, row 1, as table 2 row 1
Ideally, I'm trying to create two new table to the right of the original tables, the first one being the entries table 1 has, that table 2 does not have. The second one being the entries table 2 has, that table 1 does not have.
Have attached an example below of the end result I'm looking for out of this. The four rows on the left are entries that the first table has but the second table doesn't, and the rows to the right are all entries that the second table has, but the first table does not.
I've tried to search on this but haven't found something that matches what I've got, and I'm struggling to adapt someone else's code to my specific problem
Any help on this would be greatly appreciated
Maybe not a direct answer to your problem but is this data also in a database somewhere or are you familiar with Ms Access? As you could open the tables in Access, and it is pretty easy to do this kind of thing with data bases.
If not, then yes, it is do able with VBA. Numerous ways of doing it.
The simplest is to scroll through one table a line at a time and compare it with every row in the other table and match or not. This will work with small tables and be easy and quick but for large data tables it would be wasteful and may take a long time to complete.

SSRS Cells auto-merge

I'm having trouble unmerging cells on the report.
3 Suppliers for the query
I have a SQL query that shows 3 instances of a supplier (left joined to contact) as shown below. However, when running the report for the query the 3 instance of the supplier is merged into one. This is not desirable in my case because when exporting the report to excel, I'd like to be able to sort columns based on other properties, however, this would not be possible due the the merging of the rows. How can I get results to show individually?
Cells are Merged on the report
Within the properties of each Row Group you can specify which columns to group on. You generally don't need a separate group for each field, but that's OK. In your last group, the one called "(Details)", if it is not grouped by anything, it will show one row per line of results from the query. So take a look at what it's grouped by. As long as the rows are in your dataset, the report will group or show them based on how you configure the grouping here. Grouping on nothing means it will show all rows.
Another tip is to align the end of your header textbox with the line of one of your columns. This will prevent it from creating an extra column in Excel for the "City" field.
Your report does not need all of those groupings - the SSRS grouping is not like SQL. You should only group when you want to aggregate data on that field. Normally you might have a company with its address in various fields in one group but you only need to group once on the Company Name or (preferably) ID - not on each field and not a separate group for each. You could then show details of various invoices in other columns that aren't grouped.
But since you want to display the company data on each row, you would not want ANY grouping on the company.
To fix your issues, remove all the groupings (but not the rows) and just leave the detail group (which doesn't have a Grouping).
You can check out MS Docs: Understanding Groups for a better explanation.

Pivot table with registers duplicate in 2 row

I have a question about how to summarize a list of data, I attached an image of how the data is presented.
The question is how to determine (summarize) the total of activities per unit considering that a person works in 2 units.
You could define that the person working in A / B does 50% of activities for each unit.
As the list of registers is very extensive the idea is to be able to automate, try with a PivotTable and did not give me result.
Any suggestions would be appreciated (xls, sql, etc).
data
http://ge.tt/381feDj2 -> Excel fILE
Am really not the best of people when it comes to Pivot Tables, but assuming i know what you are asking for.
1) Add one extra column (dummy) and just put 1 in it, this will be used to sum the number of events that the criteria occurs.
2) Select the whole table and then Insert=>Pivot Tables
3) Click Ok
4) Set Rows (fecha); Values (dummy); and the rest to Filters.
Then you can choose to filter your output the way you want. If you want A and A/B you can select multiple in the filter options.

Missing rows when merging

I am working with Excel 2010, Power Query, and PowerPivot.
I have a query named Database that consists of 60+ merged tables containing a total of 2m+ rows. I also have a separate query that consists of two columns PrimaryKey3 and Members (a count of members per month). The entries in PrimaryKey3 are unique, consisting of ID-MMM-YY.
Both queries have PrimaryKey3 in common, however in Database there can be multiple rows with the same PrimaryKey3.
In order to match a member amount to each row in Database, I tried a Left Outer join. There were no errors, but when I try to upload to PowerPivot it says there are only 169K rows. I then tried Full Outer join and Inner Join, and received an error "could not convert value to number," coming from a column already formatted as a text in Database. This column contains numbers and numbers proceeding with a letter: 1234, A234. Every non-blank row has a PrimaryKey3. Why is it trying to reformat my columns/ how do I get around that?
Should I be using a different type of join, or is there another way besides merging to do this?
Hope this makes sense, thank you for any help in advance!
I uploaded both queries to PowerPivot, and created a relationship through PrimaryKey3. I then created a new column in Database with =Related(Enrollment[Members]).

Resources