Merging two tables with low quality data - excel

I've been spinning my wheels on this one for a while and have had trouble finding a relatively pertinent answer elsewhere. This seems to me to be pretty convoluted and maybe not even possible, so if that's the case please let me know and I can find other workarounds. And if it would be easier to just refer me to reading materials, etc. please feel free as well.
My goal is to merge two tables together via Power Query in Excel, where one of the tables has typos and empty fields where there shouldn't be. I've attached a picture here for reference:
The second table is a "master table" of sorts, that the first table should cross-reference against. The desired process/result is attached here:
To get the names to match, I've been trying to fine-tune fuzzy match's similarity figure, though I'm not quite there yet as some of the typos are too different. It seems to me that the real difficult part is trying to reconcile the cities based on the correct customer name.
Maybe there is a better way to do this outside of Power Query, but that's just where I started as it made the most sense to me. Any tips or guidance would be greatly appreciated! Thank you in advance.

Related

Applying multiple filters with Tabulator table

I am not well-versed in code. Would someone (for hire) be willing to help me place one of these beautiful tables on my website?
It should be able to do the following:
Sort per column
work with multiple filters
have the ability to update the numbers by linking to a google sheets, etc.
Thank you.
Rick Gonsalves
rick_gonsalves#hotmail.com
Have not tried yet as I am not familiar with code and how to use it.

How can I run a vlookup on vba array

Fairly new to posting here but not new to the site.
I have done a fair bit of Googling on this one but still don’t seem to have the answer so thought I’d post here.
This is high level and no code included just now although can provide later.
I have a template that is completed by customers and within that template are hidden tabs, one of which has a table that is used as part of a vlookup.
What I am trying to achieve here is to read the customers submitted data into account of an array (done) and then do some kind of equivalent to a vlookup on a column in the array and then add the results to a new column in the array.
I’m happy to also read the lookup table into an array or dictionary as I understand this is a far better approach.
I’m just stuck on what to use instead of vlookup and how to achieve the above.
Any thoughts would be appreciated. Thanks in advance.

Huge excel table - sorting into other excel automaticaly

What i want to do
I'm not too sure how to formulate my question so here it goes.
I am helping with a big 20k+ rows excel spreadsheet, that has data sorted by ID. What I want to do is sort that data by company name and export to another excel file.
What I would like to know is - what are the best methods to do that with least manual job, sounds weird, however I am helping my fathers friend and he does this manually every quarter.
My idea of doing this is by importing that spreadsheet into database and then exporting it as I like, however I feel that there is more simple way of doing that.
I'm not sure if I understand your question: 20,000 rows in really not that much (the maximum is about one million), and as far as sorting is concerned, I've made a small screenshot of how to sort an Excel table, based on company value and ID (sorry for the Dutch language), you might record this and turn it into a VBA macro:
Is this what you are looking for or do you need extra information?

Practical tips on documenting Excel Queries, data model tables, pivot tables?

Building a BI system (dashboards) in Excel using imported tables (from excel files). We're using Excel 2016 query, data model, measures using DAX expressions, resulting in more pivot tables (some of which are reloaded into data model), etc.
My question: is there "best practice" on 1) naming these data elements and 2) documenting these bits to have a more complete system documentation.
Background: I'm the senior "hacker" munging these things together. But I need to move this towards being sustainable. I did some prototyping work and when I went back a week later it was challenging to reconstruct my thoughts and relationships...
I've seen folks refer to use of PowerBI flow diagrams to support documentation; but it seems to be more of the "icing on the cake" than the "cake" itself.
So what "bread and butter" documentation approaches have you, more experienced developers, taken to ensure that your systems are clearly documented so that others can pick up where you left off???
For naming, I follow the Kimball Group's advice for data warehouses/marts, e.g.
https://www.kimballgroup.com/2014/07/design-tip-168-whats-name/
I rename many/most Query steps to reference the column or table name, e.g. Added Custom => Added Customer Name, Append Queries => Append Customers. The idea is to be able to pick the right step first time when coming back for maintenance.
You can select all the Queries in the Query Editor window and copy their code, then paste it into Word etc as the starting point for your documentation. You can also screen-shot the Query Editor's Query Dependancies pop-up.
For the Power Pivot logic, try this solution:
https://powerpivotpro.com/2014/03/automatically-create-data-dictionary-for-your-power-pivot-model/

Format table names differently than column names in dropdown?

This may seem like a silly question, but it's taking me way too long to find the answer, so I'm hoping I can get some help here.
If I'm understanding correctly, the sample RedQueryBuilder formats the table names that are related to the initial table in "Proper Case", while the columns of the initial table are all caps. This is wonderfully helpful in distinguishing the tables from the columns.
Developers at my place of work (who are not currently available to ask) implemented the RedQueryBuilder in one of their web apps, and somehow, the table names AND the column names are all "Proper Case." This makes it impossible to tell which are tables and which are columns. I've scoured the code, searched through any available CSS, stepped through the app in the JS debugger, etc, to try to figure out how to format the list of tables + columns the way it's done in the original, but it's taking forever, and I can't seem to find it. Any help pointing me in the right direction would be fantastic!
Thank you!!
In the meta data that is sent to the client you should be able to send "name" and "label" values. The "name" is used when reading/writing the SQL and the "label" is displayed to the user.
You should be able to control it there.

Resources