VB: Filtering data on excel table - excel

In python, using libs to work with excel files, I could do what I want.
But now, because I'm trying to learn VBA, I need to ask this question.
I'm working on a worksheet that has around 12 columns, and 50000 rows.
This data represents Requests sent to the company.
The 5# column represents its code, 10# the time took to finish it.
But, for example, rows 5, 10 and 12 could belong to the same Request, and was just divided for organizational purposes.
I need to treat these data, so that I can:
Column 6# represent the person
who answered the request. So, I need
to put each request on the "person's
worksheet". Also, create this
worksheet for him before starting to
add requests to it.
For each person (worksheet),
contabilize request types (Column
2#) attended by him. I.e., create
another table on its worksheet
showing:
Type_Of_Request | Number_of_ocurrences
Create a final Report Worksheet, showing the same table
above, but accounting all requests
(without person filter)
Obs: I know that most questions on stackoverflow are to solve a specific question, but I'm asking for start routes here.
Or even solutions, if possible.
For explanation purposes, I think that explaining the algorithm used in python will help persons who know a little of python and VBA to help me here.
So, for each issue:
Create a dict that manages the 6# column data.
This dict will have the person's unique name as the key, and for each request that him answered, it will be added to a list pointed to his name (the dict key).
Something like:
{person1: [request1, request2, request3, ...], ... }
Another dict that manages the 2# column data (the request type).
Now, I will have a dict where each entry will have a list showing requests that are of that type.
After positioning all requests, I did a simple sum on the list, and filled a table with (key, sum(dict[key]))
where dict[key] is the list of requests of same type, and a sum on it returns the total of requests of that type.
Something like:
{request_type1: [request1, request2, request3, ...], ... }
Well, same of 2, but applying the algorithm on the initial complete table.
I don't know if VB has a dict type like python has (and helps a lot!), even because I'm new on VB.
Thanks, a lot, for any help.

vba does indeed have a dictionary type, but it's usage may not mirror python's implementation. (see: http://msdn.microsoft.com/en-us/library/aa164502%28v=office.10%29.aspx )
you can also create a user defined type ( see: http://msdn.microsoft.com/en-us/library/aa189637%28v=office.10%29.aspx )
If you have a working solution, that is your best jumping-off point. Many of the python string function etc are probably even named the same or close enough for you to easily find them in the language reference.

You may find this easier with ADO, which works quite well with Excel using the Jet/ACE connection. It also will allow to use rs.CopyFromRecordset to write suitable sets to worksheets.

Related

Dynamically generate list of payment dates considering first/last date and client - ideal for controlling receivables of SaaS and recurrent contracts

I'm building an accounts receivable sheet in Google Sheets.
I would like to register the clients and their contract characteristics (client, payment frequency and price) in one sheet and I would like to dynamically generate the payment dates in another sheet.
The input sheet would look like this:
The output sheet would look like this:
I think it might be something in the QUERY and ARRAYFORMULA universe but I don't know how to configure it. Is there a way to dynamically generate the combination of Date and Client, taking into consideration first and last payment dates?
Sample in this link. If you'd like to use, please fill free to create a copy for yourself and post it in your answer.
Creating a 2D array of concatenated strings of dates and values can be a good first step in these kinds of problems.
I've demonstrated the idea in a tab called MK.Help on this sheet that I also shared in the comment above. This formula can be found in cell A2 and is generating the whole list:
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(Input!A2:A5&"|"&Input!D2:D5+SEQUENCE(1,CEILING(MAX(IFERROR((Input!E2:E5-Input!D2:D5)/Input!C2:C5))),0)*Input!C2:C5&"|"&Input!E2:E5),"|",0,0),"select Col2, Col1 where Col2<=Col3 order by Col2"))
Once you have the data in a big 2D array, you can flatten it out and then split it into it's component parts to make it query'able. I've tried to outline the process to the right of the solution.
#MattKing's answer was really good but I particularly had problems since all my inputs would have dynamic sizes and doing his step by step I couldn't figure how to adapt to this situation.
So, using a lot of Matt's inspiration and some extra research (including this new question) I came to a solution that worked better for me, using multiple pages to come to a final result. Not so classy but works.
I left my solution available in this sheet.
Even though, I've chosen to accept Matt's answer since it worked, it helped me, it looks "more pythonic" and maybe the need to be so dynamic wasn't so clear in the question.

How do one extract information from a dynamic table, automatically through excel functions?

I have been searching high and low for a way to solve my dilemma, in different ways, so I am trying to post both of the things I've been trying to do:
The challenge version 1:
I want to extract the entire row with information tied to the name which is the latest entry of that name in the table. So from the table below I would want to collect the entire row which contains the information: "A, Jack Black, 01.01.2029, 10:20". I simply want to copy the entire row to another sheet. But one important factor is that it has to happen automatically.
So i need functions which can check if: Is there another entry with the same name, higher up in the table? If so, DO NOT COPY THE ROW. If there ain't another entry with the exact same name higher up in the table, COPY THE ENTIRE ROW, to another table, within another sheet.
The challenge version 2:
What I really want to do is count the number of unique people(unique names) per. department, and summarize this in another table. Basically this means that "Jack Black" should be counted as 1 person, in department A.
So the result I want, is a table looking like this (the one beneath), where the number of people does not contain any duplicate people (names). OR it does not function with a dynamic table, which updates the information it contains on the fly. I can make this happen if I am copying from a static table, but as stated above, the table is dynamic and updates with new information every minute...
So far i've tried excel's built in filtering, but this does not work automatically. I've also tried using functions like in this guide: https://excel-bytes.com/how-to-extract-a-dynamic-list-from-a-data-range-based-on-a-criteria-without-filters-in-excel/. However every solution i find seems to need criteria for filtering out duplicates or does not function when copying information from a dynamic table.
Does anyone know how to reach my desired result, without implementing criteria for selecting the rows or counting rows as stated above? VBA code is not an option at the moment :(
In advance, THANK YOU, I've really tried solving this, but I feel like this just might break my head wide open soon if I can't solve it. HEEEEELP!
Sincerely
haakonlu

How to identify a string source using a standalone list of identifiers

I have a really quick questions about which function I need to use for my current conundrum:
I'm building a tool that automatically identifies a retailer from the first 5 digits of the account number (their "code" so to speak).
To illustrate in account number "1111122222" the "11111" will be the retailer code and the "22222" will be the customer's unique ID.
Each retailer can have several dozen unique codes so I have a separate sheet with a code table in it. (Separated because it will be split off into a standalone workbook later on)
Codes table looks like this:
Bobs Burgers | Johns Chicken | Ali's Shwarma
12345 | 56784 |77774
45698 | 33333 |44444
12398 | 99999 |55555
As we receive data in blocks of 20~30 accounts at a time, all I'd like this thing to do check the accounts against the code list and output the name of the retailer. And maybe yell "conflict, abort and run for the border!" if more than one retailer is identified :)
Apologies for the stupid question, but by this point I'm on my ninth cup of coffee and I just can't remember what functions I need to use.
P.S. The reason why I'm making my life difficult and not using a standard lookup table is because higher ups want no manual involvement from the end users with the data, so it's all gotta be identified and forwarded to relevant parties without them touching the data or destinations. I've already got the Importing automated and have the distribution ready to go, just the middle part that sent me for a loop. I'll post the full code of the tool once it's complete in case anyone needs something like this.
Apologies for the brain fart - I figured out the solution. I was trying set codes up as a table with the Retailer as the header, with each retailer in their own column. Which just wasn't working in any way. My less than elegant solution was to reformat the codebook as a "code:Retailer" table which allowed a VLookup to actually pull the data proper, and to have the codes extracted via =LEFT(TEXT(cell),5) function inside a hidden buffer sheet in the workbook rather than via VBA.
I then set up a pivot table in the hidden sheet that gave me a nice percentage value to work off of and set up a data refresh gate at every step in the macros.
Whole thing is a bit slow and will require a bit of manual installation on everyone's PCs but it works now.
P.S. Thanks #Cyril for reminding me of Index - made another one of my projects ten times easier!

Excel: Returning values from a range that haven't already been specified

I have a dataset in long format with a unique ID representing people at different points in time. I already have a list of ID's with the first code, but would like to add a column for the second and third code, if available.
An example of the data, with what the output should look like at the bottom
I already have the first two columns in this example. I was thinking I would specify that I want to look at the values that are not the first code, and output a different code in the range if available, otherwise it returns blank. I also want to be able to use the command with few modifications in a third cell for a third code, again as long as it exists.
Ideally this can all be done without the use of vba, however if that would be easier, then by all means go ahead. Any solution not necessarily following my logic map are appreciated as well. Thanks in advance for the help!

Building a customized, fuzzy and multiple Vlookup

Ok so, twice a month I receive a large file of about 100 rows, which contains 4 columns:
Building name - value - county - state
I´ve to complete 2 other columns based on a master list that have thousands of entries.
I want to produce something very similar to this fabulous add-in (http://www.microsoft.com/en-us/download/details.aspx?id=15011), but a bit simpler and that I could use at work without problems.
What I need to do is the following:
In order to match my input with the master file, I know the county and state must match, but then, the building names can change a bit in each file for the same building (ie "John Miller #34" can be "Miller, John 34 A"), and that the values may vary but not too much.
Based on that, I want to bring from the master to my file, all the entries that may match each of my rows, filtering by County and State first, and then by similarity in name and value.
Could you please share your thoughts on how you´d approach this?
I know this is not a simple thing, but anything may help!
You could also use wildcards to try and match on the primary identifier within the name. from your example, that might be "Miller", for example.
Unfortunately for you, the vlookup "fuzzy logic" is nowhere near reliable for your purpose (see the comment on my answer below for details), and you won't have any indicator as to whether the returned result is accurate or not.
It's possible to get 100% of what you want through some heavy coding in a user-defined function, but this is probably well beyond your comfort zone.
A clunky solution, although somewhat easy to explain and adopt, is to create an "identity column" for every unique scenario that can occur. So, for example:
Then you can import your master sheet and add the same identity column to the left, and perform your vlookup. When a new configuration is added you can just add that to the master list and it will populate in your imported file in future instances.
That said, if you are interested in learning, there have been many people who have walked in your shows and felt your pain. You may want to indulge in this:
http://www.mrexcel.com/forum/excel-questions/195635-fuzzy-matching-new-version-plus-explanation.html
Because what you are truly requesting is an algorithm. It's not a simple thing, but it's very possible. And if you take the time to learn you not only solve your immediate problem, but make yourself marketable as an Excel wiz. Good luck!

Resources