MATCH-formula where 'lookup_value' is array - excel

I have 3 Excel-files (automated exports) that contain the following information:
1. The total list of shelves in one particular store:
| Shelf_code |
|------------|
| AB01 |
| AA02 |
2. The total list of all shelves linked to each article
| SKU_code | Shelf_code |
|----------|------------|
| 111 | AA01 |
| 111 | AB01 |
| 111 | AC01 |
| 112 | AA01 |
3. The list of all available SKUs
| SKU_code | Other stuff |
|----------|-------------|
| 111 | ... |
| 112 | ... |
| 113 | ... |
| 114 | ... |
And what I want to do is to link the Shelf_codes from that specific store to the total available SKU-list, so it will look like this:
| SKU_code | Other stuff | Shelf_code_store1 |
|----------|-------------|-------------------|
| 111 | ... | AB01 |
| 112 | ... | |
| 113 | ... | |
| 114 | ... | AB01 |
I have tried to embed the MATCH formula within another INDEX/MATCH formula (see code below) which was partially successful since this will only work if the shelf_code in file 2 happens to be the first one to match the SKU_code.
Since this will be mostly not the case, it will return a #N/A error
MATCH(
INDEX({file2_shelfcode},MATCH(file3_skucode,{file2_skucode},0)),
{file1_shelfcode}
)
Does anyone has a solution for this?
Since these files contain over 1000 articles, 200 shelves, 6 stores, and will be frequently updated I don't think using a Pivottable on file 2 will fit my needs.

Related

How does efficient_apriori package code itemsets?

Doing association rules mining using efficient_apriori package in python.Found a very useful answer on how to convert the output to a dataframe
However, struggling with the Itemset output and hoping someone can help me parse that correctly. Guessing LHS is an index value, but struggling with the decimal values in RHS. Does anyone know how the encoding is done? I have tried the same with SKU descriptions, and get the same output.
Input dataframe looks like this:
| SKU | Count | Percent |
|----------------------------------------------------------------------|-------|-------------|
| "('000000009100000749',)" | 110 | 0.029633621 |
| "('000000009100000749', '000000009100000776')" | 1 | 0.000269397 |
| "('000000009100000749', '000000009100000776', '000000009100002260')" | 1 | 0.000269397 |
| "('000000009100000749', '000000009100000777', '000000009100002260')" | 1 | 0.000269397 |
| "('000000009100000749', '000000009100000777', '000000009100002530')" | 1 | 0.000269397 |
Output looks like this:
| | lhs | rhs | count_full | count_lhs | count_rhs | num_transactions | confidence | support |
|---|-----------------------------|-----------------------------|------------|-----------|-----------|------------------|------------|-------------|
| 0 | "(1,)" | "(0.00026939655172413793,)" | 168 | 168 | 168 | 297 | 1 | 0.565656566 |
| 1 | "(0.00026939655172413793,)" | "(1,)" | 168 | 168 | 168 | 297 | 1 | 0.565656566 |
| 2 | "(2,)" | "(0.0005387931034482759,)" | 36 | 36 | 36 | 297 | 1 | 0.121212121 |
| 3 | "(0.0005387931034482759,)" | "(2,)" | 36 | 36 | 36 | 297 | 1 | 0.121212121 |
| 4 | "(3,)" | "(0.0008081896551724138,)" | 21 | 21 | 21 | 297 | 1 | 0.070707071 |
Could someone help me understand what's outputting in the LHS and RHS columns, and how to join that back to the 'SKU'? Ideally would like the output to have the 'SKU' instead of whatever is showing up.
I have looked at the documentation and it is quite sparse.

How can I compare two spreadsheets to see if column a is a match AND they overlap ranges =true?

I have two spreadsheets--one represents locations where the road was recently repaired and the second shows all eligible roads based on the road's speed limit. The first spreadsheet has a list of ID's (Column B) and a beginning point (Column E) and ending point (Column F) for the repair location. The second spreadsheet may have multiple matches for each ID (Column A) and the eligible beginning points (Column P) and ending points (Column Q).
I want to compare to see if any portions of the eligible roads are already on the recently repaired list.
Completed repairs = 18SealCoatMap where B=Highway Name, E=Beginning Limit, and F=Ending Limit.
| County | Highway | BDFO | EDFO |
|-----------|-----------|--------|--------|
| Guadalupe | FM0078-KG | 13.064 | 14.018 |
| Guadalupe | FM0078-KG | 14.018 | 14.848 |
| Guadalupe | FM0078-KG | 14.848 | 18.991 |
| Guadalupe | FM0465-KG | 0 | 3.342 |
Elibible repairs =MLOVER45 where A=Highway Name, B=Line ID, P=Beginning Limit, and F=Ending Limit.
| Lane | ID | Highway | SpeedLimit | Begin_DFO | End_DFO |
|-----------|----|---------|------------|-----------|---------|
| FM0078-KG | 1 | FM0078 | 50 | 13.064 | 14.018 |
| FM0078-KG | 2 | FM0078 | 55 | 14.845 | 14.848 |
| FM0078-KG | 3 | FM0078 | 50 | 14.018 | 14.845 |
| FM0078-KG | 4 | FM0078 | 55 | 14.848 | 15.006 |
So far, I'm only working with the beginning point of each eligible location. When I get a working formula, I'll copy it for the ending location.
Here's a more varied example...
Eligible Locations:
| Lane | ID | Highway | SpeedLimit | Begin_DFO | End_DFO |
|-----------|-----|---------|------------|-----------|---------|
| FM0791-KG | 369 | FM0791 | 70 | 0 | 6.909 |
| FM0791-KG | 372 | FM0791 | 70 | 6.909 | 18.603 |
| FM0791-KG | 377 | FM0791 | 55 | 19.286 | 19.486 |
| FM0791-KG | 378 | FM0791 | 70 | 19.486 | 30.971 |
Completed Locations:
| County | Highway | BDFO | EDFO |
|----------|-----------|--------|--------|
| Atascosa | FM0791-KG | 21.619 | 23.196 |
| Atascosa | FM0791-KG | 21.619 | 23.196 |
| McMullen | FM0791-KG | 0.000 | 7.017 |
| McMullen | FM0791-KG | 0.000 | 7.017 |
| McMullen | FM0791-KG | 2.190 | 2.760 |
| McMullen | FM0791-KG | 2.190 | 2.760 |
I tried the following formula but every location came back true:
=IF(A2='18SealCoatMap'!B2:B345,AND(MLOVER45!P2>'18SealCoatMap'!E2:E345,MLOVER45!P2<'18SealCoatMap'!F2:F345),TRUE)
Then I tried:
=INDEX('18SealCoatMap'!B2:B345,MATCH(A2,IF(P2>'18SealCoatMap'!E2:E345,P2<'18SealCoatMap'!F2:F345)),2)
but all of the results came back #N/A
I expect the outcome to be the ID number for the eligible location (or TRUE) if there's a match so that I can schedule repairs for all locations that do not already fall within the limits. Based on the results, I will then schedule locations that are entirely or partially due for repair.

How to transpose all subfields in front of the parent field of a pivot table in Excel

I have a data of automotive spare parts with their multiple store locations in a warehouse.
all I want to do is get the locations in front of the part number, so that it is easy to know all the locations of a specific part number.
The current pivot data looks like this
I've manually transposed a few rows in the below image, but the data contains around 70K rows, Hence I'm looking for a better solution
Kindly refer to the below table
+--------------+-----+-------+-------------+
| Item name | Qty | UoM | Stock |
+--------------+-----+-------+-------------+
| '0450000115 | 324 | piece | G12B04 |
| '0450000A61 | 312 | piece | G12B05 |
| '0450000115 | 336 | piece | G12B06 |
| '0450000A61 | 228 | piece | G12B07 |
| '0450000115 | 336 | piece | G12B08 |
| '0450000115 | 192 | piece | G12B09 |
| '087902E200A | 470 | piece | G12B10 |
| '087902E200A | 760 | piece | G12B13 |
| '087902E200A | 759 | piece | G12B14 |
| '0450000115 | 336 | piece | G12B15 |
| '087902E200A | 400 | piece | G12B16 |
| '087902E200A | 10 | piece | G3B32 |
| '084B410426 | 100 | piece | G3B32 |
| '087902E200A | 300 | piece | G4B08 |
| '0450000A61 | 2 | piece | GDB01 |
| '084B410426 | 60 | piece | GR.04.C.04. |
| '087902E200A | 327 | piece | HD.03.K.05. |
+--------------+-----+-------+-------------+
You need to create a measure, using the CONCATENATEX function. For this you need to add your data to the datamodel. You can do this by checking the box add this data to the datamodel on the bottom of the create pivottable dialogbox.
Rightclick the table on the Pivottable Fields Pane and select add measure. Then create the following measure: = CONCATENATEX('table','table'[Stock],", ")
Now put [Item name] on Rows and the measure [StockText] on Values. This should be the result:

How use grep for that complicated expressions?

+----+-------+-----+
| ID | STORE | QTY |
+----+-------+-----+
| | | |
| 9 | 101 | 18 |
| | | |
| 8 | 154 | 19 |
| | | |
| 7 | 111 | 13 |
| | | |
| 9 | 154 | 18 |
| | | |
| 8 | 101 | 19 |
| | | |
| 7 | 101 | 13 |
| | | |
| 9 | 111 | 18 |
| | | |
| 8 | 111 | 19 |
| | | |
| 7 | 154 | 14 |
+----+-------+-----+
Suppose that I have 3 stores, and I'd like to take STORE for every id which qty is the same for every store.
e.g id 9 is in 3 stores, in every store has 18 qty,
but id 7 is in stores but in only two store has equal qty (in store 111 and 101 - in 154 - id has 14 qty); how can I get that result using grep?
Do you think that is impossible to get that one in one expressions? I thought about regex but I don't know in which way I get Qty and compare to another row. In my file it looks like:
Extract the first and last columns by cut, count the number of uniq combinations, and output only those whose count is 3 (i.e. the value is the same for all three stores):
$ cut -d\| -f2,4 | sort | uniq -c | grep '^ *3 '
3 8 | 19
3 9 | 18

Adding Columns to Excel As List From Other Sheet Grows

Background
I'm creating a grade book in Excel for my wife. I have sheets for the overall grade, classwork, exams, and participation.
The three sections of work (classwork, exams, and participation) each have a variable number of items, and each item has a different number of points possible. Each section has a weight in the overall grade.
I have this up and running with a fixed number of items per section, but I'd like to create a template that can be updated from class to class and year to year.
Here's the problem:
On the classwork sheet, I'd like to be able to enter new assignments and their point value and have that automatically update the master grade sheet on my first sheet tab. Is there any way to add columns in a section of one worksheet (the master grade sheet) when new rows are added to another worksheet (the list of assignments)?
It is possible to achieve this without using VBA. The reason you will have difficulty acheiving this, however, is that you've violated normal form in the table you've already built. It appears the pertinent data you're looking for is each student's score on each assignment. If this if correct, the level of granularity you will want is on the Assignment, not on the Student.
There are some fairly quick ways to modify your existing work to account for this. I've written out some sample data below. Take a look and see if it helps.
Sample Original Table
+---------+------+------------+------------+
| Student | Quiz | Thumbnails | Watercolor |
+---------+------+------------+------------+
| Paul | 3 | 10 | 90 |
| Frank | 4 | 10 | 95 |
| Mary | 5 | 10 | 70 |
| Ellen | | 10 | 85 |
| Sue | 6 | 10 | 92 |
| Anton | 5 | 10 | 87 |
+---------+------+------------+------------+
Image of the data is below ( note I have highlighted the blank value ).
Sample Normal Table
+---------+-------------+-----------+-------+
| Student | Assignment | New_Score | Score |
+---------+-------------+-----------+-------+
| Paul | Quiz | | 3 |
| Frank | Quiz | | 4 |
| Mary | Quiz | | 5 |
| Ellen | Quiz | | 0 |
| Sue | Quiz | | 6 |
| Anton | Quiz | | 5 |
| Paul | Thumbnails | | 10 |
| Frank | Thumbnails | | 10 |
| Mary | Thumbnails | | 10 |
| Ellen | Thumbnails | | 10 |
| Sue | Thumbnails | | 10 |
| Anton | Thumbnails | | 10 |
| Paul | Watercolor | | 90 |
| Frank | Watercolor | | 95 |
| Mary | Watercolor | | 70 |
| Ellen | Watercolor | | 85 |
| Sue | Watercolor | | 92 |
| Anton | Watercolor | | 87 |
| Mary | ExtraCredit | 10 | 10 |
| Ellen | ExtraCredit | 8 | 8 |
| Sue | ExtraCredit | 9 | 9 |
| Anton | ExtraCredit | 10 | 10 |
+---------+-------------+-----------+-------+
Image of the data is below. The score column reaches back to your old table and grabs the score you've already entered for the students, so you won't have to do this all manually. The formula for this is =INDEX(non_normal,MATCH([#Student],non_normal[Student],0),MATCH([#Assignment],non_normal[#Headers],0)).
This assumes you've formatted the old data into an Excel DataTable ( ctrl+t ) and named it non_normal ( alt+j+t+i ). Note the unsubmitted assignment for Ellen comes through with a score of zero using this method. I've added a column named New_Score so that you are able to add new student-assignment submission combinations to the table without having to modify your old non_normal table ( which was the trouble in the OP ). With this column added, the formula in the Score column can be changed to =IF(NOT(ISBLANK([#[New_Score]])),[#[New_Score]],INDEX(non_normal,MATCH([#Student],non_normal[Student],0),MATCH([#Assignment],non_normal[#Headers],0))) which will take the New_Score value if available and the original score if not.
The orange cells are new student-assignment submission combinations. Note you do not need to add a row for every student, just add a row whenever a student submits an assignment.
Sample Assignments Table
+-------------+-----------------+
| Assignment | Points_Possible |
+-------------+-----------------+
| Quiz | 6 |
| Thumbnails | 10 |
| Wartercolor | 100 |
| ExtraCredit | |
+-------------+-----------------+
I've added the ExtraCredit assignment with a possible max score of zero/blank ( since not completing extra credit shouldn't count against a student )
Payoff - Back to the Original Table
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Sum of Score | Column Labels | | | | | |
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Row Labels | Quiz | Thumbnails | Watercolor | ExtraCredit | Grand Total | |
+--------------+---------------+------------+------------+-------------+-------------+--------+
| Anton | 5 | 10 | 87 | 10 | 112 | 96.6% |
| Ellen | 0 | 10 | 85 | 8 | 103 | 88.8% |
| Frank | 4 | 10 | 95 | | 109 | 94.0% |
| Mary | 5 | 10 | 70 | 10 | 95 | 81.9% |
| Paul | 3 | 10 | 90 | | 103 | 88.8% |
| Sue | 6 | 10 | 92 | 9 | 117 | 100.9% |
+--------------+---------------+------------+------------+-------------+-------------+--------+
Using the image below, you pivot your newly normalized data into a Pivot Table. ( alt+n+v ). Now, simply adding a new assignment to the normal_assignment DataTable will cause that assignment to appear in a new column when you refresh the Pivot Table ( alt+a+r+a ).
The % score on the right of the Pivot Table is calculated using the following formula ( with the sample Pivot Table starting in cell $M$2 ): =GETPIVOTDATA("Score",$M$2,"Student",M4)/SUM(assignment[Points_Possible])
I've uploaded the raw sample file for this to my public repo if you'd like to pull it and take a peek at the source. Credit to sensefulsolutions for text-to-table conversion.
Hope this is what you need!

Resources