Merge two related data sets by common value, LibreOffice/Excel or CSV - excel

I have two datasets, they are stored separately but they are related, they describe the same phenomenon, from different perspectives, in different ways.
The encoding is not really consequential, here they are rendered as Excel/LibreOffice but I can also get them as CSV.
One "sheet", Sheet I, looks like this:
and Sheet II:
Using the field submission # as the unifier I want to create a single sheet which will associate the related blue fields to the corresponding pink field.
For example, the final result should look like this:
Here is a link to those toy examples.

On sheet1 cell h2 insert:
=VLOOKUP($B2,Sheet2!$A:$F,COLUMN(h2)-6,0)
The $ fixes the data table and the row for the lookup.
You can drag the function with the black plus sign which appears when you are pointing the right bottom of the cell.

Related

Aligning vertically a series of tables with text

Hi I need the text to be in a specific format in a spreadsheet to be able to upload it on a translation tool.
I have already used the text split function to separate the text in a cell with bullet points, moving each bullet point to a separate cell.
enter image description here
Then I used the transpose function to separate each set of data. For context, you are looking at fashion products.
The name of the product is on the first row, followed by a list of features (e.g. "Bracciale" means bracelet and it is followed by the list of materials)
enter image description here
Now for the last step, I need these sets to be vertical, not horizontal. Like this:
enter image description here
I would like to set up an automatic system so that every time we receive a list with hundreds of these products we do not need to copy-paste them one below the other.
With pivot tables maybe? Keep in mind that if it is too complex it might be hard to train the translators to do it each time. Please let me know your suggestions. Thank you!
I am not a programmer. I tried pivot tables but the data was in the wrong order and I am not sure how to get the data out from the pivot table with values only without the sub-menus.
My suggestion would be to use the 'Unpivot Columns' feature in the Power Query Editor - it would be really simple.
Steps:
Select the whole range
Go to Data // Get & Transform Data // From Table/Range
Uncheck 'My Table has headers' (unless it does - but doesn't look like it?)
Press OK. This will open Power Query Editor and will have actually given you column names Col1/2/3 etc, but ignore that.
Go to Add Column // Index column
Select all columns EXCEPT the new index column by Shift+clicking on those headers
Go to Transform // Unpivot Columns
Assuming the order is important, click in the Attribute column and Sort Ascending
Click in the Index column and Sort Ascending
Remove the Attribute and Index columns if you want (right click header)
Go to File // Close & Load
You will get a new table - dynamically linked to the first (ie. can be updated/refreshed) - in the unpivoted format.
Let me know if you need more details / screenshot?
Based of this trick, maybe the following is helpfull:
Formula in A5:
=DROP(REDUCE(0,A1:A3,LAMBDA(a,b,VSTACK(a,TEXTSPLIT(b,,HSTACK(CHAR(10),"^"),1)))),1)
TEXTSPLIT() will use a combination of newline chars and the circumflex to split the input directly into a vertical array;
Iteration in REDUCE() will allow for stacked results;
DROP() the initial value from results.

How to Perform Row-by-Row List Operations in Power Query

I'm trying to compare CSVs from one column with CSVs in another column in the same row in Power Query. I need to ensure that all the CSVs in one column are in the other.
I tried using List.ContainsAll, but it seems like the syntax I'm using is not working. The solution shared here is very close to what I need, but it's comparing all values in a column, not the cell's values.
Here is my sample code, but I think this picture explains the parent-child columns better. This picture shows another scenario where the function also needs to work.
Table.AddColumn(#"Replaced Value", "Contractual and Technical Types Match?", each List.ContainsAll({[Technical Turbine Type]},{[Contractual Turbine Type]}))
You say your column contains a list but your image is not showing a list. Your image is showing text with commas separating them. This is what a list looks like
Assuming you really have columns of comma separated text, this ensures that everything in the Contractual Turbine Type column is also in the Technical Turbine Type column
Add custom column with formula
= List.ContainsAll(
List.Transform(Text.Split([Technical Turbine Type],","), each Text.Trim(_)),
List.Transform(Text.Split([Contractual Turbine Type],","), each Text.Trim(_))
)
You could just use this if you are not worried about spaces after the commas
= List.ContainsAll(
Text.Split([Technical Turbine Type],","),
Text.Split([Contractual Turbine Type],",")
)

How do I sum all the columns with the same header into one column in excel?

I am using a football dataset where I have changed all the countries to specific geographical areas in their column headers. What I want to do is I want to add up all the columns with the same geographic value with all the values added up.
This is how my data looks like:
The value of #players should still remain the same after condensing the data. I tried using the data function, but I could not figure it out.
My output should ideally look like a column with all the EUs added up, the AFRs added up etc.
It's actually unclear as to exactly what it is you're trying to do.
Here is an example use of SUMIF that addresses one possible interpretation:
=SUMIF($B$1:$E$1,F$1,$B2:$E2)
And here is an image showing how and were that formula would be used in this sample table:
If this doesn't address what you're aimed for, could you make an effort to do as I have done and show a small sample with the expected results.
I was able to solve it using the SUM function.
I used SUM(C2,C6,C7....) every column that had an EU in it and did the same for the other headers.

Match and Conditional Formatting from Matrix Table

I am looking for some decent help with my matrix table, and is there a good or best approach to properly match dependent instances in certain matrix using drop downs.
This picture represents my matrix table (Picture 1):
As you can see there are a lot of instances, but horizontally and vertically they got the same number of "headers". Those "1`s" are representing not compatibility in my case but lets call it simply "match". That is on one sheet that is gonna be populated with some new values from time to time.
On another sheet which is actually sheet for showing the data and their compatibility possibilities is equipped with drop downs. There you got "Groups (Group1, Group2...)" in a sense of main parts and "dependent groups (AA1, BB2..)" as small components that are part of main parts. To avoid misunderstanding here you have explanations, I used for the sake of this example fictional values:
Groups aka. Main Parts
Dependent groups aka. components
As you can see beneath, is my fictional table but exactly the same concept as I should use in my real case.
I PUT AN EXPLANATION IN THE PICTURE 2 SO YOU CAN FOLLOW ALONG AND SEE EXACTLY WHERE/WHAT I DID!
What I used firstly there are =match functions, one for vertical position (A3) and one for horizontal (B4). This boolean row is done using =or(index) but reffering to the match positions as you can see. And from there I should use true/false for coloring my group boxes in a case compatibility is possible - thats all the science.
So, my question is if there is another approach to this problem? As you can see I have 3 different rows of functions at one place, or imagine if I will have more "groups" that can rise in many more rows and calculations.
Picture 2
EDITED:
This is screenshot of the original sheet, I just hid some rows that were with Infos that is reason the number is not consistent. As you can see it is almost the same as dummy example I provided above. Underneath every "box" you got three rows of calculations as I mentioned before. The two times number "2" that you see here is the position of some value that I found using =match function, one is for horizontal and another for vertical lookup. In this case it is model type, 070FX is position 2, 100FX is 3 and 200FX is 4th position in the matrix table, and so on for all the other groups. And those groups (Model, Endpoint, Gas sensor...) are defined separately on another sheet where I had to make unique list and dependent list so I can reference those to my drop down list.
EDIT Nr 4! So this formula I used for true/false:
=SUMPRODUCT(('0359-matrix'!$A$2:$A$101=F10)*(('0359-matrix'!$B$1:$CW$1=$B$10)+('0359-matrix'!$B$1:$CW$1=$C$10)+('0359-matrix'!$B$1:$CW$1=$D$10)+('0359-matrix'!$B$1:$CW$1=$E$10)+('0359-matrix'!$B$1:$CW$1=$F$10)+('0359-matrix'!$B$1:$CW$1=$G$10)+('0359-matrix'!$B$1:$CW$1=$H$10)+('0359-matrix'!$B$1:$CW$1=$I$10)+('0359-matrix'!$B$1:$CW$1=$J$10)+('0359-matrix'!$B$1:$CW$1=$K$10)+('0359-matrix'!$B$1:$CW$1=$L$10)+('0359-matrix'!$B$1:$CW$1=$M$10)+('0359-matrix'!$B$1:$CW$1=$N$10)+('0359-matrix'!$B$1:$CW$1=$O$10)+('0359-matrix'!$B$1:$CW$1=$P$10)+('0359-matrix'!$B$1:$CW$1=$Q$10)+('0359-matrix'!$B$1:$CW$1=F13)+('0359-matrix'!$B$1:$CW$1=G13)+('0359-matrix'!$B$1:$CW$1=H13)+('0359-matrix'!$B$1:$CW$1=I13)+('0359-matrix'!$B$1:$CW$1=J13))*'0359-matrix'!$B$2:$CW$101)>0
I copied only last part, or when it starts from second row..Because it is too long to write whole funciton - it cuts down automatically.
('0359-matrix'!$B$1:$CW$1=$Q$10)+('0359-matrix'!$B$1:$CW$1=$B$13)+('0359-matrix'!$B$1:$CW$1=$C$13)+('0359-matrix'!$B$1:$CW$1=$D$13)+('0359-matrix'!$B$1:$CW$1=$E$13)+('0359-matrix'!$B$1:$CW$1=$F$13))*'0359-matrix'!$B$2:$CW$101)>0
But on marked cells I am getting the same results: B22 - F22 has the same as B21 - F21 (boolean) what shouldnt be like that but to follow color, green is False, it has to be something with an array reference.
Checkout the following. A1 to E5 is the matrix that shows which pieces are incompatible (=1). The others have to be empty or 0.
In cell I8 I used the following formula (and copied it down up to I11):
=SUMPRODUCT(($A$2:$A$5=H8)*(($B$1:$E$1=$H$8)+($B$1:$E$1=$H$9)+($B$1:$E$1=$H$10)+($B$1:$E$1=$H$11))*$B$2:$E$5)
The formula result shows you the amount of incompatibilities a part has. Eg AA1 has one incompatibility with BB2 but BB2 is incompatible with 2 AA1 and CC3.
To get the TRUE/FALSE use the same formula and append >0: like =SUMPRODUCT(…)>0
For any additinonal "group" (Model, Endpoint, …) you need to add another +($B$1:$E$1=$H$12) where $B$1:$E$1 points to your matrix data and $H$12 to your selected group value.
Overview of the formula ranges:
Note that this kind of calculation can only tell the amount of incompatibilites a part has but not the names of the parts that are incompatible.
Edited horizontal version
Formula in the selected cell is
=SUMPRODUCT(($A$2:$A$5=G8)*(($B$1:$E$1=$G$8)+($B$1:$E$1=$H$8)+($B$1:$E$1=$I$8)+($B$1:$E$1=$J$8))*$B$2:$E$5)
you can pull it to the right.

Chart always complains about invalid references - Excel 2007

I made a XY plot that shows points from one data set in two different colors, depending on a set of conditions. I achieved this by making the source table three columns instead of two. First column is the X. Second column is Y is one set of conditions apply, third column is Y is the other set of conditions apply. So the second and third columns have formulas like this in them, respectively:
=IF(ConditionApplies,YValue,"")
=IF(ConditionApplies,"",YValue)
(So the graph actually has two series, each of which is not a contiguous block of numbers - each is interspersed with "nothing")
When I make a change that affects the ConditionApplies, the table reacts properly. Then I switch to the chart (on a different sheet) and it always says: "A formula in this worksheet contains one or more invalid references...". Click OK.
The chart itself always looks the way I would expect, with two different sets of points according to the Conditions I devised. If I inspect the data source fields, all the references are intact and proper.
Basically everything works, I would just like to avoid this annoying pop-up.
Had the same problem. Deleted a data column and the chart that referenced it kept complaining.
Solution was to move the chart to its own page. then copy the chart and put it back into worksheet.
Hope it helps.
I 100% understand everything you've said here and, on the surface, it sounds like it's not any kind of bug. It seems like you are actually referencing something you shouldn't. If that's, in fact, the case that's obviously something you want to fix.
My first guess would be to look at your "ConditionApplies" formulas. Under certain cases, would they create invalid references (referencing data of the wrong type, dividing by zero, circular references, etc.). The most common cause of problems like that would be dragging formulas but not having the "$" signs in the appropriate places. So your cell references change when you expected they'd stay the same.
For example:
=SUM(A1:G25)
should be something like the following to prevent the column and row from incrementing when dragged:
=SUM($A$1:$G$25)
Recommendation
Look at the "ConditionApplies" formulas (or better yet, post them here) and aggressively place $ where ever they don't break things. Then "re-drag" your new formulas, updating the previous ones.
There is a microsoft KB 931389! about this problem with status "Confirmed, not fixed".
In my situation with chart and two series collection problem solved by adding a code to delete all seriesCollection before adding new data:
While Sheets(3).ChartObjects(1).Chart.SeriesCollection.Count > 0
Sheets(3).ChartObjects(1).Chart.SeriesCollection(Sheets(3).ChartObjects(1).Chart.SeriesCollection.Coun t).Delete
Wend

Resources