Aligning vertically a series of tables with text - excel

Hi I need the text to be in a specific format in a spreadsheet to be able to upload it on a translation tool.
I have already used the text split function to separate the text in a cell with bullet points, moving each bullet point to a separate cell.
enter image description here
Then I used the transpose function to separate each set of data. For context, you are looking at fashion products.
The name of the product is on the first row, followed by a list of features (e.g. "Bracciale" means bracelet and it is followed by the list of materials)
enter image description here
Now for the last step, I need these sets to be vertical, not horizontal. Like this:
enter image description here
I would like to set up an automatic system so that every time we receive a list with hundreds of these products we do not need to copy-paste them one below the other.
With pivot tables maybe? Keep in mind that if it is too complex it might be hard to train the translators to do it each time. Please let me know your suggestions. Thank you!
I am not a programmer. I tried pivot tables but the data was in the wrong order and I am not sure how to get the data out from the pivot table with values only without the sub-menus.

My suggestion would be to use the 'Unpivot Columns' feature in the Power Query Editor - it would be really simple.
Steps:
Select the whole range
Go to Data // Get & Transform Data // From Table/Range
Uncheck 'My Table has headers' (unless it does - but doesn't look like it?)
Press OK. This will open Power Query Editor and will have actually given you column names Col1/2/3 etc, but ignore that.
Go to Add Column // Index column
Select all columns EXCEPT the new index column by Shift+clicking on those headers
Go to Transform // Unpivot Columns
Assuming the order is important, click in the Attribute column and Sort Ascending
Click in the Index column and Sort Ascending
Remove the Attribute and Index columns if you want (right click header)
Go to File // Close & Load
You will get a new table - dynamically linked to the first (ie. can be updated/refreshed) - in the unpivoted format.
Let me know if you need more details / screenshot?

Based of this trick, maybe the following is helpfull:
Formula in A5:
=DROP(REDUCE(0,A1:A3,LAMBDA(a,b,VSTACK(a,TEXTSPLIT(b,,HSTACK(CHAR(10),"^"),1)))),1)
TEXTSPLIT() will use a combination of newline chars and the circumflex to split the input directly into a vertical array;
Iteration in REDUCE() will allow for stacked results;
DROP() the initial value from results.

Related

Extracting text in excel

I have some text which I receive daily that I need to seperate. I have hundreds of lines similar to the extract below:
COMMODITY PRICE DIFFERENTIAL: FEB50-FEB40 (APR): COMPANY A OFFERS 1000KB AT $0.40
I need to extract individual snippets from this text, so for each in a seperate cell, I the result needs to be the date, month, company, size, and price. In the case, the result would be:
FEB50-40
APR
COMPANY A
100
0.40
The issue I'm struggling with is uniformity. For example one line might have FEB50-FEB40, another FEB5-FEB40, or FEB50-FEB4. Another example giving me difficult is that some rows might have 'COMPANY A' and the other 'COMPANYA' (one word instead of two).
Any ideas? I've been trying combinations of the below but I'm not able to have uniform results.
=TRIM(MID(SUBSTITUTE($D7," ",REPT(" ",LEN($D7))), (5)*LEN($D7)+1,LEN($D7)))
=MID($D7,20,21-10)
=TRIM(RIGHT(SUBSTITUTE($D6,"$",REPT("$",2)),4))
Sometimes I get
FEB40-50(' OR 'FEB40-FEB5'
when it should be
'FEB40-FEB50'`
Thank you to who is able to help.
You might get to the limits of formulas with this scenario, but with Power Query you can still work.
As I see it, you want to apply the following logic to extract text from this string:
COMMODITY PRICE DIFFERENTIAL: FEB50-FEB40 (APR): COMPANY A OFFERS 1000KB AT $0.40
text after the first : and before the first (
text between the brackets
text after the word OFFERS and before AT
text after 'AT`
These can be easily translated into several "Split" scenarios inside Power Query.
split by custom delimiter : - that's colon and space - for each ocurrence
remove first column
Split new first column by ( - that's space and bracket - for leftmost
Replace ) with nothing in second column
Split third column by delimiter OFFERS
split new fourth column by delimiter AT
The screenshot shows the input data and the result in the Power Query editor after renaming the columns and before loading the query into the worksheet.
Once you have loaded the query, you can add / remove data in the input table and simply refresh the query to get your results. No formulas, just clicking ribbon commands.
You can take this further by removing the "KB" from the column, convert it to a number, divide it by 100. Your business processing logic will drive what you want to do. Just take it one step at a time.

Taking means of irregular amounts data

I'm not able to take the means for a large dataset given that the amount of attributes is irregular.
I have posted a simplified case for the problem. It explains the problem very well.
An idea that I came up with: Make a filter to condition on a single attribute. However, still, I don't see a way to do this in an efficient way (other then doing it all by hand).
see excel file:
All help is much appreciated.
I'm basically looking for a function/method to achieve taking means of all different attributes conditioned on each person for a large dataset without doing it by hand.
You can use AVERAGEIFS() inside an IF:
=IF(OR(A2<>A1,B2<>B1),AVERAGEIFS(C:C,A:A,A2,B:B,B2),"")
the ifrst part of the if tests whether the row starts a new group either by the person or the attribute changing. Then it uses AVERAGEIFS() to return the correct average of that group. otherwise it returns a blank
What you want to do can be accomplished very simply with a pivot table.
Simply select one of the cells inside the range of data you want to process(See the video for general use of a pivot table https://www.youtube.com/watch?v=iCiayB6GrpQ )
go the insert tab and insert pivot table.
Once you have it, simply check people, attribute, and values. Then drag people and attribute into rows, drag valut into the values window, select the drop down list and change it from sum of value to average and you should be done. https://i.stack.imgur.com/nYEzw.png

Column to rows and highlight difference between values in the same group

I have a huge table with data structured like this:
And I would like to display them in Spotfire Analyst 7.11 as follows:
Basically I need to display the columns that contain "ANTE" below the others in order to make a comparison. Values that have variations for the same ID must be highlighted.
I also have the fields "START_DATE_ANTE" and "END_DATE_ANTE" which have been omitted in the example image.
Amusingly, if you were limited to just what the title asks, this would be a very simple answer.
If you wanted this in a table where the rows are displayed as usual, and the cells are highlighted, you can do this by going to properties, adding a newGrouping where you select VAL_1 and VAL_1_ANTE and add a Rule, Rule type "Boolean expression", where the value is:
[VAL_1] - [VAL_1_ANTE] <> 0
This will highlight the affected cells, which you can place next to each other. You can even throw in a calculated column showing the difference between the two columns, and slap it on right next to it. This gives you the further option to filter down to only showing rows with discrepancies, or sorting by these values.
However, if you actually need it to display the POSTs on different lines from the ANTEs, as formatted above, things get a little tricky.
My personal preference would be to pivot (split/union/etc) the data before pulling it in to Spotfire, with an indicator flag on "is this different", yes/no. However, I know a lot of Spotfire users either aren't using a database or don't have leeway to perform the SQL themselves.
In fact, if you try to do it in Spotfire using custom expressions alone, it becomes so tricky, I'm not sure how to answer it right off. I'm inclined to think you should be able to do it in a cross table, using Subsets, but I haven't figured out a way to identify which subset you're in while inside the custom expressions.
Other options include generating a table using IronPython, if you're up to that.

Pentaho Kettle - Loading excel with almost blank rows

I got an excel file from a uncontrolled source that comes with a row with all the fields filled and then several rows all fields blank except one (Always the same, is a commentary).
The commentaries belong to the ID of the "row with data".
I would like to make a new field "COMENTARY AGREGATED" with the concatenation of all the comenataries that belong to the ID but I don't know how to do it, as far as I know, you can't interact with the order of the rows as they are treated as independent. ¿Am I right and this is imposible to do inside kettle and should resort to a VB macro in excel as preprocess?
THanks for your time
You can use a group by step, group by all fields except the comment one, and on aggregations choose “concatenate values separated by” and use a whitespace as value for the concatenation ( or nothing if you prefer).
The excel input can’t do all that on its own.
for now I've advanced a little.
I found that in the Excel input step, in the Fields tab, the Repeat column can be set to Y, and if so, it fills the blank rows with the previous value.
Still don't know how to agregate the others but its a step in the right direction I guess.

Can I get relational data into an Excel Pivot Table

I have a sheet (let's go with wines as an example) that lists every bottle of wine in my cellar, when I bought it, how much I paid etc.
There's a column that describes the wine in comma-separated tags such as "Fruity, White".
I've created a pivot table from that data, with the description as a filter column. However I can't filter it by "White". I have to find every description that contains "White" such as "Dry, White", "White, Crisp" etc.
Being from an RDBMS background, my natural inclination is to put the tags in their own table keyed against the wine row so there's zero-or-more tag rows per wine row.
How, how on earth can I use that to filter the wine rows?
Yes you can do it within Excel and the description fields can remain as "Dry, White" etc as you do not need to split the comma separated values.
Lets say the Table source comprises a text column for Description, a number column for Value and a number column for Year Bought.
Your pivot is setup with the the following
Fields: Description, Value and Year Bought.
Column labels: Year Bought
Row Labels: Description
Sum of values: Sum of Value
There is a drop down label filter on the row labels - click on this and there should be an option to select Label Filters. Select this and then select Contains. You can enter say "White" which will select all your descriptions that contain white e.g. "Dry, White", "White, Crisp". The filter includes ? to represent a single character and * to represent any series of characters.
There are similar label filters for "begins with" and "ends with" as well as there negation.
I tried this in Excel 2007 and it should also work in 2003. I think in Excel 2003 you could even combine the filters e.g. contains "White" and does not contain "Dry" but in 2007 I could not find a way of doing this.
Forgive me if I'm stating the obvious, but the reason you're having problems here is that the description column is not in 1NF, and the Excel pivot interface isn't flexible enough to allow pattern-based searching.
The simplest option will be to normalise the CSV into a series of columns, each of which represents a single attribute - one column for wine colour, one for sweetness, one for country of origin and so on - and apply the filter across multiple columns. However, if (as your comment on the question suggests) wine is a metaphor for your real problem, you may not have the luxury of revisiting the design of the source data.
Another possibility might be to use a macro (or a database query - I'm not clear from your question whether you have implemented the tag system already) to pre-filter the input data on the pivot table's source sheet based on the tag values you want to search for, then re-refresh the pivot table based on that data.
A third possibility is the VBA used in this question, which looks like it will custom-filter the pivot table's visible rows.
=IF(ISERR(FIND("WHITE",UPPER(B5))),0,1)
create an extra column and add a formula. There are 2 tricks to this. One is to search for WHITE in the description column using upper - to beat the fact that excel find is case sensitive. Two is that it returns a value error if the string does not exist - so iserr will allow you to trap that and return in this example 0 if it doesn't or 1 if it does. You could substitute white and blank for 1 and 0.
you could write a script that loops through the data and adds new lines for each comma separated item in the description column. This would allow the pivot table to filter better.

Resources