Highlight Specific Word in Associated String and String Variable in Tableau

Highlight Specific Word in Associated String and String Variable in Tableau - nlp

My goal is to create a highlight function for keywords contained within an associated string, and the overall string variable.
After trying "contains" function, trying variations of the logic contained in these links (1st - https://community.tableau.com/thread/214410, 2nd - https://community.tableau.com/message/846896#846896), and receiving an answer from Tableau Support that they don't know how to highlight keywords contained therein, I wanted to try the Stack OverFlow community before giving up on this one.
To illustrate, below is a table showing my goal with a matrix that contains a keyword and an associated string:
The next shows the ability to select a keyword that is then highlighted within all observations in the string variable:
The closest I have achieved is the following where only the associated string and its entirety are highlighted, as opposed to the broader string variable and just the keyword within the relevant string:
The logic is the following:
Create a parameter for users to enter their search keyword, and create a calculated field to see if the keyword is contained in the Comment field
Create and show a Highlighter for the Comment field.
To use a parameter to call attention to the comment using color:
Create a Parameter called "Search Keyword" with data type string. Select "All" for allowable values.
Create a calculated field called Matches or Color Matches, with the following formula:
CONTAINS([Key Words], [Search Keywords])
OR CONTAINS([Comments], [Search Keywords])
Drag this calculated field onto Color on the Marks card
Right-click the parameter and select "Show Parameter Control
Type the keyword to search and highlight.
To use a Highlighter:
Once the dashboard with keywords and comments has been created, navigate to the options menu for the Comments sheet and select Highlighters > Comments
This now displays a Highlight control which will highlight the row of a comment, instead of changing the text color like the parameter does.
This option will also allow for clicking on keywords, but clicking will only highlight the corresponding comment rather than all comments with the keyword.
As a potential third alternative, if viewers only want to see the matching words and not the entire string, we can modify the parameter method to add an IF statement to the calculated field we created earlier:
IF CONTAINS([Key Words], [Search Keywords])
OR CONTAINS([Comments], [Search Keywords])
THEN [Search Keywords]
END
Do you have any suggestions on how to tweak what I have, or even take a different approach? Any help would be greatly appreciated

As I am sure you know, Tableau is going to colour the entire text string as the CONTAINS condition results in TRUE for the entire string. A different approach could be to restructure your data to a 'long' format with 1 row per word (as below).
Doing this will ensure that Tableau knows each word should be evaluated separately and that the Color Marks Card will partition each word. You can then structure your worksheet like this. To ensure the words are showing in the correct order, you'll need a calculated field to create a unique row (I have called sort_order right("000000" + str([sentence_id]),7) + right("000000" + str([Position]), 7). Note that the Text Marks Card is sorted by sort_order and also that the order in which you drag on/order the Mark Cards is important
The colour_keyword formula then is simply something like [word] = [Keyword Parameter] (maybe check for upper/lowercase variants).
I would recommend maintaining your original table's data structure as well as this 'long table format and link the two datasources via a Relationship (Data > Edit Relationships) and use Dashboard Actions. This would hopefully satisfy your highlight requirements and mean less rework for your other worksheets.
I've published the demo tableau workbook to tableau public here

Related

Aligning vertically a series of tables with text

Hi I need the text to be in a specific format in a spreadsheet to be able to upload it on a translation tool.
I have already used the text split function to separate the text in a cell with bullet points, moving each bullet point to a separate cell.
enter image description here
Then I used the transpose function to separate each set of data. For context, you are looking at fashion products.
The name of the product is on the first row, followed by a list of features (e.g. "Bracciale" means bracelet and it is followed by the list of materials)
enter image description here
Now for the last step, I need these sets to be vertical, not horizontal. Like this:
enter image description here
I would like to set up an automatic system so that every time we receive a list with hundreds of these products we do not need to copy-paste them one below the other.
With pivot tables maybe? Keep in mind that if it is too complex it might be hard to train the translators to do it each time. Please let me know your suggestions. Thank you!
I am not a programmer. I tried pivot tables but the data was in the wrong order and I am not sure how to get the data out from the pivot table with values only without the sub-menus.

My suggestion would be to use the 'Unpivot Columns' feature in the Power Query Editor - it would be really simple.
Steps:
Select the whole range
Go to Data // Get & Transform Data // From Table/Range
Uncheck 'My Table has headers' (unless it does - but doesn't look like it?)
Press OK. This will open Power Query Editor and will have actually given you column names Col1/2/3 etc, but ignore that.
Go to Add Column // Index column
Select all columns EXCEPT the new index column by Shift+clicking on those headers
Go to Transform // Unpivot Columns
Assuming the order is important, click in the Attribute column and Sort Ascending
Click in the Index column and Sort Ascending
Remove the Attribute and Index columns if you want (right click header)
Go to File // Close & Load
You will get a new table - dynamically linked to the first (ie. can be updated/refreshed) - in the unpivoted format.
Let me know if you need more details / screenshot?

Based of this trick, maybe the following is helpfull:
Formula in A5:
=DROP(REDUCE(0,A1:A3,LAMBDA(a,b,VSTACK(a,TEXTSPLIT(b,,HSTACK(CHAR(10),"^"),1)))),1)
TEXTSPLIT() will use a combination of newline chars and the circumflex to split the input directly into a vertical array;
Iteration in REDUCE() will allow for stacked results;
DROP() the initial value from results.

Extract part of a string in excel

String:
"Department=Acc:2";"Classes=Accessoire";"Suppliers=xxx23";"Category=Décor";"Discount=no";Related_Carousel_Products=[23043]";"Accessory Type=Crinolines et Shorts";
My excel cells are filled with data like this and I want to extract a specific part of it, for example I would like to extract Accessory Type="Crinoline" into a new column so that I can edit them separately. I've tried this article it has many creative ways to extract the data but I cannot find a way to extract in the way I want, I want to extract part of the string, including the quotes.
https://www.extendoffice.com/documents/excel/3639-excel-extract-part-of-string.html

UPDATED - screenshot showing breakdown of each key function
You can do this using mid + search as follows (screenshot below/this sheet refer):
=MID(B2,SEARCH($F$2,B2),SEARCH(";",MID(B2,SEARCH($F$2,B2)+1,LEN(B2))))
where:
B2: the raw text
F2 = 'Accessory Type' (or any other thing you specify that satisfies final bullet)
Entire string you want to return (with or without quotation marks) falls after 'Accessory Type' and before the very next semi-colon (;) - per your example/below screenshot/above link.
How does this work?
We need to find the part of text that starts with the selected word(s) (e.g. "Accessory Type" in this case) and ends after the description of that accessory type (in this case, it's made up "asdfhadhgk")
Working from inside out mid function (A) returns everything after the words "Accessory Type"
Great, now we just need to it 'stop' a bit sooner, i.e. after the semi-colon that first appears after the words Accessory. This is exactly what the outer Mid function (D) achieves (it returns the string starting with "Accessory Type" up to the semi colon)
Screenshots below refer.

Create a search option in Power BI dashboard based on keywords table

I have two tables
With complete data, including a keywords columns. where keywords are comma separated (around 25 keywords)
Unique keywords extracted from the keywords column. (single column with each keyword in each observation)
Task is, based on the keyword in the second table, search the observations that have similar keywords and display on the report.
Looks something like this:
This is a filter, which is not fulfilling my task.
(or)
I am back of https://ideas.powerbi.com/ideas/idea/?ideaid=a586deac-c465-48da-978b-30ac2a4a3245 this activity. if someone can provide any solution related to this, will be helpful :).

I'm not sure what do you try to achieve. If you want just filter some visualization by selecting one of the keywords then create a measure (returning 0 /1, and this we can use for the filter in visualization) using SELECTEDVALUE -> for grabbing selected slicer and pathcontains (you need to replace comas ", " to pipe "|"
https://dax.guide/pathcontains/

Extracting text from complex string in excel

The attached image (link: https://i.stack.imgur.com/w0pEw.png) shows a range of cells (B1:B7) from a table I imported from the web. I need a formula that allows me to extract the names from each cell. In this case, my objective is to generate the following list of names, where each name is in its own cell: Erik Karlsson, P.K. Subban, John Tavares, Matthew Tkachuk, Steven Stamkos, Dustin Brown, Shea Weber.
I have been reading about left, right, and mid functions, but I'm confused by the irregular spacing and special characters (i.e. the box with question mark beside some names).
Can anyone help me extract the names? Thanks

Assuming that your cells follow the same format, you can use a variety of text functions to get the name.
This function requires the following format:
Some initial text, followed by
2 new lines in Excel (represented by CHAR(10)
The name, which consists of a first name, a space, then a last name
A second space on the same line as the name, followed by some additional text.
With this format, you can use the following formula (assuming your data is in an Excel table, with the column of initial data named Text):
=MID([#Text],SEARCH(CHAR(10),[#Text],SEARCH(CHAR(10),[#Text])+1)+1,SEARCH(" ",MID([#Text],SEARCH(CHAR(10),[#Text],SEARCH(CHAR(10),[#Text])+1)+1,LEN([#Text])),SEARCH(" ",MID([#Text],SEARCH(CHAR(10),[#Text],SEARCH(CHAR(10),[#Text])+1)+1,LEN([#Text])))+1)-1)
To come up with this formula, we take the following steps:
First, we figure out where the name starts. We know this occurs after the 2 new lines, so we use:
=SEARCH(CHAR(10),[#Text],SEARCH(CHAR(10),[#Text])+1)+1
The inner (occurring second) SEARCH finds the first new line, and the outer (occurring first) finds the 2nd new line.
Now that we have that value, we can use it to determine the rest of the string (after the 2 new lines). Let's say that the previous formula was stored in a table column called Start of Name. The 2nd formula will then be:
=MID([#Text],[#[Start of Name]],LEN([#Text]))
Note that we're using the length of the entire text, which by definition is more than we need. However, that's not an issue, since Excel returns the smaller amount between the last argument to MID and the actual length of the text.
Once we have the text from the start of the name on, we need to calculate the position of the 2nd space (where the name ends). To do that, we need to calculate the position of the first space. This is similar to how we calculated the start of the name earlier (which starts after 2 new lines). The function we need is:
=SEARCH(" ",[#[Rest of String]],SEARCH(" ",[#[Rest of String]])+1)-1
So now, we know where the name starts (after 2 new lines), and where it ends (after the 2nd space). Assuming we have these numbers stored in columns named Start of Name and To Second Space respectively, we can use the following formula to get the name:
=MID([#Text],[#[Start of Name]],[#[To Second Space]])
This is equivalent to the first formula: The difference is that the first formula doesn't use any "helper columns".
Of course, if any cell doesn't match this format, then you'll be out of luck. Using Excel formulas to parse text can be finicky and inflexible. For example, if someone has a middle name, or someone has a initials with spaces (e.g. P.K. Subban was P. K. Subban), or there was a Jr. or something, your job would be a lot harder.
Another alternative is to use regular expressions to get the data you want. I would recommend this thorough answer as a primer. Although you still have the same issues with name formats.
Finally, there's the obligatory Falsehoods Programmers Believe About Names as a warning against assuming any kind of standardized name format.

Excel, Numberplate Clarification

I am working on an excel document for fuel cards at the minute and my current issue is to write in a formula for validating number plates based on UK standard plates (two letters followed by two numbers then three letters i.e. BK08JWZ). At this point in time we are not considering personal plates in this just to keep things simple.
Ideally I need excel to look at the text in the box and confirm it to an agreed layout but I am struggling to find the right formula. The plates are in column 'I' and I have already added in another column after titled 'approved plates' in column 'J'but this can be deleted if it's not needed.
Results wise, I can do this one of two ways, to either get the excel document to highlight and number plates that do not match the DVLA standard , or have a column next to the number plate column that registers a boolean response to the recognition i.e. If it is valid (true) or if not (false).
Either way the plate needs to be able to be seen as it was currently, so if there is something wrong with it, it needs to be visible, not throw up an error message.
Any help would be very welcome.
All the information on UK standard number plates are on this site:
https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/359317/INF104_160914.pdf

I would do it like this:
1) create a lookup sheet with data from the booklet. One column for allowed "memory tag" identiffiers (first two letters), one column for the allowed "age identiffiers" (first two numbers), and one column for allowed random letters (last three letters, full alphabet except I and Q)
2) strip spaces from the number plate for comparison
3) Use MID(numberplate,1,2), MID(numberplate,3,2) and MID(numberplate,5,3) to compare to each lookup list repectively (using INDEX()>0).
4) when all 3 parts are found in lookup lists the number plate is valid.

Try researching Regular Expressions or RegEx. This is a powerful programming tool to determine whether strings match specific patterns. You can use RegEx expressions to extract the pattern, replace the pattern or test for the pattern. Very efficient but not for the faint-hearted although there is plenty of help on-line. Try this article for starters.
The following RegEx may be what you need..
(?^[A-Z]{2}[0-9]{2}[A-Z]{3}$)|(?^[A-Z][0-9]{1,3}[A-Z]{3}$)|(?^[A-Z]{3}[0-9]{1,3}[A-Z]$)|(?^[0-9]{1,4}[A-Z]{1,2}$)|(?^[0-9]{1,3}[A-Z]{1,3}$)|(?^[A-Z]{1,2}[0-9]{1,4}$)|(?^[A-Z]{1,3}[0-9]{1,3}$)
This was copied from this article which gives a very full explanation using DVLA rules.
EDIT:
To use RegEx within Excel. In the IDE, Tools menu, select References and add the Microsoft VBScript Regular Expressions 5.5 reference.
With acknowlegement to user3616725s helpful observation.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Highlight Specific Word in Associated String and String Variable in Tableau - nlp

Related

Aligning vertically a series of tables with text

Extract part of a string in excel

Create a search option in Power BI dashboard based on keywords table

Extracting text from complex string in excel

Excel, Numberplate Clarification

Categories

Resources