Alteryx Analyse the similarity of the words - alteryx

I am currently doing out the top 10 types of fault chart. So the user will key in what is the fault about, ex. light bulb fused. As it is free flow text box, the words may not be the same. Is there anyway to make Alteryx understand that some words may be the same, allowing me to find the top 10 types of fault. Thank you.

You have a couple of ways. You can use the Fuzzy Match tools in the Join category to sort out slight spelling mistakes. You can find Alteryx examples of Fuzzy Match on Youtube.
You can also use the Record ID followed by Text to Columns (Split to Rows based on space) to get a list of single words.
In what you are trying to do, I would advise building up a bit of a lookup table. You can then use the Find-Replace Tool to Append the Category from the lookup depending on the words that are found.
Depending on the cleanliness of your data and how different each category is will guide you as to how far down the above paths you should go.

Related

How can I find text between two headings from docx in python

I want to extract information from the resume, for this, I have to identify headings and take text data underneath that heading.
I think you need to be more specific to your issue and approach you want to take. As of now, for heading extraction, you can define a corpus first form all the headings after reading in beautiful soup. Once such corpus is created you can now match the corpus with heading of the resume and get the section by defining the starting and ending data point. and then match skills et. whatever you want to do with it.
This is the simplest approach based on your current question. Be more specific so, i can guide with more precise approach.
Best,

LibreOffice or Excel: Randomization of items across colums without repetition

I have 100 people and I want them to judge words as either positive or negative (e.g. 'insurance' and 'car accident'). I have a total of 100 of such words. I also want each person to do three words as I am interested in some statistical properties (i.e. seeing how well people agree).
I want assign words to people by creating three columns with the same words in each column. However, I want words to randomized in a way so that there is no repetition in any row. Randomization is obviously important as I want to avoid any bias, but it would be silly to ask the same person the same two (or worse, three) words.
So, here is the data structure that I try to achieve:
person1, word1, word65, word33;
person2, word55, word56, word44;
person3, word23, word23, word3; <--- This should not happen
Is there a simple formula or other way to do this form of column-spanning randomization without repetition in LibreOffice Calc or Excel?
Thanks in advance!
What you need is a random permutation of the words that you type in difference cells. You can do this task using the Libreoffice extension Permutate! (download here: https://sourceforge.net/projects/permutate/). Since I am the developer of this simple extension, please do not hesitate to ask for any clarifications.

problems with excel showing producs when insterting measurments with multiple results

Hello i'm having problems with getting this to work. What i'm trying to do is when you insert a set of measurements i want excel to show the ( In this case products) which are closest to those measurements.
here is a picture:
The result i'm trying to reach is when you type in the measurements you get product(s) and the manufacturer which are closest to those measurements.
Any help is greatly appreciated.
In essence, what you are after is an index+match function. It will allow you to find a value in one list, given a corresponding variable. In this case, given a measurement, it will find a manufacturer and product combo in your list.
Your problem is that you will need to adapt your data to allow for this. For example, you need to decide whether you only want the closest match for measurements or if you need the closest match that is greater than the measurement you provide.
It is also possible that you'll need to split your measurement column into two different columns (unless all you need is the total area irrespective of individual lengths).
You could potentially avoid the index+match by using conditional formatting, but that would still require the data manipulation.
Given the information you provided, the answer will never be much more informative than this. But this should get you started and the following steps can be made easier with help from google.

Replacing numeric values in Excel sheet with text values from other sheet

I am using Surveymonkey for a questionnaire. Most of my data has a regular scale from 0-6, and additionally an "Other" option that people can use in case they choose to not answer the item. However, when I download the data, Surveymonkey automatically assigns a value of 0 to that not-answer category, and it appears this cant be changed.
This leads to me not knowing when a zero in my numeric dataset actually means zero or just participants choosing to not answer the question. I can only figure that out by looking at another file that includes the labels of participants answers (all answers are provided by the corresponding labels, so this datafile misses all non-labeled answers...).
This leads me to my problem: I have two excel files of same size. I would need to find a way to find certain values in one dataset (text value, scattered randomly over dataset), and replace the corresponding numeric values in the other dataset (at the same position in the dataset) with those values.
I thought it would just be possible to find all values and copy paste in the same pattern, but I cannot seem to find a way to do that. I feel like I am missing an obvious solution, but after searching for quite a while I really could not find an answer to my specific question.
I have never worked with macros or more advanced excel programming before, but have a bit of knowledge about programming in itself. I hope I explained this well, I would be very thankful for any suggestions or scripts that could help me out here!
Thank you!
Alex
I don't know how your Excel file is organised, but if it's like the legacy Condensed format, all you should need to do is to select the column corresponding to a given question (if that's what you have), and search and replace all 0 (match entire cell) with the text you want.

Building a customized, fuzzy and multiple Vlookup

Ok so, twice a month I receive a large file of about 100 rows, which contains 4 columns:
Building name - value - county - state
I´ve to complete 2 other columns based on a master list that have thousands of entries.
I want to produce something very similar to this fabulous add-in (http://www.microsoft.com/en-us/download/details.aspx?id=15011), but a bit simpler and that I could use at work without problems.
What I need to do is the following:
In order to match my input with the master file, I know the county and state must match, but then, the building names can change a bit in each file for the same building (ie "John Miller #34" can be "Miller, John 34 A"), and that the values may vary but not too much.
Based on that, I want to bring from the master to my file, all the entries that may match each of my rows, filtering by County and State first, and then by similarity in name and value.
Could you please share your thoughts on how you´d approach this?
I know this is not a simple thing, but anything may help!
You could also use wildcards to try and match on the primary identifier within the name. from your example, that might be "Miller", for example.
Unfortunately for you, the vlookup "fuzzy logic" is nowhere near reliable for your purpose (see the comment on my answer below for details), and you won't have any indicator as to whether the returned result is accurate or not.
It's possible to get 100% of what you want through some heavy coding in a user-defined function, but this is probably well beyond your comfort zone.
A clunky solution, although somewhat easy to explain and adopt, is to create an "identity column" for every unique scenario that can occur. So, for example:
Then you can import your master sheet and add the same identity column to the left, and perform your vlookup. When a new configuration is added you can just add that to the master list and it will populate in your imported file in future instances.
That said, if you are interested in learning, there have been many people who have walked in your shows and felt your pain. You may want to indulge in this:
http://www.mrexcel.com/forum/excel-questions/195635-fuzzy-matching-new-version-plus-explanation.html
Because what you are truly requesting is an algorithm. It's not a simple thing, but it's very possible. And if you take the time to learn you not only solve your immediate problem, but make yourself marketable as an Excel wiz. Good luck!

Resources