How to query a text field looking for at least one string from a list of strings in Presto - presto

How do I query a string field in Presto looking for occurences of at least one string in a list?
text
This is a good day
This is a good week
day one: project starts
Examples:
Select * from tbl
where text CONTAINS ANY of ['day','project']
output:
This is a good day
day one: project starts
I tried any_match but it fails.

Related

Extracting text in excel

I have some text which I receive daily that I need to seperate. I have hundreds of lines similar to the extract below:
COMMODITY PRICE DIFFERENTIAL: FEB50-FEB40 (APR): COMPANY A OFFERS 1000KB AT $0.40
I need to extract individual snippets from this text, so for each in a seperate cell, I the result needs to be the date, month, company, size, and price. In the case, the result would be:
FEB50-40
APR
COMPANY A
100
0.40
The issue I'm struggling with is uniformity. For example one line might have FEB50-FEB40, another FEB5-FEB40, or FEB50-FEB4. Another example giving me difficult is that some rows might have 'COMPANY A' and the other 'COMPANYA' (one word instead of two).
Any ideas? I've been trying combinations of the below but I'm not able to have uniform results.
=TRIM(MID(SUBSTITUTE($D7," ",REPT(" ",LEN($D7))), (5)*LEN($D7)+1,LEN($D7)))
=MID($D7,20,21-10)
=TRIM(RIGHT(SUBSTITUTE($D6,"$",REPT("$",2)),4))
Sometimes I get
FEB40-50(' OR 'FEB40-FEB5'
when it should be
'FEB40-FEB50'`
Thank you to who is able to help.
You might get to the limits of formulas with this scenario, but with Power Query you can still work.
As I see it, you want to apply the following logic to extract text from this string:
COMMODITY PRICE DIFFERENTIAL: FEB50-FEB40 (APR): COMPANY A OFFERS 1000KB AT $0.40
text after the first : and before the first (
text between the brackets
text after the word OFFERS and before AT
text after 'AT`
These can be easily translated into several "Split" scenarios inside Power Query.
split by custom delimiter : - that's colon and space - for each ocurrence
remove first column
Split new first column by ( - that's space and bracket - for leftmost
Replace ) with nothing in second column
Split third column by delimiter OFFERS
split new fourth column by delimiter AT
The screenshot shows the input data and the result in the Power Query editor after renaming the columns and before loading the query into the worksheet.
Once you have loaded the query, you can add / remove data in the input table and simply refresh the query to get your results. No formulas, just clicking ribbon commands.
You can take this further by removing the "KB" from the column, convert it to a number, divide it by 100. Your business processing logic will drive what you want to do. Just take it one step at a time.

How to extract certain words from a string based on the existence of other words?

So i have what appears to be a rather basic problem that probably has a simple solution but i probably need a function i am not aware of (new to programming). For this assignment i need to get a certain word/words relative to the position of some other "key" word.
Ex of what i need to retrieve:
A valid person's information in a description will contain the following: 1. a string "name is" after which a single space and a name (first name and last name), separated by single space.
2. single space before and after that two digits representing the person's age after which we have single space and the word "years"
3. birthday date - before the date we should have the word "on" after that single space and then date in exact format dd-mm-yyyy (months will be digits and the date should be separated with "-" {dd-mm-yyyy})
This is given as the practice string:
Hello,everyone my name is Maria Mariova. I am 22 years old. was born on 22-06-1994.
I've been going at this for a few hours but no luck. Looked mainly into re but didn't seem to find what i need. the goal is to get the data i need in a variable so that i can pass it in wherever needed.

Get the date from Excel File name in a Variable

I need to create a SSIS package that would extract data from an Excel source and load it into a SQL Server Destination.
The Excel file name would have a date, typically the file name would look like emp_20110909.xls where 11 is the Month, 09 is the Day and 09 is the Year. Now I want to capture this date and in the destination table add another column named "Extracted_Date" and populate the captured date for all the records extracted from this excel.
Can anyone tell me how to do that process?
Excel as a data source offers no explicit functionality for this whereas the Flat File Source does. I blogged about this under What is the name of a file
What you're looking to do is have a Foreach File Enumerator look in a folder for your Excel file(s). Assign the value of the currently found file to a variable like #[User::CurrentFileName]. That would look something like C:\ssisdata\mySource\Input\emp_110909.xls
You would update the Excel Connection Manager to have an expression on the ExcelFilePath property so now as the value of #[User::CurrentFileName] changes, so does the actual referenced file. You can find plenty of references to using the foreach enumerator on the web or search my answers
The last bit you need is to parse the value of CurrentFileName to find the year
(11), month (09) and day (09) elements - or maybe you want it as one big value (110909). For this, I would create 4 variables: FileDate, FileYear, FileMonth, FileDay all as string. Yes, they're numbers but for our usage, treating them as string is going to be easier.
FileDate will correspond to everything between the underscore following emp up until the period of xls. We're going to use the Expression language of SSIS to do this and the particular elements will be SUBSTRING, FINDSTRING and LEN
SUBSTRING(#[User::CurrentFileName], FINDSTRING(#[User::CurrentFileName], "emp_", 1) + LEN("emp_"), 6)
Here, I was lazy and just "knew" the length was 6 and hardcoded as such. In the event that someone gives us a emp_20110909.xls this will fail. The preceding expression would be modified by finding the position of the period and then calculating the length from the emp_ position.
Now that we know FileDate, we can use SUBSTRING to slice out the first 2 elements for year, next 2 for month and final two for day.
You can then inject those values into your Data Flow via a Derived Task or push into an audit table via Execute SQL Task.

How can I replace specific text inside an excel cell with text from a different column

I have a very large spreadsheet where I need to replace a word in one column with text from another on a large scale.
I need to replace one word (in this case it is [Rate]) with information from a different column.
ex: Fixed rate of [Rate] per kilowatt hour.
finished product: Fixed rate of 0.0652 (number found in different column) per kilowatt hour.
Is this possible on a large scale? There is 800 something of these that I need to update but my work application is rather slow and if I can streamline this it will save me hours of time.
Use the substitute function:
The information page is found at:
Syntax
SUBSTITUTE(text,old_text_or_reference,new_text_or_reference,inst)
Text:the text (as string) or reference in which you want to replace
Old_text:Text to be replaced in the string
New_text:Is the text or reference you want to replace old_text with.
inst: Not a compulsory variable. Is which consecutive number of the occurrence of the old_text that you want to replace.
http://office.microsoft.com/en-nz/excel-help/substitute-function-HP010062578.aspx

Get first ten characters in a string

I'm adding a notepad feature to my application, and I want to read the first ten characters of a string and use that string as the title for a cell in my table view.
I think I could use substringFromIndex:.
Can someone elaborate on how I could go about doing this?
[#"1234567890" substringToIndex:6]
gives you
123456
substringFromIndex - from a given index to the end (apple docs )
substringToIndex - Returns a new string containing the characters of the receiver up to, but not including, the one at a given index (apple docs)

Resources