Presto: how to extract items in a string to columns - presto

I'm looking to use Presto to extract the filename and url from a string that looks like this. Any advice? Here's an example
{name=filename.pdf, url:https://url.com}
Thanks!

You can use regexp_extract:
WITH t(value) AS (
VALUES '{name=filename.pdf, url:https://url.com}'
)
SELECT
regexp_extract(value, 'name=(.*), url:(.*)}', 1) AS name,
regexp_extract(value, 'name=(.*), url:(.*)}', 2) AS url
FROM t
=>
name | url
--------------+-----------------
filename.pdf | https://url.com
(1 row)

Related

Kusto query language split # character and take last item

If I have a string for example:
"this.is.a.string.and.I.need.the.last.part"
I am trying to get the last part of the string after the last ".", which in this case is "part"
How to I achieve this?
One way I tried was to split the string on ".", I get a array back, but then I don't know how to retrieve the last item in the array.
| extend ToSplitstring = split("this.is.a.string.and.I.need.the.last.part", ".")
gives me:
["this", "is","a","string","and","I","need","the","last", "part"]
and a second try I have tried this:
| extend ToSubstring = substring(myString, lastindexof(myString, ".")+1)
but Kusto do not have a function of lastindexof.
Anyone with tips?
you can access the last member of the array using a negative index -1.
e.g. this:
print split("this.is.a.string.and.I.need.the.last.part", ".")[-1]
returns a single table, with a single column and a single record, with the value part
You can try the code below, and feel free to change it to meet your need:
let lastIndexof = (input:string, lookup: string) {
indexof(input, lookup, 0, -1, countof(input,lookup))
};
your_table_name
| extend ToSubstring = substring("this.is.a.string.and.I.need.the.last.part", lastIndexof("this.is.a.string.and.I.need.the.last.part", ".")+1)

How to extract specific sub directory names from URL

Given the following request URLs:
https://example.com/api/foos/123/bars/456
https://example.com/api/foos/123/bars/456
https://example.com/api/foos/123/bars/456/details
Common structure: https://example.com/api/foos/{foo-id}/bars/{bar-id}
I wish to get separate columns for the values of {foo-id} and {bar-id}
What I tried
requests
| where timestamp > ago(1d)
| extend parsed_url=parse_url(url)
| extend path = tostring(parsed_url["Path"])
| extend: foo = "value of foo-id"
| extend: bar = "value of bar-id"
This gives me /api/foos/{foo-id}/bars/{bar-id} as a new path column.
Can I solve this question without using regular expressions?
Related, but not the same question:
Application Insights: Analytics - how to extract string at specific position
Splitting on the '/' character will give you an array and then you can extract the elements you are looking for as long as the path stays consistent. Using parse_url() is optional- you could use substring() or just adjust the indexes you retrieve.
requests
| extend path = parse_url(url)
| extend elements = split(substring(path.Path, 1), "/") //gets rid of the leading slash
| extend foo=tostring(elements[2]), bar=tostring(elements[4])
| summarize count() by foo, bar

HiveQL string function questions

I'm using HiveQL to run the below query.
The intention is that the case statement removes the last XX characters from the end of the domain, dependent on the suffix (.com, .co.uk).
This doesn't seem to work as there is no change to the strings in the 'domainnew' column in the output.
Can anyone advise how I would make this work?
I also then need to take the output of 'domainnew' and take only the characters to the right of the first '.' when reading from the right handside.
domain = mobile.domain.facebook.com
domainnew = mobile.domain.facebook
newcalc = facebook
Any advice on this would be brilliant!!
Thank you
select domain, catid, apnid, sum(optimisedsize) as bytes,
CASE domain
WHEN instr(domain, '.co.uk') THEN substr(domain,LENGTH(domain)-6)
WHEN instr(domain, '.com') THEN substr(domain,LENGTH(domain)-6)
ELSE domain
END as domainnew
from udsapp.web
where dt = 20170330 and hour = 04 and loc = 'FAR1' and catid <> "0:0" group by domain, catid, apnid sort by bytes desc;
with t as (select 'mobile.domain.facebook.com' as domain)
select regexp_extract(domain,'(.*?)(\\.com|\\.co\\.uk|)$',1) as domainnew
,regexp_extract(domain,'.*?([^.]+)(\\.com|\\.co\\.uk|)$',1) as new_calc
from t
;
+------------------------+----------+
| domainnew | new_calc |
+------------------------+----------+
| mobile.domain.facebook | facebook |
+------------------------+----------+

Find a string using PHPMyAdmin

i have table in DB = dle_post and a row contains id,full_story i want to check if full_story starts with "1." then list its id but the big problem is there are some spaces in the start of full_story some time 1 some time 2 and some time 3 , how can i list all ids starting with "1."
You want to execute some SQL like this, which you can also do in PHPmyAdmin...
SELECT id FROM dle_post WHERE LTRIM(full_story) LIKE '1%';
I think this will work!
Would this query help:
$id = fetch id here;
mysql_query("SELECT * FROM YOUR_TABLE WHERE id LIKE '%".$id."`%'", $someconnection);
YOUR_TABLE -> replace it with your table nime

URL Query String Variable Count in PHP

I am developing an accounting software as project. I want to count sets of variables in URL query string. eg
http://localhost/xampp/khata2/test/tableaddrow_nw.php?accname1=1&DrAmount1=1&CrAmount1=1&accname2=2+was2&DrAmount2=2&CrAmount2=2&accname3=3+was3&DrAmount3=3&CrAmount3=3&accname4=4+was4&DrAmount4=4&CrAmount4=4&accname5=5+was5&DrAmount5=5&CrAmount5=5&accname6=6+was6&DrAmount6=6&CrAmount6=6
There is a set of kind of rows in a query string accname1,DrAmount1,CrAmount1 .... accname6,DrAmount6,CrAmount6 so on which is generated dynamicaly as user inputs data in rows.
how can i find out ho many rows are (set of data) are in query string ?
Any help is appreciated.
$rows = explode("&", $_SERVER['QUERY_STRING']);
This will give you an array of:
accname1=1
DrAmount1=1
CrAmount1=1
So you could get the total as count($rows).
Is this what you meant?

Resources