Postgres column contains string values which need to pivoted

Postgres column contains string values which need to pivoted - string

details column has string data in the following format (enclosed by parenthesis as shown) -
{"id":"350876","Time":"Aug 22 2022, 12:41:57 PM" ,"Session":"NO","teamPercentage":89}
how do I add these id , Time, Session , teamPercentage as new columns in the same row?
My thought process:
Do I need to do some pivot of some sort? But I don't know if pivot can be done with string values, and more specifically how to detect id, time, session etc. and pivot these values.

This seems to be a JSON value, so just treat it as one. Then you can use the JSON functions to extract the values for the keys
select details::json ->> 'id' as id,
details::json ->> 'Time' as time,
details::json ->> 'Session' as session,
details::json ->> 'teampPercentage' as percentage
from the_table;

Related

How to Flatten a semicolon Array properly in Azure Data Factory?

Context: I've a data flow that extracts data from SQL DB, when data comes is just one column with a string separated by tab, in order to manipulate the data properly, I've tried to separate every single column with its corresponding data:
Firstly, to 'rebuild' the table properly I used a 'Derived Column' activity replacing tab with semicolons instead (1)
dropLeft(regexReplace(regexReplace(regexReplace(descripcion,[\t],';'),[\n],';'),[\r],';'),1)
So, after that use 'split()' function to get an array and build the columns (2)
split(descripcion, ';')
Problem: When I try to use 'Flatten' activity (as here https://learn.microsoft.com/en-us/azure/data-factory/data-flow-flatten), is just not working and data flow throws me just one column or if I add an additional column in the 'Flatten' activity I just get another column with the same data that the first one:
Expected output:
column2
column1
column3
2000017
ENVASE CORONA CLARA 24/355 ML GRAB
PC13
2004297
ENVASE V FAM GRAB 12/940 ML USADO
PC15
Could you say me what i'm doing wrong, guys? thanks by the way.

You can use the derived column activity itself, try as below.
After the first derived column, what you have is a string array which can just be split again using derived schema modifier.
Where firstc represent the source column equivalent to your column descripcion
Column1: split(firstc, ';')[1]
Column2: split(firstc, ';')[2]
Column3: split(firstc, ';')[3]
Optionally you can select the columns you need to write to SQL sink

how to store double colon values in oracle database table

I have excel which i am trying to import in oracle database table.
Some of the values in excel consist of for example 14:39.5 with double colon. What dataype in oracle database table i should provide to store this value ?
Currently have given varchar datatype and its throwing an error during import as :
Conversion error! Value: "00:12:01.615518000" to data type: "Number". Row ignored! Value is '00:12:01.615518000'. Cannot be converted to a decimal number object. Valid format: 'Unformatted'

You can store it as an INTERVAL DAY(0) TO SECOND(9) data type:
CREATE TABLE table_name (
time INTERVAL DAY(0) TO SECOND(9)
);
Then you can use TO_DSINTERVAL passing your value with '0 ' prepended to the start:
INSERT INTO table_name (time)
VALUES ( TO_DSINTERVAL('0 ' || '00:12:01.615518000') );
db<>fiddle here

If it is part of a date/time stamp then you could store it as DATE or TIMESTAMP if you could add the date component. Oracle doesn't have just a TIME data type.
If you can't add a date component to it, then assuming it is a time interval you could convert it to seconds or microseconds (lose the colons) and store it as a NUMBER.
If you want to maintain the exact formatting as shown, your only option is to store it as text using VARCHAR2 or something similar.

SQLite returns int instead of real

I'm working on a website in node js and use SQLite as a database for the first time.
I want to be able to use real for some form data and I noticed that every real in my database are converted to integer once the query is made.
To vizualize the database i am using DB Browser and i checked if the columns are defined as REAL which they are.
If i try to query a data set as 0.1 in my DB I get this :
sqlite> select step_variable
from variables
where id=38;
0.0
After trying as suggested the command TYPEOF(step_variable) it returned :
0.0|real

In the SQLite CREATE TABLE command, one defines a data type affinity, not a data type. SQLite supports the following five column affinities: TEXT, NUMERIC, INTEGER, REAL, NONE.
Thus the data type you specify when creating a table does not enforce a certain data type. You can supply any data type you want or even omit the data type.
CREATE TABLE table1(
column1 ABC,
column2 Others,
column3 WHATEVER);
CREATE TABLE table2(column1, column2, column3);
Populate tables:
INSERT INTO table1 VALUES( 1, 'my text', 123.45);
INSERT INTO table2 VALUES( 1, 'my text', 123.45);
Now let us check what SQLite made out of it:
SELECT column1, TYPEOF(column1) from table1
SELECT column2, TYPEOF(column1) from table1
SELECT column3, TYPEOF(column1) from table1
With:
column TYPEOF(column)
------------------------
1 INTEGER
my text TEXT
123.45 REAL
When you go through a query result e.g. by using sqlite2_step you can use the sqlite3_column_type statement to confirm the column type - unless you know the result anyway and simply cast the result to the data type expected.
Martin

I found the solution it was simply that i didn't save my file after modifying it.

Excel Power Query - from web with dynamic worksheet cell value

We have a spreadsheet that gets updated monthly, which queries some data from our server.
The query url looks like this:
http://example.com/?2016-01-31
The returned data is in a json format, like below:
{"CID":"1160","date":"2016-01-31","rate":{"USD":1.22}}
We only need the value of 1.22 from the above and I can get that inserted into the worksheet with no problem.
My questions:
1. How to use a cell value [contain the date] to pass the date parameter [2016-01-31] in the query and displays the result in the cell next to it.
2. There's a long list of dates in a column, can this query be filled down automatically per each date?
3. When I load the query result to the worksheet, it always load in pairs. [taking up two cells, one says "Value", the other contains the value which is "1.22" in my case]. Ideally I would only need "1.22", not the title, can this be removed? [Del won't work, will give you a "Column 1" instead, or you have to hide the entire row which will mess up with the layout].
I know this is a lot to ask but I've tried a lot of search and reading in the last few days and I have to say the M language beats me.
Thanks in advance.

Convert your Web.Contents() request into a function:
let
myFunct = ( param as date ) => let
x = Web.Contents(.... & Date.ToText(date) & ....)
in
x
in
myFunct
Reference your data request function from a new query, include any transformations you need (in this case JSON.Document, table expansions, remove extraneous data. Feel free to delete all the extra data here, including columns that just contain the label 'value'.
(assuming your table of domain values already exists) add a custom column like
=Expand(myFunct( [someparameter] ))
edit: got home and got into my bookmarks. Here is a more detailed reference for what you are looking to do: http://datachix.com/2014/05/22/power-query-functions-some-scenarios/

For a table - Add column where you get data and parse JSON
let
tt=#table(
{"date"},{
{"2017-01-01"},
{"2017-01-02"},
{"2017-01-03"}
}),
add_col = Table.AddColumn(tt, "USD", each Json.Document(Web.Contents("http://example.com/?date="&[date]))[rate][USD])
in
add_col
If you need only one value
Json.Document(Web.Contents("http://example.com/?date="&YOUR_DATE_STRING))[rate][USD]

Selecting timeuuid columns corresponding to a specific date

Short version: Is it possible to query for all timeuuid columns corresponding to a particular date?
More details:
I have a table defined as follows:
CREATE TABLE timetest(
key uuid,
activation_time timeuuid,
value text,
PRIMARY KEY(key,activation_time)
);
I have populated this with a single row, as follows (f0532ef0-2a15-11e3-b292-51843b245f21 is a timeuuid corresponding to the date 2013-09-30 22:19:06+0100):
insert into timetest (key, activation_time, value) VALUES (7daecb80-29b0-11e3-92ec-e291eb9d325e, f0532ef0-2a15-11e3-b292-51843b245f21, 'some value');
And I can query for that row as follows:
select activation_time,dateof(activation_time) from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e
which results in the following (using cqlsh)
activation_time | dateof(activation_time)
--------------------------------------+--------------------------
f0532ef0-2a15-11e3-b292-51843b245f21 | 2013-09-30 22:19:06+0100
Now lets assume there's a lot of data in my table and I want to retrieve all rows where activation_time corresponds to a particular date, say 2013-09-30 22:19:06+0100.
I would have expected to be able to query for the range of all timeuuids between minTimeuuid('2013-09-30 22:19:06+0100') and maxTimeuuid('2013-09-30 22:19:06+0100') but this doesn't seem possible (the following query returns zero rows):
select * from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e and activation_time>minTimeuuid('2013-09-30 22:19:06+0100') and activation_time<=maxTimeuuid('2013-09-30 22:19:06+0100');
It seems I need to use a hack whereby I increment the second date in my query (by a second) to catch the row(s), i.e.,
select * from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e and activation_time>minTimeuuid('2013-09-30 22:19:06+0100') and activation_time<=maxTimeuuid('2013-09-30 22:19:07+0100');
This feels wrong. Am I missing something? Is there a cleaner way to do this?
The CQL documentation discusses timeuuid functions but it's pretty short on gte/lte expressions with timeuuids, beyond:
The min/maxTimeuuid example selects all rows where the timeuuid column, t, is strictly later than 2013-01-01 00:05+0000 but strictly earlier than 2013-02-02 10:00+0000. The t >= maxTimeuuid('2013-01-01 00:05+0000') does not select a timeuuid generated exactly at 2013-01-01 00:05+0000 and is essentially equivalent to t > maxTimeuuid('2013-01-01 00:05+0000').
p.s. the following query also returns zero rows:
select * from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e and activation_time<=maxTimeuuid('2013-09-30 22:19:06+0100');
and the following query returns the row(s):
select * from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e and activation_time>minTimeuuid('2013-09-30 22:19:06+0100');

I'm sure the problem is that cqlsh does not display milliseconds for your timestamps
So the real timestamp is something like '2013-09-30 22:19:06.123+0100'
When you call maxTimeuuid('2013-09-30 22:19:06+0100') as milliseconds are missing, zero is assumed so it is the same as calling maxTimeuuid('2013-09-30 22:19:06.000+0100')
And as 22:19:06.123 > 22:19:06.000 that causes record to be filtered out.

Not directly related to answer but as an additional addon to #dimas answer.
cqlsh (version 5.0.1) seem to show the miliseconds now
system.dateof(id)
---------------------------------
2016-06-03 02:42:09.990000+0000
2016-05-28 17:07:30.244000+0000

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Postgres column contains string values which need to pivoted - string

Related

How to Flatten a semicolon Array properly in Azure Data Factory?

how to store double colon values in oracle database table

SQLite returns int instead of real

Excel Power Query - from web with dynamic worksheet cell value

Selecting timeuuid columns corresponding to a specific date

Categories

Resources