ADF Understanding the Case Statement

ADF Understanding the Case Statement - azure

Given the following Derived column Expression
case(Rolling =='A'||Rolling == 'B'||Rolling == 'C'|| Rolling =="S"
, "
, case(Alpha== 'EE'
, toString(toDate(Manu_Date, 'yyyy-MM-dd'))
, case(Alpha=='CW', Del_Date,"))
)
2 questions
Is there a Better way to write this code?
What is this code trying to do ?
I am trying to understand what they are trying to achieve with this expression?

In the given expression, after Rolling=="S", it is not double Quotes ". It should be two single quotes''
Similarly, after Del_date, also it should be two single Quotes.
case(Rolling =='A'||Rolling == 'B'||Rolling == 'C'|| Rolling =="S", '',
case(Alpha== 'EE', toString(toDate(Manu_Date, 'yyyy-MM-dd')),
case(Alpha=='CW', Del_Date,'' )))
What is this code trying to do ?
Syntax for case statement is
case(condition,true_expression,false_expression)
Initially, this expression checks if Rolling is 'A' or 'B' or 'C' or 'S' and then assign the value as '' (empty string) for the derived column.
When the above condition is false, then checks if Alpha is 'EE' and assign the value of Manu_Date in string format.
When the second condition also fails, it checks if Alpha='CW' and assign the value of Del_Date column.
When all the above conditions are not met, '' (empty string) is assigned. This will be the default value.
I repro'd this with sample input.
img1: input data
In derived column transformation, new column is added, and the expression is given as in below script.
case(Rolling =='A'||Rolling == 'B'||Rolling == 'C'|| Rolling =="S", '',
case(Alpha== 'EE', toString(toDate(Manu_Date, 'yyyy-MM-dd')),
case(Alpha=='CW', Del_Date,'' )))
img2: Derived column transformation output
Is there a Better way to write this code?
Since the order of condition is important to assign the values to the new column, case statement is better way to do.
But, instead of using nested case statements, we can use single case statement to achieve the same.
Syntax:
case( condition_1, expression_1, condition_2, expression_2,.......... condition_n,expression_n,default_expression).
Null will be the default value, when the default expression is omitted.
Modified expression
case(Rolling =='A'||Rolling == 'B'||Rolling == 'C'|| Rolling =="S", '',
Alpha== 'EE', toString(toDate(Manu_Date, 'yyyy-MM-dd')),
Alpha=='CW', Del_Date,'' )
img 3: Results of both case statements
Both the expressions are added in the derived column transformation and results are same in both cases.

Related

SELECT statement returning the column name instead of the VALUE (for that said column)

I'm trying to parse information in to a SELECT statement using the two column names 'id' and 'easy_high_score' so I can manipulate values of them two columns in my program, but when trying to get the value of the column 'easy_high_score', which should be an integer like 46 or 20, it instead returns a string of ('easy_high_score',).
Even though there is no mention of [('easy_high_score',)] in the table, it still prints this out. In the table, id 1 has the proper values and information i'm trying get but to no avail. I am fairly new to SQLite3.
if mode == "Easy":
mode = 'easy_high_score'
if mode == "Normal":
mode = "normal_high_score"
if mode == 'Hard':
mode == "hard_high_score"
incrementor = 1 ##This is used in a for loop but not necessary for this post
c.execute("SELECT ? FROM players WHERE id=?", (mode, incrementor))
allPlayers = c.fetchall()
print(allPlayers) #This is printing [('easy_high_score',)], when it should be printing an integer.
Expected Result: 20 (or an integer which represents the high score for easy mode)
Actual Result: [('easy_high_score',)]

Column name cannot be specified using a parameter it should be present verbatim in the query. Modify the line that executes the query like this:
c.execute("SELECT %s FROM players WHERE id=?" % mode, (incrementor,))

A possible cause of this is double quotes vs single quotes.
'SELECT "COLUMN_NAME" FROM TABLE_NAME' # will give values as desired
"SELECT 'COLUMN_NAME' FROM TABLE_NAME" # will give column name like what you got

postgresql, select empty string

To query empty fields I have seen this answer:
Postgresql, select empty fields
(unfortunately I don't have enough reputation points to answer #wildplasser on that post, so here we go)
Wildplasser's answer:
SELECT mystr, mystr1
FROM mytable
WHERE COALESCE(mystr, '') = ''
OR COALESCE(mystr1, '') = ''
;
I am not sure I get the COALESCE method, but it also works for me this way (specific for my string data type):
SELECT mystr, mystr1
FROM mytable
WHERE mystr = '' ;
My questions are:
Does COALESCE work for any data type?
Is there any better way to query empty strings? i.e., column_value = ' '

First you need to understand the difference between NULL and "empty".
NULL is the absence of a value. Any (or at least almost any) data type can be NULL. When you have a column of type integer, and you don't want to put a value in that field, you put NULL.
"Empty" is a string/text concept. It's a string with an empty value, i.e. ''. A text field with an empty string contains a value: the empty string. It is not the same as containing NULL, i.e. no value. Other data types e.g. integer, boolean, json, whatever, can't have an empty string.
Now to COALESCE. That function works on any data type, and basically it returns the first not-NULL result of its arguments. So COALESCE(NULL, TRUE) returns TRUE because the first argument is NULL; COALESCE(FALSE, TRUE) returns FALSE because the first argument is not NULL; and COALESCE(NULL, NULL) returns NULL because there are no not-NULL arguments.
So, COALESCE(field, '') returns the value of field if it's not NULL, and otherwise returns an empty string. When used in COALESCE(field, '') = '' when trying to find any rows where field is "empty", this is basically saying "if field is NULL then use an empty string in its place, then see if it equals an empty string". This is because NULL and an empty string are not equivalent, and "you" are trying to find any rows where fields are NULL or empty.
In your version of the query, where you just do field = '', that will ONLY return results where field is actually an empty string, not where field is NULL. Which behaviour you desire is up to you.

With COALESCE you will get NULL values too in the first query.
1- In Postgresql, you can't mix datatype example here, but you can use the function to_char to mix values
2- I don't understand your question

I think based on the definition of coalesce itself as
"The COALESCE() function returns the first non-null value in a list."
means that it work for any data type
I don't really understand the question but i think yes its already the most efficient way to make empty string

How to convert matlab table [Inf], '' entry to char string

I have a Matlab table and want to create an SQL INSERT statement of this line(s).
K>> obj.ConditionTable
obj.ConditionTable =
Name Data Category Description
________________ ____________ _________________ ___________
'Layout' 'STR' '' ''
'Radius' [ Inf] 'Radius_2000_inf' ''
'aq' [ 0] '0' ''
'VehicleSpeed' [ 200] 'Speed_160_230' ''
Erros when conditionTable = obj.ConditionTable(1,:);
K>> char(conditionTable.Data)
Error using char
Cell elements must be character arrays.
K>> char(conditionTable.Description)
ans =
Empty matrix: 1-by-0
problem: the [Inf] entry
problem: possibly [123] number entries
problem: '' entries
Additionally, following commands are also useless in this matter:
K>> length(conditionTable.Data)
ans =
1
K>> isempty(conditionTable.Description)
ans =
0
Target Statement would be something like this:
INSERT INTO `ConditionTable` (`Name`, `Data`, `Category`, `Description`, `etfmiso_id`) VALUES ("Layout", "STR", "", "", 618);

Yes, num2str accept a single variable of any type and will return a string, so all these operations are valid:
>> num2str('123')
ans =
123
>> num2str('chop')
ans =
chop
>> num2str(Inf)
ans =
Inf
However, it can deal with purely numeric arrays (e.g. num2str([5 456]) is also valid), but it will bomb out if you try to throw a cell array at it (even if all your cells are numeric).
There are 2 possible way to work around that to convert all your values to character arrays:
1) use an intermediate cell array
I recreated a table [T] with the same data than in your example. Then running:
%% Intermediate Cell array
T3 = cell2table( cellfun( #num2str , table2cell(T) , 'uni',0) ) ;
T3.Properties.VariableNames = T.Properties.VariableNames
T3 =
Name Data Category Description
______________ _____ _________________ ___________
'Layout' 'STR' '' ''
'Radius' 'Inf' 'Radius_2000_inf' ''
'aq' '0' '0' ''
'VehicleSpeed' '200' 'Speed_160_230' ''
produces a new table containing only strings. Notice that we had to recreate the column names (copied from the initial table), as these are not transferred into the cell array during conversion.
These method is suitable for relatively small tables, as the round trip table/cellarray/table plus the call to cellfun will probably be quite slow for larger tables.
2) Use varfun function
varfun is for Tables the equivalent of cellfun for cell arrays. You'd think that a simple
T2 = varfun( #num2str , T )
would do the job then ... well no. This will error too. If you look at the varfun code at the line indicated by the error, you'll notice that internally, data in your table are converted to cell arrays and the function is applied to that. As we saw above, num2str errors when met with a cell array. The trick to overcome that, is to send a customised version of num2str which will accept cell arrays. For example:
cellnum2str = #(x) cellfun(#num2str,x,'uni',0)
Armed with that, you can now use it to convert your table:
%% Use "varfun"
cellnum2str = #(x) cellfun(#num2str,x,'uni',0) ;
T2 = varfun( cellnum2str , T ) ;
T2.Properties.VariableNames = T.Properties.VariableNames ;
This will produce the same table than in the example 1 above. Notice that again we had to reassign the column headers on the newly created table (the irony is varfun choked trying to apply the function on the column headers, but does not re-use or return them in the output ... go figure.)
discussion: Initially I tried to make the varfun solution work (hence the T2 name of the result), and wanted to recommend this one, because I didn't like the table/cell/table conversion of the other solution. Now I have seen what goes on into varfun, I am not so sure that this solution will be faster. It might be slightly more readable in a semantic way, but if speed is a concern you'll have to try both version and choose which one gives you the best results.

for the record: num2str(cell2mat(conditionTable.Data)), works, independant if 'abc', [Inf], [0], [123.123], apparently..

Using REGEXP_SUBSTR to get key-value pair data

I have a column with below values,
User_Id=446^User_Input=L307-60#/25" AP^^
I am trying to get each individual value based on a specified key.
All value after User_Id= until it encounters ^
All value after User_Input= until it encounters ^
I tried for and so far I have this,
SELECT LTRIM(REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^'
,'[0-9]+',1,1),'^') User_Id
from dual
How do I get the value for the User_Input??
P.S: User input can have anything, like ',", *,% including a ^ in the middle of the string (that is, not as a delimiter).
Any help would be greatly appreciated..

This can be easily solved using boring old INSTR to calculate the offsets of the start and end points for the KEY and VALUE strings.
The trick is to use the optional occurrence parameter to identify each the correct instance of =. Because the input can contain carets which aren't intended as delimiters we need to use a negative position to identify the last ^.
with cte as (
select kv
, instr(kv, '=', 1, 1)+1 as k_st -- first occurrence
, instr(kv, '^', 1) as k_end
, instr(kv, '=', 1, 2)+1 as v_st -- second occurrence
, instr(kv, '^', -1) as v_end -- counting from back
from t23
)
select substr(kv, k_st, k_end - k_st) as user_id
, substr(kv, v_st, v_end - v_st) as user_input
from cte
/
Here is the requisite SQL Fiddle to prove it works. I think it's much easier to understand than any regex equivalent.

If there is no particular need to use Regex, something like this returns the value.
WITH rslt AS (
SELECT 'User_Id=446^User_Input=L307-60#/25" AP^' val
FROM dual
)
SELECT LTRIM(SUBSTR(val
,INSTR(val, '=', 1, 2) + 1
,INSTR(val, '^', 1, 2) - (INSTR(val, '=', 1, 2) + 1)))
FROM rslt;
Of course, if you can't guarantee that there will not be any carets that are valid text characters, this will possibly return partial results.

Assuming that you will always have 'User_Id=' and 'User_Input=' in your string, I would use a character group approach to parsing
Use the starting anchor,^, and ending anchor, $. Look for 'User_Id=' and 'User_Input='
Associate the value you are searching for with a character group.
SCOTT#dev>
1 SELECT REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^','^User_Id=(.*\^)User_Input=(.*\^)$',1, 1, NULL, 1) User_Id
2* FROM dual
SCOTT#dev> /
USER
====
446^
SCOTT#dev>
1 SELECT REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^','^User_Id=(.*\^)User_Input=(.*\^)$',1, 1, NULL, 2) User_Input
2* FROM dual
SCOTT#dev> /
USER_INPUT
================
L307-60#/25" AP^
SCOTT#dev>

Got this answer from a friend of mine.. Looks simple and works great...
SELECT
regexp_replace('User_Id=446^User_Input=L307-60#/25" AP^^', '.*User_Id=([^\^]+).*', '\1') User_Id,
regexp_replace('User_Id=446^User_Input=L307-60#/25" AP^^', '.*User_Input=(.*)[\^]$', '\1') User_Input
FROM dual
Posting here just in case any of you find it interesting..

Strange SELECT behavior

I have this strange problem. i have a table with 10 columns of type character varying.
I need to have a function that searches all records and returns the id of the record which has all strings. Lets say records:
1. a,b,c,d,e
2. a,k,l,h
3. f,t,r,e,w,q
if i call this function func(a,d) it should return 1, if i call func(e,w,q) its should return 3.
The function is
CREATE OR REPLACE FUNCTION func(ma1 character varying,ma2 character varying,ma3 character varying,ma4 character varying)
DECLARE name numeric;
BEGIN
SELECT Id INTO name from Table WHERE
ma1 IN (col1,col2,col3,col4) AND
ma2 IN (col1,col2,col3,col4) AND
ma3 IN (col1,col2,col3,col4) AND
ma4 IN (col1,col2,col3,col4);
RETURN name;
END;
It's working 90% of the time, the weird problem is that some rows are not found.
Its not uppercase or lowercase problem.
What can be wrong, its version 9.1 on 64 bit win 7. I feel its like encoding or string problem but i can't see where and what.
//Ok i found the problem, it has to do with all column, if all 24 columns are filled in then its not working ?? but why ? are there limitations becouse there are 24 columns that i must compare with//
Can someone help me plz.
thanks.

The problem is (probably) that some of your columns have nulls.
In SQL, any equality comparison with a null is always false. This extends to the list of values used with the IN (...) condition.
If any of the values in the list are null, the comparison will be false, even if the value being sought is in the list.
The work-around is to make sure no values are null. which unfortunately results in a verbose solution:
WHERE ma1 IN (COALESCE(col1, ''), COALESCE(col2, ''), ...)

I suspect Bohemian is correct that the problem is related to nulls in your IN clauses. An alternative approach is to use Postgres's array contained in operator to perform your test.
where ARRAY[ma1,ma2,ma3,ma4] <# ARRAY[col1,col2,...,colN]

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

ADF Understanding the Case Statement - azure

Related

SELECT statement returning the column name instead of the VALUE (for that said column)

postgresql, select empty string

How to convert matlab table [Inf], '' entry to char string

Using REGEXP_SUBSTR to get key-value pair data

Strange SELECT behavior

Categories

Resources