Can we write a Insert query Lookup Activity in ADF - azure

At the end of the pipeline I wanted to write the below query
INSERT INTO [TestTable] (Job_name, status) VALUES (Job_name, current_timestamp()).
Jobname will be passed as a parameter.
Can this be written in Lookup, please let me know.

Definitely possible, but should you write massive inserts/scripts within a lookup... probably not a great idea, but see below (Truncate Example, but will work the same with an insert)
I use this method for small things like truncating a table, but never with big code that should be stored as source in the DB.
EDIT:
If you need to pass parameters or Variables into the lookup you should use string interpolation like so:
INSERT INTO dbo.MyTable
SELECT '#{variables('YourVariable')}' as Variable1,
'#{pipeline().Pipeline} as PipelineName

Related

ADF - passing parameters within string to SQL Lookup

I'm writing a pipeline, where I fetch SQL queries from a metadata database in a lookup with the hope to execute those later on in the pipeline. Imagine a string stored in a database:
"SELECT * FROM #{pipeline().parameters.SchemaName}.#{pipeline().parameters.TableName}"
My hope was when passing this string to another Lookup activity, it would pick up the necessary parameters. However it's being passed to the activity as-is, without parameter substitution and I'm getting errors as a result. Is there any clean fix for this or am I trying to implement something not supported by ADF natively?
I found a work-around is just wrapping the string in a series of replace() statements, but hoping something simpler exists.
Can you try below query in the Dynamic Content text box:
#concat('SELECT * FROM ',pipeline().parameters.SchemaName,'.',pipeline().parameters.TableName)

Why aren't the values inserted into the mysql2 query?

Can you tell me, please?
Why does the mysql2 value substitution not work inside the IN operator?
I do this, but nothing works.
Only the first character of the array is being substituted (number 6)
"select * from products_categories WHERE category_id IN (?)", [6,3]);
You can do it like this, of course:
IN(?,?,?,?,?,?,?,?,?,?) [6,3,1,1,1,1,1,1,1,1,1]
But that's not right, I thought that the IN should be automatically substituted from an array =(
I haven't used this, but my gut feeling tells that array items map to question marks based on indexes, so in your case 6 binds to first ? and 3 looks for another one, but doesn't find.
If I were you, I'd try to make sure that my first array item is then actually array, so I'd rewrite it:
"select * from products_categories WHERE category_id IN (?)", [[6,3]]);
I suspect you are using this with .execute(), which is short for prepared statements "prepare first if never executed before"+execute. While api is very similar to .query() one biggest difference is that in case of prepared statement only parameters are sent at execution time, unlike .query() where whole query text is interpolated with all parameters on the client. As a result, you need to send exactly the number of parameters as you have number of placeholders in original query text ( in you example - one ?). The whole [6,3,1,1,1,1,1,1,1,1,1] is treated as one parameter and sent to server as "6,3,1,1,1,1,1,1,1,1,1" string ( because during prepare step that parameter was likely reported by server as VAR_CHAR )
The solution is 1) use .query() and interpolate on the client or 2) build enough ?s dynamically and prepare different PS for different number of IN parameters

The best way to check existence of filtered rows in Cassandra? by user-defined aggregate?

I have a Cassandra table,
CREATE TABLE read_locks (
parent_path text,
filename text,
instance text,
PRIMARY KEY ((parent_path, filename), instance)
);
Logically I want to check the existence of any locks on a file by the following statement:
select count(*)>0 as result from read_locks where parent_path='...' and filename='...';
Of course, I have at least 2 implementations.
select count(*) as result from read_locks where parent_path='...' and filename='...';
and then to use other code, i.e. C++, to check the value of result.
Or
select * from read_locks where parent_path='...' and filename='...';
and then to use other code, i.e. C++, to check the bool value of the following statement:
cass_iterator_next(rows)
I am not sure which is better.
And I guess there is a user-defined aggregate function to do so, but I couldn't figure out.
Please share your comments.
Thank you in advance,
Ying
If you only care if there are any locks, and now how many locks there are, then it's probably more efficient to add a limit clause like this:
SELECT * FROM read_locks WHERE parent_path='...' and filename='...' LIMIT 1;
If that returns a row, then you know there is at least one lock on the file, and if it returns nothing, then there are no locks on the file.

Separating fields out of a string in Hive

I have the following problem...
I work with Hive and want to add a file with several (different) rows of Strings. Those contain fields with a fixed size, like this:
A20130420bcd 34 fgh
where the fields have the length 1,8,6,4,3.
Separated it would look like this:
"A,20130420,bcd,fgh"
Is there any possibility to read the String and sort it into a field besides getting it as a substring for every field like
substring(col_value,1,1) Field1
etc?
I would imagine that cutting the already read part of the string would increase the performance, but i could think of any way to do this with the given functions here.
Secondly, as stated before, there are different types of strings, ordered and identified by the first character.right now just check those with the WHERE-Statement, but it's horrible, as it runs through the whole file just to find only the first String. Is there any way to read specific lines by their number? If i know, that the first string will be of a certain kind, can read it directly?
right it looks like this:
insert overwrite table TEST
SELECT
substring(col_value,1,1) field1,
...
substring(col_value,10,3) field 5
from temp_data WHERE substring(col_value,1,1) = 'A';
any ideas on this?
I would love to hear some ideas =)
You need to write yours generic-UDF parser that output the struct or map or whatever appropriate. you can refer to UDF that output multi-values.
then you can write
insert overwrite table output
select parsed.first, parsed.second
from (
select parse(taget)
from input
) parsed
where first='X';
About second question,you may need to check "explain" command of hive to see if hive do filter push-down for you.(just see how many map reduce it takes, theoretically it should be one map, depending on 1.hive version,
2.output table format
.)
In general sense, this is why database is popular -- take optimization into consideration for you .

SSIS: Filtering Multiple GUIDs from String Variable as Parameter In Data Flow OLE Source

I have an SSIS package that obtains a list of new GUIDs from a SQL table. I then shred the GUIDs into a string variable so that I have them separated out by comma. An example of how they appear in the variable is:
'5f661168-aed2-4659-86ba-fd864ca341bc','f5ba6d28-7283-4bed-9f11-e8f6bef225c5'
The problem is in the data flow task. I use the variable as a parameter in a SQL query to get my source data and I cannot get my results. When the WHERE clause looks like:
WHERE [GUID] IN (?)
I get an invalid character error so I found out the implicit conversion doesn't work with the GUIDs like I thought they would. I could resolve this by putting {} around the GUID if this were a single GUID but there are a potential 4 or 5 different GUIDs this will need to retrieve at runtime.
Figuring I could get around it with this:
WHERE CAST([GUID] AS VARCHAR(50)) IN (?)
But this simply produces no results and there should be two in my current test.
I figure there must be a way to accomplish this... What am I missing?
You can't, at least not using the mechanics you have provided.
You cannot concatenate values and make that work with a parameter.
I'm open to being proven wrong on this point but I'll be damned if I can make it work.
How can I make it work?
The trick is to just go old school and make your query via string building/concatenation.
In my package, I defined two variables, filter and query. filter will be the concatenation you are already performing.
query will be an expression (right click, properties: set EvaluateAsExpression to True, Expression would be something like "SELECT * FROM dbo.RefData R WHERE R.refkey IN (" + #[User::filter] + ")"
In your data flow, then change your source to SQL Command from variable. No mapping required there.
Basic look and feel would be like
OLE Source query

Resources