How to use backslash in Azure Data Factory? - azure

I have an 'Execute SSIS Package' activity in my ADF, that requires from me passing the parameter for Environment. In my case the paramater value is .\Dev. Unfortunately, the dot - backslash sentence is required by SSIS.
I am trying to provide the parameter dynamically, but every time my value .\Dev is being changed to .\\Dev (Each backslash is doubled inside Azure Data Factory). I have tried different solutions like looking up the parameter in the database or setting a variable and even replacing double backslash with single one, but it didn't work - every time I should get .\Dev i was getting .\\Dev. It also looks like this double backslash is treated as one character by ADF (in case of replace or substring functions).
Can anyone help with that?

Related

Insert values with single quotes in a Postgres table column [duplicate]

I have a table test(id,name).
I need to insert values like: user's log, 'my user', customer's.
insert into test values (1,'user's log');
insert into test values (2,''my users'');
insert into test values (3,'customer's');
I am getting an error if I run any of the above statements.
If there is any method to do this correctly please share. I don't want any prepared statements.
Is it possible using sql escaping mechanism?
String literals
Escaping single quotes ' by doubling them up → '' is the standard way and works of course:
'user's log' -- incorrect syntax (unbalanced quote)
'user''s log'
Plain single quotes (ASCII / UTF-8 code 39), mind you, not backticks `, which have no special purpose in Postgres (unlike certain other RDBMS) and not double-quotes ", used for identifiers.
In old versions or if you still run with standard_conforming_strings = off or, generally, if you prepend your string with E to declare Posix escape string syntax, you can also escape with the backslash \:
E'user\'s log'
Backslash itself is escaped with another backslash. But that's generally not preferable.
If you have to deal with many single quotes or multiple layers of escaping, you can avoid quoting hell in PostgreSQL with dollar-quoted strings:
'escape '' with '''''
$$escape ' with ''$$
To further avoid confusion among dollar-quotes, add a unique token to each pair:
$token$escape ' with ''$token$
Which can be nested any number of levels:
$token2$Inner string: $token1$escape ' with ''$token1$ is nested$token2$
Pay attention if the $ character should have special meaning in your client software. You may have to escape it in addition. This is not the case with standard PostgreSQL clients like psql or pgAdmin.
That is all very useful for writing PL/pgSQL functions or ad-hoc SQL commands. It cannot alleviate the need to use prepared statements or some other method to safeguard against SQL injection in your application when user input is possible, though. #Craig's answer has more on that. More details:
SQL injection in Postgres functions vs prepared queries
Values inside Postgres
When dealing with values inside the database, there are a couple of useful functions to quote strings properly:
quote_literal() or quote_nullable() - the latter outputs the unquoted string NULL for null input.
There is also quote_ident() to double-quote strings where needed to get valid SQL identifiers.
format() with the format specifier %L is equivalent to quote_nullable().
Like: format('%L', string_var)
concat() or concat_ws() are typically no good for this purpose as those do not escape nested single quotes and backslashes.
According to PostgreSQL documentation (4.1.2.1. String Constants):
To include a single-quote character within a string constant, write
two adjacent single quotes, e.g. 'Dianne''s horse'.
See also the standard_conforming_strings parameter, which controls whether escaping with backslashes works.
This is so many worlds of bad, because your question implies that you probably have gaping SQL injection holes in your application.
You should be using parameterized statements. For Java, use PreparedStatement with placeholders. You say you don't want to use parameterised statements, but you don't explain why, and frankly it has to be a very good reason not to use them because they're the simplest, safest way to fix the problem you are trying to solve.
See Preventing SQL Injection in Java. Don't be Bobby's next victim.
There is no public function in PgJDBC for string quoting and escaping. That's partly because it might make it seem like a good idea.
There are built-in quoting functions quote_literal and quote_ident in PostgreSQL, but they are for PL/PgSQL functions that use EXECUTE. These days quote_literal is mostly obsoleted by EXECUTE ... USING, which is the parameterised version, because it's safer and easier. You cannot use them for the purpose you explain here, because they're server-side functions.
Imagine what happens if you get the value ');DROP SCHEMA public;-- from a malicious user. You'd produce:
insert into test values (1,'');DROP SCHEMA public;--');
which breaks down to two statements and a comment that gets ignored:
insert into test values (1,'');
DROP SCHEMA public;
--');
Whoops, there goes your database.
In postgresql if you want to insert values with ' in it then for this you have to give extra '
insert into test values (1,'user''s log');
insert into test values (2,'''my users''');
insert into test values (3,'customer''s');
you can use the postrgesql chr(int) function:
insert into test values (2,'|| chr(39)||'my users'||chr(39)||');
When I used Python to insert values into PostgreSQL, I also met the question: column "xxx" does not exist.
The I find the reason in wiki.postgresql:
PostgreSQL uses only single quotes for this (i.e. WHERE name = 'John'). Double quotes are used to quote system identifiers; field names, table names, etc. (i.e. WHERE "last name" = 'Smith').
MySQL uses ` (accent mark or backtick) to quote system identifiers, which is decidedly non-standard.
It means PostgreSQL can use only single quote for field names, table names, etc. So you can not use single quote in value.
My situation is: I want to insert values "the difference of it’s adj for sb and it's adj of sb" into PostgreSQL.
How I figure out this problem:
I replace ' with ’, and I replace " with '. Because PostgreSQL value does not support double quote.
So I think you can use following codes to insert values:
insert into test values (1,'user’s log');
insert into test values (2,'my users');
insert into test values (3,'customer’s');
If you need to get the work done inside Pg:
to_json(value)
https://www.postgresql.org/docs/9.3/static/functions-json.html#FUNCTIONS-JSON-TABLE
You must have to add an extra single quotes -> ' and make doubling quote them up like below examples -> ' ' is the standard way and works of course:
Wrong way: 'user's log'
Right way: 'user''s log'
problem:
insert into test values (1,'user's log');
insert into test values (2,''my users'');
insert into test values (3,'customer's');
Solutions:
insert into test values (1,'user''s log');
insert into test values (2,'''my users''');
insert into test values (3,'customer''s');

How to remove escape character from the output of Lookup activity in Azure Data Factory?

I am reading JSON data from SQL Database in Azure Data Factory.
I have Azure Data Factory (ADF) pipeline, contains "Lookup" activity, which reads the JSON Data from SQL DB and bring into ADF Pipeline. Somehow the escape character (" \ ") get inserted in JSON data when I see at the output of Lookup activity of ADF.
For Example, the output of Lookup activity become like this:
{\ "resourceType\ ":\ "Sales","id" :\ "9i5W6tp-JTd-24252\ "
Any idea how to remove the escape character from JSON in pipeline?
Update:
Thanks for the update Joseph. When I try your steps, It doesn't work for me.
In lookup am reading data from SQL DB.
This is my Append variable:
After running it, I still see escape character
{
"firstRow": {
"JSONData": "{\"resourceType\":\"counter\",\"id\":\"9i5W6tp-JTd- and more
As we know, '\' is an escape character. In your case, this symbol appears because it is used to escape one double quote inside a pair of double quotes.
For example, "\"" => """.
But it doesn't matter, we only need to convert it from string type to json type, it will automatically remove escape characters.
I've created a test to verify it.
First, I defined an Array type variable.
My Lookup activity's output is as follows:
Then I used an AppendVariable activity and used an expression #json(activity('Lookup1').output.firstRow.value) to convert it from string type to json type.
After I run debug, we can see the result as follows, there is no '\'.

How do I extract a string between two characters using ADFs expression builder?

I'm trying to extract part of a file name using expressions in ADF expression builder. The part I'm trying to extract is dynamic in size but always appears between "_" and "-".
How can I go about doing this extraction?
Thanks!
Suppose there's a pipeline parameter named filename, you could use the below expression to extract value between '_' and '-', e.g. input 'ab_cd-', you would get 'cd' as output:
#{substring(pipeline().parameters.fileName, add(indexOf(pipeline().parameters.fileName, '_'),1),sub(indexOf(pipeline().parameters.fileName, '-'),3))}
You may want to check the documentation of Expressions and functions in Azure Data Factory for more details: https://learn.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions#string-functions

U-SQL Error - Change the identifier to use at least one lower case letter

I am fairly new to U-SQL and trying to run a U-SQL script in Azure Data Lake Analytics to process a parquet file using the Parquet extractor functionality. I am getting the below error and I don't find a way to get around it.
Error - Change the identifier to use at least one lower case letter. If that is not possible, then escape that identifier (for example: '[ACTIVITY]'), or embed it in a CSHARP() block (e.g CSHARP(ACTIVITY)).
Unfortunately all the different fields generated in the Parquet file are capitalized and I don't want to to escape these identifiers. I have tried if I could wrap the identifier with CSHARP block and it fails as well (E_CSC_USER_RESERVEDKEYWORDASIDENTIFIER: Reserved keyword CSHARP is used as an identifier.) Is there anyway I could extract the parquet file? Thanks for your help!
Code Snippet:
SET ##FeaturePreviews = "EnableParquetUdos:on";
#var1 =
EXTRACT ACTIVITY string,
AUTHOR_NAME string,
AFFLIATION string
FROM "adl://xxx.azuredatalakestore.net/Abstracts/FY2018_028"
USING Extractors.Parquet();
#var2 =
SELECT *
FROM #var1
ORDER BY ACTIVITY ASC
FETCH 5 ROWS;
OUTPUT #var2
TO "adl://xxx.azuredatalakestore.net/Results/AbstractsResults.csv"
USING Outputters.Csv();
Based on your description you try to say
EXTRACT ALLCAPSNAME int FROM "/data.parquet" USING Extractors.Parquet();
In U-SQL, we reserve all caps identifiers so we can add new keywords in the future without invalidating old scripts.
To work around, you just have to quote the name (escape it) like in any other SQL dialect:
EXTRACT [ALLCAPSNAME] int FROM "/data.parquet" USING Extractors.Parquet();
Note that this is not changing the name of the field. It is just the syntactic way to address the field.
Also note, that in most SQL communities, it is considered a best practice to always quote identifiers to avoid reserved keyword clashes.
If all fields in the Parquet file are all caps, you will have to quote them all... In a future update you will be able to say EXTRACT * FROM … for Parquet (and Orc) files, but you still will need to quote the columns when you refer to them explicitly.

SSIS: Filtering Multiple GUIDs from String Variable as Parameter In Data Flow OLE Source

I have an SSIS package that obtains a list of new GUIDs from a SQL table. I then shred the GUIDs into a string variable so that I have them separated out by comma. An example of how they appear in the variable is:
'5f661168-aed2-4659-86ba-fd864ca341bc','f5ba6d28-7283-4bed-9f11-e8f6bef225c5'
The problem is in the data flow task. I use the variable as a parameter in a SQL query to get my source data and I cannot get my results. When the WHERE clause looks like:
WHERE [GUID] IN (?)
I get an invalid character error so I found out the implicit conversion doesn't work with the GUIDs like I thought they would. I could resolve this by putting {} around the GUID if this were a single GUID but there are a potential 4 or 5 different GUIDs this will need to retrieve at runtime.
Figuring I could get around it with this:
WHERE CAST([GUID] AS VARCHAR(50)) IN (?)
But this simply produces no results and there should be two in my current test.
I figure there must be a way to accomplish this... What am I missing?
You can't, at least not using the mechanics you have provided.
You cannot concatenate values and make that work with a parameter.
I'm open to being proven wrong on this point but I'll be damned if I can make it work.
How can I make it work?
The trick is to just go old school and make your query via string building/concatenation.
In my package, I defined two variables, filter and query. filter will be the concatenation you are already performing.
query will be an expression (right click, properties: set EvaluateAsExpression to True, Expression would be something like "SELECT * FROM dbo.RefData R WHERE R.refkey IN (" + #[User::filter] + ")"
In your data flow, then change your source to SQL Command from variable. No mapping required there.
Basic look and feel would be like
OLE Source query

Resources