Groovy Prepared Statement with Named Parameters - groovy

I used the below way to do named parameters with JDBC Prepared Statement. Any suggestion to improve this?
import java.sql.*;
def sqlQuery = "select * from table where col1=:col1 and col2=:col2"
def namedParameters =[
['ColumnName':'col1','Value':'test', 'DataType': 'int'],
['ColumnName':'col2','Value':'testsdfdf', 'DataType':'string'],
];
PreparedStatement stmt;
namedParameters.eachWithIndex{ k, v ->
println "Index: " + v
println "Name: " + k.ColumnName
println "Value: " + k.Value
//To replace named parameters with ?
sqlQuery = sqlQuery .replace(":" + k.ColumnName, "?")
println sqlQuery
println "DataType: " + k.DataType
switch(k.DataType.toLowerCase())
{
case('int'):
stmt.setInt(v+1, k.Value)
break;
case('string'):
stmt.setString(v+1, k.Value)
break;
default:
stmt.setObject(v+1, k.Value)
}
};
println "End"
I am doing a string replace to replace named parameter with ?
And based on the provided map, identifying data type setting it to PreparedStatement accordingly

You can use groovy.sql.Sql class and its execute(Map params, String query, Closure processResults method. Consider following exemplary script:
sql.groovy:
#Grab(group='com.h2database', module='h2', version='1.4.197')
import groovy.sql.Sql
// (1) Configure JDBC connection (use H2 in-memory DB in this example)
def config = [
url:'jdbc:h2:mem:test',
user:'sa',
password:'',
driver: 'org.h2.Driver'
]
// (2) Connect to the database
def sql = Sql.newInstance(config.url, config.user, config.password, config.driver)
// (3) Create table for testing
sql.execute '''
create table test (
id integer not null,
name varchar(50)
)
'''
// (4) Insert some test data
def query = 'insert into test (id, name) values (?,?)'
sql.withBatch(3, query) { stmt ->
stmt.addBatch(1, 'test 1')
stmt.addBatch(2, 'test 2')
stmt.addBatch(3, 'test 3')
}
// (5) Execute SELECT query
sql.execute([id: 1, name: 'test 1'], 'select * from test where id >= :id and name != :name', { _, result ->
result.each { row ->
println "id: ${row.ID}, name: ${row.NAME}"
}
})
The last part shows how you can use prepared statement with named parameters. In this example we want to list all rows where id is greater or equal 1 and where name does not equal test 1.
The first parameter of sql.execute() method is a map that holds your named parameters (each key maps to :key in the SQL query). The second parameter is your SQL query where :key format is used for named parameters. And the third parameter is a closure that defines processing business logic - result holds a list of maps (e.g. [[ID: 2, NAME: test 2], [ID:3 name: test 3]] in this case) and you have to define how to process this result.
Console output
id: 2, name: test 2
id: 3, name: test 3
sql.eachRow() alternative
Alternatively you can use sql.eachRow(String sql, Map params, Closure closure) instead:
sql.eachRow('select * from test where id > :id', [id: 1], { row ->
println "id: ${row.ID}, name: ${row.NAME}"
})
It will produce the same output.

With Groovy SQL you can even use GString as SQL query.
Example
// Define bind variables
def keyX = 1
def keyY = 'row1'
Query
groovyCon.eachRow("select x,y from mytab where x = ${keyX} and y = ${keyY}") {println it}
This send following query to DB:
select x,y from mytab where x = :1 and y = :2
Groovy SQl is very handy and usefull, except for some special cases (e.g. you need to reuse a preparedStatement in a loop) where you must fallback to plain JDBC.

Related

how can I implement UNNSET(SELECT NAME FROM NAMES); in spanner

Select * Id IN UNNSET(#IDS)
And
UNNSET(SELECT NAME FROM NAMES);
In this query UNNEST(#IDS) is working as I'm passing IDS as List<String>. But UNNSET(SELECT NAME FROM NAMES) is not working in spanner. how can I implement this in spanner?
A small comment on your question is that you have misspelled UNNEST (as UNNSET). I will assume that this was a mistake when asking the question, so I will disregard it.
Given the following schema:
CREATE TABLE Names (
Id INT64 NOT NULL,
Names ARRAY<STRING(MAX)> NOT NULL,
) PRIMARY KEY(Id);
CREATE TABLE SingleNames (
Id INT64 NOT NULL,
Name STRING(MAX),
) PRIMARY KEY(Id)
We can perform an IN query like so:
SELECT *
FROM SingleNames
WHERE Name IN UNNEST((SELECT n.Names FROM Names n WHERE n.Id = 1))
Note the double parenthesis within the UNNEST call, that is required so that the query is interpreted as an expression (which is the required argument for the UNNEST call).
We can query that using the Java client like so:
try (ResultSet rs = databaseClient
.singleUse()
.executeQuery(Statement
.newBuilder("SELECT * FROM SingleNames WHERE Name IN UNNEST((SELECT n.Names FROM Names n WHERE n.Id = #id))")
.bind("id")
.to(1L)
.build())
) {
while (rs.next()) {
System.out.println(rs.getLong("Id") + ", " + rs.getString("Name"));
}
}

Psycopg2 Throwing Syntax error over dynamically generated SQL

So i have the following snippet where i am trying to generated a Dynamic SQL for insert, following is my payload that is passed as payload.
data = {"id": "123", "name": "dev", "description": "This is the dev Env","created_by":"me","updated_by": "me","table_name": table_name}
I am getting the following error for above mentioned payload.
LINE 1: ...updated_by, table_name) VALUES (123, dev, This is the dev En...
My Implementation:
class DMLRelationalDB:
def __init__(self):
pass;
def insert_sql(self, params):
"""
:param params:
:return:
"""
converted_dict = self.__convert_params_to_columns_and_placeholders(params)
print(converted_dict)
column_names = ", ".join(converted_dict['columns'])
placeholders = ", ".join(converted_dict['values'])
table_name = params["table_name"]
statement = f"""INSERT INTO {table_name} ({column_names}) VALUES ({placeholders})"""
print(statement)
return statement
def __convert_params_to_columns_and_placeholders(self, items_dict):
"""
:param items_dict:
:return:
"""
columns = []
values = []
for key, value in items_dict.items():
columns.append(key)
values.append(value)
return {"columns": columns, "values": values}
The problem is that you are trying to pass string values to your postgres DB without quoting them first. Personally I would quote all the data that will enter the database, just to make sure that it is handled correctly.
What you can do is the following:
placeholders = ", ".join([f'"{val}"' for val in converted_dict['values']])
If you have different types of data like datetimes for example the string representation will be put inside the f-string, so it would be safe.
If you have strings that contain double quotation marks you could use "dollar-quoting" to be on the safe side:
placeholders = ", ".join([f'$${val}$$' for val in converted_dict['values']])
If you think that there is a possibility that some string of yours would have two dollars after another, then put some string between the two-dollars to make it abolutely safe:
placeholders = ", ".join([f'$str${val}$str$' for val in converted_dict['values']])
The downside to this is that you increase the amount of data that is transfered and if you have lots of information, it will decrease the performance.

Postgres query with '%%' parameter not returning results via psycopg2

When I execute the below query in a query editor like DBeaver - it returns a result but if I execute the same query via Python & psycopg2 it does not return a result. '%%' should match any title/location so there will always return something. I'm just testing this for a category without keywords but it will also take an array of keywords if they exist depending on the category. So the array could be ['%%'] or ['%boston%', '%cambridge%'] and both should work.
select title, link
from internal.jobs
where (title ilike any(array['%%'])
or location ilike any(array['%%']))
order by "publishDate" desc
limit 1;
I've tried adding the E flag at the beginning of the string. E.g. E'%%'
Python:
import psycopg2
FILTERS = {
'AllJobs': [],
'BostonJobs': ['boston', 'cambridge'],
'MachineLearningJobs': ['ml', 'machine learning']
}
conn = psycopg2.connect("dbname=test user=postgres")
cur = conn.cursor()
sql = """
select title, link
from internal.jobs
where (title ilike any(array[%s])
or location ilike any(array[%s]))
order by "publishDate" desc
limit 1;
"""
for title, tags in FILTERS.items():
if not tags:
formatted_filters = "'%%'" # Will match any record
else:
formatted_filters = ','.join([f"'%{keyword}%'" for keyword in tags])
cur.execute(sql, (formatted_filters))
results = cur.fetchone()
print(results)
You can use the cur.mogrify() query to look at the SQL finally generated, check in psql if it works, and how you need to tweak it.
Most likely you have to double every %.
Thanks to Piro for the very useful cur.mogrify() clue. That helped me further debug the query to figure out what was going wrong.
I ended up removing the extra set of quotes, I used a named parameter and now it works as expected.
Updated code:
import psycopg2
FILTERS = {
'AllJobs': [],
'BostonJobs': ['boston', 'cambridge'],
'MachineLearningJobs': ['ml', 'machine learning']
}
conn = psycopg2.connect("dbname=test user=postgres")
cur = conn.cursor()
sql = """
select title, link
from internal.jobs
where (title ilike any(array[%(filter)s])
or location ilike any(array[%(filter)s]))
order by "publishDate" desc
limit 1;
"""
for title, tags in FILTERS.items():
if not tags:
formatted_filters = '%%' # Will match any record
else:
formatted_filters = [f'%{keyword}%' for keyword in tags]
print(cur.mogrify(sql, {'filter': formatted_filters}))
cur.execute(sql, {'filter': formatted_filters})
results = cur.fetchone()
print(results)

update statement using loop over tuple of query and data fails in psycopg2

I have created a mini functional pipeline which creates an update statement with regex and then passes the statement and the data to pycopg2 to execute.
If I copy paste the statement outside of the loop it works, if I try to loop over all statements I get an error.
# Function to create statement
def psycopg2_regex_replace_chars(table, col, regex_chars_old, char_new):
query = "UPDATE {} SET {} = regexp_replace({}, %s , %s, 'g');".format(table, col, col)
data = (regex_chars_old, char_new)
return (query, data)
# Create functions with intelligible names
replace_separators_with_space = partial(psycopg2_regex_replace_chars,regex_chars_old='[.,/[-]]',char_new=' ')
replace_amper_with_and = partial(psycopg2_regex_replace_chars, regex_chars_old='&', char_new='and')
# create funcs_list
funcs_edit = [replace_separators_with_space,
replace_amper_with_and]
So far, so good.
This works
stmt = "UPDATE persons SET name = regexp_replace(name, %s , %s, 'g');"
data = ('[^a-zA-z0-9]', ' ')
cur.execute(stmt, data)
conn.commit()
This fails
tables = ["persons"]
cols = ["name", "dob"]
for table in tables:
for col in cols:
for func in funcs_edit:
query, data = func(table=table, col=col)
cur.execute(query, data)
conn.commit()
error
<ipython-input-92-c8ba5d469f88> in <module>
6 for func in funcs_edit:
7 query, data = func(table=table, col=col)
----> 8 cur.execute(query, data)
9 conn.commit()
ProgrammingError: function regexp_replace(date, unknown, unknown, unknown) does not exist
LINE 1: UPDATE persons SET dob = regexp_replace(dob, '[.,/[-]]' , ' ...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.```

opposite of spark dataframe `withColumn` method?

I'd like to be able to chain a transformation on my DataFrame that drops a column, rather than assigning the DataFrame to a variable (i.e. df.drop()). If I wanted to add a column, I could simply call df.withColumn(). What is the way to drop a column in an in-line chain of transformations?
For the entire example use this as baseline:
val testVariable = 10
var finalDF = spark.sql("'test' as test_column")
val iDF = spark.sql("select 'John Smith' as Name, cast('10' as integer) as Age, 'Illinois' as State")
val iDF2 = spark.sql("select 'Jane Doe' as Name, cast('40' as integer) as Age, 'Iowa' as State")
val iDF3 = spark.sql("select 'Blobby' as Name, cast('150' as integer) as Age, 'Non-US' as State")
val nameDF = iDF.unionAll(iDF2).unionAll(iDF3)
1 Conditional Drop
If you want to only drop on certain outputs and these are known outputs, you can build out conditional loops to check if the iterator needs to be dropped or not. In this case if the test variable exceeds 4 it will drop the name column, else it adds a new column.
finalDF = if (testVariable>=5) {
nameDF.drop("Name")
} else {
nameDF.withColumn("Cooler_Name", lit("Cool_Name")
}
finalDF.printSchema
2 Programmatically build the select statement. Baseline the select expression statement takes in independent strings and build them into commands that can be read by Spark. In the case below we know we have a test for drop but we do know what columns might be dropped. In this case if a column gets a test values that does not equal 1 we do not include the value in out command array. When we run the command array against the select expression on the table, those columns are dropped.
val columnNames = nameDF.columns
val arrayTestOutput = Array(1,0,1)
var iteratorArray = 1
var commandArray = Array("")
while(iteratorArray <= columnNames.length) {
if (arrayTestOutput(iteratorArray-1) == 1) {
if (iteratorArray == 1) {
commandArray = columnNames(iteratorArray-1)
} else {
commandArray = commandArray ++ columnNames(iteratorArray-1)
}
}
iteratorArray = iteratorArray + 1
}
finalDF=nameDF.selectExpr(commandArray:_*)
finalDF.printSchema

Resources