Cannot execute a Spanner DDL script using 'gcloud spanner databases ddl update' command - google-cloud-spanner

A Google Spanner DDL script runs successfully when submitted in the Spanner Console, but when executed via the "glcoud spanner databases ddl update" command using the "--ddl-file" argument it consistently fails with the error:
(gcloud.spanner.databases.ddl.update) INVALID_ARGUMENT: Error parsing Spanner DDL
statement: \n : Syntax error on line 1, column 1: Encountered 'EOF' while parsing:
ddl_statement
'#type': type.googleapis.com/google.rpc.LocalizedMessage
locale: en-US
message: |-
Error parsing Spanner DDL statement:
: Syntax error on line 1, column 1: Encountered 'EOF' while parsing: ddl_statement
Example of the command:
gcloud spanner databases ddl update test-db
--instance=test-instance
--ddl-file=table.ddl
cat table.ddl
CREATE TABLE regions
(
region_id STRING(2) NOT NULL,
name STRING(13) NOT NULL,
) PRIMARY KEY (region_id);
There is only one other reference to this identical situation on the internet. Has anyone got the "ddl-file" argument to successfully work?

The problem is (most probably) caused by the last semi colon in your DDL script. It seems that the --ddl-file option accepts scripts with multiple DDL statements that may be separated by semi colons (;), but the last statement should not be terminated by a semi colon. Doing so will cause gcloud to try to parse another DDL statement after the last, only to determine that there is none, and thereby throwing an Unexpected end of file error.
So TLDR: Remove the last semi colon in your script and it should work.

Related

Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead

I am trying to merge to a SQL Database using the following code in Databricks with pyspark
query = """
MERGE INTO deltadf t
USING df s
ON s.SLAId_Id = t.SLAId_Id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
"""
driver_manager = spark._sc._gateway.jvm.java.sql.DriverManager
con = driver_manager.getConnection(url) #
stmt = con.createStatement()
stmt.executeUpdate(query)
stmt.close()
But I'm getting the following error:
SQLException: Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead at position 25.
Any thoughts on where might be going wrong?
I don't know why you're getting this exact error. However I believe there are a number of issues with what you are trying to do.
Running the query via JDBC makes it run in SQL Server only. Construct like WHEN MATCHED THEN UPDATE SET * / WHEN NOT MATCHED INSERT * will not work. Databricks accepts it, but for SQL Server you need to explicitly provide columns to update and values to insert (reference).
Also, do you actually have tables named deltadf and df in SQL Server? I suppose you have a Dataframe or temporary view named df... this will not work. As said, this query executes in SQL Server only. If you want to upload data from Dataframe use df.write.format("jdbc").save (reference).
See this Fiddle - if deltadf and df are tables, running this query in SQL Server (any version) will only complain about Incorrect syntax near '*'.
SQLException: Malformed SQL Statement: Expected token 'USING' but found Identifier with value 't' instead at position 25.
if you missed updating any specific field or specific syntax, you will get this error.
I performed merge operation its working fine for me without error, Please follow below reference .
Reference:
https://www.youtube.com/watch?v=i5oM2bUyH0o
https://docs.databricks.com/delta/delta-update.html#upsert-into-a-table-using-merge
https://www.sqlshack.com/sql-server-merge-statement-overview-and-examples/

Incomplete statement at end of file

After running below command :
sh cqlsh --request-timeout=3600 -f test.cql
I am getting below error :
Incomplete statement at end of file
Even when my first line is use sample; followed by 50 insert queries.
What could be the reasons for this error?
That error is returned if the statement at the end of the file either (a) has invalid syntax, or (b) not terminated correctly.
Sometimes the issue can occur several lines up from the last statement in the input file.
Check that the CQL statements have valid syntax. It might be necessary to do a process of elimination and split the file so there's only 10 statements in each so you can identify the offending statement. Cheers!

Tablename with spaces at JDBC connection gives error

I'm trying to establish a connection in AWS Glue, using a pyspark script.
The JDBC connection is pointing to a Microsoft SQL Server in Azure Cloud.
When I try to enter the connection string, it works until it gets to the table that it should read. That's mainly because of the whitespace inside the table name. Do you have any hint on how to write the syntax here?
source_df = sparksession.read.format("jdbc").option("url","jdbc:sqlserver://00.000.00.00:1433;databaseName=Sample").option("dbtable", "dbo.122 SampleCompany DE$Contract Header").option("user", "sampleuser").option("password", "sampL3p4ssw0rd").load()
When you execute this, it always throws the error:
py4j.protocol.Py4JJavaError: An error occurred while calling o69.load. : com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near '.122'
Do you have any idea how to solve this?
Given the presence of spaces (and probably the dollar sign, and the fact the identifier starts with numbers), you need to quote the object name. Quoting object names in SQL Server is done by enclosing it in brackets (or, though this may depend on the session config, double quotes).
Keep in mind that dbo is the schema, while 122 SampleCompany DE$Contract Header is the table name. Schema and table name need to be quoted separately, not as a unit.
So, try to pass "dbo.[122 SampleCompany DE$Contract Header]"

How to execute multiple statements in Presto?

I need to run multiple lines of code in Presto together. Here is example.
drop table if exists table_a
drop table if exists table_b
The above gives me error:
SQL Error [1]: Query failed (#20190820_190638_03672_kzuv6): line 2:1: mismatched input 'drop'. Expecting: '.', <EOF>
I already tried adding ";", but no luck.
Is it possible to stack multiple statements or I need to execute line by line? My actual example involves many other commands such as create table and etc.
You can use presto command line option to submit the sql file which may consist many sql commands.
/presto/executable/path/presto client --file $filename
Example:
/usr/lib/presto/bin/presto client --file /my/presto/sql/file.sql

BigQuery Command Line - How to use parameters in the query string?

I am writing a shell script which involves BigQuery commands to query an existing table and save the results to a destination table.
However, since my script will be run periodically, I have a parameter for the date for which the query should run.
For example, my script looks like this:
DATE_FORMATTED=$(date +%Y%m%d)
bq query --destination_table=Desttables.abc_$DATE_FORMATTED "select hits_eventInfo_eventLabel from TABLE_DATE_RANGE([mydata.table_],TIMESTAMP($DATE_FORMATTED),TIMESTAMP($DATE_FORMATTED)) where customDimensions_index = 4"
I get the following error:
Error in query string: Error processing job 'pro-cn:bqjob_r5437894379_1': FROM clause with table wildcards matches no table
How else can I pass the variable $DATE_FORMATTED to the TABLE_DATE_RANGE function from BigQuery in order to help execute my query?
Use double quotes "" + single quote ''. For example, in your case:
TIMESTAMP("'$DATE_FORMATTED'")
OR
select "'$variable'" as dummy from your_table
You are probably missing the single quotes around the $DATE_FORMATTED value inside the TIMESTAMP functions. Without the quotes it's going to be defaulting to the EPOCH time.
Try with:
TIMESTAMP('$DATE_FORMATTED'),TIMESTAMP('$DATE_FORMATTED')

Resources