How to execute multiple statements in Presto? - presto

I need to run multiple lines of code in Presto together. Here is example.
drop table if exists table_a
drop table if exists table_b
The above gives me error:
SQL Error [1]: Query failed (#20190820_190638_03672_kzuv6): line 2:1: mismatched input 'drop'. Expecting: '.', <EOF>
I already tried adding ";", but no luck.
Is it possible to stack multiple statements or I need to execute line by line? My actual example involves many other commands such as create table and etc.

You can use presto command line option to submit the sql file which may consist many sql commands.
/presto/executable/path/presto client --file $filename
Example:
/usr/lib/presto/bin/presto client --file /my/presto/sql/file.sql

Related

Cannot execute a Spanner DDL script using 'gcloud spanner databases ddl update' command

A Google Spanner DDL script runs successfully when submitted in the Spanner Console, but when executed via the "glcoud spanner databases ddl update" command using the "--ddl-file" argument it consistently fails with the error:
(gcloud.spanner.databases.ddl.update) INVALID_ARGUMENT: Error parsing Spanner DDL
statement: \n : Syntax error on line 1, column 1: Encountered 'EOF' while parsing:
ddl_statement
'#type': type.googleapis.com/google.rpc.LocalizedMessage
locale: en-US
message: |-
Error parsing Spanner DDL statement:
: Syntax error on line 1, column 1: Encountered 'EOF' while parsing: ddl_statement
Example of the command:
gcloud spanner databases ddl update test-db
--instance=test-instance
--ddl-file=table.ddl
cat table.ddl
CREATE TABLE regions
(
region_id STRING(2) NOT NULL,
name STRING(13) NOT NULL,
) PRIMARY KEY (region_id);
There is only one other reference to this identical situation on the internet. Has anyone got the "ddl-file" argument to successfully work?
The problem is (most probably) caused by the last semi colon in your DDL script. It seems that the --ddl-file option accepts scripts with multiple DDL statements that may be separated by semi colons (;), but the last statement should not be terminated by a semi colon. Doing so will cause gcloud to try to parse another DDL statement after the last, only to determine that there is none, and thereby throwing an Unexpected end of file error.
So TLDR: Remove the last semi colon in your script and it should work.

Can we run complex multi line SQL queries using Blueprism?

I am new to SQL stuff in blueprism, I am able to configure SQL object and execute simple queries, but I am facing trouble while trying to run multiline complex SQL queries.
when I was trying to execute the below query in blueprism, getting some error message, saying "Incorrect Syntax near Database2"
"select top 10 * from [Database1].[dbo].[Table1]
join [Database1].[dbo].[Table2] on [Database1].[dbo].[Table2].Fieldname1=[Database1].[dbo].[Table1].Fieldname2
join [Database2].[dbo].[Table1] on [Database2].[dbo].[Table1].Fieldname1=[Database1].[dbo].[Table2].Fieldname2"
Can somebody please help me, what was the wrong in the above query...
I found the answer myself, there should not be any additional white space characters in the query, entire query should be in continuous line. The beauty of blueprism is, it can execute any level of complex queries without any constraints, but need to modify the syntax accordingly. always we should mention the filename and table names in the following format - [databasename].[dbo].[tablename].[fieldname]

BigQuery Command Line - How to use parameters in the query string?

I am writing a shell script which involves BigQuery commands to query an existing table and save the results to a destination table.
However, since my script will be run periodically, I have a parameter for the date for which the query should run.
For example, my script looks like this:
DATE_FORMATTED=$(date +%Y%m%d)
bq query --destination_table=Desttables.abc_$DATE_FORMATTED "select hits_eventInfo_eventLabel from TABLE_DATE_RANGE([mydata.table_],TIMESTAMP($DATE_FORMATTED),TIMESTAMP($DATE_FORMATTED)) where customDimensions_index = 4"
I get the following error:
Error in query string: Error processing job 'pro-cn:bqjob_r5437894379_1': FROM clause with table wildcards matches no table
How else can I pass the variable $DATE_FORMATTED to the TABLE_DATE_RANGE function from BigQuery in order to help execute my query?
Use double quotes "" + single quote ''. For example, in your case:
TIMESTAMP("'$DATE_FORMATTED'")
OR
select "'$variable'" as dummy from your_table
You are probably missing the single quotes around the $DATE_FORMATTED value inside the TIMESTAMP functions. Without the quotes it's going to be defaulting to the EPOCH time.
Try with:
TIMESTAMP('$DATE_FORMATTED'),TIMESTAMP('$DATE_FORMATTED')

ERROR for load files in HBase at Azure with ImportTsv

Trying to load tsv file in HBase running in HDInsight in Microsoft Azure cloud using a recommended approach connecting through Remote Desktop and running on the command line trying to load t1.tsv file (with two tab separated columns) from hdfs into hbase t1 table:
C:\apps\dist\hbase-0.98.0.2.1.5.0-2057-hadoop2\bin>hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,num t1 t1.tsv
and get:
ERROR: One or more columns in addition to the row key and timestamp(optional) are required
Usage: importtsv -Dimporttsv.columns=a,b,c
replacing order of the specified columns to num,HBASE_ROW_KEY
C:\apps\dist\hbase-0.98.0.2.1.5.0-2057-hadoop2\bin>hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=num,HBASE_ROW_KEY t1 t1.tsv
I get:
ERROR: Must specify exactly one column as HBASE_ROW_KEY
Usage: importtsv -Dimporttsv.columns=a,b,c
This tells me that comma separator in the column list is not recognized or column name is incorrect I also tried to use column with qualifier as num:v and as 'num' - nothing helps
Any ideas what could be wrong here? Thanks.
>hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns="HBASE_ROW_KEY,d:c1,d:c2" testtable /example/inputfile.txt
This works for me. I think there are some differences between terminals in Linux and Windows, thus in windows you need to add quotation marks to clarify this is a value string, otherwise might not be recognized.

Cassandra selective copy

I want to copy selected rows from a columnfamily to a .csv file. The copy command is available just to dump a column or entire table to a file without where clause. Is there a way to use where clause in copy command?
Another way I thought of was,
Do "Insert into table2 () values ( select * from table1 where <where_clause>);" and then dump the table2 to .csv , which is also not possible.
Any help would be much appreciated.
There are no way to make a where clause in copy, but you can use this method :
echo "select c1,c2.... FROM keySpace.Table where ;" | bin/cqlsh > output.csv
It allows you to save your result in the output.csv file.
No, there is no built-in support for a "where" clause when exporting to a CSV file.
One alternative would be to write your own script using one of the drivers. In the script you would do the "select", then read the results and write out to a CSV file.
In addition to Amine CHERIFI's answer:
| sed -e 's/^\s+//; s_\s*\|\s*_,_g; /^-{3,}|^$|^\(.+\)$/d'
Removes spaces
Replaces | with ,
Removes header separator, empty and summary lines
Other ways to run the SQL with filter and redirect the response to csv
1) Inside the cqlsh, use the CAPTURE command and redirect the output to a file. You need to set the tracing on before executing the command
Example: CAPTURE 'output.txt' -- output of the sql executed after this command gets captured into output.txt file
2) In case if you would like to redirect the SQL output to a file from outside of cqlsh
./cqlsh -e'select * from keyspaceName.tableName' > fileName.txt -- hostname

Resources