How to match SQL server functions in Hive - python-3.x

I am trying to write the Stored Procedure for SQL equivalent in Hive. I managed to translate the first two:
DECLARE #ReloadMonths as INT=15
set reloadMonths=15
DECLARE #Anchor_DT as DATE =EOMONTH(Getdate(),-1);
set anchor_dt=select last_day(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd')`)
But I am having troubles translating the following two:
DECLARE #YearMonth as INT=C_II.Common.FN_COM_DATEToYearMonth(#Anchor_DT);
set yearMonth=(anchor_dt,'yyyy-MM')
DECLARE #StartYearMonth as INT =ISNULL(#StartYearMonth_Inp,C_II.Common.FN_COM_DATEToYearMonth(DATEADD(MM,-#ReloadMonths+1,#Anchor_DT)));
set startYearMonth=${hiveconf:${hiveconf:startYearMonth}};
Any ideas or suggestions?

your requirements was not much clear in the question. Also it seems that this function 'C_II.Common.FN_COM_DATEToYearMonth' is specific to your project and it's not a standard sql server function.
Lets breakdown it in step by steps:
If we run below statements in sql server:
DECLARE #Anchor_DT as DATE =EOMONTH(Getdate(),-1);
select #Anchor_DT;
It will give you date as: 2019-06-30
whereas the hive conversion you made for this is incorrect.
select last_day(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'))
It will give you last day of current month as '2019-07-31' so the right and equivalent coversion to sql server would be as:
select DATE_SUB(current_date(),DAY(current_date()));
This will return you date as :'2019-06-30'
The last two statements in your question was not very clear but looks like you are expecting below conversion.
select date_format('${hiveconf:anchor_dt}','yyyy-MM');
It will return as : 2019-06
"DECLARE #StartYearMonth as INT =ISNULL(#StartYearMonth_Inp,C_II.Common.FN_COM_DATEToYearMonth(DATEADD(MM,-#ReloadMonths+1,#Anchor_DT)));"
I have converted above statement in sql server as shown below:
select format((DATEADD(MM,-#ReloadMonths+1,#Anchor_DT)),'yyyy-MM');
This will return date in sql server as : 2018-04
Answer to your question:
create a hive script and save it on your hdfs location.(testdatehive.hql)
select date_format('${hiveconf:anchor_dt}','yyyy-MM');
select date_format(add_months('${hiveconf:anchor_dt}',-${hiveconf:reloadMonths}+1),'yyyy-MM');
Shell script:
#!/bin/bash
#Declare integer variable
declare -i reloadMonths=15
echo $reloadMonths
echo "Executing the hive query - get anchor date and store it in shell variable"
anchor_dt=$(hive -e "select DATE_SUB(current_date(),DAY(current_date()));")
echo $anchor_dt
echo "Pass anchor_date & reloadMonths to hive script"
hive --hiveconf anchor_dt=$anchor_dt --hiveconf reloadMonths=$reloadMonths -f hdfs://hostname/user/vikct001/dev/hadoop/hivescripts/testdatehive.hql
echo "Executing the hive query - ends"
Here is your shell output:
15
Executing the hive query - get anchor date and store it in shell variable
2019-06-30
Pass anchor_date & reloadMonths to hive script
2019-06
2018-04
Let me know if this works.

Related

Linux - Store a sql select value in a variable bash

I want to store the value of the sqlite statement in a variable
backup=$(sqlite3 "/home/miguel/Desktop/SO/ProjetoFinal/Backup_Principal.db" "SELECT periocidade_backup FROM STORAGE WHERE path'$path';")
But when i echo $backup it returns the following:
sqlite3 "/home/miguel/Desktop/SO/ProjetoFinal/Backup_Principal.db" "SELECT periocidade_backup FROM STORAGE WHERE path='$path';"
What am I doing wrong?
the part of your code '$path' is using a single quote which is literal and show exactly as what is in the quotes, which would not use the variable's value. using speech marks like the following should work, "'$path'"

Quotes missing in Hive Query - HQL

I am calling HQL from shell script.
I am passing variable to HQL from querying from other table. Variable I am passing as follows:
$$A1=('123','124')
I see variable $$A1 properly in shell script with echo statement and it displayed as ('123','124').
but when I am using this variable in query, its missing single quotes. I mean it is passing as (123,124)
I am passing as $$A1 as follows:
select * from table1 where cd in $$A1
query is taking as select * from table where cd in (123,124)
why single quotes are missing when it is passing to the query.
appreciate any help on this.
Thanks,
Babu

I needs to execute one sql query against two DBs in at a time and export the data to csv files [duplicate]

I have file1.sh file and which internally needs to execute one sql query against two Oracle DBs at a same time and needs to export date to csv fiiles, below is the sample shellscript which executes the query against two dbs.
....
#!bin/bash
set -X
sqlplus -S ${user1}#${DBCONNECTIONNAME_1}/${Pwd} Datesquery.sql & >> ${Targetdirectory}/csvfile1.csv
sqlplus -S ${user1}#${DBCONNECTIONNAME_2}/${Pwd} Datesquery.sql & >> ${Targetdirectory}/csvfile2.csv
sed 1d csvfile2.csv > file2noheader.csv
cat csvfile1.csv file2noheader.csv > ${Targetdirectory}/Expod.csv
....
But it does not connect to DB and does not execute any query and simply displays sqlplus manual as how to use the connection string, please let me know how to call one query against two dbs and execute them in parrallay and binds output to separate csv files.
A given sqlplus session can only connect to one db at a time, so your requirement 'at the same time' is essentially a non-starter. If 'at the same time' really means 'sequentially, in the same script, then you are back to fixing your connect string. And at that you 'have more errors than an early Mets game' (with apologies to any NY Mets fans).
First, your script indicates that your sqlplus command is the very first actual command following specification of your shell processor and 'set -x'. Yet you make heavy use of environment variables as substitutions for username, password, and connection name - without ever setting those variables.
Second, your use of an '&' in the command line is totally confusing to both me and the parser.
Third, you need to preceed your reference to the sql script with '#'.
Fourth, your order of elements in the command line is all wrong.
Try this
#!/bin/bash
orauser1=<supply user name here>
orapw2=<supply password here>
oradb_1=<supply connection name of first database>
#
orauser1=<supply user name here>
orapw2=<supply password here>
oradb_1=<supply connection name of first database>
#
Targetdirectory=<supply value here>
#
sqlplus -S ${orauser1}/${orapw1}#${oradb_1} #Datesquery.sql >> ${Targetdirectory}/csvfile1.csv
sqlplus -S ${orauser2}/${orapw2}#${oradb_1} #Datesquery.sql >> ${Targetdirectory}/csvfile2.csv
Or create a database link form one DB to other and then run both sqls in one db, one over DB link.
select * from tab1
union
select * from tab1#db_link

BigQuery Command Line - How to use parameters in the query string?

I am writing a shell script which involves BigQuery commands to query an existing table and save the results to a destination table.
However, since my script will be run periodically, I have a parameter for the date for which the query should run.
For example, my script looks like this:
DATE_FORMATTED=$(date +%Y%m%d)
bq query --destination_table=Desttables.abc_$DATE_FORMATTED "select hits_eventInfo_eventLabel from TABLE_DATE_RANGE([mydata.table_],TIMESTAMP($DATE_FORMATTED),TIMESTAMP($DATE_FORMATTED)) where customDimensions_index = 4"
I get the following error:
Error in query string: Error processing job 'pro-cn:bqjob_r5437894379_1': FROM clause with table wildcards matches no table
How else can I pass the variable $DATE_FORMATTED to the TABLE_DATE_RANGE function from BigQuery in order to help execute my query?
Use double quotes "" + single quote ''. For example, in your case:
TIMESTAMP("'$DATE_FORMATTED'")
OR
select "'$variable'" as dummy from your_table
You are probably missing the single quotes around the $DATE_FORMATTED value inside the TIMESTAMP functions. Without the quotes it's going to be defaulting to the EPOCH time.
Try with:
TIMESTAMP('$DATE_FORMATTED'),TIMESTAMP('$DATE_FORMATTED')

Sybase, execute string as sql query

In Sybase SQL, I would like to execute a String containing SQL.
I would expect something like this to work
declare #exec_str char(100)
select #exec_str = "select 1"
execute #exec_str
go
from the documentation of the exec command
execute | exec
is used to execute a stored procedure or an extended stored
procedure (ESP). This keyword is
necessary if there are multiple
statements in the batch.
execute is also used to execute a string containing Transact-SQL.
However my above example gives an error. Am I doing something wrong?
You need bracketing:
execute ( #exec_str )

Resources