Presto - can I do alter table if exists? - presto

How can I alter table name only if exists?
Something like: alter table mydb.myname if exists rename to mydb.my_new_name

You can do something like:
ALTER TABLE users RENAME TO people;
or
ALTER TABLE mydb.myname RENAME TO mydb.my_new_name;
Please notice that IF EXISTS syntax is not available here. Please find more informations here: https://docs.starburstdata.com/latest/sql/alter-table.html The work for that is tracked under: https://github.com/prestosql/presto/issues/2260
Currently you need to handle this on a different layer, like java program that is running SQL queries to Presto over JDBC.

Related

Write data frame to hive table in spark

could you please tell me if this command could create problems with overwriting all tables in the DB:
df.write.option(“path”, “path_to_the_db/hive/”).mode(overwrite).saveAsTable("result_data")
table_name is a new table in the DB, it hasn't existed.
After these commands, all tables disappeared.
I was using Spark3 and tried to solve an error:
Can not create the managed table('result_data').
The associated location('dbfs:/user/hive/warehouse/result_data') already exists.
I expected that a new table will be created without any issues if it doesn’t exist.
If path_to_the_db/hive contains other tables, then you overwrite into that folder, it seems possible that the whole directory would be emptied first, yes. Perhaps you should instead use path_to_the_db/hive/result_data
According to the error, though, your table does already exist.
You can use Spark to register a temporary table in SQL code, then run INSERT OVERWRITE query for existing tables.

What truly constitutes a Databricks managed table?

All over the place you read a managed table is a table that is created in the default location (/user/hive/warehouse/)
But Databricks_own_examples_in_documentation creates a managed table in /user/blabla/bla
So what TRULY constitutes a managed table?
It certainly isn't simple anything created on the default database.
The remark, "A managed table is just something we create without the 'LOCATION' keyword" is ... not exactly correct.
Is it anything written in /user?
Is it any table that reads a source on /user?
.. If this is true then when I read a file from /mnt/mymount/fil.csv to create tmpView then
create table myTable as select * from tmpView
to re-write it to the default managed location, why is myTable still an external table?
What REALLY defines a managed table?
There it is.
If_I_create_a_DATABASEwith_a_LOCATION_value
then every table in this database I create without a LOCATION values is a managed table.
But the table will be a subdirectory of a database's location regardless of the cluster's default location for the user Hive warehouse.
Certainly a small but useful nuance.

Print table name on which query is executed

Looking at the following lines of code:
query = "DROP TABLE IF EXISTS my_table"
cur.execute(query)
conn.commit()
# print(table_name)
I'm running the query against multiple tables with various query and I want to return the name of the table and the action executed each time. Is there a way to get some kind of meta data from cur.execute or conn.commit on the action running?
In the example above I'd like to print the table name (my_table) and the action (DROP TABLE). however I want this to be dynamic. If I'm creating a table I want to the name of the table newly created and the action (CREATE TABLE).
Thanks.
Quick and Dirty
tables = ['table_1', 'table_2', 'table_3']
action = 'DROP TABLE'
for table in tables:
cur.execute(f'{str(action)} IF EXISTS {str(table)}')
print(f'ACTION: {action}')
print(f'TABLE: {table}')
conn.commit()
HOWEVER, please do not ever do something like this in anything other than a tiny app that will never leave your computer, and especially not with anything that will accept input from a user.
Bad things will happen.
Dynamically interfacing with databases using OOP is a solved problem, and its not worth reinventing the wheel. Have you considered using an ORM like SQLAlchemy?

How to read hive managed table data using spark?

I am able to read hive external table using spark-shell but, when I try to read data from hive managed table it only shows column names.
Please find queries here:
Could you please try using database name as well along with table name?
sql(select * from db_name.test_managed)
If still result is same, request you to please share output of describe formatted for both the tables.

Cassandra create and load data atomicity

I have got a web service which is looking for the last create table
[name_YYYYMMddHHmmss]
I have a persister job that creates and loads a table (insert or bulk)
Is there something that hides a table until it is fully loaded ?
First, I have created a technical table, it works but I will need one by keyspace (using cassandraAuth). I don’t like this.
I was thinking about tags, but it doesn’t seem to exist.
- create a table with tag and modify or remove it when the table is loaded.
There is also the table comment option.
Any ideas?
Table comment is a good option. We use it for some service information about the table, e.g. table versions tracking.

Resources