What truly constitutes a Databricks managed table?

What truly constitutes a Databricks managed table? - databricks

All over the place you read a managed table is a table that is created in the default location (/user/hive/warehouse/)
But Databricks_own_examples_in_documentation creates a managed table in /user/blabla/bla
So what TRULY constitutes a managed table?
It certainly isn't simple anything created on the default database.
The remark, "A managed table is just something we create without the 'LOCATION' keyword" is ... not exactly correct.
Is it anything written in /user?
Is it any table that reads a source on /user?
.. If this is true then when I read a file from /mnt/mymount/fil.csv to create tmpView then
create table myTable as select * from tmpView
to re-write it to the default managed location, why is myTable still an external table?
What REALLY defines a managed table?

There it is.
If_I_create_a_DATABASEwith_a_LOCATION_value
then every table in this database I create without a LOCATION values is a managed table.
But the table will be a subdirectory of a database's location regardless of the cluster's default location for the user Hive warehouse.
Certainly a small but useful nuance.

Related

Write data frame to hive table in spark

could you please tell me if this command could create problems with overwriting all tables in the DB:
df.write.option(“path”, “path_to_the_db/hive/”).mode(overwrite).saveAsTable("result_data")
table_name is a new table in the DB, it hasn't existed.
After these commands, all tables disappeared.
I was using Spark3 and tried to solve an error:
Can not create the managed table('result_data').
The associated location('dbfs:/user/hive/warehouse/result_data') already exists.
I expected that a new table will be created without any issues if it doesn’t exist.

If path_to_the_db/hive contains other tables, then you overwrite into that folder, it seems possible that the whole directory would be emptied first, yes. Perhaps you should instead use path_to_the_db/hive/result_data
According to the error, though, your table does already exist.
You can use Spark to register a temporary table in SQL code, then run INSERT OVERWRITE query for existing tables.

Set Kentico primary key value when inserting TreeNode

With Kentico 13, I'm looking for a way to specify the primary key value when inserting a TreeNode via API. Something like:
var node = TreeNode.New("MyPageType");
node.SetValue("MyPageTypeID", 1234);
node.Insert(parentNode);
This needs to set the primary key in the MyPageType table so needs SQL identity insert on, and also needs to set the DocumentForeignKeyValue in the CMS_Document table.
The only way I have thought of doing it is with some custom SQL after the node is created, but feels like a hack. Is there a better way?
This is for a content migration task of thousands of documents. After the content migration the default SQL & primary key behavior will be used.

In case anyone finds this, the solution I came up with was to run the content migration script with the old primary key value in a temporary column. After migration I ran SQL to update Kentico references to the old primary key, remove the old primary key, and change the primary key to the temporary column. A bit nasty, but got the job done.

Presto - can I do alter table if exists?

How can I alter table name only if exists?
Something like: alter table mydb.myname if exists rename to mydb.my_new_name

You can do something like:
ALTER TABLE users RENAME TO people;
or
ALTER TABLE mydb.myname RENAME TO mydb.my_new_name;
Please notice that IF EXISTS syntax is not available here. Please find more informations here: https://docs.starburstdata.com/latest/sql/alter-table.html The work for that is tracked under: https://github.com/prestosql/presto/issues/2260
Currently you need to handle this on a different layer, like java program that is running SQL queries to Presto over JDBC.

Cassandra create and load data atomicity

I have got a web service which is looking for the last create table
[name_YYYYMMddHHmmss]
I have a persister job that creates and loads a table (insert or bulk)
Is there something that hides a table until it is fully loaded ?
First, I have created a technical table, it works but I will need one by keyspace (using cassandraAuth). I don’t like this.
I was thinking about tags, but it doesn’t seem to exist.
- create a table with tag and modify or remove it when the table is loaded.
There is also the table comment option.
Any ideas?

Table comment is a good option. We use it for some service information about the table, e.g. table versions tracking.

How can I obtain the database schema from an existing ActiveRecord.cs file?

I have been given the source code for an existing project that uses SubSonic ORM. My (limited!) understanding is that SubSonic generates code by reverse-engineering the existing database. Unfortunately I don't have the database that was used for this project.
I do have the ActiveRecord.cs file from the last time it was compiled. How could I work out the database schema so I can reproduce the database?

This sounds like SubSonic 3. Here are a couple places to get you started based on me looking through my ActiveRecord.cs file. You might want to create a small database yourself, run SubSonic on it, and see what gets generated in ActiveRecord.cs.
Inside your ActiveRecord.cs file, you'll find one partial class per table. The partial class will inherit from IActiveRecord and will likely be the name of the table.
Inside the class, you'll find a function called "KeyName()" which will return your primary key column name for the table. SubSonic requires a primary key for tables it processes and generates code for.
Look for a region named " Foreign Keys ". If this table has foreign keys, you'll find a property corresponding to each foreign key, something like "public IQueryable OtherTableNames". So this table should have a column named something like "OtherTableNameID"; check the generated partial class for the foreign key table to be sure.
Immediately below the foreign key region, you'll find properties for the non-foreign key columns of this table. You can somewhat guess at the data types of the columns from the property data types (e.g. string might be a char(x) or a varchar(x)).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

What truly constitutes a Databricks managed table? - databricks

Related

Write data frame to hive table in spark

Set Kentico primary key value when inserting TreeNode

Presto - can I do alter table if exists?

Cassandra create and load data atomicity

How can I obtain the database schema from an existing ActiveRecord.cs file?

Categories

Resources