MemSQL support for MySQL style user variables in the load data command - singlestore

Does MemSQL support user variables in the load data command, similar to MySQL (see MySQL load NULL values from CSV data for examples)? The MemSQL documentation (https://docs.memsql.com/docs/load-data) doesn't give a clue, and my attempts at using user variables have failed.

No, variables in LOAD DATA are not currently supported in general (as of MemSQL 5.5). This is a feature we are tracking for a future release.
We only support the following syntax to skip the contents of a column in the file using a dummy variable (briefly mentioned in the docs https://docs.memsql.com/docs/load-data):
load data infile 'foo.tsv' into table foo (bar, #, #, baz);

Related

How to rollback changes to VSAM file on CICS?

I'm using EXEC CICS SYNCPOINT and EXEC CICS SYNCPOINT ROLLBACK to commit/backout updates to VSAM and DB2 tables when abend happens. However, only updates to DB2 tables are backed out not on VSAM. Am I missing something? CICS parameter RLS is set to RLS=NO.
It will depend on the type of files that you are using. If you are using RLS files then you have to define the files correctly using idcams using the LOG parameter see:
https://www.ibm.com/docs/en/zos/2.2.0?topic=cics-recoverable-nonrecoverable-data-sets
If you are using non-RLS files then you need to set the attributes correctly on your FILE definition.
See the following page within the CICS documentation that describes about file recovery:
https://www.ibm.com/docs/en/cics-ts/5.6?topic=resources-recovery-files

SSIS - Power Query Source: setting connection at runtime

I'm trying to use the Power Query source component in a generic way from SSIS (VS2019).
The idea would be to use a for each loop to load and transform Excel files. At run time, I need to set the connection manager properties for each file as well as the PQY script to be executed on the file.
What I did so far is trying to create a JSON connection string inside a script component and assign the connection string to the connection manager. It keeps on saying that the file requires credentials.
Would someone already experienced that kind of dev? All the files do have the same structure so far, do meta-data need to be refreshed too?
[Edit]
1. In the control flow, I'm retrieving the PQY script I want to apply from a DB.
Before transormations, script starts like this:
let Source = Excel.Workbook(File.Contents("path_to_a_file.xlsx"),null,true),RawData_Sheet = Source{[Item="Table1",Kind="Table"]}[Data]..."
In the C# script task, I'm replacing the path to excel file by the current file variable. M Script is stored in a variable used in the PQY component.
C# Script is then updating the PQY connection manager to target the appropriate file:
ConnectionManager _conn = Dts.Connections["Power Query Connection Manager"];
String _ConnectString = "[{kind:File,path:path_to_a_file.xlss,AuthenticationKind:Windows,Username:myusername,Password:mypassword}]";
_conn.ConnectionString = _ConnectString;
The PQY component is left has it is, connected to ["Power Query Connection Manager"] and getting its script from the variable I set.
PQY configuration screen
Thanks for any tip on this,
Olivier
I can't address the specifics of the PQ but generic anything in a Data Flow will not work.
The Data Flow task works because it makes a strict contract between the source(s) and the destination(s). These columns with these data types will be in play during the run. It's a design-time contract because that allows the run-time engine to allocate resources based on how many buffers of data the system can support. Each row is X bytes, we have Y bytes of memory available, so Z buffers worth of data plus parallelism stuff.
Wish I had a better story to tell you.

Azure Databricks - Can not create the managed table The associated location already exists

I have the following problem in Azure Databricks. Sometimes when I try to save a DataFrame as a managed table:
SomeData_df.write.mode('overwrite').saveAsTable("SomeData")
I get the following error:
"Can not create the managed table('SomeData'). The associated
location('dbfs:/user/hive/warehouse/somedata') already exists.;"
I used to fix this problem by running a %fs rm command to remove that location but now I'm using a cluster that is managed by a different user and I can no longer run rm on that location.
For now the only fix I can think of is using a different table name.
What makes things even more peculiar is the fact that the table does not exist. When I run:
%sql
SELECT * FROM SomeData
I get the error:
Error in SQL statement: AnalysisException: Table or view not found:
SomeData;
How can I fix it?
Seems there are a few others with the same issue.
A temporary workaround is to use
dbutils.fs.rm("dbfs:/user/hive/warehouse/SomeData/", true)
to remove the table before re-creating it.
This generally happens when a cluster is shutdown while writing a table. The recomended solution from Databricks documentation:
This flag deletes the _STARTED directory and returns the process to the original state. For example, you can set it in the notebook
%py
spark.conf.set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation","true")
All of the other recommended solutions here are either workarounds or do not work. The mode is specified as overwrite, meaning you should not need to delete or remove the db or use legacy options.
Instead, try specifying the fully qualified path in the options when writing the table:
df.write \
.option("path", "hdfs://cluster_name/path/to/my_db") \
.mode("overwrite") \
.saveAsTable("my_db.my_table")
For a more context-free answer, run this in your notebook:
dbutils.fs.rm("dbfs:/user/hive/warehouse/SomeData", recurse=True)
Per Databricks's documentation, this will work in a Python or Scala notebook, but you'll have to use the magic command %python at the beginning of the cell if you're using an R or SQL notebook.
I have the same issue, I am using
create table if not exists USING delta
If I first delete the files lie suggested, it creates it once, but second time the problem repeats, It seems the create table not exists does not recognize the table and tries to create it anyway
I don't want to delete the table every time, I'm actually trying to use MERGE on keep the table.
Well, this happens because you're trying to write data to the default location (without specifying the 'path' option) with the mode 'overwrite'.
Like said Mike you can set "spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation" to "true", but this option was removed in Spark 3.0.0.
If you try to set this option in Spark 3.0.0 you will get the following exception:
Caused by: org.apache.spark.sql.AnalysisException: The SQL config 'spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation' was removed in the version 3.0.0. It was removed to prevent loosing of users data for non-default value.;
To avoid this problem you can explicitly specify the path where you're going to save with the 'overwrite' mode.

Cassandra database import issue for timeuuid

I have installed Cassandra 2.2.12 on my window machine locally. I have exported database from live server in a '.sql' file using 'razorsql' GUI tool. I don't have server access for live, only have database access. When i am trying ti import '.sql' file using 'razorsql' to local cassandra setup, its giving me error (Invalid STRING constant '8ca25030-89ab-11e7-addb-70a0656e5127' for "id" of type timeuuid).
Even i tried using COPY FROM command, its returning same error. Please find attached screen-shot for more detail of error.
Could anybody please help?
You should not put any quotes, because then it gets interpreted as a string instead of UUID - hence the error message.
See also: Inserting a hard-coded UUID via CQLsh (Cassandra)
I think you have two solutions:
edit your export file and remove the single quotes from the inserts.
rerun the export and export the data as csv and run the copy command in cqlsh. In this case, the csv file will not have quotes.

Customize Liquibase Control tables (DATABASECHANGELOG & DATABAEECHANGELOGLOCK)

Following the inputs from the below forum, system properties were specified and customized names for DATABASECHANGELOG & DATABASECHANGELOGLOCK tables were used & setup (on liquibase update execution).
http://forum.liquibase.org/topic/configurable-databasechangelog-table-name
Liquibase version-3.5.1, Database-Oracle 12c, OS-Redhat Linux
But, on subsequent attempts to execute future liquibase updates (against same database schema), the execution fails as its trying to recreate the customized DATABASECHANGELOG table again - failing with Object Name already in use. This does not happen when trying to use the standard liquibase control tables names (i.e. DATABASECHANGELOG & DATABASECHANGELOGLOCK)
Is there an option to skip recreation of customized liquibase control tables OR another fix for this this issue?
Why setting these as system properties?
You can invoke liquibase like the following (or these arguments defined in the properties file):
liquibase <regular arguments > --liquibaseSchemaName=YOUR_SCHEMA \
--databaseChangeLogTableName=YOUR_DBCHANGELOG \
--databaseChangeLogLockTableName=YOUR_DBCHANGELOGLOCK ....
It works fine for us (liquibase 3.5.1, Oracle 12c).
Thanks
The issue was due to the case-sensitivity of the custom table names when executing against oracle databases. Weird error, but its resolved when specifying the value in upper case (via system properties/command line)

Resources