Cassandra database import issue for timeuuid - node.js

I have installed Cassandra 2.2.12 on my window machine locally. I have exported database from live server in a '.sql' file using 'razorsql' GUI tool. I don't have server access for live, only have database access. When i am trying ti import '.sql' file using 'razorsql' to local cassandra setup, its giving me error (Invalid STRING constant '8ca25030-89ab-11e7-addb-70a0656e5127' for "id" of type timeuuid).
Even i tried using COPY FROM command, its returning same error. Please find attached screen-shot for more detail of error.
Could anybody please help?

You should not put any quotes, because then it gets interpreted as a string instead of UUID - hence the error message.
See also: Inserting a hard-coded UUID via CQLsh (Cassandra)

I think you have two solutions:
edit your export file and remove the single quotes from the inserts.
rerun the export and export the data as csv and run the copy command in cqlsh. In this case, the csv file will not have quotes.

Related

Azure Databricks - Can not create the managed table The associated location already exists

I have the following problem in Azure Databricks. Sometimes when I try to save a DataFrame as a managed table:
SomeData_df.write.mode('overwrite').saveAsTable("SomeData")
I get the following error:
"Can not create the managed table('SomeData'). The associated
location('dbfs:/user/hive/warehouse/somedata') already exists.;"
I used to fix this problem by running a %fs rm command to remove that location but now I'm using a cluster that is managed by a different user and I can no longer run rm on that location.
For now the only fix I can think of is using a different table name.
What makes things even more peculiar is the fact that the table does not exist. When I run:
%sql
SELECT * FROM SomeData
I get the error:
Error in SQL statement: AnalysisException: Table or view not found:
SomeData;
How can I fix it?
Seems there are a few others with the same issue.
A temporary workaround is to use
dbutils.fs.rm("dbfs:/user/hive/warehouse/SomeData/", true)
to remove the table before re-creating it.
This generally happens when a cluster is shutdown while writing a table. The recomended solution from Databricks documentation:
This flag deletes the _STARTED directory and returns the process to the original state. For example, you can set it in the notebook
%py
spark.conf.set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation","true")
All of the other recommended solutions here are either workarounds or do not work. The mode is specified as overwrite, meaning you should not need to delete or remove the db or use legacy options.
Instead, try specifying the fully qualified path in the options when writing the table:
df.write \
.option("path", "hdfs://cluster_name/path/to/my_db") \
.mode("overwrite") \
.saveAsTable("my_db.my_table")
For a more context-free answer, run this in your notebook:
dbutils.fs.rm("dbfs:/user/hive/warehouse/SomeData", recurse=True)
Per Databricks's documentation, this will work in a Python or Scala notebook, but you'll have to use the magic command %python at the beginning of the cell if you're using an R or SQL notebook.
I have the same issue, I am using
create table if not exists USING delta
If I first delete the files lie suggested, it creates it once, but second time the problem repeats, It seems the create table not exists does not recognize the table and tries to create it anyway
I don't want to delete the table every time, I'm actually trying to use MERGE on keep the table.
Well, this happens because you're trying to write data to the default location (without specifying the 'path' option) with the mode 'overwrite'.
Like said Mike you can set "spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation" to "true", but this option was removed in Spark 3.0.0.
If you try to set this option in Spark 3.0.0 you will get the following exception:
Caused by: org.apache.spark.sql.AnalysisException: The SQL config 'spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation' was removed in the version 3.0.0. It was removed to prevent loosing of users data for non-default value.;
To avoid this problem you can explicitly specify the path where you're going to save with the 'overwrite' mode.

SSIS package works from SSMS but not from agent job

I've an SSIS package to load excel file from network drive. It's designed to load content and then move the file to archived folder.
Everything works good when the following SQL statement runs in SSMS window.
However when it's copied to SQL agent job and executes from there, the file is neither loaded nor moved. But it shows "successful" from the agent log.
The same thing also happened to "SSIS job" instead of T-SQL job, even with proxy of windows account.(same account as ssms login)
Declare #execution_id bigint
EXEC [SSISDB].[catalog].[create_execution] #package_name=N'SG_Excel.dtsx', #execution_id=#execution_id OUTPUT, #folder_name=N'ETL', #project_name=N'Report', #use32bitruntime=True, #reference_id=Null
Select #execution_id
DECLARE #var0 smallint = 1
EXEC [SSISDB].[catalog].[set_execution_parameter_value] #execution_id, #object_type=50, #parameter_name=N'LOGGING_LEVEL', #parameter_value=#var0
EXEC [SSISDB].[catalog].[start_execution] #execution_id
GO
P.S. At first relative path of network drive is applied, then switched to absolute path(\\server\folder). It's not solving the issue.
SSIS Package Jobs run under the context of the SQL Server Agent. What Account is setup to run the SQL Server Agent on the SQL Server? It may need to be run as a Domain account that has access to the network share.
Or you can copy the Excel file to local folder on the SQL Server, so the Package can access the file there.
Personally I avoid the File System Task - I have found it unreliable. I would replace that with a Script Task, and use .NET methods from the System.IO namespace e.g. File.Move. These are way more reliable and have mature error handling.
Here's a starting point for the System.IO namespace:
https://msdn.microsoft.com/en-us/library/ms404278.aspx
Be sure to select the relevant .NET version using the Other Versions link.
When I have seen things like this in the past it's been that my package isn't accessing the path I thought it was at run time, its looking somewhere else, finding an empty folder & exiting with success.
SSIS can have a nasty habit of going back to variable defaults . It may be looking at a different path you used in dev? Maybe hard code all path values as a test? or put in break points & double check the run time values of all variables & parameters.
Other long shots may be:
Name resolution, are you sure the network name is resolving correctly at runtime?
32/64 bit issues. Dev tends to run 32 bit, live may be 64 bit. May interfere with file paths? Maybe force to 32 bit at run time?
There is issue with sql statement not having statement terminator (;) that is causing issue.
Declare #execution_id bigint ;
EXEC [SSISDB].[catalog].[create_execution] #package_name=N'SG_Excel.dtsx', #execution_id=#execution_id OUTPUT, #folder_name=N'ETL', #project_name=N'Report', #use32bitruntime=True, #reference_id=Null ;
Select #execution_id ;
DECLARE #var0 smallint = 1 ;
EXEC [SSISDB].[catalog].[set_execution_parameter_value] #execution_id, #object_type=50, #parameter_name=N'LOGGING_LEVEL', #parameter_value=#var0 ;
EXEC [SSISDB].[catalog].[start_execution] #execution_id ;
GO
I have faced similar issue in service broker ..

MemSQL support for MySQL style user variables in the load data command

Does MemSQL support user variables in the load data command, similar to MySQL (see MySQL load NULL values from CSV data for examples)? The MemSQL documentation (https://docs.memsql.com/docs/load-data) doesn't give a clue, and my attempts at using user variables have failed.
No, variables in LOAD DATA are not currently supported in general (as of MemSQL 5.5). This is a feature we are tracking for a future release.
We only support the following syntax to skip the contents of a column in the file using a dummy variable (briefly mentioned in the docs https://docs.memsql.com/docs/load-data):
load data infile 'foo.tsv' into table foo (bar, #, #, baz);

How to get Cassandra database dump with data

I need to get a dump(with data) from remote Cassandra database. I was able to get database schema via following command.How can i get all data in the keyspace?
I'm using Cassandra 1.1.9
echo -e "connect localhost/9260;\r\n use PWC_Keyspace;\r\n show schema;\n" | bin/cassandra-cli -h localhost -port 9260 > dilshan.cdl
With Cassandra 1.1.9, I don't believe you have access to cqlsh with the copy-to command, so you'll be stuck with 2 options.
1) Export the data from the data files (sstables) on disk using sstable2json, or
2) Write a program to iterate over every row and copy/serialize it to a format you find easier to work with.
You MAY be able to use a more recent cqlsh (say, from 2.0, which still used thrift instead of the native interface), and point it at your 1.1.9 server and use 'COPY TO' to export each table to a csv. However, the COPY command in cqlsh for 2.0 doesn't use paging, and cassandra 1.1.19 doesn't support paging, so there's a very good chance it's simply going to time out and fail.

int object has no attribute replace when trying to run a CQL command in cassandra

I have a counter column family in cassandra. When i try to view the data from CQL i get an error even though there is data in the column family.
SELECT * from userstats;
Generates the following error:
'int' object has no attribute 'replace'
I can confirm that the data is in the column family and is working properly since I can view the data with the Datastax Opscenter data explorer.
It sounds like you're using an older version of cqlsh. Upgrading it (just copying the bin/cqlsh file from the Cassandra 1.1 branch head, along with everything under the pylib directory, into place) ought to solve this.
If it doesn't, running cqlsh with --debug would help a lot in diagnosing the problem.

Resources