DataStax Devcenter export CSV format with "|" instead of "," - cassandra

I'm looking for a way to change the delimiter of CSV files exported by DataStax DevCenter from a "," to a "|". Reason being that there are several array lists in my Cassandra data export that have commas already so a "," delimiter makes it challenging to parse the data once I open the export as CSV file and then try to open it to parse with another application. I'm using DevCenter v1.6.0

COPY table_name ( column , ... )
TO 'file_name'
WITH option = 'value'
You can use DELIMITER option.
Example:
COPY table_name ( column , ... )
TO 'file_name'
WITH DELIMITER= '|'
For list of other options please refer below link.
COPY Command in Cassandra

Related

How to COPY Cassandra data from yesterday to csv

I want to copy all the data of one Cassandra node from yesterday (midnight to midnight) into a csv file.
Is there a nice solution for this?
There are a couple ways to copy or export data from Cassandra. If you want a resulting csv file you will have to use the CQL COPY command. COPY TO exports data from a table into a csv file
COPY table_name [( column_list )]
TO 'file_name'[, 'file2_name', ...] | STDOUT
[WITH option = 'value' [AND ...]]
You can read more about how to use the COPY TO command HERE
You could also use sstabledump but you end up with a json file. You can read about that option HERE
Hope that helps,
Pat

Data parsing using hive query

I am building a pipeline through Azure data factory. Input dataset is a csv file with column delimiter and the output dataset is also a csv file column delimiter. The pipeline is designed with a HDinsight activity through hive query in the file with extension .hql. The hive query is as follows
set hive.exec.dynamic.partition.mode=nonstrict;
DROP TABLE IF EXISTS Table1;
CREATE EXTERNAL TABLE Table1 (
Number string,
Name string,
Address string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/your/folder/location'
SELECT * FROM Table1;
Below is the file format
Number,Name,Address
1,xyz,No 152,Chennai
2,abc,7th street,Chennai
3,wer,Chennai,Tamil Nadu
How do I data parse the column header with the data in the output dataset?
As per my understanding, Your question is related to to csv file. You are putting csv file at table location and it consist of header. If my understanding is correct, Please try below property in your table ddl. I hope this will help you.
tblproperties ("skip.header.line.count"="1");
Thanks,
Manu

PSQL to CSV with column alias leads to corrupted file

I managed to use PSQL on Windows to export a SQL query directly into a CSV file, and everything works fine as long as I don't redefine column names with aliases (using AS).
But as soon as I use a column alias, e.g.:
\copy (SELECT project AS "ID" FROM mydb.mytable WHERE project > 15 ORDER BY project) TO 'C:/FILES/user/test_SQL2CSV.csv' DELIMITER ',' CSV HEADER
I have unexpected behaviors with the CSV file.
In Excel: the CSV is corrupted and is blank
In Notepad: the data is present, but with no delimiter or spaces
(continous, e.g. ID27282930...)
In Notepad++: the data is well organized in a column
(e.g.
ID
27
28
29
30
...
)
Is there anything to do so that the exported file can be read directly within Excel (as it happens when I don't use aliases)?
After testing various other configurations of the query, I found the issue. Apparently Excel interprets a file starting with "ID" as some SYLK format instead of CSV... Renaming the column alias to e.g. "MyID" fixed the issue.
Reference here: annalear.ca/2010/06/10/why-excel-thinks-your-csv-is-a-sylk

BCP SQL Command to .csv file has formatting issue

I have a simple BCP command to query my MSSQL database and copy the result to a .csv file like this:
bcp "select fullname from [database1].[dbo].employee" queryout "c:\test\table.csv" -c -t"," -r"\n" -S servername -T
The issue comes when the fullname column is varchar separated by a comma like "Lee, Bruce". When the result is copied to the .csv file, the portion before the comma (Lee) is placed into the first column in Excel spreadsheet and the portion after the comma (Bruce) is placed in the second column. I would like it to keep everything in the first column and keep the comma (Lee, Bruce) . Does anyone have any idea how to achieve this?
Obviously you should set columns separator to something different than comma. I'm not familiar with the above syntax, but I guess these: -c -t"," -r"\n" are column and new line separators respectively.
Further you should either change default CSV separator in regional settings OR use import wizard for proper data placing in Excel. By the way, there are plenty of similar questions on SO.

How to import data from Excel file to Teradata table using BTEQ scripts?

I was able to do fill tables with data from Excel file or text files using GUI utility Teradata Sql assistant. But now I have a requirement to import data into teradata tables from excel file using a bteq script. I have been trying to do that using
.IMPORT REPORT .IMPORT DATA .IMPORT VARTEXT and I have tried other things also, but of no use. I have referred to some answers in teradataforum and googled for the same but my script is not working. Please help me with a script which will import data from excel file or atleast text file using BTEQ script.My script is as follows...
.LOGON XXXX/XXXXXX,XXXX
.import data FILE = D:\XX\XXXX.xls ;
.QUIET ON
.REPEAT *
USING COL1 (CHAR(1))
,COL2 (CHAR(1))
,COL3 (VARCHAR(100))
INSERT INTO DATABASE.TABLE
( COL1
,COL2
,COL3)
VALUES ( :COL1
,:COL2
,:COL3);
.QUIT
EDIT:
Till now I came this long. I have successfully loaded data from comma separated text file using the following code. But how to do it in Excel?
.LOGON xxxx/xxxx,xxxx
.IMPORT VARTEXT ',' FILE=xxxxx.TXT;
.QUIET ON
.REPEAT *
USING
( col1 VARCHAR(2)
,col2 VARCHAR(1)
,col3 VARCHAR(60)
)
INSERT INTO database.table
( col1
,col2
,col3)
VALUES ( :col1
,:col2
,:col3);
.QUIT
Sample comma separated text file being
1,B,status1
2,B,status2
3,B,status3
etc.
Please help me if possible to load the same with Excel file.
This is not possible - Excel is a binary format. You have to save it as a comma-separated values file (.CSV) from Excel. You might also be able to come up with some convoluted solution using an Access database that links to the Teradata table and the spreadsheet.

Resources