Copying Data into Cassandra table

Copying Data into Cassandra table - cassandra

Can we import/copy multiple files into acassandra table which are having same column name in a table and in files?
COPY table1(timestamp ,temp ,total_load ,designl) FROM 'file1', 'file2' WITH HEADER = 'true';
I tried using above syntax: but its saying
Improper COPY command.
i mean to to say suppose we have 100's of delimiter files with same columns, and i want to load all files into single cassandra table using single cql query?
is this possible:?
when i tried it using each COPY command for each file to a table it is Over Writing the data?
Please Help me!

You can specify more tables with the following synax:
COPY table1("timestamp", temp, total_load, designl) FROM 'file1, file2' WITH HEADER = 'true';
or you can also use wildcards:
COPY table1("timestamp", temp, total_load, designl) FROM 'folder/*.csv' WITH HEADER = 'true';
Two remarks however:
Timestamp is a type name in Cassandra, if your column has this name, you need to quote it as I did in the example above.
If your data is overwritten when executing several copy commands, then it will be overwritten even if you execute a single copy command. If you have several lines for the same PRIMARY KEY, then only the last row will win.

Related

Ignore extra columns in CSV when using COPY FROM in Cassandra

I have a table with two columns id, name and I am ingesting data from a 3rd party that may add data to each row without my knowledge. I want to ensure the file still loads until I am ready to change my data model, ignoring any extra columns that are added to the end of each row.
E.g. If the csv file changes to id, name, email at some point, I want my COPY FROM query to continue happily loading id, name.
I've tried to implement SKIPCOLS, but as far as I can see, this only seems to work when there is a match between fields in the first place.
i.e. The following is the correct usage of COPY FROM with SKIPCOLS but will not help my needs as I can't seem to reference a column that only exists in the csv.
COPY users (id, name) FROM 'users.csv' WITH SKIPCOLS = 'name';
Is there another way to do this or a different way to use SKIPCOLS that I am missing?

Datastax rename table

I have deployed 9 node cluster on google cloud.
Created a table and loaded the data. Now want to change the table name.
Is there any way I can change the table name in Cassandra?
Thanks

You can't rename table name.
You have to drop the table and create again
You can use ALTER TABLE to manipulate the table metadata. Do this to change the datatype of a columns, add new columns, drop existing columns, and change table properties. The command returns no results.
Start the command with the keywords ALTER TABLE, followed by the table name, followed by the instruction: ALTER. ADD, DROP, RENAME, or WITH. See the following sections for the information each instruction require
If you need the data you can backup and restore data using copy command in cqlsh.
To Backup data :
COPY old_table_name TO 'data.csv'
To Restore data :
COPY new_table_name FROM 'data.csv'

How to use imp command for overwrite existing data

I am using the imp command for importing a database but after one time, we are executing the imp command again so that data is inserted a 2nd time. We want to remove the old data and insert fresh data.
This is what I tried...
Please help me and suggest for specific parameter which is help solved that type of problem..
thanks and sorry for my English..

IMPDP has the parameter: TABLE_EXISTS_ACTION = {SKIP | APPEND | TRUNCATE | REPLACE}
table_exists_action=skip: This says to ignore the data in the import file and leave the existing table untouched. This is the default and it is not a valid argument if you set content=data_only.
table_exists_action=append: This says to append the export data onto the existing table, leaving the existing rows and adding the new rows from the dmp file. Of course, the number and types of the data columns must match to use the append option. Just like the append hint, Oracle will not re-user any space on the freelists and the high-water mark for the table will be raised to accommodate the incoming rows.
table_exists_action=truncate: This says to truncate the existing table rows, leaving the table definition and replacing the rows from the expdp dmp file being imported. To use this option you must not have any referential integrity (constraints) on the target table. You use the table_exists_action=truncate when the existing table columns match the import table columns. The truncate option cannot be used over a db link or with a cluster table.
table_exists_action=replace: This says to delete the whole table and replace both the table definition and rows from the import dmp file. To use this option you must not have any referential integrity (constraints) on the target table. You use the table_exists_action=replace when the existing table columns do not match the import table columns.

compare the data between two variables and generate a report in to the text file

my requirement is transfer the data from source data base to the target database
job1.
sourcedatabase:oracle. target
table1 target1.lst
table2 table2.lst
table3 table3.lst
this part i done successfully.
job 2.
now i want to count the number of records source database and target database
this part also done successfully.
job3: ...........(this part only i am lacking)
i kept the record count between source and target in variable as well as text file
now tell me how to compare the the values in a variable or a text file(these values are find by using select count(*) from table and wc -l $filename.) that i may find the loading process done successfully or not and also i want maintain a log file also
please enhance me how to compare the values in a text file or a variable so that i can maintain a log file to generate a report maintain in a text file.

It's not very clear where these text files came from and why to compare to them. Why don't you store the counts in the database in first place (instead of/in addition to writing them to a file).
text file or a variable
What variable? In Oracle PL/SQL you can compare variables using =, !=, is null, is not null, etc. In any other programming language: there are comparison operators too.

Query database columns using Excel/csv data

I have a case where I need to read an Excel/csv/text file containing two columns (say colA and colB) of values (around 1000 rows). I need to query the database using values in colA. The query will return an XMLType into which the respective colB value needs to be inserted. I have the XML query and the insert working but I am stuck on what approach I should take to read the data, query and update it on the fly.
I have tried using external tables but realized that I don't have access to the server root to host the data file. I have also considered creating a temporary table to load the data to using SQL Loader or something similar and run the query/update within the tables. But that would need some formal overhead to go through. I would appreciate suggestions on the approach. Examples would be greatly helpful.
e.g.
text or Excel file:
ColA,ColB
abc,123
def,456
ghi,789
XMLTypeVal e.g.
<node1><node2><node3><colA></colA><colB></colB></node3></node2></node1>
UPDATE TableA SET XMLTypeVal
INSERTCHILDXML(XMLTypeVal,
'/node1/node2/node3', 'colBval',
XMLType('<colBval>123</colBval>'))
WHERE EXTRACTVALUE(TableA.XMLTypeVal, node1/node2/node3/ColA') = ('colAval');

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Copying Data into Cassandra table - cassandra

Related

Ignore extra columns in CSV when using COPY FROM in Cassandra

Datastax rename table

How to use imp command for overwrite existing data

compare the data between two variables and generate a report in to the text file

Query database columns using Excel/csv data

Categories

Resources