ResponseError : Expected 4 or 0 byte int - node.js

I am trying cassandra node driver and stuck in problem while inserting a record, it looks like cassandra driver is not able to insert float values.
Problem: When passing int value for insertion in db, api gives following error:
Debug: hapi, internal, implementation, error
ResponseError: Expected 4 or 0 byte int (8)
at FrameReader.readError (/home/gaurav/Gaurav-Drive/code/nodejsWorkspace/cassandraTest/node_modules/cassandra-driver/lib/readers.js:291:13)
at Parser.parseError (/home/gaurav/Gaurav-Drive/code/nodejsWorkspace/cassandraTest/node_modules/cassandra-driver/lib/streams.js:185:45)
at Parser.parseBody (/home/gaurav/Gaurav-Drive/code/nodejsWorkspace/cassandraTest/node_modules/cassandra-driver/lib/streams.js:167:19)
at Parser._transform (/home/gaurav/Gaurav-Drive/code/nodejsWorkspace/cassandraTest/node_modules/cassandra-driver/lib/streams.js:101:10)
at Parser.Transform._read (_stream_transform.js:179:10)
at Parser.Transform._write (_stream_transform.js:167:12)
at doWrite (_stream_writable.js:225:10)
at writeOrBuffer (_stream_writable.js:215:5)
at Parser.Writable.write (_stream_writable.js:182:11)
at write (_stream_readable.js:601:24)
I am trying to execute following query from code:
INSERT INTO ragchews.user
(uid ,iid ,jid ,jpass ,rateCount ,numOfratedUser ,hndl ,interests ,locX ,locY ,city )
VALUES
('uid_1',{'iid1'},'jid_1','pass_1',25, 10, {'NEX1231'}, {'MUSIC'}, 21.321, 43.235, 'delhi');
parameter passed to execute() is
var params = [uid, iid, jid, jpass, rateCount, numOfratedUser, hndl, interest, locx, locy, city];
where
var locx = 32.09;
var locy = 54.90;
and call to execute looks like:
var addUserQuery = 'INSERT INTO ragchews.user (uid ,iid ,jid ,jpass ,rateCount ,numOfratedUser ,hndl ,interests ,locX ,locY ,city) VALUES (?,?,?,?,?,?,?,?,?,?,?);';
var addUser = function(user, cb){
console.log(user);
client.execute(addUserQuery, user, function(err, result){
if(err){
throw err;
}
cb(result);
});
};
CREATE TABLE ragchews.user(
uid varchar,
iid set<varchar>,
jid varchar,
jpass varchar,
rateCount int,
numOfratedUser int,
hndl set<varchar>,
interests set<varchar>,
locX float,
locY float,
city varchar,
favorite map<varchar, varchar>,
PRIMARY KEY(uid)
);
P.S
Some observations while trying to understand the issue:
Since it seems, problem is with float so i changed type float (of locX, locY) to int and re-run the code. Same error persist. Hence, it is not problem associated specifically to float CQL type.
Next, i attempted to remove all int from the INSERT query and attempted to insert only non-numeric values. This attempt successfully inputted the value into db. Hence it looks like now that, this problem may be associated with numeric types.

Following words are as it is picked from cassandra node driver data type documentation
When encoding data, on a normal execute with parameters, the driver tries to guess the target type based on the input type. Values of type Number will be encoded as double (as Number is double / IEEE 754 value).
Consider the following example:
var key = 1000;
client.execute('SELECT * FROM table1 where key = ?', [key], callback);
If the key column is of type int, the execution fails. There are two possible ways to avoid this type of problem:
Prepare the data (recommended) - prepare the query before execution
client.execute('SELECT * FROM table1 where key = ?', [key], { prepare : true }, callback);
Hinting the target types - Hint: the first parameter is an integer`
client.execute('SELECT * FROM table1 where key = ?', [key], { hints : ['int'] }, callback);
If you are dealing with batch update then this issue may be of your interest.

Related

Why does this happen when inserting thousands of rows into an Oracle table with Node.js?

I am new in Oracle and I am looking for the best way for insert thousands (maybe millions) of records into a table.
I have seen other questions and answers about this situation, but in this answer the PL/SQL code use TWO associate arrays of scalar types (PSL_INTEGER) and works as table columns, I need the same but with ONE nested table of record/complex type for insert in the table as a row.
First of all, I have this code in Node.js (TypeScript) using the oracledb package (v 5.1.0):
let data: Array<DataModel>;
// data's variable is populated with data and 'DataModel' is an interface,
// data is an array with a the exact table's structure:
// [
// { C_ONE: 'mike', C_TWO: 'hugman', C_THREE: '34', ... with other 12 columns },
// { C_ONE: 'robert', C_TWO: 'zuck', C_THREE: '34', ... with other 12 columns },
// { C_ONE: 'john', C_TWO: 'gates', C_THREE: '34', ... with other 12 columns }
// ]
let context;
try {
context = await oracledb.getConnection({
user: 'admin',
password: 'admin',
connectString: 'blabla'
});
const result = await context.execute(
// My SP
'BEGIN PACKAGE_TEST.SP_TEST_STRESS(:p_data, :p_status); END;',
{
// My JSON Array
p_data: {
type: 'PACKAGE_TEST.T_STRESS',
val: data
},
// Variable for check if all success or fails... this doesn't matters :)
p_status: {
type: oracledb.NUMBER,
val: 1,
dir: oracledb.BIND_OUT
}
},
{ autoCommit: true }
);
console.log(result);
if ((result.outBinds as { p_status: number }).p_status === 0) {
// Correct
}
else {
// Failed
}
} catch (error) {
// bla bla for errors
} finally {
if (context) {
try {
await context.close();
} catch (error) {
// bla bla for errors
}
}
}
And the PL/SQL code for my sotore procedure:
CREATE OR REPLACE PACKAGE PACKAGE_TEST
IS
TYPE R_STRESS IS RECORD
(
C_ONE VARCHAR(50),
C_TWO VARCHAR(500),
C_THREE VARCHAR(10),
C_FOUR VARCHAR(100),
C_FIVE VARCHAR(10),
C_SIX VARCHAR(100),
C_SEVEN VARCHAR(50),
C_EIGHT VARCHAR(50),
C_NINE VARCHAR(50),
C_TEN VARCHAR(50),
C_ELEVEN VARCHAR(50),
C_TWELVE VARCHAR(50),
C_THIRTEEN VARCHAR(300),
C_FOURTEEN VARCHAR(100),
C_FIVETEEN VARCHAR(300),
C_SIXTEEN VARCHAR(50)
);
TYPE T_STRESS IS VARRAY(213627) OF R_STRESS;
PROCEDURE SP_TEST_STRESS
(
P_DATA_FOR_PROCESS T_STRESS,
P_STATUS OUT NUMBER
);
END;
/
CREATE OR REPLACE PACKAGE BODY PACKAGE_TEST
IS
PROCEDURE SP_TEST_STRESS
(
P_DATA_FOR_PROCESS T_STRESS,
P_STATUS OUT NUMBER
)
IS
BEGIN
DBMS_OUTPUT.put_line('started');
BEGIN
FORALL i IN 1 .. P_DATA_FOR_PROCESS.COUNT
INSERT INTO TEST_STRESS
(
C_ONE,
C_TWO,
C_THREE,
C_FOUR,
C_FIVE,
C_SIX,
C_SEVEN,
C_EIGHT,
C_NINE,
C_TEN,
C_ELEVEN,
C_TWELVE,
C_THIRTEEN,
C_FOURTEEN,
C_FIVETEEN,
C_SIXTEEN
)
VALUES
(
P_DATA_FOR_PROCESS(i).C_ONE,
P_DATA_FOR_PROCESS(i).C_TWO,
P_DATA_FOR_PROCESS(i).C_THREE,
P_DATA_FOR_PROCESS(i).C_FOUR,
P_DATA_FOR_PROCESS(i).C_FIVE,
P_DATA_FOR_PROCESS(i).C_SIX,
P_DATA_FOR_PROCESS(i).C_SEVEN,
P_DATA_FOR_PROCESS(i).C_EIGHT,
P_DATA_FOR_PROCESS(i).C_NINE,
P_DATA_FOR_PROCESS(i).C_TEN,
P_DATA_FOR_PROCESS(i).C_ELEVEN,
P_DATA_FOR_PROCESS(i).C_TWELVE,
P_DATA_FOR_PROCESS(i).C_THIRTEEN,
P_DATA_FOR_PROCESS(i).C_FOURTEEN,
P_DATA_FOR_PROCESS(i).C_FIVETEEN,
P_DATA_FOR_PROCESS(i).C_SIXTEEN
);
EXCEPTION
WHEN OTHERS THEN
p_status := 1;
END;
P_STATUS := 0;
END;
END;
And my target table:
CREATE TABLE TEST_STRESS
(
C_ONE VARCHAR(50),
C_TWO VARCHAR(500),
C_THREE VARCHAR(10),
C_FOUR VARCHAR(100),
C_FIVE VARCHAR(10),
C_SIX VARCHAR(100),
C_SEVEN VARCHAR(50),
C_EIGHT VARCHAR(50),
C_NINE VARCHAR(50),
C_TEN VARCHAR(50),
C_ELEVEN VARCHAR(50),
C_TWELVE VARCHAR(50),
C_THIRTEEN VARCHAR(300),
C_FOURTEEN VARCHAR(100),
C_FIVETEEN VARCHAR(300),
C_SIXTEEN VARCHAR(50)
);
An intersting behavior happens with this scenario:
If I send my JSON Array with 200 rows, this works perfectly, I don't know the exact time it takes to complete successfully, but
I can tell it's milliseconds.
If I send my JSON Array with 200,000 rows, this takes three or four minutes to wait, the promise is resolved and it throws me an exception of type: ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT
This happens when passing the JSON Array to the procedure parameter, it seems that when processing it it will cost too much.
Why does this happen in the second scenario?
Is there a limitation on
the number of rows in the NESTED TABLE TYPES or is any configuration (default) with Node.js?
Oracle suggests
increasing pga_aggregate_limit but seeing it in my SQLDeveloper with
"show parameter pga;" It is 3G, does it mean that the information I
am sending is exceeding 3 GB? Is normal?
Is there a more viable solution that
does not affect the performance of the database?
Appreciate your help.
Each server process gets its own PGA, so I'm guessing this is causing the total aggregate PGA, over all the processes currently running, to go over 3 GB.
I assume this is happening because of what's going on inside your package, but you only show the specification, so there's no way to tell what's happening there.
You're not using a nested table type. You're using a varray. A varray has a maximum length of 2,147,483,647.
It sounds like you're doing something inside your procedure to use too much memory. Maybe you need to process the 200,000 rows in chunks? With no more information about what you're trying to do, can you use some other process to load your data, like sqlldr?

SELECT VALUE COUNT(1) FROM (SELECT DISTINCT c.UserId FROM root c) AS t not working

In a Cosmos DB stored procedure, I'm using a inline sql query to try and retrieve the distinct count of a particular user id.
I'm using the SQL API for my account. I've run the below query in Query Explorer in my Cosmos DB account and I know that I should get a count of 10 (There are 10 unique user ids in my collection):
SELECT VALUE COUNT(1) FROM (SELECT DISTINCT c.UserId FROM root c) AS t
However when I run this in the Stored Procedure portal, I either get 0 records back or 18 records back (total number of documents). The code for my Stored Procedure is as follows:
function GetDistinctCount() {
var collection = getContext().getCollection();
var isAccepted = collection.queryDocuments(
collection.getSelfLink(),
'SELECT VALUE COUNT(1) FROM (SELECT DISTINCT c.UserId FROM root c) AS t',
function(err, feed, options) {
if (err) throw err;
if (!feed || !feed.length) {
var response = getContext().getResponse();
var body = {code: 404, body: "no docs found"}
response.setBody(JSON.stringify(body));
} else {
var response = getContext().getResponse();
var body = {code: 200, body: feed[0]}
response.setBody(JSON.stringify(body));
}
}
)
}
After looking at various feedback forums and documentation, I don't think there's an elegant solution for me to do this as simply as it would be in normal SQL.
the UserId is my partition key which I'm passing through in my C# code and when I test it in the portal, so there's no additional parameters that I need to set when calling the Stored Proc. I'm calling this Stored Proc via C# and adding any further parameters will have an effect on my tests for that code, so I'm keen not to introduce any parameters if I can.
Your problem is caused by that you missed setting partition key for your stored procedure.
Please see the statements in the official document:
And this:
So,when you execute a stored procedure under a partitioned collection, you need to pass the partition key param. It's necessary! (Also this case explained this:Documentdb stored proc cross partition query)
Back to your question,you never pass any partition key, equals you pass an null value or "" value for partition key, so it outputs no data because you don't have any userId equals null or "".
My advice:
You could use normal Query SDK to execute your sql, and set the enableCrossPartitionQuery: true which allows you scan entire collection without setting partition key. Please refer to this tiny sample:Can't get simple CosmosDB query to work via Node.js - but works fine via Azure's Query Explorer
So I found a solution that returns the result I need. My stored procedure now looks like this:
function GetPaymentCount() {
var collection = getContext().getCollection();
var isAccepted = collection.queryDocuments(
collection.getSelfLink(),
'SELECT DISTINCT VALUE(doc.UserId) from root doc' ,
{pageSize:-1 },
function(err, feed, options) {
if (err) throw err;
if (!feed || !feed.length) {
var response = getContext().getResponse();
var body = {code: 404, body: "no docs found"}
response.setBody(JSON.stringify(body));
} else {
var response = getContext().getResponse();
var body = {code: 200, body: JSON.stringify(feed.length)}
response.setBody(JSON.stringify(body));
}
}
)
}
Essentially, I changed the pageSize parameter to -1 which returned all the documents I knew would be returned in the result. I have a feeling that this will be more expensive in terms of RU/s cost, but it solves my case for now.
If anyone has more efficient alternatives, please comment and let me know.

PostgreSQL COPY To STDOUT takes lot of time for large amount of data

I have 4 tables with almost 770K rows in each table and I am using COPY TO STDOUT command for data backup. I am using pg module for database connection. Following code shows database client connection:
var client = new pg.Client({user: 'dbuser', database: 'dbname'});
client.connect();
Following is the code for backup:
var stream1 = client.copyTo("COPY table_name TO STDOUT WITH CSV");
stream1.on('data', function (chunk) {
rows = Buffer.concat([rows, chunk]);
});
stream1.on('end', function () {
myStream._write(rows, 'hex', function(){
rows = new Buffer(0);
return cb(null, 1);
});
});
stream1.on('error', function (error) {
debug('Error: ' + error);
return cb(error, null);
});
myStream is the object of class which inherits stream.Writable. myStream._write() will concatenate streams in a buffer and at the end, the data in buffer is stored in a file.
It works fine for small amount of data but it takes lot of time for large data.
I am using PostgreSQL 9.3.4 and NodeJS v0.10.33
The create table statement is:
CREATE TABLE table_name
(
id serial NOT NULL,
date_time timestamp without time zone,
dev_id integer,
type integer,
value1 double precision,
value2 double precision,
value3 double precision,
value4 double precision,
message character varying(255),
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
)
Here is the execution plan:
dbname=# explain (analyze, verbose, buffers) select * from table_name;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.table_name (cost=0.00..18022.19 rows=769819 width=105) (actual time=0.047..324.202 rows=769819 loops=1)
Output: id, date_time, dev_id, type, value1, value2, value3, value4, message, created_at, updated_at
Buffers: shared hit=10324
Total runtime: 364.909 ms
(4 rows)

Inserting timestamp into Cassandra

I have a table created as follows:
CREATE TABLE my_table (
date text,
id text,
time timestamp,
value text,
PRIMARY KEY (id));
CREATE INDEX recordings_date_ci ON recordings (date);
I'm able to simply add a new row to the table using the following Node code:
const cassandra = require('cassandra-driver');
const client = new cassandra.Client({ contactPoints: ['localhost'], keyspace: 'my_keyspace'});
const query = 'INSERT INTO my_table (date, id, time, url) VALUES (?, ?, ?, ?)';
client.execute(query, ['20160901', '0000000000', '2016-09-01 00:00:00+0000', 'random url'], function(err, result) {
if (err){
console.log(err);
}
console.log('Insert row ended:' + result);
});
However, I get the following error:
'Error: Expected 8 or 0 byte long for date (24)
When I change the timestamp to epoc time:
client.execute(query, ['20160901', '0000000000', 1472688000, 'random url']
I get:
d
OverflowError: normalized days too large to fit in a C int
I'm able to insert new rows via cqlsh so I'm probably missing something with the node.js driver.
Any idea?
Thanks
Where you have a string 2016-09-01 00:00:00+0000, instead use new Date('2016-09-01 00:00:00+0000').

Cassandra - NodeJS - Issue while retrieving map type values

I am using helenus in my node-js project to get/set values in cassandra. I have a MapType field inside my Table, but when I retrieve the value from the table, I get an empty key-value set.
Below is the schema for my table
CREATE TABLE datapoints (
id uuid PRIMARY KEY,
created_at timestamp,
properties map<text,text>
);
I have inserted the values from cql using the query below
INSERT INTO datapoints (id, properties) VALUES (24053e20-63e9-11e3-8d81-0002a5d5c51b, { 'fruit' : 'apple', 'band' : 'Beatles' });
Below is my nodejs code:
var helenus = require('/usr/local/lib/node_modules/helenus')
var pool = new helenus.ConnectionPool({
hosts : ['localhost:9160'],
keyspace : 'mykeyspace',
timeout : 3000
});
pool.connect(function(err, keyspace){
if(err){
console.log("connect me error")
throw(err);
} else {
pool.cql("SELECT * FROM datapoints", [], function(err,results){
console.log("results", results)
results.forEach(function(row){
props = row.get('properties').value;
var id = row.get('id').value;
console.log("properties", props);
console.log("id", id);
});
})
}
});
The line console.log("properties", props); returns me a function, and when I call that function, I get an empty key value set. Please help.
It seems there was an issue with the deserialization of the collection types. The pull request that was made in the past broke the deserialization. I just pushed a fix to version 0.6.8 that should take care of this.

Resources