Sql insert into Tidb int field with '' Got Error Data Truncated - tidb

I create a table in TiDB, with a int field,
while inserting use value '' to this field, I got an error 'Data Truncated'.
My code like this:
CREATE TABLE test(
i1 INT(11),
s1 VARCHAR(16)
)
INSERT INTO test(i1,s1) VALUES ('11','aa'); //ok
INSERT INTO test(i1,s1) VALUES ('','aa'); //Error 'Data Truncated'
INSERT INTO test(i1,s1) VALUES (NULL,'aa') //ok
While in mysql 5.7,the following sql returns ok
INSERT INTO test(i1,s1) VALUES ('','aa');
My TiDB version is :
Release Version: v1.0.6-1-g17c1319
Git Commit Hash: 17c13192136c1f0bf26db6dec994b9f1b43c90f0
Git Branch: release-1.0
UTC Build Time: 2018-01-09 09:07:08
https://github.com/pingcap/tidb/issues/6317

In the case you present, TiDB behaves the same as MySQL. This error is caused by the strict SQL mode. As a workaround, you can:
set ##sql_mode='';
INSERT INTO test(i1,s1) VALUES ('','aa');

Related

Inserting Timestamp Into Snowflake Using Python 3.8

I have an empty table defined in snowflake as;
CREATE OR REPLACE TABLE db1.schema1.table(
ACCOUNT_ID NUMBER NOT NULL PRIMARY KEY,
PREDICTED_PROBABILITY FLOAT,
TIME_PREDICTED TIMESTAMP
);
And it creates the correct table, which has been checked using desc command in sql. Then using a snowflake python connector we are trying to execute following query;
insert_query = f'INSERT INTO DATA_LAKE.CUSTOMER.ACT_PREDICTED_PROBABILITIES(ACCOUNT_ID, PREDICTED_PROBABILITY, TIME_PREDICTED) VALUES ({accountId}, {risk_score},{ct});'
ctx.cursor().execute(insert_query)
Just before this query the variables are defined, The main challenge is getting the current time stamp written into snowflake. Here the value of ct is defined as;
import datetime
ct = datetime.datetime.now()
print(ct)
2021-04-30 21:54:41.676406
But when we try to execute this INSERT query we get the following errr message;
ProgrammingError: 001003 (42000): SQL compilation error:
syntax error line 1 at position 157 unexpected '21'.
Can I kindly get some help on ow to format the date time value here? Help is appreciated.
In addition to the answer #Lukasz provided you could also think about defining the current_timestamp() as default for the TIME_PREDICTED column:
CREATE OR REPLACE TABLE db1.schema1.table(
ACCOUNT_ID NUMBER NOT NULL PRIMARY KEY,
PREDICTED_PROBABILITY FLOAT,
TIME_PREDICTED TIMESTAMP DEFAULT current_timestamp
);
And then just insert ACCOUNT_ID and PREDICTED_PROBABILITY:
insert_query = f'INSERT INTO DATA_LAKE.CUSTOMER.ACT_PREDICTED_PROBABILITIES(ACCOUNT_ID, PREDICTED_PROBABILITY) VALUES ({accountId}, {risk_score});'
ctx.cursor().execute(insert_query)
It will automatically assign the insert time to TIME_PREDICTED
Educated guess. When performing insert with:
insert_query = f'INSERT INTO ...(ACCOUNT_ID, PREDICTED_PROBABILITY, TIME_PREDICTED)
VALUES ({accountId}, {risk_score},{ct});'
It is a string interpolation. The ct is provided as string representation of datetime, which does not match a timestamp data type, thus error.
I would suggest using proper variable binding instead:
ctx.cursor().execute("INSERT INTO DATA_LAKE.CUSTOMER.ACT_PREDICTED_PROBABILITIES "
"(ACCOUNT_ID, PREDICTED_PROBABILITY, TIME_PREDICTED) "
"VALUES(:1, :2, :3)",
(accountId,
risk_score,
("TIMESTAMP_LTZ", ct)
)
);
Avoid SQL Injection Attacks
Avoid binding data using Python’s formatting function because you risk SQL injection. For example:
# Binding data (UNSAFE EXAMPLE)
con.cursor().execute(
"INSERT INTO testtable(col1, col2) "
"VALUES({col1}, '{col2}')".format(
col1=789,
col2='test string3')
)
Instead, store the values in variables, check those values (for example, by looking for suspicious semicolons inside strings), and then bind the parameters using qmark or numeric binding style.
You forgot to place the quotes before and after the {ct}. The code should be :
insert_query = "INSERT INTO DATA_LAKE.CUSTOMER.ACT_PREDICTED_PROBABILITIES(ACCOUNT_ID, PREDICTED_PROBABILITY, TIME_PREDICTED) VALUES ({accountId}, {risk_score},'{ct}');".format(accountId=accountId,risk_score=risk_score,ct=ct)
ctx.cursor().execute(insert_query)

Azure HDInsight cluster with Hbase + pheonix using Local Index

We have an HDInsight cluster running HBase (Ambari)
We have created a table using Pheonix
CREATE TABLE IF NOT EXISTS Results (Col1 VARCHAR(255) NOT NULL,Col1
INTEGER NOT NULL ,Col3 INTEGER NOT NULL,Destination VARCHAR(255)
NOT NULL CONSTRAINT pk PRIMARY KEY (Col1,Col2,Col3) )
IMMUTABLE_ROWS=true
We have filled some data into this table (using some java code)
Later, we decided we want to create a local index on the destination column as follows
CREATE LOCAL INDEX DESTINATION_IDX ON RESULTS (destination) ASYNC
We have run the index tool to fill the index as follows
hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table
RESULTS --index-table DESTINATION_IDX --output-path
DESTINATION_IDX_HFILES
When we run queries and filter using the destination columns everything is ok. For example
select /*+ NO_CACHE, SKIP_SCAN */ COL1,COL2,COL3,DESTINATION from
Results where COL1='data' DESTINATION='some value' ;
But, if we do not use the DESTINATION in the where query, then we will get a NullPointerException in BaseResultIterators.class
(from phoenix-core-4.7.0-HBase-1.1.jar)
This exception is thrown only when we use the new local index. If we query ignoring the index like this
select /*+ NO_CACHE, SKIP_SCAN ,NO_INDEX */ COL1,COL2,COL3,DESTINATION from
Results where COL1='data' DESTINATION='some value' ;
we will not get the exception
Showing some relevant code from the area where we get the exception
...
catch (StaleRegionBoundaryCacheException e2) {
// Catch only to try to recover from region boundary cache being out of date
if (!clearedCache) { // Clear cache once so that we rejigger job based on new boundaries
services.clearTableRegionCache(physicalTableName);
context.getOverallQueryMetrics().cacheRefreshedDueToSplits();
}
// Resubmit just this portion of work again
Scan oldScan = scanPair.getFirst();
byte[] startKey = oldScan.getAttribute(SCAN_ACTUAL_START_ROW);
byte[] endKey = oldScan.getStopRow();
====================Note the isLocalIndex is true ==================
if (isLocalIndex) {
endKey = oldScan.getAttribute(EXPECTED_UPPER_REGION_KEY);
//endKey is null for some reason in this point and the next function
//will fail inside it with NPE
}
List<List<Scan>> newNestedScans = this.getParallelScans(startKey, endKey);
We must use this version of the Jar since we run inside Azure HDInsight
and we can not select a newer jar version
Any ideas how to solve this?
What does "recover from region boundary cache being out of date" mean? it seems to be related to the problem
It appears that the version that azure HDInsight has for phoenix core (phoenix-core-4.7.0.2.6.5.3004-13.jar) has the bug but if i use a bit newer version (phoenix-core-4.7.0.2.6.5.8-2.jar, from http://nexus-private.hortonworks.com:8081/nexus/content/repositories/hwxreleases/org/apache/phoenix/phoenix-core/4.7.0.2.6.5.8-2/) we do not see the bug any more
note that it is not possible to take a much newer version like 4.8.0 since in this case the server will throw a version mismatch error

Getting ValidationFailureSemanticException on 'INSERT OVEWRITE'

I am creating a DataFrame and registering that DataFrame as temp view using df.createOrReplaceTempView('mytable'). After that I try to write the content from 'mytable' into Hive table(It has partition) using the following query
insert overwrite table
myhivedb.myhivetable
partition(testdate) // ( 1) : Note here : I have a partition named 'testdate'
select
Field1,
Field2,
...
TestDate //(2) : Note here : I have a field named 'TestDate' ; Both (1) & (2) have the same name
from
mytable
when I execute this query, I am getting the following error
Exception in thread "main" org.apache.hadoop.hive.ql.metadata.Table$ValidationFailureSemanticException: Partition spec
{testdate=, TestDate=2013-01-01}
Looks like I am getting this error because of the same field names ; ie testdate(the partition in Hive) & TestDate (The field in temp table 'mytable')
Whereas if my partition name testdate is different from the fieldname(ie TestDate), the query executes successuflly. Example...
insert overwrite table
myhivedb.myhivetable
partition(my_partition) //Note here the partition name is not 'testdate'
select
Field1,
Field2,
...
TestDate
from
mytable
My guess is it looks like a Bug in Spark...but would like to have second opinion...Am I missing something here?
#DuduMarkovitz #dhee ; apologies for being too late for the response. I am finally able to resolve the issue. Earlier I was creating the table using cameCase(in the CREATE statement) which seems to be the reason for the Exception. Now i have created the table using the DDL where field names are in lower case. This has resolved my issue

COPY FROM CSV with static fields on Postgres

I'd like to switch an actual system importing data into a PostgreSQL 9.5 database from CSV files to a more efficient system.
I'd like to use the COPY statement because of its good performance. The problem is that I need to have one field populated that is not in the CSV file.
Is there a way to have the COPY statement add a static field to all the rows inserted ?
The perfect solution would have looked like that :
COPY data(field1, field2, field3='Account-005')
FROM '/tmp/Account-005.csv'
WITH DELIMITER ',' CSV HEADER;
Do you know a way to have that field populated in every row ?
My server is running node.js so I'm open to any cost-efficient solution to complete the files using node before COPYing it.
Use a temp table to import into. This allows you to:
add/remove/update columns
add extra literal data
delete or ignore records (such as duplicates)
, before inserting the new records into the actual table.
-- target table
CREATE TABLE data
( id SERIAL PRIMARY KEY
, batch_name varchar NOT NULL
, remote_key varchar NOT NULL
, payload varchar
, UNIQUE (batch_name, remote_key)
-- or::
-- , UNIQUE (remote_key)
);
-- temp table
CREATE TEMP TABLE temp_data
( remote_key varchar -- PRIMARY KEY
, payload varchar
);
COPY temp_data(remote_key,payload)
FROM '/tmp/Account-005'
;
-- The actual insert
-- (you could also filter out or handle duplicates here)
INSERT INTO data(batch_name, remote_key, payload)
SELECT 'Account-005', t.remote_key, t.payload
FROM temp_data t
;
BTW It is possible to automate the above: put it into a function (or maybe a prepared statement), using the filename/literal as argument.
Set a default for the column:
alter table data
alter column field3 set default 'Account-005'
Do not mention it the the copy command:
COPY data(field1, field2) FROM...

SQLite foreign key, on delete cascade, and SQLITE_CONSTRAINT

Overview:
I have a parent / child table relationship where the child may contain 2:n records with FK's back to the parent. When attempting to delete from the parent, I get a SQLITE_CONSTRAINT error. This is unexpected as I have FK's enabled, have the child registered with ON DELETE CASCADE, and a new enough SQLite version.
However: My child table originally did not have ON DELETE CASCADE. I added (and enabled FK's) after data had been added to parent/child. From there, I renamed the original child & created a new table with the constraint, and finally moved to the new table.
Table layout as follows:
CREATE TABLE IF NOT EXISTS message (
message_id INTEGER PRIMARY KEY,
area_tag VARCHAR NOT NULL,
message_uuid VARCHAR(36) NOT NULL,
reply_to_message_id INTEGER,
to_user_name VARCHAR NOT NULL,
from_user_name VARCHAR NOT NULL,
subject, /* FTS # message_fts */
message, /* FTS # message_fts */
modified_timestamp DATETIME NOT NULL,
view_count INTEGER NOT NULL DEFAULT 0,
UNIQUE(message_uuid)
);
CREATE INDEX IF NOT EXISTS message_by_area_tag_index
ON message (area_tag);
CREATE VIRTUAL TABLE IF NOT EXISTS message_fts USING fts4 (
content="message",
subject,
message
);
CREATE TRIGGER IF NOT EXISTS message_before_update BEFORE UPDATE ON message BEGIN
DELETE FROM message_fts WHERE docid=old.rowid;
END;
CREATE TRIGGER IF NOT EXISTS message_before_delete BEFORE DELETE ON message BEGIN
DELETE FROM message_fts WHERE docid=old.rowid;
END;
CREATE TRIGGER IF NOT EXISTS message_after_update AFTER UPDATE ON message BEGIN
INSERT INTO message_fts(docid, subject, message) VALUES(new.rowid, new.subject, new.message);
END;
CREATE TRIGGER IF NOT EXISTS message_after_insert AFTER INSERT ON message BEGIN
INSERT INTO message_fts(docid, subject, message) VALUES(new.rowid, new.subject, new.message);
END;
CREATE TABLE IF NOT EXISTS message_meta (
message_id INTEGER NOT NULL,
meta_category INTEGER NOT NULL,
meta_name VARCHAR NOT NULL,
meta_value VARCHAR NOT NULL,
UNIQUE(message_id, meta_category, meta_name, meta_value),
FOREIGN KEY(message_id) REFERENCES message(message_id) ON DELETE CASCADE
);
At startup, directly after attaching to the DB's I ensure FK's are enabled:
PRAGMA foreign_keys = ON;
Other details:
SQLite version: 3.7.17
Access: node-sqlite3
Exact error: Error: SQLITE_CONSTRAINT: FOREIGN KEY constraint failed
Is this caused by the fact that I later added the constraint? (See Update 1)
How do I fix this without losing data?
Update 1:
I can confirm that only select messages (I believe, messages that were in message before ON DELETE CASCADE as added to message_meta) cause the constraint error. Others delete just fine and properly take out associated message_meta records.
Answering my own question -- after some hours of trying various things I was able to find the issue(s):
When I originally added the ON DELETE CASCADE clause, I did so by renaming the original message_meta table to message_meta_backup, creating a new table with the clause, then moving the data into it: SELECT * FROM message_meta_backup INSERT INTO message_meta;. What I did not do was to drop the backup table.
Due to #1 or something related, something internal to my database became corrupted or confused.
What I tried (that did not work):
REINDEX;
Simply dropping the backup table: DROP TABLE message_meta_backup;
...and various other things I forget :)
What DID work:
What finally ended up working was a combination of dropping the backup table and completely rebuilding the database using the sqlite3 shell's .drop command:
> sqlite3 db/message.sqlite3
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> drop table message_meta_backup;
sqlite> .quit
> sqlite3 db/message.sqlite3 ".dump" >> message_dump.sql
rm db/message.sqlite3
> cat message_dump.sql | sqlite3 db/message.sqlite3
I'm now able to DELETE FROM message ... and have it properly cascade the delete to message_meta without the nasty error:
sqlite> DELETE FROM message WHERE message_id IN(SELECT message_id FROM message WHERE area_tag='some_area' ORDER BY message_id desc limit -1 offset 200);
sqlite>
(no error given!)

Resources