Bad distributed join plan: result table shard keys do not match - singlestore

We are very new to memsql/mysql and we are trying to play around with a memsql installation.
It is installed on a CentOS7 virtual machine and we are running version 5.1.0 of MemSQL.
We are receiving the error from one of the queries we are attempting:
ERROR 1889 (HY000): Bad distributed join plan: result table shard keys do not match. Please contact MemSQL support at support#memsql.com.
On one of our queries
We have two tables:
CREATE TABLE `MyObjects` (
`Id` INT NOT NULL AUTO_INCREMENT,
`Name` VARCHAR(128) NOT NULL,
`Description` VARCHAR(256) NULL,
`Boolean` BIT NOT NULL,
`Int8` TINYINT NOT NULL,
`Int16` SMALLINT NOT NULL,
`Int32` MEDIUMINT NOT NULL,
`Int64` INT NOT NULL,
`Float` DOUBLE NOT NULL,
`DateCreated` TIMESTAMP NOT NULL,
SHARD KEY (`Id`),
PRIMARY KEY (`Id`)
);
CREATE TABLE `MyObjectDetails` (
`MyObjectId` INT,
`Int32` MEDIUMINT NOT NULL,
SHARD KEY (`MyObjectId`),
INDEX (`MyObjectId`)
);
And here is the SQL we are executing and getting the error.
memsql> SELECT mo.`Id`,mo.`Name`,mo.`Description`,mo.`Boolean`,mo.`Int8`,mo.`Int16`,
mo.`Int32`,mo.`Int64`,mo.`Float`,mo.`DateCreated`,mods.`MyObjectId`,
mods.`Int32` FROM
( SELECT
mo.`Id`,mo.`Name`,mo.`Description`,mo.`Boolean`,mo.`Int8`,
mo.`Int16`,mo.`Int32`,mo.`Int64`,mo.`Float`,mo.`DateCreated`
FROM `MyObjects` mo LIMIT 10 ) AS mo
LEFT JOIN `MyObjectDetails` mods ON mo.`Id` = mods.`MyObjectId` ORDER BY `Name` DESC;
ERROR 1889 (HY000): Bad distributed join plan: result table shard keys do not match. Please contact MemSQL support at support#memsql.com.
Does anyone know why we are receiving this error, and if there is a possible change we can make to help alleviate this issue?
The one thing we do know is it has something to do with the inner select as if I pull it out and do the join it works, however we only get 10 total rows from the join. What we are attempting is getting the top 10 from the main table and include all of the details from the right.
We also tried changing the MyObjectDetails table to have an empty SHARD KEY, but that resulted in the same error.
SHARD KEY()
We also added an auto-incrementing Id column to the details table and put the shard on that column, and yet still received the same error.
Thanks in advance for any help.
UPDATE:
I contacted MemSQL through email (huge props to their customer service by the way -- very fast response time, less than a couple hours)
But from what Mike stated I changed the table to be a REFERENCE table and removed the SHARD KEY part of the create table statement. Once I did this, I was able to run the queries. I am not 100% sure on what ramifications this will have but it fixed my issue at hand. Thanks
CREATE REFERENCE TABLE `MyObjects` (
`Id` INT NOT NULL AUTO_INCREMENT,
`Name` VARCHAR(128) NOT NULL,
`Description` VARCHAR(256) NULL,
`Boolean` BIT NOT NULL,
`Int8` TINYINT NOT NULL,
`Int16` SMALLINT NOT NULL,
`Int32` MEDIUMINT NOT NULL,
`Int64` INT NOT NULL,
`Float` DOUBLE NOT NULL,
`DateCreated` TIMESTAMP NOT NULL,
PRIMARY KEY (`Id`)
);

Thanks to Mike Gallegos for looking into this, adding a summary of his answer here:
The error message here is bad, but the reason for the error is that MemSQL does not currently support a distributed left join where the left side (the Limit subquery in this case) has a LIMIT operator. If you cannot rewrite the query to do the limit after the join, then you could change the MyObjects table to a reference table to work around the issue.

Related

Integrity error while running migrate to add new column in Django

I have a MYSQL table with few million entries. While trying to add a new column to this table using Django migration after some time (possibly due to huge data) is failing with below error:
Error django.db.utils.IntegrityError: (1062, "Duplicate entry '123456-softwareengineer' for key 'api_experience_entity_id_123a45b6c789def0_uniq'")
I checked manually but the entity_id(123456) didn't had any duplicate entries. I was able to see that this entry was being updated when the migration was running.
What could be the possible solution to perform migration without affecting the data and near to no downtime for system?
Below are the details of my table:
SHOW CREATE TABLE api_experience;`
`CREATE TABLE `api_experience` ( `id` int(11) NOT NULL AUTO_INCREMENT, `type` varchar(100) NOT NULL, `duration` varchar(200) NOT NULL, `created_at` datetime NOT NULL, `updated_at` datetime NOT NULL, `entity_id` int(11) NOT NULL, `designation` varchar(100), PRIMARY KEY (`id`), UNIQUE KEY `api_experience_entity_id_123a45b6c789def0_uniq` (`entity_id`,`type`), KEY `api_experience_599dcce2` (`type`), KEY `api_experience_5527459a` (`designation`), KEY `api_experience_created_at_447d412d906baea1_uniq` (`created_at`), CONSTRAINT `api_entity_id_3fbe8eb2deb42063_fk_api_entity_id` FOREIGN KEY (`entity_id`) REFERENCES `api_entity` (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=773570676 DEFAULT CHARSET=utf8`
Have tried with running migration in low traffic at midnight but didn't help.

In sqlite-jdbc driver(3.23 or 3.30),fail update when string is too long, without any exception. But in command line and navicat, it's OK

SQL
replace into articles(id,grade,no,title,content,newchars,update_time)
values(1394823098212,'1','8','title','**LongUtf8String**',1614001996557)
If the length of LongUtf8String exceeds 2689 bytes, the operation will fail, without any exception.
In sqlite command line and navicat, it can work.
Table definition:
create table if not exists articles (
id long not null primary key,
grade tinyint not null,
no smallint not null,
title varchar(255) not null,
content text not null,
newchars text not null
);
I can't find any configuration items in SQLiteConfig.
Are there any ways to overcome the limit?
Thanks!!!
It is my fault:(((
I limited the length when check parameters, and discard it without prompt.

How do I insert records in a node-pg-migrate migration?

I'm trying to use node-pg-migrate to handle migrations for an ExpressJS app. I can translate most of the SQL dump into pgm.func() type calls, but I can't see any method for handling actual INSERT statements for initial data in my solution's lookup tables.
It is possible using the pgm.sql catch all:
pgm.sql(`INSERT INTO users (username, password, created, forname, surname, department, reviewer, approver, active) VALUES
('rd#example.com', 'salty', '2019-12-31 11:00:00', 'Richard', 'Dyce', 'MDM', 'No', 'No', 'Yes');`)
Note the use of backtick (`) to allow breaking the SQL statement across multiple lines.
You can use raw sql if you needed.
Create a migration file with the extension .sql and write usual requests.
This article has a great example.
My example:
-- Up Migration
CREATE TABLE users
(
id BIGSERIAL NOT NULL PRIMARY KEY,
username VARCHAR(50) NOT NULL,
email VARCHAR(50) NOT NULL,
password VARCHAR(50) NOT NULL,
class_id INTEGER NOT NULL,
created_at DATE NOT NULL,
updated_at DATE NOT NULL
);
CREATE TABLE classes
(
id INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(50) NOT NULL,
health INTEGER NOT NULL,
damage INTEGER NOT NULL,
attack_type VARCHAR(50) NOT NULL,
ability VARCHAR(50) NOT NULL,
created_at DATE NOT NULL,
updated_at DATE NOT NULL
);
INSERT INTO classes (id,
name,
health,
damage,
attack_type,
ability,
created_at,
updated_at)
VALUES (0,
'Thief',
100,
25,
'Archery Shot',
'Run Away',
NOW(),
NOW());
-- Down Migration
DROP TABLE users;
DROP TABLE classes;

Azure SQL db query external Azure SQL DB - PPDwManagedToNativeInteropException

I have one Azure SQL server where I have several databases. I need to be able to query across these databases, and have at the moment solves this through external tables. A challange with this solution is that external tables does not support all the same data types as ordinary tables.
According to the following article the solution to incompatible data types are to use other similiar ones in the external table.
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-data-types#unsupported-data-types.
DDL for table in DB1
CREATE TABLE [dbo].[ActivityList](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Registered] [datetime] NULL,
[RegisteredBy] [varchar](50) NULL,
[Name] [varchar](100) NULL,
[ak_beskrivelse] [ntext] NULL,
[ak_aktiv] [bit] NULL,
[ak_epost] [bit] NULL,
[Template] [text] NULL
CONSTRAINT [PK_ActivityList] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
DDL for external table in DB2
CREATE EXTERNAL TABLE [dbo].[NEMDBreplicaActivityList]
(
[ID] [int] NOT NULL,
[Registered] [datetime] NULL,
[RegisteredBy] [varchar](50) NULL,
[Name] [varchar](100) NULL,
[ak_beskrivelse] [nvarchar](4000) NULL,
[ak_aktiv] [bit] NULL,
[ak_epost] [bit] NULL,
[Template] [varchar](900) NULL
)
WITH (DATA_SOURCE = [DS],SCHEMA_NAME = N'dbo',OBJECT_NAME = N'ActivityList')
Querying the external table NEMDBreplicaActivityList produces the following error
Error retrieving data from
server.database.windows.net.db1. The
underlying error message received was:
'PdwManagedToNativeInteropException ErrorNumber: 46723, MajorCode:
467, MinorCode: 23, Severity: 16, State: 1, ErrorInfo: ak_beskrivelse,
Exception of type
'Microsoft.SqlServer.DataWarehouse.Tds.PdwManagedToNativeInteropException'
was thrown.'.
I have tried defining the ak_beskrivelse column as other external table legal datatypes, such as varchar, with the same result.
Sadly I'm not allowed to edit the data type of columns in the db1 table.
I assume that the error is related to the data type. Any ideas how to fix it?
I solved a similar problem to this by creating a view over the source table which cast the text value as varchar(max), then pointed the external table to the view.
So:
CREATE VIEW tmpView
AS
SELECT CAST([Value] AS VARCHAR(MAX))
FROM [Sourcetable].
Then:
CREATE EXTERNAL TABLE [dbo].[tmpView]
(
[Value] VARCHAR(MAX) NULL
)
WITH (DATA_SOURCE = [myDS],SCHEMA_NAME = N'dbo',OBJECT_NAME = N'tmpView')
Creating the view and casting the text value worked perfect for me 😊
Thank you!
Created view vw_TestReport:
SELECT CAST([Report Date] AS VARCHAR(MAX)) AS [Report Date]
FROM dbo.TestReport
And created external table from view:
CREATE EXTERNAL TABLE [dbo].[TestReport](
[Report Date] [varchar](max) NULL
)
WITH (DATA_SOURCE = [REFToDB],SCHEMA_NAME = N'dbo',OBJECT_NAME = N'vw_TestReport')

MemSQL Weird Insert/Update Behaviour

We are using single node MemSQL and everything was working fine but when we are trying to move our MemSQL setup to use multi node the insert/update statements are behaving very weirdly
My table structures are like below , have removed many columns , to keep it short
CREATE /*!90618 REFERENCE*/ TABLE `fact_orderitem_hourly_release_update`
(
`order_id` int(11) NOT NULL DEFAULT '0',
`customer_login` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`warehouse_id` int(11) DEFAULT NULL,
`city` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`store_id` int(11) DEFAULT NULL,
PRIMARY KEY (`order_id`)
);
CREATE TABLE `fact_orderitem_hourly_scale` (
`order_id` int(11) NOT NULL DEFAULT '0',
`order_group_id` int(11) NOT NULL DEFAULT '0',
`item_id` int(11) NOT NULL,
`sku_id` int(11) NOT NULL DEFAULT '0',
`sku_code` varchar(45) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`po_type` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`store_order_id` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`bi_last_modified_on` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00.000000',
PRIMARY KEY (`item_id`,`sku_id`),
/*!90618 SHARD */ KEY `sku_id` (`sku_id`),
KEY `idx_fact_orderitem_hourly_lmd` (`bi_last_modified_on`),
KEY `idx_fact_orderitem_hourly_ord` (`order_id`),
KEY `idx_order_group_id` (`order_group_id`),
KEY `idx_store_order_id` (`store_order_id`)
);
My Load Script :
mysql -h$LiveMemSQL_DB -u$LiveMemSQL_USER --password=$LiveMemSQL_PASS -P$LiveMemSQL_PORT --verbose reports_and_summary < /home/titan/brand_catalog/upsert_memsql_orl_update.sql
Contents of .SQL File :
--start of .sql file
TRUNCATE TABLE reports_and_summary.fact_orderitem_hourly_release_update;
#Load data into staging
LOAD DATA LOCAL INFILE '/myntra/redshift/delta_files/live_scale_order_release_upd.txt' INTO TABLE reports_and_summary.fact_orderitem_hourly_release_update LINES TERMINATED BY '\n';
#Insert/Update statement
INSERT INTO reports_and_summary.fact_orderitem_hourly_scale
(
item_id,
sku_id,
customer_login,
order_status,
is_realised,
is_shipped,
shipping_charge,
gift_charge,
warehouse_id,
city,
store_id
)
select
fo.item_id,
fo.sku_id,
fr.customer_login,
fr.order_status,
fr.is_realised,
fr.is_shipped,
fr.shipping_charge,
fr.gift_charge,
fr.warehouse_id,
fr.city,
fr.store_id
from fact_orderitem_hourly_release_update fr
join fact_orderitem_hourly_scale fo
on fr.order_id=fo.order_id
ON duplicate key update
customer_login=values(customer_login),
order_status=values(order_status),
is_realised=values(is_realised),
is_shipped=values(is_shipped),
shipping_charge=values(shipping_charge),
gift_charge=values(gift_charge),
warehouse_id=values(warehouse_id),
city=values(city),
store_id=values(store_id);
--End .sql file
When I trigger the above .sql through mysql command line client , it works sometimes and it doesn't many of times , and some times if I execute the same .sql file continuously 5-10 times , the updates will get effected in one of those runs , and sometimes say for example if there are 3 records with order_id 101 and status SHIPPED and we got an update in merge table say the order status has been changed to DELIVERED , ideally status of all 3 orders should be changed to DELIVERED , but only one or 2 of the rows associated with an order are getting updated but if I execute the same .sql file content through MySQLWorkbench it works perfectly fine , I may sound stupid , but this is what is happening and I am struggling from last 2 days with this weird behavior
Please find the below screen cast , where I captured this behaviour https://www.youtube.com/watch?v=v2HN-n4V0MI&feature=youtu.be
Your staging table is a reference table, and writes to reference tables are replicated asynchronously to the cluster. This is why sometimes your updates work as expected and sometimes they don't.
You can
wait for a bit after writing into the reference table
make the staging table non-reference

Resources