Can TiDB Syncer run normally when the MySQL table character set is latin1?

Can TiDB Syncer run normally when the MySQL table character set is latin1? - tidb

I see PingCAP FAQ said that
"The character sets of TiDB use UTF-8 by default and currently only support UTF-8"
If MySQL table character set is latin1, can Syncer run correctly or can I get the same result in the TiDB side as in the MySQL side?

In fact, TiDB uses utf8mb4 as the default character set. If the upstream MySQL character set is latin1, the Syncer can run normally and the results are consistent.

Related

cx_oracle return question marks for hebrew characters

I have a table with hebrew characters.
I run a select command and get ??? instead of the hebrew result.
I connect using python3 on linux redhat8 and cx_Oracle using enconding='UTF-8'.
the string in the table is hebrew, I can see it correctly from pl/sql on windows.
how can I fix this
Thanks you
Tsvi

You probably have a discrepancy between the character set the database thinks the data is in and the actual character set of the data. This is covered in the documentation.
Check the database character set using this SQL:
SELECT value AS db_charset
FROM nls_database_parameters
WHERE parameter = 'NLS_CHARACTERSET'
Check the client character set using this SQL:
select distinct client_charset
from v$session_connect_info
where sid = sys_context('userenv', 'SID')
Check the client character set both where it is working correctly (PL/SQL on Windows) and not working correctly (Python on Linux). If this doesn't help you figure it out, post the results in your question and I'll adjust my answer accordingly.

Error with redis protocol (RESP) during bulk load when data contains UTF-8 characters

I have to do simple structured bulk load on my redis database. However there are also some UTF-8 characters and when I'm trying to load in data with them I am getting ERR Protocol error: expected '$', got ' ' . Loading in data without UTF-8 characters works just fine.
Data example of UTF-8 char that is causing the error :
*4\r\n$4\r\nHSET\r\n$6\r\nGrad_Ž\r\n$6\r\nalmada\r\n$1\r\n1\r\n
If I replace Ž with normal character like S for example it loads and causes no errors.
I have tried different commands to run it and I have tried changing bash locale.
Command I am using to run it :
echo -e "$(cat test.txt)" | redis-cli --pipe
Thanks in advance.

In your case:
$6\r\nGrad_Ž\r\n
6 means that there's 6 bytes value, however, Grad_Ž has more than 6 bytes. Because Ž is NOT an ascii character, and takes more than 1 byte.
You need to set the correct string length (in byte) for the value.

How do I handle an encoding error in Excel Power Query?

I receive the following error when connecting to a Postgres database.
DataSource.Error: ODBC: ERROR [22P05] ERROR: character with byte sequence 0xc2 0x96 in encoding
"UTF8" has no equivalent in encoding "WIN1252"; Error while executing the query
Details:
DataSourceKind=Odbc
DataSourcePath=database=XXXXXXXX;dsn=PostgreSQL30;encoding='utf-8';port=XXXX;server=XXXXXXXXX
OdbcErrors=Table
It only happens with this table, so the connection works in general. I would prefer to deal with this at the excel level and not make changes to the database. I tried including 'encoding='utf-8' in the connection string, but I see that the problem isn't that excel doesn't recognize the encoding scheme but that it doesn't have a process to handle 0xc2 0x96 in WIN1252.
Is there a way to change excel default encoding or a way to specify it in the connection string or query settings to handle this?

Solved by installing the unicode Postgres driver and setting the dsn to it.

gitlab issues not work with Chinese

When I create an issue, an enter Title and Details in Chinese.
But It not works.
Form input
Result

The documentation "Setup Database" does mention
# Create the GitLab production database
mysql> CREATE DATABASE IF NOT EXISTS `gitlabhq_production` DEFAULT CHARACTER SET `utf8` COLLATE `utf8_unicode_ci`;
So it could be possible this charset is missing in your database.
Johannes Schleifenbaum mentions in your issue 4620:
Are your database and tables (in this case issues) created with utf8 character-set/collation? I had the exact same issue.
$: mysql -ugitlab -p gitlabhq_production
mysql> SHOW FULL COLUMNS FROM issues;
mysql> SHOW VARIABLES LIKE "character_set_database";
mysql> SHOW VARIABLES LIKE "collation_database";
The blog post "Converting Character sets in MySQL to UTF8" proposes different options, including:
mysql> ALTER DATABASE gitlabhq_production DEFAULT CHARACTER SET utf8 COLLATE=utf8_unicode_ci;
mysql> ALTER TABLE issues CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;,

CouchDB Find One not working

I am a couchDB newbie and am doing the examples in the O'Reilly CouchDB guide.
I have a problem using a view to retrieve a document by key:
curl http://127.0.0.1:5984/basic/_design/example/_view/by_date?key="2009/01/15 15:52:20"
gives the reply:
curl: (52) Empty reply from server
but just retrieving all rows:
curl http://127.0.0.1:5984/basic/_design/example/_view/by_date
gives me 3 rows including the specific row I am looking for:
{"id":"hello-world","key":"2009/01/15 15:52:20","value":"Hello World"}
why doesn't the key query work?
I am using CouchDB version 0.10.0 on Ubuntu 9.10

CouchDB expects the start_key parameter to be a valid JSON-compatible type, such as "a string" or 12345 or ["an", "array", "with", 5.0, "elements"]. If you check your CouchDB logs you will probably see a 400 (bad client request) entry because your key is either invalid UTF8 or invalid JSON.
You probably have two problems:
The shell is interpreting your quotes which must actually be sent to CouchDB. Try single-quoting your double-quote string.
You probably also need to encode your key so that it is a valid URL. Specifically, replace your space with %20
Putting this all together, the following works for me on CouchDB 0.11 on Ubuntu 9.10.
$ curl http://127.0.0.1:5984/blog/_design/docs/_view/by_date?key='"2009/01/30%2018:04:11"'
{"total_rows":1,"offset":0,"rows":[
{"id":"biking","key":"2009/01/30 18:04:11","value":"Biking"}
]}

It worked, I single-quoted the key string and encoded the space character so the request became:
/by_date?key='"2009/01/30%2015:52:20"'

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Can TiDB Syncer run normally when the MySQL table character set is latin1? - tidb

I see PingCAP FAQ said that "The character sets of TiDB use UTF-8 by default and currently only support UTF-8" If MySQL table character set is latin1, can Syncer run correctly or can I get the same result in the TiDB side as in the MySQL side?

In fact, TiDB uses utf8mb4 as the default character set. If the upstream MySQL character set is latin1, the Syncer can run normally and the results are consistent.

Related

cx_oracle return question marks for hebrew characters

Error with redis protocol (RESP) during bulk load when data contains UTF-8 characters

How do I handle an encoding error in Excel Power Query?

gitlab issues not work with Chinese

CouchDB Find One not working

Categories

Resources