How to fix the SQL query in databricks if column name has bracket in it - databricks

I have a file which has data like this , I have converted that file into a databricks table.
Select * from myTable
Output:
Product[key] Product[name]
123 Mobile
345 television
456 laptop
I want to query my table for laptop data.
I am using below query
Select * from myTable where Product[name]='laptop'
I am getting below error in databricks:
AnalysisException: cannot resolve 'Product' given input columns:
[spark_catalog.my_db.myTable.Product[key],[spark_catalog.my_db.myTable.Product[name]

When certain characters appear in column names of a table in SQL, you get a parse exception. These characters include brackets, dots (.), hyphens (-), etc. So, when such characters appear in column names, we need an escape character to parse these characters just as a part of column name.
For SQL in Databricks, this character is Backtick (`). Enclosing your column name in backticks ensures that your column name is parsed correctly as it is even when it includes characters like ‘[]’ (In this case).
Since you have converted a file data into Databricks table, you were not able to see the main problem which is parsing the column name. If you manually create a table with specified schema in Databricks, you will get the following result:
Once you use Backtick in the following way, using the column name would not be a problem anymore.
create table mytable(`Product[key]` integer, `Product[name]` varchar(20))
select * from mytable where `Product[name]`='laptop'

Related

Couldnt convert to number when expanding a table power query

I have a very annoying problem when i try to merge two tables on power query excel. I use one column to match records from both tables and when i try to expand the second table it pops up the following message:
DataFormat.Error: We couldn't convert to Number.
Λεπτομέρειες:
ECS
I have no idea how to fix this. The columns that are matched have text, not value. There are no errors when i import data. Is there anyone that can help?
Try the following:
Delete the step #"Changed Type" in both queries
Make sure that the two merged columns have the same type (text ABC, in your case)
When you create a query from a table, Power Query try to guess (based on the first 200 lines) the type of each column. Now, the value "Λεπτομέρειες: ECS" is probably included in a different column (than the two merged) that has Number 123 as a type. It's kind of a mixed column (due to the source of data itself or to a delimiter issue).

Reading from DB2 tables with columns with names having special characters into Spark

I need to read data from a DB2 table into a spark dataframe.
However, the DB2 table named 'TAB#15' has 2 columns with special characters with names such as MYCRED# and MYCRED$.
My pyspark code looks like this:
query = '''select count(1) as cnt from {table} as T'''.format(table=table)
my_val = spark.read.jdbc(url, table=query, properties).collect()
My spark-submit however, throws an error that looks like this:
"ERROR: u"\nextraneous input '#' expecting... "
My questions/ ask is:
Is it possible to read data into a Spark dataframe, from a DB2 table whose table name and column names have special characters like '#' and '$'?
If there are any code samples/ similar questions to this one, that can illustrate the above requirement of reading DB2 table data from columns that have special characters in their column names, please point me out to them..
Try to use something like
table = '"MYDB2Specifier.TAB#15"'
Identifiers have double quotes. If you leave them out, everything will be uppercase. If the string has special characters like a $, you might need to escape the character.

Snowflake ODBC to Excel, errors on field names containing spaces

We have a highly complex set of tables, with nested views that eventually feed a series of dashboards on a Tableau server. The base view uses "as" clauses on some data fields to create fields with spaces in the field name (i.e.somefieldname as "Some Field Name"). Later views use the * wildcard to retrieve all values. Tableau is able to handle it.
The problem is now users want to access those final views in Excel.
We set up an ODBC connection on their workstations and when they pull the data from one of the final views. However, the fields that contain blanks in the field names show as errors and are blank in the resulting worksheet. I'm trying to build a view on that final view and use "as" clauses to remove the spaces in the field names, but haven't been able to find the proper SQL syntax for the source field. I've tried brackets but that didn't work.
Would we be better off trying Power BI? Our data management people are just getting started with it; I haven't seen it yet but will be tomorrow.
Thanks in advance for any tips you can provide!
Lou
Creating a view on top of your final view with renamed columns is probably your easiest solution. The SQL syntax for selecting from a column that has been created with spaces (more generally: a column that has been created with " around its field name/s) is to put the column in double quotes (") when you select from it. Here is an example:
-- Create a sample table. The first column contains spaces and capitals
create or replace table test_table
(
"Column with Spaces" varchar, -- A column created with quotes around it means that you can put anything into the field name. Including spaces & funky characters
col_without_spaces varchar
);
-- Insert some sample data
insert overwrite into test_table
values ('row1 col1', 'row1 col2'),
('row2 col1', 'row2 col2');
-- Create a view that renames columns
create or replace view test_view as
(
select
"Column with Spaces" as col_1, -- But now you have to select it like this since the spaces and capital letters have been treated literally in the create table statement
col_without_spaces as col_2
from test_table
);
-- Select from the view
select * from test_view;
Produces:
+---------+---------+
|COL_1 |COL2 |
+---------+---------+
|row1 col1|row1 col2|
|row2 col1|row2 col2|
+---------+---------+

When and why are Google Cloud Spanner table and column names case-sensitive?

Spanner documentation says:
Table and column names:
Can be between 1-128 characters long. Must start with an uppercase or lowercase letter.
Can contain uppercase and lowercase letters, numbers, and underscores, but not hyphens.
Are case-insensitive. For example, you cannot create tables named mytable and MyTable in the same database or columns names mycolumn and
MyColumn in the same table.
https://cloud.google.com/spanner/docs/data-definition-language#table_statements
Given that, I have no idea what this means:
Table names are usually case insensitive, but may be case sensitive
when querying a database that uses case sensitive table names.
https://cloud.google.com/spanner/docs/lexical#case-sensitivity
In fact it seems that table names are case-sensitive, for example:
Queries fail if we don't match the case shown in the UI.
This seems to be an error in the documentation. Table names are case insensitive in cloud spanner. I'll follow up with the docs team.
Edit: Updated docs https://cloud.google.com/spanner/docs/data-definition-language#naming_conventions
I add a couple of examples, so we can see the diference.
Table names are case sensitive, In this example, It does not matter, there is only one table:
Example 1:
SELECT *
FROM Roster
WHERE LastName = #myparam
returns all rows where LastName is equal to the value of query parameter myparam.
But for Example 2, if we comparing two tables, or make other kind of queries, using tables.
SELECT id, name
FROM Table1 except select id, name
FROM Table2
It will give you everything in Table1 but not in Table2.

Cassandra valid column names

I'm creating a api which will work on either mongo or cassandra, for that reason I'm using '_id' as a column name.
This should be a valid name according to the docs:
Keyspace, column, and table names created using CQL can only contain alphanumeric and underscore characters. User-defined data type names and field names, user-defined function names, and user-defined aggregate names created using CQL can only contain alphanumeric and underscore characters. If you enter names for these objects using anything other than alphanumeric characters or underscores, Cassandra will issue an invalid syntax message and fail to create the object.
However, when I run this statement:
CREATE TABLE users(_id: bigint, entires: map<timestamp, text>, PRIMARY KEY(_id));
I return the following error:
Invalid syntax at line 1, char 20
Is it possible to use underscores in column names?
Underscores in columns names? Yes. Column names starting with underscores? No.
From the CREATE TABLE documentation:
Valid table names are strings of alphanumeric characters and underscores, which begin with a letter.
You can create a column name starting with an underscore. Use quotes:
CREATE TABLE users("_id": bigint, entires: map<timestamp, text>, PRIMARY KEY("_id"));
The column name will be _id
Although you can, it does not mean that you should have such a column - you will need to continue using quotes in each query making it cumbersome:
SELECT "_id" FROM users;

Resources