sqlite3.g4 grammar does not handle left join correctly - antlr4

The github sqlite3 grammar github.com/antlr/grammars-v4/blob/master/sqlite/SQLite.g4 has issue with left join.
For this sql
select * from t1 left join t2 on t1.owner = t2.email
the word 'left' is parsed as a table_alias. Things go rapidly downhill from there
I think I can fix it by somehow saying that table_alias is any_name except K_LEFT, K_RIGHT, K_INNER but I do not know how to express that in a grammar
Or maybe there is a better way to fix this
UPDATE. Just to clarify sqlite behavior. (Not talking about antlr , talking about what sqlite understands)
this
select username from user left join device on device.owner = user.id limit 2
works.
This
select username from user alias join device on device.owner = user.id limit 2
fails saying that the column user.id doesn't exist.
Clearly the word 'left' is being recognized as a keyword not a table alias.
select username from user alias join device on device.owner = alias.id limit 2
Works, and does an inner join
select username from user alias left join device on device.owner = user.id limit 2
works and does a left join
Further update for #Mike Lischke
select username from user left where device.owner = left.id limit 2
Fails. "Query has failed: near "where": syntax error".

The behavior is correct. LEFT is included in the keyword rule which in turn is used in any_name, which is valid input for table_alias (this is known as the keywords-as-identifiers problem). Your query is wrong, according to the grammar, period. Any word following the table name is a table alias, regardless whether it is usually a keyword or not.
A workaround could be to add a validating predicate, which checks the following token and fails the table_alias rule if the current token is something that should be used in its original meaning. You can get the same result by removing the LEFT keyword from the keyword rule. However, that might have other side effects in places where you want the LEFT keyword to be valid input for an identifier.

Related

Flexible Search for fetching Email

I am very new to SAP Commerce and have been experimenting with Flexible search. Some queries work i.e and some dont.
Below one works fine.
1) Select * {User} SELECT {user.pk} FROM {User AS user}
2) SELECT {user.pk}, {ItemType} FROM {User AS user}
However when I am fetching email or uui, it is just not working.
1) SELECT {email} FROM {User AS user}
2) SELECT {originaluid} FROM {User AS user}
I have tried with {p_ as well which didn't work either.
1) SELECT {email} FROM {User AS user}
2) SELECT {p_originaluid} FROM {User AS user}
It's giving the same error for all the nonworking(wihth the different column names i.e p_originaluid) ones Exception message:
cannot search unknown field 'TableField(name='p_email',langPK='null',type=User)' within type User unless you disable checking, infoMap=TypeInfoMap for type = 8796093939794 code = User superType = 8796093841490 itemTable = users UPTable = usersup LTableName = userslp PropsTable = userprops core fields = owner = [owner,OwnerPkString,class de.hybris.platform.util.ItemPropertyValue] modifiedtime = [modifiedtime,modifiedTS,class java.util.Date] itemtype = [itemtype,TypePkString,class de.hybris.platform.util.ItemPropertyValue] creationtime = [creationtime,createdTS,class java.util.Date] pk = [pk,PK,class de.hybris.platform.core.PK] unlocalized fields = consentreference = [consentReference,p_consentreference, class java.lang.String] description = [description,p_description, class java.lang.String]
A bit of help will be great on the below questions:
1) Considering the above scenario why am I able to fetch all the columns?
2) While doing a flexible search I found "SELECT TypePkString FROM {User AS user}" query, my question is since TypePkString is a database column why is it used with SELECT without curly braces?
To answer your 1st question:
The way the type system works in SAP Commerce is. You have base item type and then you have drived item types. If you check core-items.xml Every type is derived from the generic item type directly or indirectly.
When you have any sub-item type derived from a parent type I.e Employee or B2BCustomer and you do not define a separate table ( tag ) for these separate types, the SAP Commerce uses a base table and add additional columns specific to those (Employee or B2BCustomer SAP) sub-item type.
This is why you may see many columns even for the specific type I.e B2BCustomer.
The best way to find the attributes added to specific subtypes is by looking into the back office type system.
In your case if you look into the type system by going into https://localhost:9002/backoffice/ -> Types -> User -> XML Representation and search for email, you may not find there
Now use the above steps and search for email under Type B2B Customer. You should be able to find it under the XML but when you look for email under core-items.xml file you might not find it there, that's because in the back office type system (for B2BCustomer) you will see it extended from (Extended tab) b2bcommerce. Now look for b2bcommerce-items.xml you should definitely see the item type for B2BCustomer and it should also have an attribute called email.
The reason for telling you above is so that you could know how and where to look.
When you use {} in your flexible search it uses the type system to look up and because of the same reason its unable to look it up email in User type is a part of B2BCustomer, not User, in your case:
select {email} from {B2BCustomer} should work.
For 2nd question:
As said before, Whenever you write {} in the flexible search, SAP Commerce looks into the type system and tries to resolve it however when you do not put curry braces it looked up DB columns for the given table like the below queries.
select p_email from {User}
select p_originaluid FROM {User}
Though p_email is not a part of User as told before you can still query it if you just try to fetch from the column directly same applies to the second query.
Hope this answers your question.
User item type hasn't got email property. Also it hasn't got originaluid. Check existing fields in backoffice type system or use hac to run queries without field names to get all existing fields.

Getting SyntaxException programmatically creating a table with the Cassandra Python driver

Error:
cassandra.protocol.SyntaxException: \
<Error from server: code=2000 [Syntax error in CQL query] \
message="line 1:36 no viable alternative at input '(' \
(CREATE TABLE master_table(dict_keys[(]...)">
Code:
cluster = Cluster(cloud=cloud_config, auth_provider=auth_provider)
session=cluster.connect('firstkey')
ColName={"qty_dot_url": "int",
"qty_hyphen_url": "int",
"qty_underline_url": "int",
"qty_slash_url": "int"}
columns = ColName.keys()
values = ColName.values()
session.execute('CREATE TABLE master_table({ColName} {dataType}),PRIMARY KEY(qty_dot_url)'.format(ColName=columns, dataType=values))
How to resolve above mentioned error?
So I replaced the session.execute with a print, and it produced this:
CREATE TABLE master_table(dict_keys(['qty_dot_url', 'qty_hyphen_url', 'qty_underline_url', 'qty_slash_url']) dict_values(['int', 'int', 'int', 'int'])),PRIMARY KEY(qty_dot_url)
That is not valid CQL. It needs to look like this:
CREATE TABLE master_table(qty_dot_url int, qty_hyphen_url int,
qty_underline_url int, qty_slash_url int, PRIMARY KEY(qty_dot_url))
I was able to create that by making these adjustments to your code:
createTableCQL = "CREATE TABLE master_table("
for key, value in ColName.items():
createTableCQL += key + " " + value + ", "
createTableCQL += "PRIMARY KEY(qty_dot_url))"
You could then follow that with a session.execute(createTableCQL).
Notes:
The PRIMARY KEY definition must be inside the paren list.
Creating schema from inside application code is often problematic, and can create a schema disagreement in the cluster. It's almost always better to create tables outside of code.
The syntax exception is a result of your Python code generating an invalid CQL which Aaron pointed out in his response.
To add to his answer, you need to add additional steps whenever you are programatically making schema changes. In particular, you need to make sure that you check for schema agreement (i.e. the schema change has been propagated to all nodes) before moving on to the next bit in your code.
You will need to modify your code to save the result from the schema change, for example:
resultset = session.execute(SimpleStatement("CREATE TABLE ..."))
then call this in your code:
resultset.response_future.is_schema_agreed
You'll need to loop through this check until True is returned. Depending on how long you want to wait (default max_schema_agreement_wait is 10 seconds), you'll need to implement some logic to do [something] when schema agreement is not achieved (because a node is down for example) -- this requires manual intervention from an operator to investigate the cluster.
As Aaron already said, performing schema changes programatically is very problematic and we discourage doing this unless you fully understand the pitfalls and know how to handle failures. Cheers!

Not getting results from IHP DataSync despite setting Row Level Policy

I want to use DataSync on my current application, using IHP 0.16. I believe I have followed all the installation steps in FrontController and Routes.
I have a characters table with a user_id column connected to the users table. I have set the policy on the characters table resulting in this generated SQL:
CREATE POLICY "Users can manage their characters" ON characters USING (user_id = ihp_user_id()) WITH CHECK (user_id = ihp_user_id());
ALTER TABLE characters ENABLE ROW LEVEL SECURITY;
Trying to run this in the JavaScript console
await query("characters").fetch()
I get this error in JavaScript output:
And this error in IHP output:
Query (2.119753ms): "SELECT relrowsecurity FROM pg_class WHERE oid = ?::regclass" ["characters"]
Query (0.111442ms): "SET LOCAL ROLE ?" [Identifier {fromIdentifier = "ihp_authenticated"}]
Query (0.130888ms): "SET LOCAL rls.ihp_user_id = ?" Only {fromOnly = Just 0d7b46b1-bcb4-46a2-bf77-ad27dace8416}
FormatError {fmtMessage = "1 single '?' characters, but 3 parameters", fmtQuery = "SELECT ? FROM ??", fmtParams = ["*","characters",""]}
This seems to be another error than the row level security error in the DataSync tutorial in the IHP docs. Any idea on what causes this error?
This is a known bug in IHP v0.16.0. It's already fixed in master
It's best to use IHP DataSync with the version mentioned in the introduction text at https://ihp.digitallyinduced.com/Guide/realtime-spas.html :)
There's btw a workaround for the bug if you don't want to upgrade: You always need to specify an order by, like await query("characters").orderBy('createdAt').fetch()

SQLAlchemy query with conditional filters and results

I'm building a fastAPI app and I have a complicated query that I'm trying to avoid doing as multiple individual queries where I concat the results.
I have the following tables that all have foreign keys:
CHANGE_LOG: change_id | original (FK ROSTER.shift_id) | new (FK ROSTER.shift_id) | change_type (FK CONFIG_CHANGE_TYPES)
ROSTER: shift_id | shift_type (FK CONFIG_SHIFT_TYPES) | shift_start | shift_end | user_id (FK USERS)
CONFIG_CHANGE_TYPES: change_type_id | change_type_name
CONFIG_SHIFT_TYPES: shift_type_id | shift_type_name
USERS: user_id | user_name
FK= Foreign Key
I need to return the following information:
user_name, change_type_name, and shift_start shift_end and shift_type_name for those whose shift_id matches the original or new in the CHANGE_LOG row.
The catch is that the CHANGE_LOG table might have both original and new, only an original but no new, or only a new but no original. But as the user can select a few options from drop down boxes before submitting the request, I also need to be able to include a filter to single out:
just one user, or all users
any change_type, or a group of change_types
The issue is that I can't find a way to get the user_name guaranteed for each row without inspecting it afterwards because I don't know if the new or original exist or are set to null.
Is there a way in SQLalchemy to have an optional filter in the query where I can say if the original exists use that to get the user_id, but if not then use the new to get the user_id.
Also, if i have a query that definitely finds those with original and new shifts, it will never find those with only one of them as the criteria will never match.
I've also read this and similar ones, and while they'll resolve the issue of conditionally setting some of the filters, it doesn't get around the issue of part nulls returning nothing at all, rather than half the data.
This one seems to solve that problem, but I have no idea how to implement it.
I know it's complicated, so let me know if I've done a poor job of explaining the question.
Sorted. The solution was to use the outerjoin option.
I'm sure the syntax can be more elegant than my solution if I properly engage in adding relationships when defining each class, but what I end up with is explicit and I think it makes it easier to read... at least for me.
Since I'm using a few tables more than once in the same query for different information, it was important to alias those, otherwise I ended up with a conflict (which 'user_id' did you want - it's not clear). For those playing at home, here's my general solution:
new=aliased(ROSTER)
original=aliased(ROSTER)
o_name=aliased(CONFIG_SHIFT_TYPES)
n_name=aliased(CONFIG_SHIFT_TYPES)
pd.read_sql(
db.query(
CHANGE_LOG.change_id,
CHANGE_LOG.created,
CONFIG_CHANGE_TYPES.change_name,
o_name.shift_name.label('original_type'),
n_name.shift_name.label('new_type'),
OPERATORS.operator_name
)
.outerjoin(original, original.shift_id==CHANGE_LOG.original_shift)
.outerjoin(new, new.shift_id==CHANGE_LOG.new_shift)
.outerjoin (CONFIG_CHANGE_TYPES,CONFIG_CHANGE_TYPES.change_id==CHANGE_LOG.change_type)
.outerjoin(CONFIG_SHIFT_TYPES, CONFIG_SHIFT_TYPES.shift_id==new.roster_shift_id)
.outerjoin(o_name, o_name.shift_id==original.roster_shift_id)
.outerjoin(n_name, n_name.shift_id==new.roster_shift_id)
.outerjoin(USERS, or_(USERS.operator_id==original.user_id, USERS.user_id==new.user_id)
).statement, engine)

How to do a like filter using an external table in Spark SQL

I'm trying to write a Spark SQL query to take a set of values from a reference table and use them to filter rows from a main table.
Current SQL looks like:
select time, code, user, message from TABLE where
(code = 'x' and (user like '%USER1%' or user like '%USER2%') and not (message like '%PREFIX1:HOST1%' or message like '%PREFIX1:HOST2%')) or
(code = 'y' and (user like '%USER1%' or user like '%USER2%') and not (message like '%PREFIX2:HOST1%' or message like '%PREFIX2:HOST2%'))
I should be able to change the above to something like:
select time, code, user, message from TABLE where
(code = 'x' and user rlike '(USER1|USER2)' and not message rlike 'PREFIX1:(HOST1|HOST2)') or
(code = 'y' and user rlike '(USER1|USER2)' and not message rlike 'PREFIX2:(HOST1|HOST2)')
But ideally I was hoping to be able to use an external table to prevent having to load the users / hosts into the sql multiple times and possibly hitting maximum length limits around the sql (the users set is smallish, the set of hosts consists of several hundred hosts and can be updated), something like:
Table USERS { username: String }
Table HOSTS { hostname: String }
select time, code, user, message from TABLE where
(code = 'x' and user rlike '<ANY username FROM USERS>' and not message rlike 'PREFIX1:<ANY hostname FROM HOSTS>') or
(code = 'y' and user rlike '<ANY username FROM USERS>' and not message rlike 'PREFIX2:<ANY hostname FROM HOSTS>')
I think an exact match above would be simple to implement but because it is a like match it seems a bit harder.
I suspect I should be able to get something working using data frames however I would likely have similar limitations just with different syntax. Eg. filter would look something like:
df.filter($"message".rlike("PREFIX1:(HOST1|HOST2|HOST3)")
The other option is to run better parsing logic before I hit this point so I can do an exact match which could work out easier but also makes it more difficult to represent in a DSL that a customer could use. Ideally I would like to be able to represent matching one of many values in a field.

Resources