CQL: comparing two column values - cassandra

I have a TABLE like this:
id | expected | current
------+----------+--------
123 | 25 | 15
234 | 26 | 26
345 | 37 | 37
Now I want to select all ids where current is equal to expected. In SQL I would do something like this:
SELECT id FROM myTable WHERE current = expected;
But in CQL it seems to be invalid. My cqlsh returns this:
no viable alternative at input 'current'
Is there a valid CQL query to achieve this ?
Edited
According to the CQL-Docs it should work but it doesn't... This is what the doc says:
<selectWhereClause> ::= <relation> ( "AND" <relation> )*
| <term> "IN" "(" <term> ( "," <term> )* ")"
<relation> ::= <term> <relationOperator> <term>
<relationOperator> ::= "=" | "<" | ">" | "<=" | ">="
<term> ::= "KEY"
| <identifier>
| <stringLiteral>
| <integer>
| <float>
| <uuid>
;

I used the wrong docs. I'm using CQL3 and read the docs for CQL2.
The correct doc says:
<where-clause> ::= <relation> ( AND <relation> )*
<relation> ::= <identifier> <op> <term>
| '(' <identifier> (',' <identifier>)* ')' <op> '(' <term> (',' <term>)* ')'
| <identifier> IN '(' ( <term> ( ',' <term>)* )? ')'
| TOKEN '(' <identifier> ( ',' <identifer>)* ')' <op> <term>
So it is not a valid query.

Related

Find and Extract value after specific String from a file using bash shell script?

I have a file which contains below details :
file.txt
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
| CREATE EXTERNAL TABLE `dv.par_kst`( |
| `col1` string, |
| `col2` string, |
| `col3` int, |
| `col4` int, |
| `col5` string, |
| `col6` float, |
| `col7` int, |
| `col8` string, |
| `col9` string, |
| `col10` int, |
| `col11` int, |
| `col12` string, |
| `col13` float, |
| `col14` string, |
| `col15` string) |
| PARTITIONED BY ( |
| `part_col1` int, |
| `part_col2` int) |
| ROW FORMAT SERDE |
| 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' |
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' |
| LOCATION |
| 'hdfs://nameservicets1/dv/hdfsdata/par_kst' |
| TBLPROPERTIES ( |
| 'spark.sql.create.version'='2.2 or prior', |
| 'spark.sql.sources.schema.numPartCols'='2', |
| 'spark.sql.sources.schema.numParts'='1', |
| 'spark.sql.sources.schema.part.0'='{"type":"struct","fields":[{"name":"col1","type":"string","nullable":true,"metadata":{}},{"name":"col2","type":"string","nullable":true,"metadata":{}},{"name":"col3","type":"integer","nullable":true,"metadata":{}},{"name":"col4","type":"integer","nullable":true,"metadata":{}},{"name":"col5","type":"string","nullable":true,"metadata":{}},{"name":"col6","type":"float","nullable":true,"metadata":{}},{"name":"col7","type":"integer","nullable":true,"metadata":{}},{"name":"col8","type":"string","nullable":true,"metadata":{}},{"name":"col9","type":"string","nullable":true,"metadata":{}},{"name":"col10","type":"integer","nullable":true,"metadata":{}},{"name":"col11","type":"integer","nullable":true,"metadata":{}},{"name":"col12","type":"string","nullable":true,"metadata":{}},{"name":"col13","type":"float","nullable":true,"metadata":{}},{"name":"col14","type":"string","nullable":true,"metadata":{}},{"name":"col15","type":"string","nullable":true,"metadata":{}},{"name":"part_col1","type":"integer","nullable":true,"metadata":{}},{"name":"part_col2","type":"integer","nullable":true,"metadata":{}}]}', |
| 'spark.sql.sources.schema.partCol.0'='part_col1', |
| 'spark.sql.sources.schema.partCol.1'='part_col2', |
| 'transient_lastDdlTime'='1587487456') |
+----------------------------------------------------+
from above file I want to extract PARTITIONED BY details.
Desired output :
part_col1 , part_col2
and these PARTITIONED BY is not fixed , means for some other file it might contains 3 or more , so I want extract all the PARTITIONED BY.
All the values between PARTITIONED BY and ROW FORMAT SERDE , removing spaces "`" and data types!
Could you please help me with this ?
sed -nr '/PARTITIONED BY/,/ROW FORMAT SERDE/p' a.txt|sed -nr '/`/p'|cut -d '`' -f 2|xargs -n 1 echo -n " "
my $text = do { local $/; <DATA> };
my #partitioned = ();
$text=~s#PARTITIONED BY\s*\(([^\(\)]*)\)# my $fulcontent=$1;
push (#partitioned, $1) while($fulcontent=~m/\`([^\`]+)\`/g);
($fulcontent);
#egs;
print join "\, ", #partitioned;
Output:
part_col1, part_col2
When the layout of your result doesn't matter, you can ask sed to consider lines between a start and an end tag, and only print such a line when a field can be found between 2 backquotes.
sed -rn '/PARTITIONED BY/,/ROW FORMAT/s/.*`(.*)`.*/\1/p' file.txt
Combining the results in a line as desired can be done with
printf "%s , " $(sed -rn '/PARTITIONED BY/,/ROW FORMAT/s/.*`(.*)`.*/\1 /p' file.txt) |
sed 's/ , $/\n/'
Small perl script
read whole file into $data variable
select all between PARTITIONED BY (....)
select into array only elements between `
print result joined with ,
use strict;
use warnings;
use feature 'say';
my $data = do { local $/; <> };
my $re = 'PARTITIONED BY \((.*?)\)';
$data =~ /$re/sg;
my #part = $1 =~ /`(.*?)`/sg;
say join ', ', #part;

Horizontal vs Vertical array delimiters - International

Following up on an earlier question I had about horizontal vs vertical arrays, I have a question about it's respective delimiters.
Problem definition:
Hereby an example of an incorrect way of comparing two arrays:
{=SUMPRODUCT(--({"Apple","Pear"}={"Apple","Lemon","Pear"}))}
The correct way, in case of an English application countrycode would be:
{=SUMPRODUCT(--({"Apple","Pear"}={"Apple";"Lemon";"Pear"}))}
Within an English version (most likely more than just English) of Excel these delimiters would respectively be a comma , for horizontal arrays and a semicolon ; for vertical ones. Plenty of online information to be found on this.
Working on a machine with a Dutch country code on it's application however, it't a complete other story. It does frustrate that my delimiters would both be different, respectively ; and a \. Being able to rather simply retrieve the semi-colon it's proven to be tricky to find any documentation on these delimiters for international version.
Workaround:
Not knowing these delimiters up-front makes it tricky for anyone on a variety of international versions of the application to work with these type of formulas. A rather easy workaround would be to use TRANSPOSE():
{=SUMPRODUCT(--({"Apple";"Pear"}=TRANSPOSE({"Apple";"Lemon";"Pear"})))}
Going through the build-in evaluation we can then retrieve the backslash as the column seperator. Another way would be to use the Application.International property and it's xlColumnSeparator and xlRowSeparator.
Question
We can both find and even override the xlDecimalSeparator and xlThousandsSeparator through Excel (File > Options > Advanced), or VBA (Application.DecimalSeparator = "-") but where can we find:
A place to actually see which xlRowSeparator and xlColumnSeparator are used within your own application, other than the workarounds I described. Looking for an interface similar to thousands and decimal seperator and/or official MS-documentation.
Furthermore (not specifically looking for this), is there:
A place to override them just like the decimal and thousand seperators
If not through Excel interfaces, can we brute-force this somehow through VBA?
I'm very curious if official documentation is present, and/or if the above can be done.
Not claiming this is the right answer, but with the help from comments from other users, maybe the below can clarify things a bit:
With no sign of any official documentation on this matter, and seemingly random row and column delimiters #Gserg showed a trick to retrieve information for any LCID using these unique id's on MS office support under "Create one-dimensional and two-dimensional constants". While this is MS office support information, the delimiters you see there are FALSE. They might come up as . a , a ; a : a \ or even a |. You get this results by changing the LCID from the URL to a LCID of interest, e.g.: fr-fr.
Although there are about 600 different LCID's they all get redirected to a default LCID. With the help of #FlorentB. we discovered that not only the MS office support documentation is wrong, it seems that these delimiters are not that random after all. Looking at countries using a decimal point, they use the , as a column delimiter (a horizontal array) and a ; as a row delimiter (a vertical array). Countries using a decimal comma however use a \ as a column delimiter and a ; for rows respectively.
Changing the system country settings, checking all default LCID's in Excel, we ended up with the matrix below showing all row and column delimiters per default LCID:
| LCID | Row | Column |
|-------|-----|--------|
| ar-sa | ; | , |
| bg-bg | ; | \ |
| cs-cz | ; | \ |
| da-dk | ; | \ |
| de-de | ; | \ |
| el-gr | ; | \ |
| en-gb | ; | , |
| en-ie | ; | , |
| en-us | ; | , |
| es-es | ; | \ |
| et-ee | ; | \ |
| fi-fi | ; | \ |
| fr-fr | ; | \ |
| he-il | ; | , |
| hr-hr | ; | \ |
| hu-hu | ; | \ |
| id-id | ; | \ |
| it-it | ; | \ |
| ja-jp | ; | , |
| ko-kr | ; | , |
| lt-lt | ; | \ |
| lv-lv | ; | \ |
| nb-no | ; | \ |
| nl-nl | ; | \ |
| pl-pl | ; | \ |
| pt-br | ; | \ |
| pt-pt | ; | \ |
| ro-ro | ; | \ |
| ru-ru | ; | \ |
| sk-sk | ; | \ |
| sl-si | ; | \ |
| sv-se | ; | \ |
| th-th | ; | , |
| tr-tr | ; | \ |
| uk-ua | ; | \ |
| vi-vn | ; | \ |
| zh-cn | ; | , |
| zh-hk | ; | , |
| zh-tw | ; | , |
The apparent conclusion is that all countries use a semicolon as a row (vertical) delimiter. And depending on decimal seperator countries use a backslash or comma as a column (horizontal) delimiter within array formulas.
So even without proper MS-documentation, nor a place within the Excel interface (like thousand en decimal delimiter do have), on this matter it is apparent that knowing your country's decimal seperator will automatically mean you either use a \ or , as a column delimiter.
| Dec_Seperator | Row | Column |
|---------------|-----|--------|
| . | ; | , |
| , | ; | \ |
I would happily recieve more information about the above and/or presence of any correct MS office documentation to add to this.
It is possible to do this through native Excel (without VBA or add-ins) by querying Excel's C API, but I don't know of anywhere this is documented.
Go into Excel's Name Manager and click 'New...'. Enter a name such as GetColumnSeparator.
In the 'RefersTo:' box, enter the following to get the column separator:
=INDEX(GET.WORKSPACE(37), 14)
In an Excel cell, you can now enter this:
=GetColumnSeparator
and the comma (in English - or whatever symbol is in use on your machine) will be shown.
For the row separator you need to change the index number to 15:
=INDEX(GET.WORKSPACE(37), 15)
On an English machine, this will be the semicolon by default.
On machines where Excel's 'display language' is not English (meaning Excel's function names are translated), you will need a translated version of the above formula. Again I don't know of any documentation on this, so my best suggestion would be to install the English language pack, enter the formula in English, save the workbook, then revert to your original Excel language and re-open the workbook; Excel will translate the formula automatically.
Note that you will need to save the workbook as macro-enabled (e.g. .xlsm rather than .xlsx).
Sorry this is nearly three years late but I hope it helps.

Using division operator in app insight analytics

I have the following app insights query :
let someResult=
customEvents | where name in ('SomeAction')
| parse customDimensions.someId with someId
| parse customDimensions.sometaskId with someTaskId
| parse user_AuthenticatedId with user
| summarize max(timestamp) by user, someId , someTaskId
| join (
customEvents | where name in ('someAction')
| parse customDimensions.action with someAction
| parse customDimensions.someId with someId
| project someAction,someId
) on someId
| join (
customEvents
| where name in ('someResult')
| parse customDimensions.someId with someId
| parse customDimensions.someIdsWithSomething with sometaskIds
| parse array_length(split(customDimensions.someIdsWithSomething ,',')) with someTaskCount
| distinct someId , sometaskIds,someTaskCount
| where sometaskIds<> ''
) on someId
| summarize sumif(todouble(someTaskCount),someAction=="accept")/sum(todouble(someTaskCount));
How can i divide someResult by something here . For example i want the final result to be someResult/10 . Thank you for the help.
Try this:
let someResult=
customEvents | where name in ('SomeAction')
| parse customDimensions.someId with someId
| parse customDimensions.sometaskId with someTaskId
| parse user_AuthenticatedId with user
| summarize max(timestamp) by user, someId , someTaskId
| join (
customEvents | where name in ('someAction')
| parse customDimensions.action with someAction
| parse customDimensions.someId with someId
| project someAction,someId
) on someId
| join (
customEvents
| where name in ('someResult')
| parse customDimensions.someId with someId
| parse customDimensions.someIdsWithSomething with sometaskIds
| parse array_length(split(customDimensions.someIdsWithSomething ,',')) with someTaskCount
| distinct someId , sometaskIds,someTaskCount
| where sometaskIds<> ''
) on someId
| summarize summarized = sumif(todouble(someTaskCount),someAction=="accept")/sum(todouble(someTaskCount));
someResult
| project summarized / 10
I could not test it since I do not have those custom dimensions but it is based from this working/tested example:
let someResult = requests
| summarize summarized = count();
someResult
| project summarized / 10
I hope this helps someone, as this SO question shows up as the top result but doesn't answer the question. I determined something like this from this answer:
let authRequests = requests | where operation_Name contains "Auth";
let countNon500s = toscalar(authRequests | where resultCode !startswith "5" | count);
let countAll = toscalar(authRequests | count);
let percent = 100*todouble(todouble(countNon500s) / todouble(countAll));
print todecimal(percent)
The three important parts are:
you need to convert the values to scalar values (otherwise they are tables).
you need to convert the resulting percentage to doubles (otherwise they are 1 or 0)
you need to use 'print' to output to a tabular form (or extend a column)

antlr4: Grammar ambiguity, left-recursion, both?

My grammar, shown below, does not compile. The returned error (from the antlr4 maven plugin) is:
[INFO] --- antlr4-maven-plugin:4.3:antlr4 (default-cli) # beebell ---
[INFO] ANTLR 4: Processing source directory /Users/kodecharlie/workspace/beebell/src/main/antlr4
[INFO] Processing grammar: DateRange.g4
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
[ERROR] error(20): internal error: Rule HOUR undefined
[ERROR] error(20): internal error: Rule MINUTE undefined
[ERROR] error(20): internal error: Rule SECOND undefined
[ERROR] error(20): internal error: Rule HOUR undefined
[ERROR] error(20): internal error: Rule MINUTE undefined
I can see how the grammar might be confused -- Eg, whether 2 digits is a MINUTE, SECOND, or HOUR (or maybe the start of a year). But a few articles suggest this error results from left-recursion.
Can you tell what's going on?
Thanks. Here's the grammar:
grammar DateRange;
range : startDate (THRU endDate)? | 'Every' LONG_DAY 'from' startDate THRU endDate ;
startDate : dateTime ;
endDate : dateTime ;
dateTime : GMTOFF | SHRT_MDY | YYYYMMDD | (WEEK_DAY)? LONG_MDY ;
// Dates.
GMTOFF : YYYYMMDD 'T' HOUR ':' MINUTE ':' SECOND ('-'|'+') HOUR ':' MINUTE ;
YYYYMMDD : YEAR '-' MOY '-' DOM ;
SHRT_MDY : MOY ('/' | '-') DOM ('/' | '-') YEAR ;
LONG_MDY : (SHRT_MNTH '.'? | LONG_MNTH) WS DOM ','? (WS YEAR (','? WS TIMESPAN)? | WS startTime)? ;
YEAR : DIGIT DIGIT DIGIT DIGIT ; // year
MOY : (DIGIT | DIGIT DIGIT) ; // month of year.
DOM : (DIGIT | DIGIT DIGIT) ; // day of month.
TIMESPAN : startTime (WS THRU WS endTime)? ;
// Time-of-day.
startTime : TOD ;
endTime : TOD ;
TOD : NOON | HOUR2 (':' MINUTE)? WS? MERIDIAN ;
NOON : 'noon' ;
HOUR2 : (DIGIT | DIGIT DIGIT) ;
MERIDIAN : 'AM' | 'am' | 'PM' | 'pm' ;
// 24-hour clock. Sanity-check range in listener.
HOUR : DIGIT DIGIT ;
MINUTE : DIGIT DIGIT ;
SECOND : DIGIT DIGIT ;
// Range verb.
THRU : WS ('-'|'to') WS -> skip ;
// Weekdays.
WEEK_DAY : (SHRT_DAY | LONG_DAY) ','? WS ;
SHRT_DAY : 'Sun' | 'Mon' | 'Tue' | 'Wed' | 'Thu' | 'Fri' | 'Sat' -> skip ;
LONG_DAY : 'Sunday' | 'Monday' | 'Tuesday' | 'Wednesday' | 'Thursday' | 'Friday' | 'Saturday' -> skip ;
// Months.
SHRT_MNTH : 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Aug' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ;
LONG_MNTH : 'January' | 'February' | 'March' | 'April' | 'May' | 'June' | 'July' | 'August' | 'September' | 'October' | 'November' | 'December' ;
DIGIT : [0-9] ;
WS : [ \t\r\n]+ -> skip ;
I resolved this issue by setting up a unique production rule for each sequence of digits (of length 1, 2, 3, or 4). As well, I simplified several rules -- in effect, trying to make the production rule alternatives more straightforward. Anyway, here is the final result, which does compile:
grammar DateRange;
range : 'Every' WS longDay WS 'from' WS startDate THRU endDate
| startDate THRU endDate
| startDate
;
startDate : dateTime ; endDate : dateTime ; dateTime : utc
| shrtMdy
| yyyymmdd
| longMdy
| weekDay ','? WS longMdy
;
// Dates.
utc : yyyymmdd 'T' hour ':' minute ':' second ('-'|'+') hour ':' minute ;
yyyymmdd : year '-' moy '-' dom ;
shrtMdy : moy ('/' | '-') dom ('/' | '-') year ;
longMdy : longMonth WS dom ','? optYearAndOrTime?
| shrtMonth '.'? WS dom ','? optYearAndOrTime?
;
optYearAndOrTime : WS year ','? WS timespan
| WS year
| WS timespan
;
fragment DIGIT : [0-9] ;
ONE_DIGIT : DIGIT ;
TWO_DIGITS : DIGIT ONE_DIGIT ;
THREE_DIGITS : DIGIT TWO_DIGITS ;
FOUR_DIGITS : DIGIT THREE_DIGITS ;
year : FOUR_DIGITS ; // year
moy : ONE_DIGIT | TWO_DIGITS ; // month of year.
dom : ONE_DIGIT | TWO_DIGITS ; // day of month.
timespan : (tod THRU tod) | tod ;
// Time-of-day.
tod : noon | (hour2 (':' minute)? WS? meridian?) ;
noon : 'noon' ; hour2 : ONE_DIGIT | TWO_DIGITS ;
meridian : ('AM' | 'am' | 'PM' | 'pm' | 'a.m.' | 'p.m.') ;
// 24-hour clock. Sanity-check range in listener.
hour : TWO_DIGITS ;
minute : TWO_DIGITS ;
second : TWO_DIGITS ; // we do not use seconds.
// Range verb.
THRU : WS? ('-'|'–'|'to') WS? ;
// Weekdays.
weekDay : shrtDay | longDay ; shrtDay : 'Sun' | 'Mon' | 'Tue' | 'Wed' | 'Thu' | 'Fri' | 'Sat' ; longDay : 'Sunday' | 'Monday' | 'Tuesday' | 'Wednesday' | 'Thursday' | 'Friday' | 'Saturday' ;
// Months.
shrtMonth : 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Aug' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ;
longMonth : 'January' | 'February' | 'March' | 'April' | 'May' | 'June' | 'July' | 'August' | 'September' | 'October' | 'November' | 'December' ;
WS : ~[a-zA-Z0-9,.:]+ ;

Can I escape the pipe in specflow (or gherkin)

I've got a specflow step table that I want to have the | (pipe) character as a part of the content.
Example:
Then the data should be
| Field | Value |
| SomeField | a|b|c |
But this doesn't work. How can I escape the pipe character?
Bah. I can't believe I didn't find this earlier. You CAN escape a pipe with the backslash, but the specflow syntax highlighter gets confused by it.
Then the data should be
| Field | Value |
| SomeField | a\|b\|c |

Resources