antlr - selectively tokenizing a string

antlr - selectively tokenizing a string - antlr4

I'm new to Antlr and I'm trying to write grammar to selectively tokenize a string. I really appreciate any help/pointers regarding where to look and the approach to take to implement this.
For example, the string "disabled" appears in the output of a device at various places,
section1 {
property1 disabled
}
section2 {
disabled
}
section3 {
property2 disabled
}
The grammar:
section2
: 'section2' '{'
'disabled' a_disabled=NL
'}'
;
This ends up tokenizing the string 'disabled', resulting in "" being assigned to property1 and property2, whereas the intent would be to tokenize "disabled" in section2 and assign it to a_disabled.
The expected json output would be:
{"section1":
{
"property1": "disabled"
},
"section2":
{
"disabled": "true",
},
"section3":
{
"property2": "disabled"
},
}
I have the code written to correctly assign section2:disabled to "true", but the property1 and property2 values get assigned "" because of this.
{"section1":
{
"property1": ""
},
"section2":
{
"disabled": "true",
},
"section3":
{
"property2": ""
},
}
Antlr debug output shows that all occurrences of "disabled" are being tokenized.
What would be the best way to accomplish this? Having gone through documentation, it appears that mode or semantic predicates would work. We are also using Antlr 4.7 and Go.

I'm not quite sure what you are trying to achieve from the description, and also it's not clear how you want to 'selectively tokenizing a string', but how about this grammar:
section: ID '{' ID? 'disabled' '}'
WS : [ \n\u000D] -> skip ;
ID : [a-zA-Z] [a-zA-Z0-9]* ;
And then doing the rest as operations on the parse trees? If you provide more information, I will update the answer.

Related

Convert from Tuple of strings to strings in terraform

I have an issue where I want to pass a list of vpc_ids to aws_route53_zone while getting the id from a couple of module calls and iterating it from the state file.
The out put format I am using is:
output "development_vpc_id" {
value = [for vpc in values(module.layout)[*] : vpc.id if vpc.environment == "development"]
description = "VPC id for development env"
}
where I get the output like:
"development_vpc_id": {
"value": [
"xxxx"
],
"type": [
"tuple",
[
"string"
]
]
},
instead I want to achieve below:
"developmemt_vpc_id": {
"value": "xxx",
"type": "string"
},
Can someone please help me with the same.

There isn't any automatic way to "convert" a sequence of strings into a single string, because you need to decide how you want to represent the multiple separate strings once you've reduced it into only a single string.
One solution would be to apply JSON encoding so that your output value is a string containing JSON array syntax:
output "development_vpc_id" {
value = jsonencode([
for vpc in values(module.layout)[*] : vpc.id
if vpc.environment == "development"
])
}
Another possibility is to concatenate all of the strings together with a particular character as a marker to separate each one, such as a comma:
output "development_vpc_id" {
value = join(",", [
for vpc in values(module.layout)[*] : vpc.id
if vpc.environment == "development"
])
}
If you expect that this list will always contain exactly one item -- that is, if each of your objects has a unique environment value -- then you could also tell Terraform about that assumption using the one function:
output "development_vpc_id" {
value = one([
for vpc in values(module.layout)[*] : vpc.id
if vpc.environment == "development"
])
}
In this case, Terraform will either return the one element of this sequence or will raise an error saying there are too many items in the sequence. The one function therefore acts as an assertion to help you detect if there's a bug which causes there to be more than one item in this list, rather than just silently discarding some of the items.

Inject matchesJsonPath from Groovy into Spring Cloud Contract

When writing a Spring Cloud Contract in Groovy,
I want to specify an explicit JSON path expression.
The expression:
"$.['variants'][*][?(#.['name'] == 'product_0004' && #.['selected'] == true)]"
shall appear in the generated json, like so:
{
"request" : {
"bodyPatterns": [ {
"matchesJsonPath": "$.['variants'][*][?(#.['name'] == 'product_0004' && #.['selected'] == true)]"
} ]
}
}
in order to match e.g.:
{ "variants": [
{ "name": "product_0003", "selected": false },
{ "name": "product_0004", "selected": true },
{ "name": "product_0005", "selected": false } ]
}
and to not match e.g.:
{ "variants": [
{ "name": "product_0003", "selected": false },
{ "name": "product_0004", "selected": false },
{ "name": "product_0005", "selected": true } ]
}
Is this possible using consumers, bodyMatchers, or some other facility of the Groovy DSL?

There are some possibilities with matching on json path, but you wouldn't necessarily use it for matching on explicit values, but rather to make a flexible stub for the consumer by using regex. There are some possibilities though.
So the body section is your static request body with hardcoded values, while the bodyMatchers section provides you the ability to make the stub matching from the consumer side more flexible.
Contract.make {
request {
method 'POST'
url '/some-url'
body ([
id: id
items: [
foo: foo
bar: bar
],
[
foo: foo
bar: foo
]
])
bodyMatchers {
jsonPath('$.id', byEquality()) //1
jsonPath('$.items[*].foo', byRegex('(?:^|\\W)foo(?:$|\\W)')) //2
jsonPath('$.items[*].bar', byRegex(nonBlank())) //3
}
headers {
contentType(applicationJson())
}
}
response {
status 200
}
}
I referenced some lines
1: "byEquality()" in the bodyMatchers section means: the input from the consumer must be equal to the value provided in the body for this contract/stub to match, in other words must be "id".
2: I'm not sure how nicely the //1 solution will work when the property is in a list, and you want the stub to be flexible with the amount of items provided. Therefor I also included this byRegex which basically means, for any item in the list, the property foo must have exactly value "foo". However, I dont really know why you would want to do this.
3: This is where bodyMatchers are actually most useful. This line means: match to this contract if every property bar in the list of items is a non blank string. This allows you to have a dynamic stub with a flexible size of lists/arrays.
All the conditions in bodyMatchers need to be met for the stub to match.

Logic App : Finding element in Json Object array (like XPath fr XML)

In my logic app, I have a JSON object (parsed from an API response) and it contains an object array.
How can I find a specific element based on attribute values... Example below where I want to find the (first) active one
{
"MyList" : [
{
"Descrip" : "This is the first item",
"IsActive" : "N"
},
{
"Descrip" : "This is the second item",
"IsActive" : "N"
},
{
"Descrip" : "This is the third item",
"IsActive" : "Y"
}
]
}

Well... The answer is in plain sight ... There's a FILTER ARRAY action, which works on a JSON Object (from PARSE JSON action).. couple this with an #first() expression will give the desired outcome.

You can use the Parse JSON Task to parse your JSON and a Condition to filter for the IsActive attribute:
Use the following Schema to parse the JSON:
{
"type": "object",
"properties": {
"MyList": {
"type": "array",
"items": {
"type": "object",
"properties": {
"Descrip": {
"type": "string"
},
"IsActive": {
"type": "string"
}
},
"required": [
"Descrip",
"IsActive"
]
}
}
}
}
Here how it looks like (I included the sample data you provided to test it):
Then you can add the Condition:
And perform whatever action you want within the If true section.

elasticsearch predective search solution

Trying to get predictive drop down search ,How can i make search always starts from left to right
like in example "I_kimchy park" , "park"
If i search only "par" i have to get only park in return , but here i am getting both words , how to treat empty space as a character
POST /test1
{
"settings":{
"analysis":{
"analyzer":{
"autocomplete":{
"type":"custom",
"tokenizer":"standard",
"filter":[ "standard", "lowercase", "stop", "kstem", "edgeNgram" ,"whitespace"]
}
},
"filter":{
"ngram":{
"type":"edgeNgram",
"min_gram":2,
"max_gram":15,
"token_chars": [ "letter", "digit"]
}
}
}
}
}
PUT /test1/tweet/_mapping
{
"tweet" : {
"properties" : {
"user": {"type":"string", "index_analyzer" : "autocomplete","search_analyzer" : "autocomplete"}
}
}}
POST /test1/tweet/1
{"user" : "I_kimchy park"}
POST /test1/tweet/3
{ "user" : "park"}
GET /test1/tweet/_search
{
"query": {
"match_phrase_prefix": {
"user": "park"
}
}
}

That happens because your standard tokenizer splits your user field by white spaces. You can use Keyword Tokenizer in order to treat whole string as a single value (single token).
Please keep in mind that this change may affect other of your functionalities that use this field. You may have to add dedicated "not tokenized" user field for this purpose.

couchdb futon document editor - can I customize the indentation rules?

Suppose that I want to customize the indentation rules of the foton document editor, where and how can I do that?
I'll elaborate.
The foton editor lays out document like this:
(which to my flavor is completely annoying)
{
"_id": "1326017821636",
"_rev": "2-51ab614953437181a24f1c073fbc6201",
"doc_type": 0,
"step": 2,
"data": {
"map1": {
"attr1": 73031,
"attr2": "strval"
},
"map2": {
"att1": 52001,
"att2": "strval"
},
"mapmap": {
"map": {
"id11": {
"id": "id11",
"attr": "attr",
"attr2": 2222
},
"id1211": {
"id": "id1211",
"attr": "attr",
"attr2": 2222
}
}
}
}
}
And what would I want to change, you may ask? It seems pritty standard.
Well, I'm not a standard person. To my observations many standards evolved arbitrarily and suffer lack of thought. Besides, if I was a standard-follower I was not asking about customization ;)
Shortly -
- 3 spaces tab indent. Why 3? not 2 and not 4. just 3? LOL
- block formation - opening a block draws down a line in the worng place
- commas are in the wrong side
So I want it to be like this:
(and I even have the JS code that does it, I just need help in where to put it)
{ "_id" : "1326017821636"
, "_rev" : "2-51ab614953437181a24f1c073fbc6201"
, "doc_type" : 0
, "step" : 2
, "data" :
{ "map1" :
{ "attr1" : 73031
, "attr2" : "strval"
}
, "map2" :
{ "att1" : 52001
, "att2" : "strval"
}
, "mapmap" :
{ "map" :
{ "id11" :
{ "id" : "id11"
, "attr" : "attr"
, "attr2" : 2222
}
}
, { "id1122" :
{ "id" : "id11"
, "attr" : "attr"
, "attr2" : 2222
}
}
}
}
}
Why I do it this way?
- it looks more tabular. All syntax scuffold of same object/array are in the same column
(who placed the comma in the wrong side of the statement anyway)
- no redunand wasted empty lines
- only the start-block is an edge case (opposed to in the other way you have a case for begining a block and a case for ending a block and a case for every line).
It would have been fine if I could perform my indentations and the foton will not ruin them everytime it validates the document. But, since it does, I need to get into this mechanism and replace it's indentor with one of my own.
Any directions?
P.S:
If you know the answer here - you might know the answer to this question:
couchdb futon document editor - can I customize the document validation part?

Again, after a quick browse, this is where you might want to look:
https://github.com/apache/couchdb/blob/master/share/www/script/futon.browse.js#L899
You will have a corresponding /share/www/script folder on your local couchdb instance if you want to play around editing it live.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

antlr - selectively tokenizing a string - antlr4

Related

Convert from Tuple of strings to strings in terraform

Inject matchesJsonPath from Groovy into Spring Cloud Contract

Logic App : Finding element in Json Object array (like XPath fr XML)

elasticsearch predective search solution

couchdb futon document editor - can I customize the indentation rules?

Categories

Resources