In superset, how do I exclude results where my field contains {} values? - presto

In superset I have a NullType field unappliedcodes that has some values that look like {"BLAH":"Some info"} and others look like {}
My goal is to exclude the {} entries and I have tried WHERE NOT unappliedcodes = '{}' and it does not work.
How do I exclude {} from my results?
Note: our data source is prestodb

You need to use the clause:
WHERE NOT cardinality(unappliedcouponcodes) = 0
per prestodb docs

Related

Filter the list generated by a Gremlin traversal and Groovy

I'm doing the following traversal:
g.V().has('Transfer','eventName','Airdrop').as('t1').
outE('sent_to').
inV().dedup().as('a2').
inE('sent_from').
outV().as('t2').
where('t1',eq('t2')).by('address').
outE('sent_to').
inV().as('a3').
select('a3','a2').
by('accountId').toList().groupBy { it.a3 }.collectEntries { [(it.key): [a2 : it.value.a2]]};
So as you can see I'm basically doing a traversal and at the end I'm using groovy with collectEntries to aggregate the results like I need them, which is aggregated by a3 in this case. The results look like this:
==>0xfe43502662ce2adf86d9d49f25a27d65c70a709d={a2=[0x99feb505a8ed9976cf19e757a9536117e6cdc5ba, 0x22019ad32ea3adabae68003bdefd099d7e5e3886]}
(This is GOOD, because the number of values in a2 is at least 2)
==>0x129e0131ea3cc16fe5252d7280bd1258f629f20f={a2=[0xf7958fad496d15cf9fd9e54c0012504f4fdb96ff]}
(This is NOT GOOD, I want to return in my list only those combinations where there are at lest 2 values for a2)
I have tried using filters and an additional where step in the traversal itself but I haven't been able to do it. I'm not sure if this is something I should skip using Groovy in my last line. Any help or orientation would be very much appreciated
I don't think you need to drop into Groovy to get the answer you want. It would be preferable to do this all in Gremlin especially since you intend to filter results which could yield some performance benefit. Gremlin has it's own group() step as well as methods for filtering the resulting Map:
g.V().has('Transfer','eventName','Airdrop').as('t1').
out('sent_to').
dedup().as('a2').
in('sent_from').as('t2').
where('t1',eq('t2')).by('address').
out('sent_to').inV().as('a3').
select('a3','a2').
by('accountId').
group().
by('a3').
by('a2').
unfold().
where(select(values).limit(local,2).count(local).is(gte(2)))
The idea is to build your Map with group() then deconstruct it to entries with unfold(). You the filter each entry with where() by selecting the values of the entry, which is a List of "a2" then counting the items locally in that List. I use limit(local,2) to avoid unnecessary iteration beyond 2 since the filter is gte(2).
The easiest way to do this is with findAll { }.
.groupBy { it.a3 }
.findAll { it.value.a2.size() > 1 }
.collectEntries { [(it.key): [a2: it.value.a2]] }
if some a2 are null, then value.a2 also evaluates to null and filters the results without the need for explicit nullchecks

Splunk: Extracting values for table

I have events in my logs that look like
{
linesPerSec: 1694.67
message: Status:
rowCount: 35600000
severity: info
}
when i make a search like:
index="apps" app="my-api" message="*Status:*" | table _time, linesPerSec, rowCount
This is what my table ends up looking like
How do I get the number value away from the key for both linesPerSec and rowCount? I want to see all instances. I tried using values(linesPerSec) but that seemed to aggregate only unique.
Thanks,
Nate
Answer with explanation can be found here: https://answers.splunk.com/answers/756524/extracting-values-for-table.html
Is that your complete query? You mention using values(), but there's no stats command in your search. BTW, values() displays unique values; use list() to see all of them.
You may be able to use extract to get the numbers, but I think rex will work better. Try this search:
index="apps" app="my-api" message="*Status:*"
| rex "linesPerSec:\s+(?<linesPerSec>\d+\.\d+)"
| rex "rowCount:\s+(?<rowCount>\d+)"
| table _time, linesPerSec, rowCount

How to search for a specific dynamic pattern of a field's in mongodb.?

I need to search mongodb collection for a specific pattern field. I tried using {$exists:true}; However, this gives results only if you provide exact field.
I tried using {$exists:true} for my field. But this does not give results if you give some pattern.
{
"field1":"value1",
"field2":"value2",
"field3":object
{/arjun1/pat1: 1,
/arjun2/pat2: 3,
/arjun3/pat3: 5
}
"field4":"value4",
}
From some field, I get the keys pat3 & field3. From this I would need to find out if the value /arjun3/pat3 exists in the document.
If I use {"field3./arjun3/pat3":{$exists:true}}, this would give me results. But the problem is I get only field3 and pat3 and I need to use some pattern matching like field3.*.pat3 and then use $expr or $exists; which I'm not exactly sure how to. Please help.
you could try something of this kind
db.arjun.find(
{"field3" : {
"$elemMatch" : { $and: [
{"arjun3.pat3" : {$exists:true}},
{"arjun3.pat3" : 5}
]
}}}
);
You can either go for regex (re module) for SQL like pattern matching, and compile your own custom wildcard. But if you don't want that then you can simple use the fnmatch module, it is a builtin library of python which allows wildcard matching for multiple characters (via*) or a single character (via ?).
import fnmatch
a = "hello"
print(fnmatch.fnmatch(a, "h*"))
OUTPUT:-
True

knex.raw with existing columns array using .first statement

Here is existing code:
knex("products")
.first("id", "name", "ingredients")
...
So, currently it just uses array of column names.
Now I want to add calculated column here. It would consists of "constant" + product.id.
For product with id 1 it would be "api/v1/img/1".
For product with id 222 it would be "api/v1/img/222".
Alias of it should be "image".
I have to use knex.raw somehow. Do not understand how and what is the correct syntax to use it with .first().
I'm sorry, I'm unable to understand the question. What kind of result are you trying to achieve? maybe something like this?
knex("products")
.select('*', knex.raw(`'api/v1/img' || ?? as computed`, ['products.id']))
.first()
Like this: https://runkit.com/embed/9okme0czge8z

Elasticsearch Completion Suggester field contains comma separated values

I have a field that contains comma separated values which I want to perform suggestion on.
{
"description" : "Breakfast,Sandwich,Maker"
}
Is it possible to get only applicable token while performing suggest as you type??
For ex:
When I say break, how can I get only Breakfast and not get Breakfast,Sandwich,Maker?
I have tried using commatokenizer but it seems it does not help
As said in the documentation, you can provide multiple possible inputs by indexing like this:
curl -X PUT 'localhost:9200/music/song/1?refresh=true' -d '{
"description" : "Breakfast,Sandwich,Maker",
"suggest" : {
"input": [ "Breakfast", "Sandwitch", "Maker" ],
"output": "Breakfast,Sandwich,Maker"
}
}'
This way, you suggest with any word of the list as input.
Obtaining the corresponding word as suggestion from Elasticsearch is not possible but as a workaround you could use a tokenizer outside Elasticsearch to split the suggested string and choose only the one that has the input as prefix.
EDIT: a better solution would be to use an array instead of comma-separated values, but it doesn't meet your specs... ( look at this: Elasticsearch autocomplete search on array field )

Resources