How to write a LIKE query in Azure CosmosDB? - azure

I want to retrieve data from Cosmos DB with the following query:
SELECT * FROM c WHERE c.pi like '09%001'
(This is a SQL query, which I can use in MySQL)
Here, pi is a string value, which can be 09001001 or 09025001.
Is there a way to use a LIKE command in Cosmos DB?
I know that cosmos DB uses CONTAINS, but this cannot be used when you want to match specifically the beginning or end of the string.

UPDATE :
You can now use the LIKE keyword to do text searches in Azure Cosmos DB SQL (core) API!
EXAMPLE:
SELECT *
FROM c
WHERE c.description LIKE "%cereal%"
OLD Answer:
This can be achieved in 2 ways
(i) Currently Azure Cosmosdb supports the CONTAINS, STARTSWITH, and ENDSWITH built-in functions which are equivalent to LIKE.
The keyword for LIKE in Cosmosdb is Contains .
SELECT * FROM c WHERE CONTAINS(c.pi, '09')
So, in your case if you want to match the pattern 09%001, you need to use:
SELECT * FROM c WHERE STARTSWITH(c.pi, '09') AND ENDSWITH(c.pi, '001')
(ii) As 404 mentioned, Use SQL API User Defined Functions which supports regex :
function executeRegex(str, pattern) {
let regex=RegExp(pattern);
return regex.test(str);
}
SELECT udf.EXECUTE_REGEX("foobar", ".*bar")

Another possibility is creating your own User Defined Function. As example here's a regex check:
function matchRegex(str, pattern) {
let regex=RegExp(pattern);
return regex.test(str);
}
Created under the name MATCH_REGEX it can then be used like:
SELECT udf.MATCH_REGEX("09001001", "^09.*001$")
As note: it'll kill any index optimization that for instance STARTSWITH would have. Although allows for much more complex patterns. It can therefor be beneficial to use additional filters that are capable of using the index to narrow down the search. E.g. using StartsWith(c.property1, '09') as addition to the above example in a WHERE clause.
UPDATE:
Cosmos now has a RegexMatch function that can do the same. While the documentation for the MongoApi mentions the $regex can use the index to optimize your query if it adheres to certain rules this does not seem to be the case for the SqlApi (at this moment).

Related

How to work with result set of meta-functions in Vertica

I want to use result set of meta-function get_node_dependencies as a subquery. Is there some way to do it?
Something like this:
select v_txtindex.StringTokenizerDelim (dep, chr(10)) over () as words
from (
select get_node_dependencies() as dep
) t;
This query thows an error Meta-function ("get_node_dependencies") can be used only in the Select clause.
I know that there is a view vs_node_dependencies that returns the same data in more readable way, but the question is generic, not related to any specific meta-function.
Most Vertica meta functions returning a report are for informational purposes on the fly, and can only be used on the outmost part of a query - so you can't apply another function on their output.
But - as you are already prepared to go through development work to split that output into tokens, you might often be even better off by querying vs_node_dependencies directly. You'll also be more flexible - is my take on this.

Does jOOQ support putting names on SQL statements?

Is there a way to "tag" or put names on SQL statements in jOOQ so when I look at the Performance Insights of AWS RDS, I can see something more meaningful than the first 500 chars of the statement?
For example, Performance Insights shows that this query is taking a toll in my DB:
select "my_schema"."custs"."id", "my_schema"."custs"."other_id", "my_schema"."custs"."cid_id", "my_schema"."custs"."valid_since", "my_schema"."custs"."valid_until", "my_schema"."custs"."address", "my_schema"."custs"."address_id_1", "my_schema"."pets"."id", "my_schema"."pets"."cst_id", "my_schema"."pets"."tag", "my_schema"."pets"."name", "my_schema"."pets"."description", "my_schema"."pets"."owner", "my_schema"."pets"."created_on", "my_schema"."pets"."created_by", "my_schema"."pets"."modified_on",
But as it comes chopped, it's not straight-forward to know which jOOQ code generated this.
I would prefer to see something like this:
Customer - Pet Lookup
or:
(Customer - Pet Lookup) select "my_schema"."custs"."id", "my_schema"."custs"."other_id", "my_schema"."custs"."cid_id", "my_schema"."custs"."valid_since", "my_schema"."custs"."valid_until", "my_schema"."custs"."address", "my_schema"."custs"."address_id_1", "my_schema"."pets"."id", "my_schema"."pets"."cst_id", "my_schema"."pets"."tag", "my_schema"."pets"."name", "my_schema"."pets"."description", "my_schema"."pets"."owner", "my_schema"."pets"."created_on", "my_schema"."pets"."created_by", "my_schema"."pets"."modified_on",
There are at least two out of the box approaches to what you want to achieve, both completely vendor agnostic:
1. Use "hints"
jOOQ supports Oracle style hints using the hint() method, at least for SELECT statements. Write something like:
ctx.select(T.A, T.B)
.hint("/* my tag */")
.from(T)
.where(...)
The limitation here is the location of the hint, which is going to be right after the SELECT keyword. Not sure if this will work for your RDBMS.
2. Use an ExecuteListener
You can supply your Configuration with an ExecuteListener, which patches your generated SQL strings with whatever you need to be added:
class MyListener extends DefaultExecuteListener {
// renderEnd() is called after the SQL string is generated, but
// before the prepared statement is created, let alone executed
#Override
public void renderEnd​(ExecuteContext ctx) {
if (mechanismToDetermineIfTaggingIsNeeded())
ctx.sql("/* My tag */ " + ctx.sql());
}
}
Using regular expressions, you can place that tag at any specific location within your SQL string.

Cant get identifier and Max Value CosmostDb

I would like to do some reporting on my CosmosDb
my Query is
Select Max(c.results.score) from c
That works but i want the id of the highest score then i get an exception
Select c.id, Max(c.results.score) from c
'c.id' is invalid in the select list because it is not contained in an
aggregate function
you can execute following query to archive what you're asking (thought it can be not very efficient in RU/execution time terms):
Select TOP 1 c.id, c.results.score from c ORDER BY c.results.score DESC
Group by isn't supported natively in Cosmos DB so there is no out of the box way to execute this query.
To implement this using the out of the box functionality you would need to create a new document type that contains the output of your aggregation e.g.
{
"id" : 1,
"highestScore" : 1000
}
You'd then need a process within your application to keep this up-to-date.
There is also documentdb-lumenize that would allow you to do this using stored procedures. I haven't used it myself but it may be worth looking into as an alternative to the above solution.
Link is:
https://github.com/lmaccherone/documentdb-lumenize

Case insensitive search in arrays for CosmosDB / DocumentDB

Lets say I have these documents in my CosmosDB. (DocumentDB API, .NET SDK)
{
// partition key of the collection
"userId" : "0000-0000-0000-0000",
"emailAddresses": [
"someaddress#somedomain.com", "Another.Address#someotherdomain.com"
]
// some more fields
}
I now need to find out if I have a document for a given email address. However, I need the query to be case insensitive.
There are ways to search case insensitive on a field (they do a full scan however):
How to do a Case Insensitive search on Azure DocumentDb?
select * from json j where LOWER(j.name) = 'timbaktu'
e => e.Id.ToLower() == key.ToLower()
These do not work for arrays. Is there an alternative way? A user defined function looks like it could help.
I am mainly looking for a temporary low-effort solution to support the scenario (I have multiple collections like this). I probably need to switch to a data structure like this at some point:
{
"userId" : "0000-0000-0000-0000",
// Option A
"emailAddresses": [
{
"displayName": "someaddress#somedomain.com",
"normalizedName" : "someaddress#somedomain.com"
},
{
"displayName": "Another.Address#someotherdomain.com",
"normalizedName" : "another.address#someotherdomain.com"
}
],
// Option B
"emailAddressesNormalized": {
"someaddress#somedomain.com", "another.address#someotherdomain.com"
}
}
Unfortunately, my production database already contains documents that would need to be updated to support the new structure.
My production collections contain only 100s of these items, so I am even tempted to just get all items and do the comparison in memory on the client.
If performance matters then you should consider one of the normalization solution you have proposed yourself in question. Then you could index the normalized field and get results without doing a full scan.
If for some reason you really don't want to retouch the documents then perhaps the feature you are missing is simple join?
Example query which will do case-insensitive search from within array with a scan:
SELECT c FROM c
join email in c.emailAddresses
where lower(email) = lower('ANOTHER.ADDRESS#someotherdomain.com')
You can find more examples about joining from Getting started with SQL commands in Cosmos DB.
Note that where-criteria in given example cannot use an index, so consider using it only along another more selective (indexed) criteria.

What is the azure table storage query equivalent of T-sql's LIKE command?

I'm querying Azure table storage using the Azure Storage Explorer. I want to find all messages that contain the given text, like this in T-SQL:
message like '%SysFn%'
Executing the T-SQL gives "An error occurred while processing this request"
What is the equivalent of this query in Azure?
There's no direct equivalent, as there is no wildcard searching. All supported operations are listed here. You'll see eq, gt, ge, lt, le, etc. You could make use of these, perhaps, to look for specific ranges.
Depending on your partitioning scheme, you may be able to select a subset of entities based on specific partition key, and then scan through each entity, examining message to find the specific ones you need (basically a partial partition scan).
While an advanced wildcard search isn't strictly possible in Azure Table Storage, you can use a combination of the "ge" and "lt" operators to achieve a "prefix" search. This process is explained in a blog post by Scott Helme here.
Essentially this method uses ASCII incrementing to query Azure Table Storage for any rows whose property begins with a certain string of text. I've written a small Powershell function that generates the custom filter needed to do a prefix search.
Function Get-AzTableWildcardFilter {
param (
[Parameter(Mandatory=$true)]
[string]$FilterProperty,
[Parameter(Mandatory=$true)]
[string]$FilterText
)
Begin {}
Process {
$SearchArray = ([char[]]$FilterText)
$SearchArray[-1] = [char](([int]$SearchArray[-1]) + 1)
$SearchString = ($SearchArray -join '')
}
End {
Write-Output "($($FilterProperty) ge '$($FilterText)') and ($($FilterProperty) lt '$($SearchString)')"
}
}
You could then use this function with Get-AzTableRow like this (where $CloudTable is your Microsoft.Azure.Cosmos.Table.CloudTable object):
Get-AzTableRow -Table $CloudTable -CustomFilter (Get-AzTableWildcardFilter -FilterProperty 'RowKey' -FilterText 'foo')
Another option would be export the logs from Azure Table storage to csv. Once you have the csv you can open this in excel or any other app and search for the text.
You can export table storage data using TableXplorer (http://clumsyleaf.com/products/tablexplorer). In this there is an option to export the filtered data to csv.

Resources