Comparing element values from the same fragment using range index - search

In my documents there are two elements(<a> and <b>) on which range indexes(of the same type) exist. I want all those documents in which the values of <a> and <b> are same. I understand that using cts:element-value-co-occurrences() I can fetch the pair of values of <a> and <b> from each fragment and compare the values. But how do I refer back to the fragment where a match is found? Or is there a simpler way to do this? All I want is the range indexes to get utilized.

The co-occurences functions return a list of all existing (within-fragment) value combinations of those two elements. If you simply look for all documents in which the value of element a is equal to the value of element b, you could do something like:
for $v in cts:element-values(xs:QName("a"))
return
cts:search(
collection(),
cts:and-query((
cts:element-value-query(xs:Qname("a"), $v),
cts:element-value-query(xs:Qname("b"), $v)
))
)
Or you could use cts:uris instead of cts:search to find the database uris of those docs..
ADDED:
What #mblakele in the comment below means is this:
let $query :=
cts:or-query(
for $v in cts:element-values(xs:QName("a"))
return
cts:and-query((
cts:element-value-query(xs:Qname("a"), $v),
cts:element-value-query(xs:Qname("b"), $v)
))
)
return
cts:search(
collection(),
$query
)
That saves you from doing cts:search for each value separately, and is likely to perform quicker..
HTH!

Related

groovy iterate through list of key and value

I have this list:
service_name_status=[a-service=INSTALL, b-service=UPGRADE, C-service=UPGRADE, D-service=INSTALL]
And I need to iterate through this list so the first element will be the value of a parameter called "SERVICE_NAME" and the second element will be the value of a parameter called "HELM_COMMAND",
after asserting those values to the parameters I will run my command that uses those parameters and then continue the next items on the list which should replace the values of the parameters with items 3 and 4 and so on.
So what I am looking for is something like that:
def service_name_status=[a-service=INSTALL, b-service=UPGRADE, C-service=UPGRADE, D-service=INSTALL]
def SERVICE_NAME
def HELM_COMMAND
for(x in service_name_status){
SERVICE_NAME=x(0,2,4,6,8...)
HELM_COMMAND=x(1,3,5,7,9...)
println SERVICE_NAME=$SERVICE_NAME
println HELM_COMMAND=$HELM_COMMAND
}
the output should be:
SERVICE_NAME=a-service
HELM_COMMAND=INSTALL
SERVICE_NAME=b-service
HELM_COMMAND=UPGRADE
SERVICE_NAME=c-service
HELM_COMMAND=UPGRADE
SERVICE_NAME=d-service
HELM_COMMAND=INSTALL
and so on...
I couldn't find anything that takes any other element in groovy, any help will be appreciated.
The collection you want is a Map, not a List.
Take note of the quotes in the map, the values are strings so you need the quotes or it won't work. You may have to change that at the source where your data comes from.
I kept your all caps variable names so you will feel at home, but they are not the convention.
Note the list iteration with .each(key, value)
This will work:
Map service_name_status = ['a-service':'INSTALL', 'b-service':'UPGRADE', 'C-service':'UPGRADE', 'D-service':'INSTALL']
service_name_status.each {SERVICE_NAME, HELM_COMMAND ->
println "SERVICE_NAME=${SERVICE_NAME}"
println "HELM_COMMAND=${HELM_COMMAND}"
}
EDIT:
The following can be used to convert that to a map. Be careful, the replaceAll part is fragile and depends on the data to always look the same.
//assuming you can have it in a string like this
String st = "[a-service=INSTALL, b-service=UPGRADE, C-service=UPGRADE, D-service=INSTALL]"
//this part is dependent on format
String mpStr = st.replaceAll(/\[/, "['")
.replaceAll(/=/, "':'")
.replaceAll(/]/, "']")
.replaceAll(/, /, "', '")
println mpStr
//convert the properly formatted string to a map
Map mp = evaluate(mpStr)
assert mp instanceof java.util.LinkedHashMap

ArangoDB AQL: Find Gaps In Sequential Data

I've been given data to build an application that has sequential data in the form of part numbers of products: "000000", "000001", "000002", "000010", "000011" .... The previous application was an old MS Access database that didn't have any gap filling features in the part number generator, hence the gap between "000002" and "000010" (Yes, they are also strings, but I can work with that...).
We could continue to increment based on the last value and ignore the gaps, however, in an attempt to use all numbers available to us with our naming scheme, we'd like to be able to fill the gaps. Our naming scheme describes the "product family" with the first two digits such that: [00]0000 would be a different family from [02]0000.
I can find the starting and ending values using something like:
let query = `
LET first = (
MIN(
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN part.PartNumber
)
)
LET last = (
MAX(
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN part.PartNumber
)
)
RETURN { first, last }
`
The above example returns: {first: "000000", last: "000915"}
Using ArangoDB and AQL, how could I go about finding these gaps? I've found some SQL examples but I feel the features of AQL are a bit more limiting.
Thanks in advance!
To start with, I think your best bet for getting min/max values is using aggregates:
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
COLLECT x = 1
AGGREGATE first = MIN(part.PartNumber), last = MAX(part.PartNumber)
RETURN {
first: first,
last: last
}
But that won't really help when trying to find gaps. And you're right - SQL has several logical constructs that could help (like using variables and cursor iteration), but even that would be a pattern I would discourage.
The better path might be to do a "brute force" approach - compare a table containing your existing numbers with a table of all numbers, using a native method like JOIN to find the difference. Here's how you might do that in AQL:
LET allNumbers = 0..9999
LET existingParts = (
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
LET childId = RIGHT(part.PartNumber, 4)
RETURN TO_NUMBER(childId)
)
RETURN MINUS(allNumbers, existingParts)
The x..y construct creates a sequence (an array of numbers), which we use as the full set of possible numbers. Then, we want to return only the "non-family" part of the ID (I'm calling it "child"), which needs to be numeric to compare with the previous set. Then, we use MINUS to remove elements of existingParts from the allNumbers list.
One thing to note, that query would return only the "child" portion of the part number, so you would have to join it back to the family number later. Alternatively, you could also skip string-splitting, and get "fancy" with your list creation:
LET allNumbers = TO_NUMBER(CONCAT(#family, '0000'))..TO_NUMBER(CONCAT(#family, '9999'))
LET existingParts = (
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN TO_NUMBER(part.PartNumber)
)
RETURN MINUS(allNumbers, existingParts)

How to match values in arrays and match result against key/val pair

I'm using Groovy.
I've got two sets of data. The first is an array of site codes and the second is a key/val map of some JSON data.
I need to loop through the list of site codes and match them to key in the map. Once it finds a match it needs to return the corresponding map val.
The map array looks like this:
list = ["WSM-3572", "WSM-0301","WSM-10153"]
A keypair looks like this:
{id=3dd9794a-d148-4f74-a297-cefe22d05cfd, name=Nedbank Mall of Africa(WSM-3572)},{id=8fb57fda-8bdf-4aef-8d50-f3bf8d2235e1, name=Caffe Rossini (WSM-3432)},
{id=bd12b3ef-b72f-4211-8987-2e0c6f1f688d, name=Steers Welkom (WSM-4502)},
So in the above case we should run through the list and when it gets to WSM-3572 it should find it and match the site code in the name: Nedbank Mall of Africa(WSM-3572) and then return id=3dd9794a-d148-4f74-a297-cefe22d05cfd.
I hope this all makes sense and thanks in advance
Assuming you've loaded your json into a map with JsonSlurper, something like
list.each { code ->
println "$code = " + json.find { it ->
it.name.contains "($it)"
}?.id
}
Should do it.
Not at a computer, but that should be close

How can I do an if last item in jade

As you can see from below I have an array, I would like to remove the svg from the last item in that array when it runs. How might I do this with a condition? Something like if last:item else add svg
-navlinks = {"Home":"/Home", "About":"/About", "Store Directory":"/Store-Directory", "Store Page":"/Store-Page", "Events":"/Events",}
ul.navbar-menu
for val, key in navlinks
li
a(href='#{val}') #{key}
svg.icon.icon-dots
use(xlink:href="#icon-dots")
Well the thing is, contrary to what you said, navlinks is not an Array, but rather an Object. Since Object elements do not have a numeric index, the notion of last does not have much meaning.
However, you could iterate over Object.keys(navlinks) which is a proper Array with a numeric index. So you could do something like :
-navlinks = {"Home":"/Home", "About":"/About", "Store Directory":"/Store-Directory", "Store Page":"/Store-Page", "Events":"/Events",}
ul.navbar-menu
- each key, index in Object.keys(navlinks)
li
a(href='#{val}')= navlinks[key]
if index < Object.keys(navlinks).length - 1
svg.icon.icon-dots
use(xlink:href="#icon-dots")

How do I get all hits from a cts:search() in Marklogic

I have a collection containing lots of documents.
when I search the collection, I need to get a list of matches independent of documents. So if I search for the word "pie". I would get back a list of documents, properly sorted by relevance. However, some of these documents contain the word "pie" on more then one place. I would like to get back a list of all matches, unrelated to the document where the match was found. Also, this list of all hits would need the be sorted by relevance (weight), again totally independent of the document (not grouped by the document).
Following code searches and returns matches grouped by the document...
let $searchfor := "pie"
let $query := cts:and-query((
cts:element-word-query(xs:QName("title"), ($searchfor), (), 16),
cts:element-word-query(xs:QName("para"), ($searchfor), (), 10)
))
let $resultset := cts:search(fn:collection("docs"), $query)[0 to 100]
for $n in $resultset
return cts:score($n)
What I need is $n to be the "match-node", not a "document-node"...
Thanks!
Document relevance is determined by TFIDF. Matches contribute to a document's score but don't have scores relative to each other. cts:search already returns results ordered by document relevance, so you could do this to get match nodes ordered by their ancestor document score:
let $searchfor := "pie"
let $query := cts:and-query((
cts:element-word-query(xs:QName("title"), ($searchfor), (), 16),
cts:element-word-query(xs:QName("para"), ($searchfor), (), 10)
))
return
cts:search(//(title|para),$query)[0 to 100]/cts:highlight(.,$query,element match {$cts:node})//match/*
You need to split the document (fragment it) into smaller documents. Every textnode could be a document, with an stored original xpath so that the context is not lost.
I recommend that you look at the Search API (http://community.marklogic.com/pubs/5.0/books/search-dev-guide.pdf and http://community.marklogic.com/pubs/5.0/apidocs/SearchAPI.html). This API will give what you want, providing match nodes as well as the URIs for the actual documents. You should also find it easier to use for the general cases, although there will be edge cases where you will need to revert back to cts:search.
search:search is the specific function you will want to use. It will give you back responses similar to this:
<search:response total="1" start="1" page-length="10" xmlns=""
xmlns:search="http://marklogic.com/appservices/search">
<search:result index="1" uri="/foo.xml"
path="fn:doc("/foo.xml")" score="328"
confidence="0.807121" fitness="0.901397">
<search:snippet>
<search:match path="fn:doc("/foo.xml")/foo">
<search:highlight>hello</search:highlight></search:match>
</search:snippet>
</search:result>
<search:qtext>hello sample-property-constraint:boo</search:qtext>
<search:report id="SEARCH-FLWOR">(cts:search(fn:collection(),
cts:and-query((cts:word-query("hello", ("lang=en"), 1),
cts:properties-query(cts:word-query("boo", ("lang=en"), 1))),
()), ("score-logtfidf"), 1))[1 to 10]
</search:report>
<search:metrics>
<search:query-resolution-time>PT0.647S</search:query-resolution-time>
<search:facet-resolution-time>PT0S</search:facet-resolution-time>
<search:snippet-resolution-time>PT0.002S</search:snippet-resolution-time>
<search:total-time>PT0.651S</search:total-time>
</search:metrics>
</search:response>
Here you can see that every result has one or possibly more match elements defined.
How would you determine the relevance of a word independent of the document? Relevance is a measure of document relevance, not word relevance. I don't know how one would measure word relevance.
You could potentially return all words ordered by document relevance, then words for each document in "document order" which means the order in which they appear in the document. That would be relatively easy to do with search:search where you iterate over all results and extract each matching word. What would you present with each match? Its surrounding snippet?
Keep in mind that what you're asking for would potentially take a long time to execute.

Resources