Marklogic faceted search and collations - search

I'm setting up a faceted search in MarkLogic. I have the following range indexes configured:
That is, I have two indexes. The first is on namespace http://www.corbas.co.uk/ns/presentations and local name keyword. The second has the local name level. The collation URI for both is http://marklogic.com/collation/en/S1.
When I try to search using the following I see errors related to collations:
xquery version "1.0-ml";
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
search:search("levels:Intermediate",
<options xmlns="http://marklogic.com/appservices/search">
<return-results>true</return-results>
<return-facets>true</return-facets>
<constraint name="keywords" facet="true">
<range type="xs:string" collation="http://marklogic.com/collation/en/S1">
<element ns="http://www.corbas.co.uk/ns/presentations" name="keyword"/>
</range>
</constraint>
<constraint name="levels" facet="true">
<range type="xs:string" collation="http://marklogic.com/collation/en/S1">
<element ns="http://www.corbas.co.uk/ns/presentations" name="level"/>
</range>
</constraint>
</options>)
I get the following error:
XDMP-ELEMRIDXNOTFOUND: cts:search(fn:collection(),
cts:element-range query(fn:QName("http://www.corbas.co.uk/ns/presentations","level"),
"=", "Intermediate", ("collation=http://marklogic.com/collation/en/S1"), 1),
("score-logtfidf", "faceted", cts:score-order("descending")),
xs:double("1"), ()) -- No string element range index for
{http://www.corbas.co.uk/ns/presentations}level
collation=http://marklogic.com/collation/en/S1
What am I doing wrong?

Strange Message. If it even got that far, then it looks like your database default collation is changed. Does not answer the question. just strange.
Forst off, I would always add the collation to the constraint:
<search:range type="xs:string" facet="true"
collation="http://marklogic.com/collation/en/S1">
Second, I always troubleshoot range index issue from the query console:
use cts:values() to verify that your indexes are in place and in the namespace and collation you expect. This removes other layers and verifies that the index is as you expect.
And another item: MarkLogic range indexes do not exist until content is indexed. Are you sure you have not turned off auto-index on the database and perhaps content is not indexed? That would give you an error.

To be honest, I would have expected a different error message. I would have expected MarkLogic to complain it couldn't find an index for root collation, because you have not added collation attributes on the range elements in the search options.
Maybe adding those will help.
HTH!

It looks to me like your configuration is correct, which suggests to me that the problem is timing. Once you specify what indexes you want, MarkLogic gets to work creating them. If you run a query that requires those indexes before MarkLogic finishes creating them, you get this error. Depending on the amount of content you have, the creation process can be very quick or take hours.
To check the status, point your browser to the Admin UI (http://localhost:8001) and navigate to the configuration page for your database. Click on the Status tab and look for "Reindexing/Refragmenting State"—if MarkLogic is still reindexing, it will tell you so here and you'll get updates on its progress. (You can also get this information through the Management API.)

Related

poor search performance for certain wildcard queries

I am having performance issues when using wildcard searching for certain letter combinations, and I am not sure what else I need to to to possibly improve it. All of my documents are following an envelope pattern that look something like the following.
<pdbe:person-envelope>
<person xmlns="http://schemas.abbvienet.com/people-db/model">
<account>
<domain/>
<username/>
</account>
<upi/>
<title/>
<firstName>
<preferred/>
<given/>
</firstName>
<middleName/>
<lastName>
<preferred/>
<given/>
</lastName>
</person>
<pdbe:raw/>
</pdbe:person-envelope>
I have a field defined called name, which includes the firstName and lastName paths:
{
"field-name": "name",
"field-path": [
{
"path": "/pdbe:person-envelope/pdbm:person/pdbm:firstName",
"weight": 1
},
{
"path": "/pdbe:person-envelope/pdbm:person/pdbm:lastName",
"weight": 1
}
],
"trailing-wildcard-searches": true,
"trailing-wildcard-word-positions": true,
"three-character-searches": true
}
When I do some queries using search:search, some come back fast, whereas others come back slow. This is with the filtered queries.
search:search("name:ha*",
<options xmlns="http://marklogic.com/appservices/search">
<constraint name="name">
<word>
<field name="name"/>
</word>
</constraint>
<return-plan>true</return-plan>
</options>
)
I can see from the query plan that it is going to filter over all 136547 fragments in the db. But this query works fast.
<search:query-resolution-time>PT0.013205S</search:query-resolution-time>
<search:snippet-resolution-time>PT0.008933S</search:snippet-resolution-time>
<search:total-time>PT0.036542S</search:total-time>
However a search for name:tj* takes a long time, and also filters over all of the 136547 fragments.
<search:query-resolution-time>PT6.168373S</search:query-resolution-time>
<search:snippet-resolution-time>PT0.004935S</search:snippet-resolution-time>
<search:total-time>PT12.327275S</search:total-time>
I have the same indexes on both. Are there any other indexes I should be enabling when I am specifically just doing a search via the field constraint? I have these other indexes enabled on the database itself, in general.
"collection-lexicon": true,
"triple-index": true,
"word-searches": true,
"word-positions": true
I tried doing an unfiltered query, but that did not help as I got a bunch of matches on the whole document, and not the the fields I wanted. I even tried to set the root-fragment to just my person element, but that did not seem to help things.
"fragment-root": [
{
"namespace-uri": "http://schemas.abbvienet.com/people-db/model",
"localname": "person"
}
]
Thanks for any ideas.
Fragment roots are helpful if you want to use a searchable expression for that person element, and mostly if it occurs multiple times in one document. It won't make your current search constrain on that element.
In your case you enabled a number of relevant options, but the wildcard option only works for 4 characters of more. If you want to search on wildcards with less characters, you need to enable the three, two and one character search options.
The search phrases mentioned above both contained two characters with a wildcard. Since you only enabled the three character option, it had to rely on filtering. The fact some run fast, some slow is probably because of caching. If you repeat the same query, MarkLogic will return the result from cache.
For performance testing you would either have to restart MarkLogic regularly to flush caches, or search on (semi) random strings to avoid MarkLogic being able to cache. Or maybe both..
HTH!

marklogic, howto create range on document properties

<?xml version="1.0" encoding="UTF-8"?>
<prop:properties xmlns:prop="http://marklogic.com/xdmp/property">
<publicationDate type="string" xmlns="http://marklogic.com/xdmp/json/basic">2015-03-30</publicationDate>
<identifier type="string" xmlns="http://marklogic.com/xdmp/json/basic">2629</identifier>
<posix type="string" xmlns="http://marklogic.com/xdmp/json/basic">nobs</posix>
</prop:properties>
I have a document with these properties above.
I want to filter by "PublicationDate" ...
I tried with "Fields" & "Field Range Indexes" and "Element Range Indexes", but I do not find the syntax (XML or JSON) to designate this property ?
is anyone know this syntax?
kind regards
In addition to the answers that give examples, please keep in mind that the element publicationDate is NOT in the namespace http://marklogic.com/xdmp/property in your example.. So your index configuration should have the namespace for the json/basic as defined per element and references to it as an xs:QName should not refer to "prop:"..
Trying to figure out if your index is correct? You can always try cts:values() from the query console and verify that your index is exactly where you expect it before using it in code.
After many trials, this is what seems to work fine (MarkLogic 8.0-3) :
Without "Field" (where wm is http://marklogic.com/xdmp/json/basic ):
qb.propertiesFragment(qb.value(qb.element(wm,'publicationDate'),'2015-03-30'))
is ok, but the following produces the same error (No element range index ...)
qb.propertiesFragment(qb.range(qb.element(wm,'publicationDate'), '>=' ,'2015-03-01'))
With "Field"
(wm:publicationDate, with wm in Path namespaces, WITHOUT /vm:properties/ before ...) the following seem to work fine :-)))
qb.propertiesFragment(qb.value(qb.field("properties_publicationDate"),'2015-03-30'))
qb.propertiesFragment(qb.range(qb.field("properties_publicationDate"), '>=' ,'2015-03-01'))
I think you are looking for cts:properties-query:
cts:properties-query(
cts:element-range-query(
xs:QName("my:publicationDate"),">",
current-dateTime() - xs:dayTimeDuration("P1D"))))
This example assumes a range index on prop:publicationDate, and also note that this assumes MarkLogic 7 or earlier. In MarkLogic 8, the name of this query appears to have changed to cts:properties-fragment-query.
In node.js, using the query builder, you could achieve something similar:
db.documents.query(
qb.where(
qb.fragmentScope('properties'),
qb.propertiesFragment(
qb.range('publicationDate', '>', ... )
)
)
)

Operator not found on SearchBooleanField

I'm new to the SuiteTalk API, but from what I can tell, this query SHOULD work (using the netsuite gem for ruby):
memorized_invoices = NetSuite::Records::Transaction.search({ criteria: { basic: [{ field: 'type', operator: 'anyOf', type: 'SearchEnumMultiSelectField', value: ["_invoice"]}, {field: 'memorized', value: true}]}})
But all I receive is a missing operator on SearchBooleanField:
D, [2013-10-15T16:23:10.607161 #5131] DEBUG -- : HTTPI POST request to webservices.netsuite.com (curb)
Savon::SOAPFault: (soapenv:Server.userException) org.xml.sax.SAXException: operator not found on {urn:core_2011_2.platform.webservices.netsuite.com}SearchBooleanField
Any suggestions as to why it's missing an operator on the SearchBooleanField? From your example on the main markdown page (search for: # no operator for booleans), a boolean field shouldn't need an operator.
I've done a basic google search, and haven't found much, except a few PHP examples that show that they use the operator 'is', which doesn't work either.
Any ideas?
I've also opened this question as an issue on GitHub. Thanks!
You are missing the operator property from your filter for the memorized field:
{field: 'memorized', operator: 'is', value: true}
Just took a look at the Schema Browser.
You are correct - no operator appears to be needed.
<complexType name="SearchBooleanField">
<sequence>
<element name="searchValue" type="xsd:boolean" minOccurs="0"/>
</sequence>
</complexType>
Why not create a saved search in the UI that represents the records you want to process? Just execute the search and loop through the results.
I don't know about the Ruby Gem, but executing a saved search via web services is fairly simple.
// create search object
TransactionSearchAdvanced tsa = new TransactionSearchAdvanced();
//set saved search id
tsa.savedSearchId="100";
// perform the search
NetSuiteService nss = new NetSuiteService();
SearchResult result = nss.search(tsa);
It turns out this is a bug found in the gem implementation of the netsuite library. If anyone else ever runs into this problem, see how we're solving it on the issue on the repository.
Thanks everyone for your help and suggestions.

Retrieve a specific node from a $(xData.responseXML) object

I'm completely stuck in retrieving a specific node from a responseXML object that I got back from the GetUserProfileByName (SharePoint / SPServices). I need a specific PropertyData node (in the example "FirstName") and then retrieve the value of the "FirstName". Retrieving the value is not a problem, retrieving the specific node is...
Below a part from the returned XML (for the sake of the example I stripped some properties):
...
<PropertyData>
<Name>UserProfile_GUID</Name>
<Values>
<ValueData>
<Value xmlns:q1="...">206b47c7-cfdc-...</Value>
</ValueData>
</Values>
</PropertyData>
<PropertyData>
<Name>FirstName</Name>
<Values>
<ValueData>
<Value xsi:type="xsd:string">Maarten</Value>
</ValueData>
</Values>
</PropertyData>
...
Since I know that I need the property FirstName, I do not want to iterate through the entire set of PropertyData nodes until I've the correct one (slow). In XPath I can select FirstName just by saying:
//PropertyData[Name='FirstName']/Values/ValueData/Value
However, I cannot do that in the xData.responseXML object. I tried the following filter, finds and other things (in all kinds of variations):
$(xData.responseXML).SPFilterNode("PropertyData").find("[Name*=FirstName]")
$(xData.responseXML).SPFilterNode("PropertyData").find("[Name*='FirstName']")
$(xData.responseXML).SPFilterNode("PropertyData").filter("[Name*=FirstName]")
$(xData.responseXML).SPFilterNode("PropertyData[Name='FirstName']")
I did many searches, but was not able to find an answer. There were many partial answer which a I all tried, but were not working. Any one a clue...
Thanks in advance!
Maarten
#Maarten
I'm not at my computer right now to test, but try this:
$(xData.responseXML).find("Name:contains('FirstName')").closest("PropertyData")
REVISION 1:
Given your feedback that an additional element is returned (the phonetic field), here is a revised selector to only return the one containing the FirstName element:
$(xData.responseXML)
.find("Name:contains('FirstName')")
.not(":contains('SPS-PhoneticFirstName')")
.closest("PropertyData");
Paul

WSO2 - Using get-property() function in Property/Xquery Mediators

Our current service has 7 operations. when writing an outbound xquery "local entry" in wso2, we're trying to retrieve the name of the current operation being executed (how can this be so difficult?).
After reading what i could find in wso2's documentation. it appears as if we need to set up both a Property and an Xquery mediator. supposedly the property mediator would pull the value doing something like get-property('OperationName') and then this would be referenced and passed thru the Xquery mediator.
The other idea was that we needed to define it as a variable in the "Local Registry entry definitions" and than it would be around at all parts of the sequence.
I've tried for 2 days but haven't quite got it.
Please tell me what I'm missing...
Did you try the following xquery sample[1]? I modified the query mediator to get the operation name as follows.
<variable xmlns:ax21="http://services.samples/xsd" xmlns:m0="http://services.samples" name="code" expression="get-property('OperationName')" type="STRING" />
this worked fine. I could see the getQuote in the response message.
[1] http://wso2.org/project/esb/java/4.0.2/docs/samples/advanced_mediation_samples.html#Sample390

Resources