locale support on Mongodb - locale

Can someone point me to an example or documentation on how to use non-default locale codeset on MongoDB? I have some document with key value pairs in German codeset. I need to display the results in the same collation order.

Related

How to get document containing specific field?

I want get only document in collection if contain specific field.
For example I try this but it not work:
const exampleCollection = admin.firestore().collection(‘Collection’).doc(‘Doc).collection(‘Subcollection’);
const exampleDoc = await exampleCollection.where(“field”, “>”, “”).get();
const field = await exampleDoc.data().field;
How to do?

Thanks!
You can actually use .orderBy('field') to return only those documents with that field. Firestore will automatically drop documents without that field.
You can perform simple and compound queries in firebase.
documentation: https://firebase.google.com/docs/firestore/query-data/queries#simple_queries
example: .where('field', '>', '10');
Cloud Firestore does not offer any type of query that simply checks to see if a property exists. You have to check if it contains a specific value (==), or a range (greater than, less than), or its array contains some value. In all cases, you need a specific value to compare to. There is also no type of query that looks for data that's not present, as that type of query does not scale in the way the Firestore requires.
This probably means you will need to change your document model to suit the results you need to get.

querieng document which doesn't have a given field or is empty string in Solr

I am doing a query with solr where I need to find documents without a given field say 'name' and I am trying following part;
$q=+status:active -name:["" TO *]'
But it sends both all the documents with and without that field.
Can anyone help me figure this out?
the field name is a normal String type and is indexed.
I am using nodejs. Can anyone help me with this
According to docs:
-field:[* TO *] finds all documents without a value for field
Update
I tried it but it sends even the ones with the field non empty
Then my wild quess is that you are using search query q instead of using filter query fq. Since you are using multiple statements in query I assume that q does some extra magic to get the most relevant documents for you, which can lead to returning some non-wanted results.
If you want to get the strict set of results you should use filter query fq instead, see docs.

Lucene wildcard applied to indexed field

I have a set of indexed fields such as these:
submitted_form_2200FA17-AF7A-4E44-9749-79D3A391A1AF:true
submitted_form_2398389-2-32-43242423:true
submitted_form_54543-32SDf-3242340-32422:true
And I get that it's possible to wildcard queries such as
submitted_form_2398389-2-32-43242423:t*e
What I'm trying to do is get "any" submitted form via something like:
submitted_form_*:true
Is this possible? Or will I have to do a stream of "OR"s on the known forms (which seems quite heavy)
That's not the intended use of fields, I think. Field names aren't supposed to be the searchable values, field values are. Field names are supposed to be known a priori.
My suggestion is (if possible) to store the second part of the name as the field value, for instance: submitted_form:2398389-2-32-43242423. submitted_from would be the field known a priori, and the value could eventually be searched with a PrefixQuery.
Anyway, you could access the collection of fields' names using IndexReader.getFieldNames() in Lucene 3.x and this in Lucene 4.x. I wouldn't expect search performance there.

CouchBase range search

I'm considering using CouchBase for a very read heavy and write heavy application. I'll also need to support searching based on different attributes of the documents as well as range queries.
CouchBase has views to allow searching beyond key value searches but it seems like this is mainly to get documents within a certain range, eg. get all documents indexed between two specified keys, rather than "give me all documents that have the genre attribute to 'adventure'" or "give me all documents that have creation date between 1/1/1 and 2/1/1"
Is there a way to achieve what I want without an external index?
You can definitely do both of what you describe there. You'd do both with views in Couchbase Server 2.0.
For example, a common technique when needing to search a date range is to emit a JSON array from your map function in the view. This would give you something like:
[2012, 5, 11, 16, 27, 41]
Since when you query a view, a JSON array is a valid place for start key and end key, you can specify that range.
Similarly, extracting all of the attributes you'd emit each one of them from your map function with the doc _id. Then using one of the Couchbase SDKs, you can set the include docs option when querying and the doc will be automatically fetched.

Does Solr store the original contents of the document after indexing?

If I mark a field as "don't store," does Solr retain the original contents of that field anywhere, or does it only retain the "bag of words" that it culls for the index itself?
I'm asking from the standpoint of document security. If someone cracked into the machine running our Solr index, could they get the original text passed into Solr for this "don't store" field, or not?
No, the Solr index does not store the original value in any retrievable or viewable way for fields that are set to stored="false". Common Field options on the Solr wiki states the following behavior of setting the stored option.
True if the value of the field should be retrievable during a search
If someone cracked into the machine running the Solr index and ran Solr queries based on the above they would not be able to see the contents of the field as Solr would not return that field. However if they had access to the disk and the actual index folder and segment files as written by Lucene, they could see the terms that Solr stored for each document in that field using Luke - Lucene Index Toolbox to examine the index folder.
When a field is Storable.No, only enough information is stored for Lucene to perform the search.
However, if you specify WITH_POSITIONS_OFFSETS when constructing each field, there is usually enough information to retrieve:
lowercase(EXACTSTRINGINDEXED) - LUCENEDELIMITERS - STOPWORDS
For example, if you indexed:
Jerry&Mary's Live Bait and Yellow Cab
with an analyzer that treats "&" and "'" as delimiters, did not index single letters, and treated 'and' as a stopword, you would see in the index something like:
jerry mary live bait [null word] yellow cab
(You can verify this with Luke, as mentioned above.)

Resources