How much time azure search take for index data

How much time azure search take for index data - azure

How much time does azure search take to index data?
Suppose I am putting a single record at a time in azure search:
POST https://[service name].search.windows.net/datasources?api-version=2015-02-28-Preview
Content-Type: application/json
api-key: [admin key]
{
"name" : "blob-datasource",
"type" : "azureblob",
"credentials" : { "connectionString" : "<my storage connection string>" },
"container" : { "name" : "my-container", "query" : "<optional-virtual-directory-name>" }
}
So how much time it will take so I can read this data back from REST API?

Your example shows creating a data source, not indexing documents. But, assuming you use indexing API (POST https://[service name].search.windows.net/indexes/index/docs), the delay before the just-indexed documents show up in search results ranges from instant to a few seconds, depending on the service topology and load.

Related

Azure Cognitive Search: Field Mappings

I would like to set up a field mapping in cognitive search. I am using the web based UI, and the import data wizard to create an index and indexer.
I have tried, once having created the index and indexer using the wizard, to add a component to the JSON of the indexer, such as
"fieldMappings": [
{
"sourceFieldName": "metadata_storage_name",
"targetFieldName": "new_storage_name"
}
]
I then run the indexer, it is successful but when using the search Explorer, the field "new_storage_name" is null for all results.
I would really like to add a new field to obtain the unencoded "metadata_storage_path" and hit some problems, but since I am also stuck on this very basic step, I thought I would try to answer it first.
Is there something in the workflow I am getting wrong? I found the MS docs to not be too useful.

You need to use the base64Decode function.
"fieldMappings" : [
{
"sourceFieldName": "metadata_storage_name",
"targetFieldName": "new_storage_name"
"mappingFunction" : {
"name" : "base64Decode",
"parameters" : { "useHttpServerUtilityUrlTokenDecode" : false }
}
}]
Read more:
https://learn.microsoft.com/en-us/azure/search/search-indexer-field-mappings?WT.mc_id=Portal-Microsoft_Azure_Search&tabs=rest#mappingFunctions

CosmosDb Mongo - collection with shardkey, slow query by shardkey?

I have a CosmosDb collection with Mongodb.
This is a customer database, and the ShardKey is actually CustomerId.
My collection has 200000 records, and has an combined index of both e-mail and customerid.
An example of a customer:
{
"CustomerId" : "6a0f4360-d722-4926-9751-9c7fe6a97cb3",
"FirstName" : "This is my company first name",
"LastName" : "This is my company last name",
"Email" : "6a0f4360-d722-4926-9751-9c7fe6a97cb3#somemail.com",
"Addresses" : [
{
"AddressId" : "54e34da9-55fb-4d60-8411-107985c7382e",
"Door" : "11111",
"Floor" : "99",
"Side" : "B",
"ZipCode" : "8888",
}
]
}
What I find strange is if I query by Email it spends 7000RUs (which is too much - at least is what data explorer tells me...) but if I query by CustomerId, it spends more or less the same RUs...
My questions are:
Shoudn't both operations spend less RUs than this, specially by CustomerId?
An example of a query by E-mail:
{ "Email" : { $eq: "3f7da6c3-81bd-4b1d-bfa9-d325388079ab#somemail.com" } }
An example of a query by CustomerId:
{ "CustomerId" : { $eq: "3f7da6c3-81bd-4b1d-bfa9-d325388079ab" } }
Another question, my index contains both Email and CustomerId. Is there any way for me to query by e-mail and return only CustomerId, for example?

Shoudn't both operations spend less RUs than this, specially by CustomerId?
CustomerId is your shard key (aka partition key) which helps in grouping documents with same value of CustomerId to be stored in the same logical partition. This grouping is used during pin-point GET/SET calls to Cosmos but not during querying. So, you would need an index on CustomerId explicitly.
Furthermore, since the index that you have is a composite index on CustomerId and Email, doing a query on only one of these fields at a time will lead to a scan being performed in order to get back the result. Hence the high RU charge and the similar amount of RU charge on each of these queries.
Another question, my index contains both Email and CustomerId. Is there any way for me to query by e-mail and return only CustomerId, for example?
Firstly, in order to query optimally on Email, you would need to create an index on Email separately. Thereafter, you may use the project feature of Mongo to include only certain fields in the response.
Something like this-
find({ "Email" : { $eq: "3f7da6c3-81bd-4b1d-bfa9-d325388079ab#somemail.com" } }, { "CustomerId":1 })

Marklogic|NodeJS API - Query on a specific categorie "properties"

I have a json document in my DB that looks like this :
{
"uri" : "/me/myself/and/bd1e0f91656bfc713eb6560eeaad7ad1.json",
"category" : "content",
"format" : "json",
"versionId" : "14697362595356370",
"contentType" : "application/json",
"contentLength" : "1938",
"collections" : ["http://me.myself.com/collectionA"],
"properties" : {
"relatives" : ["/me/myself/and/B.json", "/me/myself/and/A.json"]
},
"content":{}
}
I'm trying to get all documents that have a specific relative in the properties:
qb.where(
qb.scope(
qb.property('relatives'),
qb.word("/me/myself/and/B.json"),
qb.fragmentScope('properties')
))
But i keep getting a large set of document that doesn't fit the query.
Any idea how to do this using the Marklogic NodeJS API?

I see two things that look like they might be problems. The first is qb.fragmentScope('properties'). This tells MarkLogic to look in the document's properties, rather than the document's content. That doesn't look like what you meant, given your sample JSON document.
The second problem is the word query -- "/me/myself/and/B.json" is likely being broken up into its constituent words (me, myself, and, B, json), which are then matching in other documents. You want to match exactly what's there, so try a value query:
qb.where(
qb.scope(
qb.properties('properties'),
qb.value('relatives', '/me/myself/and/B.json')
)
)
Note that the qb.scope and the qb.properties are to restrict the search to just match the value when it appears in relatives under a properties JSON property. This is different from the JSON property-versus-content point made above.

qb.where(
qb.propertiesFragment(
qb.term('/me/myself/and/B.json')
)
)
This worked for me.

Looking for a REST API with template based fake data (NodeJS + key-value store)

I want to be able to write a config like this...
{
'collection' : 'payments',
'rows' : 100,
'template' : {
"id" : "1...100000",
"status" : ["None", "SentToPayer", "Overdue", "Completed"],
"amount" : ["100","200","500"]
}
}
...and it would create a key-value pair collection in a key-value store (maybe MongoDB) with 100 rows like this:
{
"id": "80494",
"status": "None",
"amount": "200"
}
And the data would be fully accessible and editable trough a REST API.
GET http://server/payments/80494 would likely get me the node above.
I am pretty sure I've seen something like this before, but I'm not able to find it right now. Does anyone know something that gives me what I want?

I ended up creating my own thingy.
This one puts data in an MongoDB database.
https://github.com/janjarfalk/datafixture.js
This one expose it.
https://github.com/tdegrunt/mongodb-rest

You can try json-server to fake a REST API on the server side, or FakeRest to fake a REST API on the client side.

Search engine by distance

I am looking to make an option of my serach engine on my site so that users can search for items within a set distance, e.g. search items within 10 miles, or 20 miles etc. I was wondering how this could be done?
The user would have to enter thier postcode, while i also have the postcode of the item's location and once they hit search there needs to be a away to work the distance between the two locations in miles and then display the results in order by distance; as in the closest item is the first result. Is there a google api for this as in use the maps 'get directions' option to work out the distance in miles? Or something i can add to my database?

The Google Geocoding API provides zip code lookup and can provide the country, city, lat/lon given even just a zip code as the address. Once you have the lat/lon then you can easily calculate the distance and sort the results.
http://code.google.com/apis/maps/documentation/geocoding/#GeocodingRequests
Note: you can only use the Geocoding API in conjunction with a Google map; geocoding results without displaying them on a map is prohibited. For complete details on allowed usage, consult the link.
So if for example if you request lookup for zip code 94043 you call following URL:
http://maps.googleapis.com/maps/api/geocode/json?address=94043&sensor=false
Which would result with JSON such as following:
{
"results" : [
{
"address_components" : [
{
"long_name" : "94043",
"short_name" : "94043",
"types" : [ "postal_code" ]
},
...
"location" : {
"lat" : 37.4284340,
"lng" : -122.07238160
},
"location_type" : "APPROXIMATE",
...
"status" : "OK"
}
If you cannot use the Google API for some reason then here is list of non-Google Geocoder APIs and services:
http://code.google.com/p/gmaps-api-issues/wiki/NonGoogleGeocoders
http://webgis.usc.edu/Services/Geocode/About/GeocoderList.aspx

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How much time azure search take for index data - azure

Related

Azure Cognitive Search: Field Mappings

CosmosDb Mongo - collection with shardkey, slow query by shardkey?

Marklogic|NodeJS API - Query on a specific categorie "properties"

Looking for a REST API with template based fake data (NodeJS + key-value store)

Search engine by distance

Categories

Resources