I am trying to filter rows against a String Type column. Basically I wanted to filter with part of string. It is very similar to LIKE operation in MySQL.
I have gone through this document https://learn.microsoft.com/en-us/rest/api/storageservices/querying-tables-and-entities
However, I couldn't find relevant information for my requirement. Any suggestion more helpful.
Basically I wanted to filter with part of string. It is very similar
to LIKE operation in MySQL.
Azure Tables have limited querying support and unfortunately LIKE is unsupported. What you would need to do is fetch all the entities on the client side and then apply the filter there.
Related
I have a need to query JSON data stored in Azure blob storage, for operations of filtering (on data types text, data and int), paging (i.e. a functionality similar to skip and take).
The problem my JSON structure is that there is no specific format of JSON data (key/value pair) and is dynamic . Hence the key/value pair of one JSOn result data can differ from another JSOn result data.
Can Azure search help in building indexes on such dynamic JSOn data so that the same can be queried or is there another preferred way?
Take a look at this https://learn.microsoft.com/en-us/azure/search/search-howto-index-json-blobs maybe it can help you.
Other option might be to export json from blob storage into Azure SQL Database or DocumentDB (maybe not everything - if you can you can export just part of data that you need) and query it there.
If you only need filtering like exact matches and numerical comparisons, then a document database such as DocumentDB may be a better choice than Azure Search.
Azure Search excels in linguistically aware full text search (including things like dealing with inflected word forms, misspellings, fuzzy matching, etc.)
As Jovan pointed out, the options are not mutually exclusive - you can use DocumentDB as the primary store and Azure Search for full text search scenarios (getting data from DocumentDB using DocumentDB indexer if necessary).
I have two fairly general question about full text search in a database. I was looking into elastic search and solr and it seems to me that one needs to produce separate documents made up of table entries, which then get searched. So the result of such a search is not actually a database entry? Or did I misunderstand something?
I also looked into whoosh search, which does index table columns and the result of whoosh are actual table rows.
When using solr or elastic search, should I put the row id into the document which gets searched and after I have my result use that id to retrieve the relevant rows from the table? Or is there a better solution?
Another question I have is if I have a id like abc/123.64664, which is stored as a string, is there any advantage in searching such a column with a FTS? It seems to me there is not much to be gained by indexing? Or am I wrong?
thanks
Elasticsearch can store the indexed document, and you can retrieve it as a part of query result. Usually ppl still store the original data in an usual DB, it gives you more reliability and flexibility on reindexing. Mind that ES indexes non-relational data. You can have you data stored in relational manner and compose denormalized documents for indexing.
As for "abc/123.64664" you can index it as tokenized string or you can tune the index for prefix search etc. It's up to you
(TL;DR) Don't think about what your data is structured in your RDBS. Think about what you are searching.
Content storage for good full text search is quite different from relational database standard storage. So, your data going into Search Engine can end up looking quite differently from the way you stored it.
This is all driven by your expected search results. You may increase granularity of the data or - opposite - denormalize it so the parent/related record content shows up in the records you actually want returned as part of search. Text processing (copyField, tokenization, pre-processing, etc) is also where a lot of content modifications happen to make a record findable.
Sometimes, relational databases support full-text search. PostgreSQL is getting better and better at that. But most of the time, relational databases just do not provide enough flexibility to support good relevancy-driven search.
Finally, if the original schema is quite complex, it may make sense to only use search engine to get the right - relevant - IDs out and then merge them in the client code with the details from the original database records.
i am new with nosql concept, so when i start to learn PouchDB, i found this conversion chart. My confusion is, how PouchDB handle if lets say i have multiple table, does it mean that i need to create multiple databases? Because from my understanding in pouchdb a database can store a lot of documents, but a document mean a row in sql or am i misunderstood?
The answer to this question seems to be surprisingly under-documented. While #llabball clearly gave a decent answer, I don't think that views are always the way to go.
As you can read here in the section When not to use map/reduce, Nolan explains that for simpler applications, the key is to abuse _ids, and leverage the power of allDocs().
In other words, if you had two separate types (say artists, and albums), then you could prefix the id of each type to obtain an easily searchable data set. For example _id: 'artist_name' & _id: 'album_title', would allow you to easily retrieve artists in name order.
Laying out the data this way will result in better performance due to not requiring extra indexes, and less code. Clearly however, if your data requirements are more complex, then views are the way to go.
... does it mean that i need to create multiple databases?
No.
... a document mean a row in sql or am i misunderstood?
That's right. The SQL table defines column header (name and type) - that are the JSON property names of the doc.
So, all docs (rows) with the same properties (a so called "schema") are the equivalent of your SQL table. You can have as much different schemata in one database as you want (visit json-schema.org for some inspiration).
How to request them separately? Create CouchDB views! You can get all/some "rows" of your tabular data (docs with the same schema) with one request as you know it from SQL.
To write such views easily the property type is very common for CouchDB docs. Your known name from a SQL table can be your type like doc.type: "animal"
Your view names will be maybe animalByName or animalByWeight. Depends on your needs.
Sometimes multiple-databases plan is a good option, like a database per user or even a database per user-feature. Take a look at this conversation on CouchDB mailing list.
I have ran through the Google API for Freebase, but still confusing.
Is there simple way to dump the relations from Freebase?
I want to dump all entity-name-pair with a specific relation (e.g. marry_with, ...), and also want the chinese entity names.
Should I
write MQL to query all entity satisfying the condition? (but the MQL service is going to be retired recently. )
or dump all freebase and parse?
or is there other API capable of doing this?
or other KB (YAGO, DBpedia, wikidata) is more easier of doing this?
Which way is easier to work out.
Please shed me some direction . thanks
Freebase was retired and Wikidata is the recommended alternative.
You can use the Wikidata Query API to get entities with a specific property.
For instance, the query http://wdq.wmflabs.org/api?q=CLAIM[26] retrieves the IDs of all items having the property spouse (P26).
You can combine this with the Wikidata API, for instance to get labels and aliases in English for the first three items returned by the previous query:
http://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q23|Q24|Q42&languages=en&props=labels|aliases
Sorry if this sounds like a rather dumb question but I would like to do a "select" on data from a Windows Azure table. I tried the following and it worked:
from question in _statusTable.GetAll()
where status.RowKey.StartsWith(name)
I then tried
from question in _statusTable.GetAll()
where status.Description.StartsWith(name)
This one gave me nothing. Can anyone explain to me if or how I can query on rows that are not part of the RowKey or PartitionKey.
You can query on any property, but the types of query supported are limited - e.g. StartsWith isn't supported. Also if you aren't querying on PartitionKey and RowKey, then there are some very important performance issues to understand - and you always need to be aware of ContinuationToken's - almost any query result can contain these.
You can see the sorts of queries supported by looking at the REST API: http://msdn.microsoft.com/en-us/library/dd894031.aspx - it's pretty limited (but quick as a result):
Equal
GreaterThan
GreaterThanOrEqual
LessThan
LessThanOrEqual
NotEqual
If you need to do more, then:
you can mimic things like StartsWith("Fred") by doing a GreaterThanOrEqualTo("Fred") and LessThan("Free")
or client side filtering will work - but that means pulling back all the rows from the storage - which could be a lot of data and which could be computationally and transactionally expensive!
What does GetAll() do? StartsWith isn't supported by WA tables, so I'm assuming GetAll pulls all the data local, and so your query is done over objects in memory. If so, this has nothing to do with Windows Azure, so I'd take a look at whether your data looks like you expect it to.