Does anyone knows how data is beeing retrived from table storage?
var result = ctx.CreateQuery<Contact>("Contacts")
.Where(x => x.PartitionKey == "key")
.Take(50)
.AsTableServiceQuery<Contact>().Execute();
foreach(var item in result)
{
Console.WriteLine(item.FirstName);
}
Does it get all items from storage and than loops through them or it get each item separately?
Take a look at the following links.
This one talks about the basics of table storage -
http://msdn.microsoft.com/en-us/magazine/ff796231.aspx
This one covers more than you are asking about, but there are some How To code examples that might be useful for querying table storage - http://www.windowsazure.com/en-us/develop/net/how-to-guides/table-services/
I also recommend this video from the PDC. It's a deep dive into tables and queues in Azure. - http://www.microsoftpdc.com/2009/svc09
You could have checked this using Fiddler. Table service is a REST Service, the CreateQuery() method creates REST Query, executes a HTTP REST Call, then parses the result, which is a XML containing all the entities in the result for the query (limit to 1000 and including continuation tokens if result is more than 1000). All the items are in the result XML, there is no point for querieng every single item from the result.
Related
I have an ADF pipeline which is iterating over a set of files, performing various operations and I have an Azure CosmosDB (SQL API) instance where I would like to insert the name of file and a timestamp, mainly to keep track on which files have been already processed and which not, but in the future I might want to add some other bits of data related to each file.
What I have is my CosmosDB
And currently I am trying to utilice the Copy Data Activity for the insert part.
One problem that I have is that this particular activity expects source while at this point I have only the filename. In theory it was an option to use the Blob Storage from where I read the file at the beginning, but since the Blob Storage is set to store binary files I got the following error if I try to use it as source
Because of that I created a dummy CosmosDB Linked service, but I have several issues with this approach:
Generally the idea for dummy source is not very appealing to me
I haven't find a lot of information on the topic but it seems that if I want to use something in the Sink I need to SELECT from the source
Even though I have selected a value for the id the item is not saved with the selected value from the Source query, but as you can see from the first screenshot I got a GUID and only the name is as I want it.
So my questions are two. I just learn ADF but this approach doesn't look like the proper way to insert item into CosmosDB from activity, so a better/more common approach would be appreciated. If there is not better proposal, how can I at least apply my own value for the id column? If I create the item in the CosmosDB GUI and save it from there, as you can see I am able to use the filename as id which for now seems like a good idea to me, but I wasn't able to add custom value (string or int) when I was trying through the activity, so how can I achieve this?
This is how my Sink looks like
I'm trying to add entries into my cosmosdb using Azure Data Factory - However i am not able to choose the right collection as Azure Data Factory can only see the top level of the database.
Is there any funny syntax for choosing which collection to pick from Cosmos DB SQL API? - i've tried doing, entities[0] and entities['tasks'] but none of them seem to work
The new entries are inserted as we see in the red box, how do i get the entries into the entries collection?
Update:
Original Answer:
If the requirement you mentioned in the comments is what you need, then it is possible. For example, to put JSON data into an existing ‘tasks’ item, you only need to use the upsert method, and the source json data has the same id as the ‘tasks’ item.
This is the offcial doc:
https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db#azure-cosmos-db-sql-api-as-sink
The random letters and numbers in your red box appear because you did not specify the document id.
Have a look of this:
By the way, if the tasks have partitional key, then you also need to specify.
How to get an activity by its id (unique uuid) or by foreign_id + time?
I could not find it in documentation. All information there represents how to get feed in pages. Not a single activity.
If you save the id you get from adding an activity, then you can opt to fetch activities based on the id, using "id_gte" or "id_lte" and only fetch with offset 0 and limit 1. Such as:
$feed->getActivities(0, 1, ['id_gte' => $id]);
This code is based on php, but their sdk should have equal functions for other languages if you require.
The general goal should be to use Stream as a secondary data store. Your proprietary data from your customers should always be accessible in your own primary data store: most likely a RDBMS like PostgreSQL. When new follow relationships get created, or new activities get added, you should store them locally, and replicate the data to GetStream. Then access feeds when a users wants to see a timeline or notification feed, and complement your data from data found in your own DB (for example: comments, likes, author information, ...)
For this reason, there is no getActivity(uuid) method available.
Using the python client you can get an activity by its ID
import stream
client = stream.connect('YOUR_API_KEY', 'API_KEY_SECRET')
client.get_activities(ids=[activity_id])
Or by its foreign_id + time
client.get_activities(foreign_id_times=[
(foreign_id, activity_time),
])
Taken from https://github.com/GetStream/stream-python/blob/main/README.md
HI I'm new to cloudant (and couch and asking questions on stackoverflow so I hope I manage to be vaguely clear about what I'm asking ) and I'm trying to do probably the second most basic geo task but am hitting a dead end.
I've got a database of docs which are geojson objects, I've created an index so I can query for intersections etc but it seems the only options I have in the url is the format=legacy (gives me the ids) and the format=geojson and the include_docs parameter - what I'd like to do is give back a particular view of the result set - I'm not interested in the geometry of the object (which is a big lump of data and it's likely that a number of other properties may be in the document that I'd rather filter out)
is there a correct way to do this in a single api call or do I need to fetch the doc ids (legacy format) and then issue a second query to bring back my chosen 'view' for each document id given in the result of format=legacy response
Thanks
I am writing an application that stores external data in ArangoDB for further processing inside the application. Let's assume I am talking about Photos in Photosets here.
Due to the nature of used APIs, I need to fetch Photosets befor I can load Photos. In the Photosets API reply, there is a list of Photo IDs that I later use to fetch the Photos. So I created an edge collection called photosInSets and store the edges between Photosets and Photos, although the Photos are not there yet.
Later on, I need to get a list of all needed Photos to load them via the API. All IDs are numeric. At the moment, I use the following AQL query to fetch the IDs of all required Photos:
FOR edge
IN photosInSets
RETURN DISTINCT TO_NUMBER(
SUBSTITUTE(edge._from, "photos/", "")
)
However... this does not look like a nice solution. I'd like to (at least) get rid of the string operation to remove the collection name. What's the nice way to do that?
One way you can find this is with a join on the photosInSets edge collection back to the photos collection.
Try a query that looks like this:
FOR e IN photoInSets
LET item = (FOR v IN photos FILTER e._from == v._id RETURN v._key)
RETURN item
This joins the _from reference in photoInSets with the _id back in the photos collection, then pulls the _key from photos, which won't have the collection name as part of it.
Have a look at a photo item and you'll see there is _id, _key and _rev as system attributes. It's fine to use the _key value if you want a string, it's not necessary to implement your own unique id unless there is a burning reason why you can't expose _key.
With a little manipulation, you could even return an array of objects stating which photo._key is a member of which photoSet, you'll just have to have two LET commands and return both results. One looking at the Photo, one looking at the photoSet.
I'm not official ArangoDB support, but I'm interested if they have another way of doing this.