How to fetch subset of attachments - couchdb

I have a CouchDB with documents, which look like this:
{
"_id": "000040cc-e3b4-47cc-b051-a5508efb8996",
"_rev": "1-882d7f88cc2e1e767b55d0c82fb638d2",
"state": "uploaded",
"state_since": "2020-02-17T11:20:55.1450252Z"
// more metadata ...
"_attachments": {
"large.jpg": {
"content_type": "image/jpeg",
"revpos": 1,
"digest": "md5-NK7ejYjrErhMAs7tZ4+R8w==",
"length": 87846,
"stub": true
},
"medium.jpg": {
...
},
"small.jpg": {
...
}
}
}
Let's assume, I want to query a set of images like this:
{
"selector": {
"state": "uploaded"
},
"sort": ["state_since"],
"limit": 100
}
If I want to display the thumbnails of those 100 images, I'd have to iterate through the result list and download the corresponding attachments. This would be 101 requests in total.
I could also do it in one request by specifying, that I want to fetch the documents with attachments. But this would return all (potentially very large) attachments.
I know that I can set the fields property in my query to only return the fields I need. But can I apply this to attachments, too? And if yes: how?

No, there's no way to do what you're requesting. The only ways to fetch a subset of attachments are by fetching them one at a time, or by using the atts_since attribute when fetching a single document, which is intended for use in replication.
Perhaps consider re-designing your documents. Perhaps you can store your thumbnails on a separate document, that only contains thumbnails.

Related

Forge-Get Item Path along with custom attributes in BIM360 document

Two Requirements are needed:
Get item path of the document in a BIM360 document management.
Get all custom attributes for that item.
For Req. 1, an api exists to fetch and for getting custom attributes, another api exists and data can be retrived.
Is there a way to get both the requirements in a single api call instead of using two.
In case of large number of records, api to retrieve item path is taking more than an hour for fetching 19000+ records and token gets expired though refesh token is used, while custom attribute api processes data in batches of 50, which completes it in 5 minutes only.
Please suggest.
Batch-Get Custom Attributes is for the additional attributes of Document Management specific. While path in project is a general information with Data Management.
The Data Management API provides some endpoints in a format of command, which can ask the backend to process the data for bunch of items.
https://forge.autodesk.com/en/docs/data/v2/reference/http/ListItems/
This command will retrieve metadata for up to 50 specified items one time. It also supports the flag includePathInProject, but the usage is tricky and API document does not indicate it. In the response, it will tell the pathInProject of these items. It may save more time than iteration.
{
"jsonapi": {
"version": "1.0"
},
"data": {
"type": "commands",
"attributes": {
"extension": {
"type": "commands:autodesk.core:ListItems",
"version": "1.0.0" ,
"data":{
"includePathInProject":true
}
}
},
"relationships": {
"resources": {
"data": [
{
"type": "items",
"id": "urn:adsk.wipprod:dm.lineage:vkLfPabPTealtEYoXU6m7w"
},
{
"type": "items",
"id": "urn:adsk.wipprod:dm.lineage:bcg7gqZ6RfG4BoipBe3VEQ"
}
]
}
}
}
}
Get item path of the document in a BIM360 document management.
Is this question about getting the hiarchy of the item? e.g. rootfolder>>subfolder>>item ? With the endpoint, by specifying the query param includePathInProject=true, it will return the relative path of the item (pathInProject) in the folder structure.
https://forge.autodesk.com/en/docs/data/v2/reference/http/projects-project_id-items-item_id-GET/
"data": {
"type": "items",
"id": "urn:adsk.wipprod:dm.lineage:xxx",
"attributes": {
"displayName": "my-issue-att.png",
"createTime": "2021-03-12T04:51:01.0000000Z",
"createUserId": "xxx",
"createUserName": "Xiaodong Liang",
"lastModifiedTime": "2021-03-12T04:51:02.0000000Z",
"lastModifiedUserId": "200902260532621",
"lastModifiedUserName": "Xiaodong Liang",
"hidden": false,
"reserved": false,
"extension": {
"type": "items:autodesk.bim360:File",
"version": "1.0",
"schema": {
"href": "https://developer.api.autodesk.com/schema/v1/versions/items:autodesk.bim360:File-1.0"
},
"data": {
"sourceFileName": "my-issue-att.png"
}
},
"pathInProject": "/Project Files"
}
or if you may iterate by the data of parent
"parent": {
"data": {
"type": "folders",
"id": "urn:adsk.wipprod:fs.folder:co.sdfedf8wef"
},
"links": {
"related": {
"href": "https://developer.api.autodesk.com/data/v1/projects/b.project.id.xyz/items/urn:adsk.wipprod:dm.lineage:hC6k4hndRWaeIVhIjvHu8w/parent"
}
}
},
Get all custom attributes for that item. For Req. 1, an api exists to fetch and for getting custom attributes, another api exists and data can be retrived. Is there a way to get both the requirements in a single api call instead of using two. In case of large number of records, api to retrieve item path is taking more than an hour for fetching 19000+ records and token gets expired though refesh token is used, while custom attribute api processes data in batches of 50, which completes it in 5 minutes only. Please suggest.*
Let me try to understand the question better. Firstly, two things: Custom Attributes Definitions, and Custom Attributes Values(with the documents). Could you clarify what are they with 19000+ records?
If Custom Attributes Definitions, the API to fetch them is
https://forge.autodesk.com/en/docs/bim360/v1/reference/http/document-management-custom-attribute-definitions-GET/
It supports to set limit of each call. i.e. the max limit of one call is 200, which means you can fetch 19000+ records by 95 times, while each time calling should be quick (with my experience < 10 seconds). Totally around 15 minutes, instead of more than 1 hour..
Or at your side, each call with 200 records will take much time?
If Custom Attributes Values, the API to fetch them is
https://forge.autodesk.com/en/docs/bim360/v1/reference/http/document-management-versionsbatch-get-POST/
as you know, 50 records each time. And it seems it is pretty quick at your side with 5 minutes only if fetch the values of 19000+ records?

How to include fields in api server and remove it before returning to results to client in Graphql

I have a Node.js GraphQL server. From the client, I am trying get all the user entries using a query like this:
{
user {
name
entries {
title
body
}
}
}
In the Node.js GraphQL server, however I want to return user entries that are currently valid based on publishDate and expiryDate in the entries object.
For example:
{
"user": "john",
"entries": [
{
"title": "entry1",
"body": "body1",
"publishDate": "2019-02-12",
"expiryDate": "2019-02-13"
},
{
"title": "entry2",
"body": "body2",
"publishDate": "2019-02-13",
"expiryDate": "2019-03-01"
},
{
"title": "entry3",
"body": "body3",
"publishDate": "2020-01-01",
"expiryDate": "2020-01-31"
}
]
}
should return this
{
"user": "john",
"entries": [
{
"title": "entry2",
"body": "body2",
"publishDate": "2019-02-13",
"expiryDate": "2019-03-01"
}
]
}
The entries is fetched via a delegateToSchema call (https://www.apollographql.com/docs/graphql-tools/schema-delegation.html#delegateToSchema) and I don't have an option to pass publishDate and expiryDate as query parameters. Essentially, I need to get the results and then filter them in memory.
The issue I face is that the original query doesn't have publishDate and expiryDate in it to support this. Is there a way to add these fields to delegateToSchema call and then remove them while sending them back to the client?
You are looking for transformResult
Implementation details are:
At delegateToSchema you need to define transforms array.
At Transform you need to define transformResult function for filtering results.
If you have ability to send arguments to remote GraphQL server, then you should use
transformRequest

Filter couchdb document based on value from nested child document

I would like to create a map/reduce function that filters the documents based on a nested value from the child document. But retrieve the parent document.
I have following documents:
{
"_id": "1",
"_rev": "1-991baf1d86435a73a3460335cc19063c",
"configuration_id": "225f9d47-841c-43c2-90c2-e65bb49083d3",
"name": "test",
"image": "",
"type": "A",
"created": "",
"updated": 1,
"destroyed": ""
}
{
"_id": "225f9d47-841c-43c2-90c2-e65bb49083d3",
"_rev": "1-3e3a1c357c86cbd1cd42b5980b9655a4",
"configuration_packages_id": "cd19b0ba-157d-4dd4-adac-56fd470bfed4",
"configuration_distribution_id": "5b538411-ca99-46c7-ac3c-1f382e4577a9",
"type": "CONFIGURATION",
"configuration": {
"hostname": "example123",
"images": [
"image1",
"image2"
]
}
}
Now I would like to retrieve all the documents of type A and with hostname example123.
At the moment I retrieve all the document of type A like this:
function (doc) {
if (doc.type === "A") {
emit([doc.updated], doc);
}
}
But now I would also like to filter on the host name as well.
I'm not sure on how to achieve this with CouchDB.
TLDR;
You cannot do this
Details
Your "nested" document is only accessible through a join but you can't query it.
The correct way to do that kind of query natively would have been to have a real nested document inside the parent document. Separating those documents has a cost.
Join example
function (doc) {
if (doc.type === "A") {
emit([doc.updated,0]);
emit([doc.updated,1],["_id":doc.configuration_id]);
}
}
If you query the view with "include_docs=true", this will get you the configuration document linked as well as the parent document itself. Then you can query to get the updated docs, merge the nested(1) with the parents(0) and filter them.

Pagination with per-row access rights

Hi I am using CouchDB and assuming I have an articles document with the field users, containing an array of user IDs that are allowed to view this article.
Example scenario, there will be a paginated table view showing 10 articles per page, my controller will retrieve the first 10 articles from CouchDB then perform the access rights check one by one on the returned articles. But the current user may only have view access rights on say, 8 of them, therefore the table will only show 8 articles instead of 10.
What are the best practice of handling such situation besides implementing the access rights logic on the CouchDB layer?
To accomplish this, I would simply use a view keyed on the users field:
function (doc) {
doc.users.forEach(function (user) {
emit([ user ]);
});
}
I emitted an array with just 1 item in this case. I presume you'd also emit something like doc.created in order to have your articles sorted, you would simply add them after user in that array.
The view results would look something like:
{
"rows": [
{ "id": "<article-1>", "key": [ "<user-1>", "<created>" ] },
{ "id": "<article-2>", "key": [ "<user-1>", "<created>" ] },
{ "id": "<article-3>", "key": [ "<user-1>", "<created>" ] },
{ "id": "<article-1>", "key": [ "<user-2>", "<created>" ] },
{ "id": "<article-1>", "key": [ "<user-3>", "<created>" ] }
]
}
You can simply paginate like you normally would with CouchDB. You simply use start_key=["<user-1>"]&end_key=["<user-1>","\ufff0"] in addition to the usual paging limit=10&skip=0 for page 1, limit=10&skip=10 for page 2, etc.

How do I do the SQL equivalent of "DISTINCT" in CouchDB?

I have a bunch of MP3 metadata in CouchDB. I want to return every album that is in the MP3 metadata, but no duplicates.
A typical document looks like this:
{
"_id": "005e16a055ba78589695c583fbcdf7e26064df98",
"_rev": "2-87aa12c52ee0a406084b09eca6116804",
"name": "Fifty-Fifty Clown",
"number": 15,
"artist": "Cocteau Twins",
"bitrate": 320,
"album": "Stars and Topsoil: A Collection (1982-1990)",
"path": "Cocteau Twins/Stars and Topsoil: A Collection (1982-1990)/15 - Fifty-Fifty Clown.mp3",
"year": 0,
"genre": "Shoegaze"
}
I believe your map/reduce would look something like:
function map(doc) {
emit(doc.album, null);
}
function reduce(key, values) {
return null;
}
Remember to query with the extra parameter group=true
Have a look at View Cookbook for SQL Jockeys' Get Unique Values section.

Resources