How to set attachment size limit for an individual database - couchdb

Is there a way for restricting attachment size limits on an individual database in CouchDB?
I have a database videosdb, in my social networking site and I want to restrict attachment size on it. So that user can't upload video greater than a fixed limit.
Moreover if there is such functionality, than do it work in multipart request say during replication like this.
PUT /target/SpaghettiWithMeatballs?new_edits=false HTTP/1.1
Accept: application/json
Content-Length: 1030
Content-Type: multipart/related; boundary="864d690aeb91f25d469dec6851fb57f2"
Host: localhost:5984
User-Agent: CouchDB

As for replication, you can write validate_doc_update function, which will sum all .length props of each attachment, and reject a doc if the limit was reached. Replicated docs always have that prop for each attachment.
This however generally doesn’t hold for saving docs, since json being saved may not have .length props for new attachments.

There is no out-of-the-box way to do this on a per-database level. However, you can likely accomplish your goal by writing an update function for that database, which rejects documents that contain attachments that are too large.
But note that it is not advisable to use CouchDB to store large attachments anyway, as CouchDB was not designed for this use case, and is not particularly efficient for that type of load. It is generally considered best practice to store large attachments externally such as in an S3 bucket or similar, and have your CouchDB documents reference them by ID or URL.

Related

How to upload video to mogondb using expressjs?

I want to make an web app where user able to upload to upload.
So how can i make the api where the video will be stored in mongodb.
Storing an entire file is not an option, you can store using binary data, but is not the best approach.
As Kris said, are better options like AWS S3. On the other hand, remember the limitation of 16MB per document, that is the maximum amount of data a document can store, so in case you decide to use binary data to store your videos you need having in mind this limit.

How do we handle concurrent writes to Azure Search documents?

Use Case and Question
making one network call to fetch a document from search into memory,
then changing that document in memory, and
finally making a second network call to update that document in search.
How can we guarantee that step (3) fails if the document has changed since step (1).
Using an ETag in the request header seems ideal. We found docs on doing that for non-document resources (e.g., index fields), but we found scant documentation on doing that with document resources.
Our Research
The .NET v11 SearchClient provides .NET methods for the update and the merge operations, but does not explicitly mention concurrency in the XML comments of those methods.
In the source code for the v11 SearchClient, both the upload and merge operations call into IndexDocumentsInternal. That method has no explicit ETag usage (and it ends up calling into the REST API).
The REST API headers documentation says of ETags that it supports indexers, indexes, and data sources, but not documents.
The REST API response documentation includes a 409; however, the docs do not give a lot of information on when this response would be returned. The 409 means that a, version conflict was detected when attempting to index a document. This can happen when you're trying to index the same document more than once concurrently.
This related SO item does not answer our question: Does Azure Search Provides Etags for managing concurrency for Add, Update or Delete Documents?

How to upload large files for signature using SDK

I need to upload a large file in order to request signatures from multiple signers. I'm assuming that I use the chunked upload but, I'm not sure how to tie the upload to an envelope. Does anybody have an example of how to use a chunked upload and tie to an envelope?
TIA!
If your document is about 18MB or smaller you can use the regular API / SDK calls and BASE64 encode the document so it's included in the request JSON. At this time, the SDKs always BASE64 encode documents.
For documents larger than 18 MB and smaller than 25MB, you can use the regular API calls if you include the binary version of the document using multipart encoding. See the docs. Also see working examples: Node.js example, Java example.
For documents larger than 25MB and smaller than about 50MB, you use the ChunkedUploads API methods.

How to get images to the client?

I'm building a website on express.js and I'm wondering where to store images. Certain 'static' website info like the team pages will be backed by a database. If new team members come onboard, we push new data to CouchDB and a new team page shows up on the site.
A team page will include a profile picture, which will be stored in CouchDB along with other data.
Should I be sending the image through the webserver or just sending the reference to where the image is and having the client grab the image from the database, since CouchDB is an HTTP server itself?
I am not an expert from Couch DB, but here is my 2 cents. In general hitting DB for every image, is going to increase the load. If the website is going to be accessed by many people, that will be a lot.
Ideal way is serve it with CDN, and have the CDN server point to your resource server/ webserver.
You can store the profile pics (and any other file) as attachments to the docs. The load is the same like for every other web-server.
Documentation:
attachment endpoint /db/doc/attachment
document endpoint /db/doc
CouchDB manages the ETags for attachments as well as for docs or views. Clients which have cached the pics already will get a light-weight 304 response for every identical request. You can try it out with my CouchDB based blog lbl.io. Open you favorite browser developer bar and observe the image requests during multiple refreshes.
Hint 1: If you have the choice between inline-attachment-upload (Base64 encoded in the doc, 1 request to create a doc with attachment) or upload-attachment-only (multipart/related in the original content type, 2 requests to create a doc with attachment or 1 request to create an attachment when the doc already exists) .... then choose the second. Its more efficient handled by CouchDB.
Hint 2: You can configure CouchDB to handle gzip compression by the content-type of attachments. It reduces the load a lot.
I just dump avatars in /web/images/avatars, store the filename only in couchdb, and serve the folder with express.static()
You certainly can use a couchdb attachment
You can also create an amazon s3 bucket and save the absolute https path on your user objects

Does any cloud object stores support object metadata indices?

I have a very large document store - about 50 million JSON docs, with 50m more added per year. Each is about 10K. I would like to store them in a cloud storage and retrieve them via a couple structured metadata indices that I would update as I add documents to the store.
It looks like AWS S3, Google Cloud Storage and Azure allow custom metadata to be returned with an object, but not used as part of a GET request to filter a collection of objects.
Is there a good solution "out-of-the-box" for this? I can't find any, but it seems like my use case shouldn't be really unusual. I don't need to query by document attributes or to return partial documents, I just need to GET a collection of documents by filtering on a handful of metadata fields.
The AWS SimpleDB page mentions "Indexing Amazon S3 Object Metadata" as a use case, and links to a library that hasn't been updated since 2009.
They are simply saying that you can store and query the metadata in amazon simple DB which is a NoSQL database provided by amazon for you. Depending on the kind of metadata you have, you could also store it in an RDBMS. Few 100 million rows isn’t too much if you create the proper indices and you can store URLs or file names, to access the files stored on S3, Azure, … afterwards.

Resources