CouchDB - Is it safe to delete the view files in mrview?

CouchDB - Is it safe to delete the view files in mrview? - couchdb

I am working on a project using CouchDB and the partition or storage dedicated for the couchdb files reached its maximum capacity that the site fails to connect with CouchDB and produces connection errors. I know that couch is storage hungry but I never expect this soon. I have tried compacting methods such as:
curl -H "Content-Type: application/json" -X POST http://localhost:5984/dbname/_compact
curl -H "Content-Type: application/json" -X POST http://localhost:5984/dbname/_view_cleanup
and,
localhost:5984/dbname/_compact/all_view_documents
The above commands only released 2GB of storage. As I searched which files in the partition consumes most of the storage, I found out that in a particular folder /usr/local/var/lib/couchdb/.dbname/mrview found a view file that is 144GB in size is in still there even when I used to compact all view documents.
Note: the compacted document file / database file is only 1.6GB, the total partition storage is 150GB

I encountered this problem too.
Apparently these are files that increase when the design views are modified because the indexes are recomputed
my test was to delete the files and functionally testing my application
were re-create some of these files but with a much smaller
could assume that there is no problem in deleting and that this increase in size should not worry unless we change layout views
http://www.staticshin.com/programming/does-updating-a-design-document-in-couchdb-cause-rebuilding-of-views/

Related

Upload file on Linux (CLI) to Dropbox (via bash/sh)?

I need to save (and overwrite) a file via the cron (hourly) to my dropbox account. The file needs to be stored in a predefined location (which is shared with some other users).
I have seen the possibility to create a Dropbox App, but that create its own dropbox folder.
Also looked at Dropbox Saver but that seems for browsers.
I was thinking (hoping) something super lightweight, a long the lines of CURL, so i don't need to install libraries. Just a simple sh script would be awesome. I only need to PUT the file (overwrite), no need to read (GET) it back.
Was going thru the dropbox developer API documentation, but kind of got lost.
Anybody a good hint?

First, since you need to access an existing shared folder, register a "Dropbox API" app with "Full Dropbox" access:
https://www.dropbox.com/developers/apps/create
Then, get an access token for your account for your app. The easiest way is to use the "Generate" button on your app's page, where you'll be sent after you create the app. It's also accessible via the App Console.
Then, you can upload to a specified path via curl as shown in this example:
This uploads a file from the local path matrices.txt in the current folder to /Homework/math/Matrices.txt in the Dropbox account, and returns the metadata for the uploaded file:
echo "some content here" > matrices.txt
curl -X POST https://content.dropboxapi.com/2/files/upload \
--header "Authorization: Bearer <ACCESS_TOKEN>" \
--header "Dropbox-API-Arg: {\"path\": \"/Homework/math/Matrices.txt\"}" \
--header "Content-Type: application/octet-stream" \
--data-binary #matrices.txt
<ACCESS_TOKEN> should be replaced with the OAuth 2 access token.

#Greg's answer also works but seems like a long chore.
I used Dropbox's official command-line interface here: https://github.com/dropbox/dbxcli.
As the date of posting, it is working fine and provides lots of helpful commands for downloading and uploading.

Another solution I just tried is a bash utility called Dropbox-Uploader.
After configuration through the same steps as above (app creation and token generation), you can just do: ./dropbox_uploader.sh upload mylocal_file my_remote_file, which I find pretty convenient.

How to get realtime updates for post edits from the Instagram API when subscribed to a tag?

I've subscribed to realtime updates for a specific tag via the Instagram API using this code (as provided in the API docs):
curl -F 'client_id=CLIENT_ID' \
-F 'client_secret=CLIENT_SECRET' \
-F 'object=tag' \
-F 'aspect=media' \
-F 'verify_token=MY_SECURITY_TOKEN' \
-F 'object_id=TAG_NAME' \
-F 'callback_url=MY_CALLBACK_URL' \
https://api.instagram.com/v1/subscriptions/
This is working well, Instagram calls MY_CALLBACK_URL whenever there is a new post with the tag TAG_NAME.
My callback script fetches and stores all the data from Instagram in my local database so I don't have to fetch everything each time somebody visits my site. The problem is I don't get a notification when a post is edited or deleted, so often times the data in my local DB will be outdated.
To solve that I suppose I could ...
... set up a real time subscription for every single post I get (which doesn't sound like a good idea for obvious reasons)
... not keep a local copy of the data and instead fetch everything from Instagram every time somebody visits my site (which would probably push the API limits pretty quick)
What's the best practice here?

I think it's a bit of a grey area regarding storing data. I setup an identical setup with the real-time API that stored image URL's in a MySQL database.
Then, client-side I use the Jquery ImageLoaded library before showing images on the page to determine if they still exist or not. It's a bit crude but it works.

How to "clean" serialized files from filesystem without restarting http task

We have an XPages application and we serialize all pages on disk for this specific application. We already use the gzip option but it seems the serialized files are removed from disk only when the http task is stopped or restarted.
As this application is used by many different customers from different places around the globe, we try to avoid restart the server or the http task as much as possible but the drawback is that serialized files are never deleted ans so sooner or later we face a disk space problem even if the gzip serialzed files are not that big.
A secondary issue is that the http task takes quite a long time to stop because it has to remove all the serialized files.
Is there any way to have the domino server "clean" old/unused serialized files without restarting the http task ?
Currently we implemented an OS script which cleans serialized files older than tow days which is fine, but I would prefer a solution within domino.
Thanks in advance for your answers/suggestions !
Renaud

I believe the httpSessionId is used to store the file(s) on disk. You could try the following:
Alter the xsp.persistence.dir.xspstate to a friendlier location on (i.e. /temp/xspstate)
Register a SessionListener with your XPage application
Inside the SessionListener's sessionDestroyed method recursively search through the folders to find the one file or folder that matches the sessionId and delete
When the sessionDestoryed method is called in the listener any file locks should have been removed. Also note, as of right now, the seesionDestroyed method is not called right after a user logs out (see my question here: SessionListener sessionDestroyed not called)
hope this helps...

Azure Block Blob PUT fails when using HTTPS

I have written a Cygwin app that uploads (using the REST API PUT operation) Block Blobs to my Azure storage account, and it works well for different size blobs when using HTTP. However, use of SSL (i.e. PUT using HTTPS) fails for Blobs greater than 5.5MB. Blobs less than 5.5MB upload correctly. Anything greater and I find that the TCP session (as seen by Wireshark) reports a dwindling window size that goes to 0 once the aforementioned number of bytes are transferred. The failure is repeatable and consistent. As a point of reference, PUT operations against my Google/AWS/HP cloud storage accounts work fine when using HTTPS for various object sizes, which suggests the problem is not my client but specific to the HTTPS implementation on the MSAZURE storage servers.
If I upload the 5.5MB blob as two separate uploads of 4MB and 1.5MB followed by a PUT Block List, the operation succeeds as long as the two uploads used separate HTTPS sessions. Notice the emphasis on separate. That same operation fails if I attempt to maintain an HTTPS session across both uploads.
Any ideas on why I might be seeing this odd behavior with MS Azure? Same PUT operation with HTTPS works ok with AWS/Google/HP cloud storage servers.

Thank you for reporting this and we apologize for the inconvenience. We have managed to recreate the issue and have filed a bug. Unfortunately we cannot share a timeline for the fix at this time, but we will respond to this forum when the fix has been deployed. In the meantime, a plausible workaround (and a recommended best practice) is to break large uploads into smaller chunks (using the Put Block and Put Block List APIs), thus enabling the client to parallelize the upload.

This bug has now been fixed and the operation should now complete as expected.

CouchDB replication error: random fail with error "checkpoint_commit_failure"

Inside a PHP application I'm trying to replicate 4 DBs in and out: this is only happening with one of those replications: database's name is "people". To avoid any PHP library specific issue, I'm testing from bash running curl:
curl -H 'Content-Type: application/json' -X POST LOCAL_PATH/_replicate -d '{"source":"REMOTE_PATH/people","target":"LOCAL_PATH/people", "continuous":false}'
With this output:
{"error":"checkpoint_commit_failure","reason":"Error updating the source checkpoint document: conflict"}
I've checked this post, but it doesn't seem to be that, as we're using full paths for replication (both local and remote).
This happens most of the times, but not always.. Any idea???

CouchDB stores check points in the source database server for the last sequence id it was able to replicate. Therefore the credentials that you're using to replicate from the source server with also need write permission on the source database to write these check points.
However, this is not strictly necessary because check points are an optimization. Your docs will replicate just fine without these check points.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

CouchDB - Is it safe to delete the view files in mrview? - couchdb

Related

Upload file on Linux (CLI) to Dropbox (via bash/sh)?

How to get realtime updates for post edits from the Instagram API when subscribed to a tag?

How to "clean" serialized files from filesystem without restarting http task

Azure Block Blob PUT fails when using HTTPS

CouchDB replication error: random fail with error "checkpoint_commit_failure"

Categories

Resources