CouchDB gives no_majority when trying to update _security - couchdb

After couchdb upgrade it was no longer possible to create new databases or update _security of old databases.

I've just run into this with CouchDB 2.1.1. My problem was that the security object I was attempting to pass in was malformed.
From https://issues.apache.org/jira/browse/COUCHDB-2326,
Attempts to write security objects where the "admins" or "members" values are malformed will result in an HTTP 500 response with the following body:
{"error":"error","reason":"no_majority"}
This should really be an HTTP 400 response with a "bad_request" error value and a different error reason.
To reproduce:
$ curl -X PUT http://localhost:15984/test
{"ok":true}
$ curl -X PUT http://localhost:15984/test/_security -d '{"admins":[]}'
{"error":"error","reason":"no_majority"}

Yet another reason could be that old nodes were lingering in _membership configuration.
I.e., _membership showed:
{
"all_nodes": [
"couchdb#localhost"
],
"cluster_nodes": [
"couchdb#127.0.0.1",
"couchdb#localhost"
]
}
when it should show
{
"all_nodes": [
"couchdb#localhost"
],
"cluster_nodes": [
"couchdb#localhost"
]
}
Doing a deletion of the bad cluster node as described in docs helped.
Note, that _nodes might not be available on port 5984, but on 5986 only.

For a PUT to /db/_security, if the user is not a a db or server admin, the response is HTTP status 500 with {"error":"error","reason":"no_majority"} but the server logs are more informative, including: {forbidden,<<"You are not a db or server admin.">>}

One reason could be couchdb process reached its maximum number of open files, which lead to read errors and (wrongly) no_majority errors.

Another reason could be that the server switched from single node configuration to multiple node configuration (for example during an upgrade).
Changing back number of nodes to 1 helped

Related

ECONNREFUSED when attempting to POST to emulator from within local Docker container

TLDR:
Can't post to local Cosmos Emulator. Can post to Azure Cosmos, but not with #azure/cosmos-sign, only with #azure/cosmos (which seems utterly bizare as the latter is supposedly built upon the former.) This is not ideal (as the message signing portion alone is very lightweight with REST API directly). Bug, or user error? Why do the instructions for enabling networking/https not seem to work?
Details:
I have a Node.js based app, and am using the Azure/cosmos-sign package to generate the correct headers via the generateHeaders method to save a JSON object in the local Cosmos Emulator.
Upon trying to post from the Node app to the URI provided in the Emulator Quickstart (https://localhost:8081), the error returned is...
Error: connect ECONNREFUSED 127.0.0.1:8081 : https://localhost:8081
As per these instructions...
Enable access to emulator on a local network
If you have multiple machines using a single network, and if you set
up the emulator on one machine and want to access it from other
machine. In such case, you need to enable access to the emulator on a
local network.
You can run the emulator on a local network. To enable network access,
specify the /AllowNetworkAccess option at the command-line, which
also requires that you specify /Key=key_string or
/KeyFile=file_name. You can use /GenKeyFile=file_name to generate
a file with a random key upfront. Then you can pass that to
/KeyFile=file_name or /Key=contents_of_file.
To enable network access for the first time, the user should shut down
the emulator and delete the emulator's data directory
%LOCALAPPDATA%\CosmosDBEmulator.
-https://learn.microsoft.com/en-us/azure/cosmos-db/local-emulator?tabs=cli%2Cssl-netstd21#enable-access-to-emulator-on-a-local-network
...I thought perhaps I needed to enable the networking functionality. It is all on the same (Windows) host (with the Node.js application running in Docker on the same host as the Emulator is installed). But this caused more problems with no benefit. With the generated key, I can load the included UI for managing the local emulator instance, but I then can't create Databases or Containers (without resetting the emulator and starting it again normally, eg: without the AllowNetworkAccess and related settings).
Attempting to use the included Explorer to create a Database returns...
Error while creating database SampleDb:
{
"code": 401,
"body": {
"code": "Unauthorized",
"message": "The input authorization token can't serve the request. Please check that the expected payload is built as per the protocol, and check the key being used. Server used the following payload to sign: 'post\ndbs\n\nmon, 29 mar 2021 23:33:45 gmt\n\n'\r\nActivityId: 29e4e700-d1b7-4d59-bdea-5931e4d6622d, Microsoft.Azure.Documents.Common/2.11.0"
},
"headers": {
"access-control-allow-credentials": "true",
"access-control-allow-origin": "https://localhost:8081",
"access-control-expose-headers": "Access-Control-Allow-Origin,Access-Control-Allow-Credentials,Content-Type,x-ms-activity-id,x-ms-gatewayversion",
"content-type": "application/json",
"date": "Mon, 29 Mar 2021 23:33:45 GMT",
"server": "Microsoft-HTTPAPI/2.0",
"x-firefox-spdy": "h2",
"x-ms-activity-id": "29e4e700-d1b7-4d59-bdea-5931e4d6622d",
"x-ms-gatewayversion": "version=2.11.0",
"x-ms-throttle-retry-count": 0,
"x-ms-throttle-retry-wait-time-ms": 0
},
"activityId": "29e4e700-d1b7-4d59-bdea-5931e4d6622d"
}
I did see this somewhat similar SO question, but it was abandoned.
This one, however seems to imply they simply reverted the KeyFile steps mentioned in the MS Docs. It seems odd that I am getting the same error from the Node.js POST regardless of if I use the AllowNetworkAccess switch or not.
Using the /NoFirewall switch as recommended here didnt resolve POSTs but did allow the Explorer UI to still work properly. The upvoted answer for that question is what I have already tried (/AllowNetworkAccess /KeyFile=...., and is not working, as explained above).
The docs here indicate that TLS (https) is in fact required...
"The Azure Cosmos DB Emulator supports only secure communication via TLS"
However, here they seem to indicate that, in the Node SDK (which relies on the same cosmos-sign library I am using)...
"TLS verification is disabled. By default the Node.js SDK(version 1.10.1 or higher) for the SQL API will not try to use the TLS/SSL certificate when connecting to the local emulator."
I tried adjusting the start script for my Node Docker image as suggested here...
If connecting to the Cosmos DB Emulator, disable TLS verification
for your node process:
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
const client = new CosmosClient({ endpoint, key });
...and changed the start script in my package.json from...
"start": "node $NODE_OPTIONS node_modules...."
...to...
"start": "NODE_TLS_REJECT_UNAUTHORIZED=0 node $NODE_OPTIONS node_modules...."
...and rebuilt my images, but still receive the same ECONNREFUSED error from the Node client/app.
As I was reading the documentation for the REST API I was reminded that, as opposed to using the CosmosClient (which just needs the base URL), to do a post to the API the url needs to be fully formed as indicated here...
Method: POST
Request URI: https://{databaseaccount}.documents.azure.com/dbs/{db-id}/colls/{coll-id}/docs
Description: The {databaseaccount} is the name of the Azure Cosmos DB account created under your subscription. The {db-id} value is the
user generated name/ID of the database, not the system generated ID
(rid). The {coll-id} value is the name of the collection that contains
the document.
After appending /dbs/SampleDB/colls/SampleCollection/docs (yes, my entities are CamelCase) to the base url offered by the Emulator UI's Quickstart URI (https://localhost:8081)... I am still getting the ECONNREFUSED error to http posts.
Hmm... retargeted the Node app to point to a collection in my Azure Cosmos DB, and I am still having no luck.
400: Invalid API version. Ensure a valid x-ms-version header value is
passed. Please update to the latest version of Azure Cosmos DB
SDK.ActivityId: bfdeb339-8fef-4ba9-a03d-444a8664c02b,
Microsoft.Azure.Documents.Common/2.11.0
Added x-ms-version and set it to 2018-12-31 (latest, as per here).
Now I am getting (after trying both my secondary, and primary keys... just in case)...
401: The input authorization token can't serve the request. Please
check that the expected payload is built as per the protocol, and
check the key being used. Server used the following payload to sign:
'postdocsdbs/TopHand/colls/SampleTbltue, 30 mar 2021 02:54:25
gmt'ActivityId: bb258bb4-f5a8-4495-b0b5-b54fa8b7c46f,
Microsoft.Azure.Documents.Common/2.11.0
I verified that the required headers are all present. What can possibly be left?!
Base URI for Azure Cosmos had a trailing /, which ended up duplicated when the rest of the path was appended. Fixing the url string, still getting the 401.
A github issue pointed me to what may have been an error in the URL/REST path I was posting to. Rather than posting to (what I had previously)...
dbs/SampleDb/colls/SampleTbl/docs
...I changed it to...
dbs/SampleDb/colls/SampleTbl
...and am now getting error 405, MethodNotAllowed, RequestHandler.Post. 405 isn't listed as code returned by the Cosmos REST service.
This example in the MS docs definitely uses the /docs string at the end of the url/REST path.
Example
POST https://querydemo.documents.azure.com/dbs/1KtjAA==/colls/1KtjAImkcgw=/docs HTTP/1.1
x-ms-documentdb-partitionkey: ["Andersen"]
x-ms-date: Tue, 29 Mar 2016 02:28:29 GMT
authorization: type%3dmaster%26ver%3d1.0%26sig%3d92WMAkQv0Zu35zpKZD%2bcGSH%2b2SXd8HGxHIvJgxhO6%2fs%3d
Cache-Control: no-cache
User-Agent: Microsoft.Azure.Documents.Client/1.6.0.0
x-ms-version: 2015-12-16
Accept: application/json
Host: querydemo.documents.azure.com
Cookie: x-ms-session-token#0=602; x-ms-session-token=602
Content-Length: 344
Expect: 100-continue
{
"id": "AndersenFamily",
"LastName": "Andersen",
}
I contacted MS support and was giving some info that unblocked me (but doesn't entirely address the issues noted above).
For my own use-case, simply setting a key and allowing network access to the emulator was sufficient.
Note: This doesn't address the issues of the Emulator's Data Explorer becoming nonfunctional.
The feedback I received from the support personnel in regard to using the command line switches disabling the UI was...
By changing the key to something other than default one, you also
protect your emulator data from being seen via the Data Explorer.
Apparently the key alone isn't enough to protect the data, and disabling the UI is a "feature".
Solution: Simply executing...
.\Microsoft.Azure.Cosmos.Emulator.exe /AllowNetworkAccess /Key={insert your base64 encoded 64+ character string}
...allowed network access to systems on the same host as the emulator. This avoided all the certificate/key generation/importing/etc headache.
You must connect to the non-loopback IP of the host the emulator is running on to connect to it (writes/reads/etc).

ElasticSearch 6.0 timeout on cluster

I've 3 differents server with 1 instance of ES 6.0 on each and another server with nodejs, to query on.
On each server I just changed :
discovery.zen.ping.unicast.hosts : [ LIST_ES_IP ]
discovery.zen.minimum_master_nodes: 2
My problem is that after some times (not define), I've timeout error from nodejs server. But if I call
curl -XGET 'IP:9200/_cluster/health?pretty'
On this same server, I can see that ES works fine.
If I remove one server from cluster (comment previous 2 config lines), and query only on it, all works good & I never have timeout.
Did I need to change another config to make this cluster works ?
Did you have ideas about why I've timeout only on cluster mode ?
Thanks in advance,
Apparently it's cause on elasticsearch-js client because I reactivate cluster, but define host to
"IP:9200"
and it works for 3 hours now.
Before I've
[ "IP1:9200", "IP2:9200", "IP3:9200" ]
I try with
[ {host: "IP", port: 9200}, {...} ]
But time out too..
So no way to have rollback if one server failed ?

MongoDB GET request returning nothing

I am running a MongoDB database using NodeJS + Forever on an Amazon EC2 Instance. (MongoDB and NodeJS code can be found here https://github.com/WyattMufson/MongoDB-AWS-EC2). I installed Mongo on the EC2 instance following this tutorial: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/.
When I run:
curl -H "Content-Type: application/json" -X POST -d '{"test":"field"}' http://localhost:3000/data
The POST returns:
{
"test": "field",
"created_at": "2017-11-20T04:52:12.292Z",
"_id": "5a125f7cead7a00d5a2593ec"
}
But this GET:
curl -X GET http://localhost:3000/data
returns:
[]
My dbpath is set to /data/db/ and I have no permissions issues when running it.
Why is the POST request working, but not the GET?
There are a number of issues and potential issues with that Node app that you're using. I suppose that you could start down the path of fixing/updating those issues or find another sample MEAN application, depending on what you're trying to ultimately accomplish.
One of the glaring issues is that collectionDriver.js is not properly passing errors back to the callback, it's passing null back rather than the error. This appears in 6 different places, but in particular (based on your sample POST) here on lines 46 and 47:
the_collection.insert(obj, function() { //C
callback(null, obj);
Should be (with some extra console logging for good measure):
the_collection.insert(obj, function(err, doc) { //C
console.error("insert error: %s", err.message);
callback(err, doc);
If you make those changes you will almost certainly see that your POST is actually returning a MongoError. And then you can move on to finding and fixing the next set of problems.
One of the errors/issues that you might find is that project is using a really old version of the MongoDB Node.js driver, and you find might this error uncovered when you fix the error handling:
driver is incompatible with this server version
Fixing that will take some additional work, since there are some API changes in the more recent 2.x driver which would be required to support more current versions of MongoDB (e.g. 3.2 or 3.4). See Node.js Driver Compatibility

Service Fabric Application PackageDeployment Operation Time Out exception

i have service fabric cluster and 3 nodes are created in 3 systems and it is inter-connected. i am able to connect each of nodes. These nodes are created in windows server. These Windows Server(VMs) are on-premises.
Manually i am trying to deploy my package into my cluster/one of nodes, i am getting Operation Timeout exception. i have used below commands to execute for deployment.
Service Fabric Power shell Commands:
Copy-ServiceFabricApplicationPackage -ApplicationPackagePath 'c:\sample\etc' -ApplicationPackagePathInImageStore 'abc.app.portaltype'
after execute above command it runs for 2 -3 mins and throws Operation Timeout exception. My package size is almost 250 MB and approximately 15000 file exist in my package. after that i have passed an extra parameter -TimeOutSec to 600(10mins) explicitly in above command, then it successfully executed and it copied to service fabric imagestore.
Register-ServiceFabricApplicationType -ApplicationPathInImageStore 'abc.app.portaltype'
after executed Copy-ServiceFabricApplicationPackage command , i have executed above Register-ServiceFabricApplicationType command to register my in cluster.but it also throws Operation timeout exception then i have passed an extra parameter -TimeOutSec to 600(10mins) explicitly in above command, but no luck it throws same operation timeout exception.
Just to make sure these operation Timeout issue because of no files in package or not. i have created simple empty service fabric asp.net core app and created package and try to deploy in same server with using above command, it deployed with in fraction of second and it works as smoothly.
Anybody has any idea how to over come service fabric operation timeout issue ?
How to handle the operation timeout issue if the package contains large set of files ?
Any help/suggestion would be very appreciated.
Thanks,
If this is taking longer than the 10 Minute default max it's probably one of the following issues:
Large application packages (>100s of MB)
Slow network connections
A large number of files within the application package (>1000s).
The following workarounds should help you.
Add the following settings to your cluster config:
"fabricSettings": [
{
"name": "NamingService",
"parameters": [
{
"name": "MaxOperationTimeout",
"value": "3600"
},
]
}
]
Also add:
"fabricSettings": [
{
"name": "EseStore",
"parameters": [
{
"name": "MaxCursors",
"value": "32768"
},
]
}
]
There’s a couple additional features which are currently rolling out. For these to be present and functional, you need to be sure that the client is at least 2.4.28 and the runtime of your cluster is at least 5.4.157. If you’re staying up to date these should already be present in your environment.
For register you can specify the -Async flag which will handle the upload asynchronously, reducing the need for the timeout to just the time necessary to send the command, not the application package. You can also query the status of the registration with Get-ServiceFabricApplicationType. 5.5 fixes some issues with these commands, so if they aren't working for you you'll have to wait for that release to hit your environment.

Measure resource usage of Docker container on exit

I create containers which compile/interpret user'c code and pass the result back to the browser (just like JSFiddle). Now, I need to know how much CPU and memory has been used for executing that code. So, I don't need it realtime but on container's exit, so that I can pass these two parameters with the others back to the client.
I tried using pseudo-files like here, but there is no such a location on my server (Ubuntu 14.04). How I can measure these parameters?
docker has a stats api
https://docs.docker.com/engine/reference/api/docker_remote_api_v1.24/#/get-container-stats-based-on-resource-usage
"cpu_usage" : {
"percpu_usage" : [
8646879,
24472255,
36438778,
30657443
],

Resources