I have noticed numerous entries in Tomcat's local_access_log for various resources coming from IP address 127.0.0.1. These are clearly attempts to hack in. For example, here is a request to get access to the "manager" app:
127.0.0.1 - - [30/Apr/2015:13:35:13 +0000] "GET /manager/html HTTP/1.1" 401 2474
here is another one:
127.0.0.1 - - [30/Apr/2015:21:23:37 +0000] "POST /cgi-bin/php?%2D%64+%61%6C%6C%6F%77%5F%75%72%6C%5F%69%6E%63%6C%75%64%65%3D%6F%6E+%2D%64+%73%61%66%65%5F%6D%6F%64%65%3D%6F%66%66+%2D%64+%73%75%68%6F%73%69%6E%2E%73%69%6D%75%6C%61%74%69%6F%6E%3D%6F%6E+%2D%64+%64%69%73%61%62%6C%65%5F%66%75%6E%63%74%69%6F%6E%73%3D%22%22+%2D%64+%6F%70%65%6E%5F%62%61%73%65%64%69%72%3D%6E%6F%6E%65+%2D%64+%61%75%74%6F%5F%70%72%65%70%65%6E%64%5F%66%69%6C%65%3D%70%68%70%3A%2F%2F%69%6E%70%75%74+%2D%64+%63%67%69%2E%66%6F%72%63%65%5F%72%65%64%69%72%65%63%74%3D%30+%2D%64+%63%67%69%2E%72%65%64%69%72%65%63%74%5F%73%74%61%74%75%73%5F%65%6E%76%3D%22%79%65%73%22+%2D%64+%63%67%69%2E%66%69%78%5F%70%61%74%68%69%6E%66%6F%3D%31+%2D%64+%61%75%74%6F%5F%70%72%65%70%65%6E%64%5F%66%69%6C%65%3D%70%68%70%3A%2F%2F%69%6E%70%75%74+%2D%6E HTTP/1.1" 404 1016
When decoded, the URL is this:
127.0.0.1 - - [30/Apr/2015:21:23:37 0000] "POST /cgi-bin/php?-d allow_url_include=on -d safe_mode=off -d suhosin.simulation=on -d disable_functions="" -d open_basedir=none -d auto_prepend_file=php://input -d cgi.force_redirect=0 -d cgi.redirect_status_env="yes" -d cgi.fix_pathinfo=1 -d auto_prepend_file=php://input -n HTTP/1.1" 404 1016
There are lots of such entries, all from IP address 127.0.0.1. Obviously, since this is the address of localhost, I can't block it. More over, I am not sure if there is something that I can do about it. Is there possibly an exploit that should be patched up? For instance, is there a version of Tomcat that has a related vulnerability? I am running Tomcat 8.
Much thanks for any advice!
UPDATE: thanks for the suggestion about a proxy. Turned out that httpd was indeed installed and not surprisingly, there are suspicious request. For example:
[Sat Mar 30 17:26:49 2013] [error] [client 5.34.247.59] Invalid URI in request GET /_mem_bin/../../../../winnt/system32/cmd.exe?/c+dir HTTP/1.0
[Sat Mar 30 17:26:49 2013] [error] [client 5.34.247.59] Invalid URI in request GET /_mem_bin/../../../../winnt/system32/cmd.exe?/c+dir%20c:\\ HTTP/1.0
[Sat Mar 30 17:26:49 2013] [error] [client 5.34.247.59] Invalid URI in request GET /_mem_bin/../../../../winnt/system32/cmd.exe?/c+dir%20c:\\ HTTP/1.0
This is not a windows system so cmd.exe has not place for it...
If you have a proxy server running on your computer, that will often receive requests and then call the primary server using the localhost (127.0.0.1) interface.
This could explain why you're logging these requests.
Is it possible to write in the couchdb server log (the one defined by default.ini or local.ini in [log]) from a couchapp? (But from somewhere else than a view)
If that's not possible, maybe there's a workaround which would allow to log successful or unsuccessful authentication attemps in the couchdb server log? I'd like to process this server side and would like to avoid logging all httpd activity and grepping for user logging patterns, which doesn't seem to be easy or pretty...
Cheers,
Jun
A year later I find that it was in fact possible to log from views (or lists or any Javascript Design Doc functions) using the log() function: http://docs.couchdb.org/en/1.6.1/query-server/javascript.html#log
log(message)
Log a message to the CouchDB log (at the INFO level).
Arguments:
message – Message to be logged
function(doc){
log('Procesing doc ' + doc['_id']);
emit(doc['_id'], null);
}
After the map function has run, the following line can be found in CouchDB logs (e.g. at /var/log/couchdb/couch.log):
[Sat, 03 Nov 2012 17:38:02 GMT] [info] [<0.7543.0>] OS Process #Port<0.3289> Log :: Processing doc 8d300b86622d67953d102165dbe99467
Who would have guessed :)
I'm pretty sure you can't write to couch.log from a view, it's a sandboxed system.
Getting a record of connections to the server is possible though. Here's a dump from my couch.log, with an HTTP error in there:
/
[Sat, 13 Sep 2014 08:18:57 GMT] [info] [<0.160.0>] Opening index for db: test idx: _design/ivet sig: "f6b64ef8593e23cac644c13b895b7607"
[Sat, 13 Sep 2014 08:18:57 GMT] [info] [<0.121.0>] 127.0.0.1 - - GET /test/_design/ivet/_view/medicationWHP/foobar?include_docs=true 200
[Sat, 13 Sep 2014 08:18:57 GMT] [info] [<0.121.0>] 127.0.0.1 - - GET /test/_design/ivet/_view/medicationWHP/foobar?include_docs=true 500
[Sat, 13 Sep 2014 08:18:57 GMT] [error] [<0.121.0>] httpd 500 error response:
{"error":"json_encode","reason":"{bad_term,{key,null}}"}
[Sat, 13 Sep 2014 08:19:05 GMT] [info] [<0.36.0>] Apache CouchDB has started on http://127.0.0.1:5984/
You can see it has the VERB PATH CODE format for each line, so you can filter that for whatever you need. (Unauthorized is 401) You can also access the log through /_log. Details on that are here:
http://docs.couchdb.org/en/latest/api/server/common.html#log
To get all that information, you'll need to have the log level set to info. You can do this at the config screen in futon.
To do it server-side, you'd probably need to use node.js or something like that. Just have it consume the /_log endpoint, and filter each line by the HTTP response code.
Update
As #AkshatJiwanSharma suggested I have tried a few things while locally replicating. Very instructive! I have renamed the question since the problem is not that the design document gets replicated, in fact it isn't replicated, but it is fetched via an HTTP GET as part of the initial replication "negotiation" phase.
I've moved the original question to the bottom to make the new question clearer. The new question is:
It seems inefficient (particularly in the case of CouchApps) to fetch the entire design document - i.e. the entire remote app - when initiating a replication with a remote source. Can this be avoided?
It is particularly problematic in our case, on high latency links (less than 7.2Kbps), with relatively large design documents (3MB).
Remote Target
I have first tried by using a "remote" target by setting the replication target to http://127.0.0.1:5984/emr_replica.
[Fri, 08 Aug 2014 08:36:20 GMT] [info] [<0.18947.7>] Document `88fa1b1a1315d27ded663466c6003578` triggered replication `e8e66a554d198b88b6263a572a072fd3+continuous`
[Fri, 08 Aug 2014 08:36:20 GMT] [info] [<0.18946.7>] starting new replication `e8e66a554d198b88b6263a572a072fd3+continuous` at <0.18947.7> (`emr_demo` -> `http://127.0.0.1:5984/emr_replica/`)
[Fri, 08 Aug 2014 08:36:20 GMT] [info] [<0.18928.7>] 127.0.0.1 - - POST /emr_replica/_revs_diff 200
[Fri, 08 Aug 2014 08:36:20 GMT] [info] [<0.18915.7>] y.y.y.y - - GET /_utils/_sidebar.html 200
[Fri, 08 Aug 2014 08:36:20 GMT] [info] [<0.18916.7>] y.y.y.y - - GET /_replicator/88fa1b1a1315d27ded663466c6003578?revs_info=true 200
In that case the design document doesn't seem to be fetched.
Remote Source
Then setting the source as "remote" like this
{
"_id": "88fa1b1a1315d27ded663466c6003a4a",
"_rev": "3-b6408e98acafe729da0153c35d9df113",
"source": "http://127.0.0.1:5984/emr_demo",
"target": "emr_replica",
"continuous": true,
"filter": "emr/user_data",
"owner": "jun"
}
Then the server fetches the remote design document before starting the replication (GET /emr_demo/_design/emr 200).
[Fri, 08 Aug 2014 08:42:17 GMT] [info] [<0.19687.7>] Document `88fa1b1a1315d27ded663466c6003a4a` triggered replication `bd8f6288970bca974dba36dbc6e5353b+continuous`
[Fri, 08 Aug 2014 08:42:17 GMT] [info] [<0.19686.7>] starting new replication `bd8f6288970bca974dba36dbc6e5353b+continuous` at <0.19687.7> (`http://127.0.0.1:5984/emr_demo/` -> `emr_replica`)
[Fri, 08 Aug 2014 08:42:17 GMT] [info] [<0.19648.7>] 127.0.0.1 - - HEAD /emr_demo/ 200
[Fri, 08 Aug 2014 08:42:17 GMT] [info] [<0.19648.7>] 127.0.0.1 - - GET /emr_demo/_design/emr 200
[Fri, 08 Aug 2014 08:42:18 GMT] [info] [<0.19656.7>] 127.0.0.1 - - GET /emr_demo/5cc2db69a32a84091b96c244273fda0e?revs=true&open_revs=%5B%221-ef8967557f2e99eb137f963daccddb3f%22%5D&latest=true 200
Further testing shows that this fetching of the design document is only done once. Further replications (including after restarting the server) only fetch the changes with the appropriate filter:
[Fri, 08 Aug 2014 09:06:36 GMT] [info] [<0.520.0>] Document `88fa1b1a1315d27ded663466c6003a4a` triggered replication `bd8f6288970bca974dba36dbc6e5353b+continuous`
[Fri, 08 Aug 2014 09:06:36 GMT] [info] [<0.519.0>] starting new replication `bd8f6288970bca974dba36dbc6e5353b+continuous` at <0.520.0> (`http://127.0.0.1:5984/emr_demo/` -> `emr_replica`)
[Fri, 08 Aug 2014 09:06:36 GMT] [info] [<0.335.0>] 127.0.0.1 - - GET /emr_demo/_changes?filter=emr%2Fuser_data&feed=continuous&style=all_docs&since=1607&heartbeat=1666 200
[Fri, 08 Aug 2014 09:06:36 GMT] [info] [<0.334.0>] 127.0.0.1 - - GET /emr_demo/5cc2db69a32a84091b96c24427560310?atts_since=%5B%2218-b613d3160bd09c45ac07a5485c9c7bce%22%5D&revs=true&open_revs=%5B%2219-d50438143337a3a0af5ed8ceb75b42f5%22%5D&latest=true 200
Former question
We're trying to use the couchdb replication over a very high latency link (slow, frequent disconnections,...). We want to avoid to replicate the design document which is heavy. We have a filter in place and when using the following curl command, the design document doesn't appear, as expected:
curl http://x.x.x.x:5984/emr/_changes?filter=emr/user_data
Our replication document is:
{
"_id": "e0e38be8cc0b11356dfb03bc8400074d",
"_rev": "1-d77117f03d63099e1e505b9f9de3371d",
"source": "http://x.x.x.x:5984/emr",
"target": "emr",
"continuous": true,
"filter": "emr/user_data",
"create_target": true,
"owner": "jun"
}
We have deactivated authentication while we're debugging. When using an existing database and removing create_target, the same problem occurs.
The source server outputs the following:
[Mon, 10 Mar 2014 21:22:03 GMT] [info] [<0.135.0>] Retrying HEAD request to http://x.x.x.x:5984/emr/ in 0.25 seconds due to error {conn_failed,{error,etimedout}}
[Mon, 10 Mar 2014 21:23:47 GMT] [info] [<0.135.0>] Retrying GET request to http://x.x.x.x:5984/emr/_design/emr in 0.25 seconds due to error req_timedout
[Mon, 10 Mar 2014 21:24:14 GMT] [error] [<0.135.0>] Replicator, request GET to "http://x.x.x.x:5984/emr/_design/emr" failed due to error {error,req_timedout}
[Mon, 10 Mar 2014 21:24:14 GMT] [error] [<0.135.0>] Replication manager, error processing document `e0e38be8cc0b11356dfb03bc8400074d`: Couldn't open document `_design/emr` from source database `http://x.x.x.x:5984/emr/`: {'EXIT',{http_request_failed,"GET","http://x.x.x.x:5984/emr/_design/emr",
{error,{error,req_timedout}}}}
When using tcpdump, it's clear that the replication fails because the replication manager attempts to download the heavy design document (http://x.x.x.x:5984/emr/_design/emr).
FYI the replicator's configuration is:
replicator connection_timeout 5000
db _replicator
http_connections 1
max_replication_retry_count 3
retries_per_request 1
socket_options [{keepalive, true}, {nodelay, true}]
ssl_certificate_max_depth 3
verify_ssl_certificates false
worker_batch_size 1
worker_processes 1
EDIT: The user_data function (which correctly hides the design document when ran through curl as above) is :
exports.user_data = function(doc, req) {
if (doc.collection == "visits" || doc.collection == "patients" || doc.collection == "reports") {
return true;
}
return false;
}
Hope someone can help!
Suggestion
Try defining a filter function in another, small, dedicated design document and see if that fixes your problem.
// replicator document:
{
"_id": "e0e38be8cc0b11356dfb03bc8400074d",
"_rev": "1-d77117f03d63099e1e505b9f9de3371d",
"source": "http://x.x.x.x:5984/emr",
"target": "emr",
"continuous": true,
"filter": "small-design-doc/user_data",
"create_target": true,
"owner": "jun"
}
// _design/small-design-doc
// -- will be replicated, but is quite small:
{
"_id": "_design/small-design-doc",
"_rev": "1-...",
"filters": {
"user_data": "function(doc, req) { ... }"
}
}
Explanation
According to a current snapshot of the source code, it seems the replicator is trying to fetch the design document (_design/emr) from the source database, simply because the filter function is defined there (emr/user_data).
If you specify a filter function in another design document, the replicator should try to download that very document before executing replication. So you cannot quite circumvent downloading any design document, but you are able to select which one.
Great question by the way. And very thoroughly investigated!
I'm chasing this problem where my REST client (curl) program is sending a very long URI(5000 chars) to my Apache httpd 2.2.15 (RHEL6) which is being denied. From the Apache docs, i read that default max. URI length supported is 8190 (via LimitRequestLine here), but when I'm giving a URI of 5000 chars (like somehost/dir1/dir2/dir3/.../dir700/ ), i'm getting this error in ssl_error_log file:
[Tue Apr 02 17:29:16 2013] [error] [client 10.0.100.1] (36)File name too long: Cannot map GET <<long URI>> HTTP/1.1 to file
From the apache code, this looks to be hitting the PATH_MAX(4096) limit of Linux. If this is the case, then how can I make sure that URI are honoured up to 8190 chars? OR else is there any other limit which restricts the path to 4096 chars?
I don't want to see a log for every request the server receives when I'm testing (it makes reading the results much harder). Is there a simple way to start up Node so that it doesn't do that?
I'm referring the the lines that look like this just to be perfectly clear:
127.0.0.1 - - [Mon, 07 Jan 2013 15:59:52 GMT] "GET / HTTP/1.1" 200 1039 "-" "Mozilla/5.0 Chrome/10.0.613.0 Safari/534.15 Zombie.js/1.4.1"
NodeJS does not do this automatically.
Assuming you are using express, you need to remove the logger middleware. Remove this line:
app.use(express.logger());