Why can header.copy not be used in vcl_deliver? - varnish

I was wondering about an unexpected behaviour of varnish and the header vmod.
The following vcl will not compile, because of the use of header.copy in vcl_deliver.
Why is the use of header.copy not allowed here? Is there any documentation for this behaviour? Neither the documentation of varnish nor header say anything ybout it.
I don't need a workaround. I already have that.
vcl 4.1;
import header;
backend default {
.host = "127.0.0.1";
.port = "8888";
}
sub vcl_backend_response {
header.copy(beresp.http.Set-Cookie, beresp.http.X-Set-Cookie);
unset beresp.http.Set-Cookie;
}
sub vcl_deliver {
header.copy(beresp.http.X-Set-Cookie, beresp.http.Set-Cookie);
unset beresp.http.X-Set-Cookie;
}
will result in the following error, when compiled
Feb 09 06:40:11 epcentos7.dev varnishd[12938]: Error: Feb 09 06:40:11
epcentos7.dev varnishd[12938]: Message from VCC-compiler: Feb 09
06:40:11 epcentos7.dev varnishd[12938]: ('/etc/varnish/default.vcl'
Line 15 Pos 15) -- (Pos 20) Feb 09 06:40:11 epcentos7.dev
varnishd[12938]: header.copy(beresp.http.X-Set-Cookie,
beresp.http.Set-Cookie); Feb 09 06:40:11 epcentos7.dev
varnishd[12938]:
--------------######--------------------------------------------
Feb 09 06:40:11 epcentos7.dev varnishd[12938]: Not available in subroutine
'vcl_deliver'. Feb 09 06:40:11 epcentos7.dev varnishd[12938]: Running
VCC-compiler failed, exited with 2 Feb 09 06:40:11 epcentos7.dev
varnishd[12938]: VCL compilation failed Feb 09 06:40:11 epcentos7.dev
systemd[1]: varnish.service: control process exited, code=exited
status=255 Feb 09 06:40:11 epcentos7.dev systemd[1]: Failed to start
Varnish Cache, a high-performance HTTP accelerator.

The beresp object is not available in vcl_deliver as it is part of a different flow. The equivalent would be resp, which would result in the following header.copy() line:
sub vcl_deliver {
header.copy(resp.http.X-Set-Cookie, resp.http.Set-Cookie);
unset resp.http.X-Set-Cookie;
}
Transaction scope
Varnish has 2 kinds of transactions:
Client transactions: requests being received from the client, and responses being served to the client
Backend transactions: backend requests being sent to the backend, and backend responses being received from the backend
When a request results in a cache hit, Varnish only needs a client transaction to process it. No connection to the backend is made, so no backend transaction is used.
When a request results in a cache miss, Varnish will use both a client transaction to interact with the client, and a backend transaction to get the non-cached data from the backend
Subroutine scope
With the transaction scope in mind, we can now map VCL subroutines to it.
Here's an overview of client-side subroutines:
vcl_recv
vcl_hash
vcl_miss
vcl_hit
vcl_pass
vcl_deliver
vcl_synth
vcl_purge
These transactions have access to the req object and the resp object for requests and responses.
There are also backend subroutines such as:
vcl_backend_fetch
vcl_backend_response
vcl_backend_error
These subroutines have access to the bereq and beresp objects.
Object flow
When a request is received, the request information is stored in the req object. When it the request results in a cache miss, the req object information is copied into the bereq object.
When the backend responds the beresp object contains the (potentially) cacheable information. The beresp object data is copied into the obj object which represents what is stored in cache, but also to the resp object that is used to serve the response to the client that requested it.

Related

AWS API Gateway "Unsupported method \"undefined\"" as response

I am setting up a AWS Lambda function to connect to my DynamoDB. To access it I'm also setting up an API Gateway.
The lambda seems to work when I test it. Because of this I believe the issue to be in the API gateway setup.
For the lambda I have configured the following test event:
I have configured a test event which looks like this:
{
"httpMethod": "GET"
}
This test event gives me the following response:
Response:
{
"statusCode": "200",
"body": "{\"Items\":[{\"id\":1,\"brand\":\"Test brand\",\"title\":\"Test product\"}],\"Count\":1,\"ScannedCount\":1}",
"headers": {
"Content-Type": "application/json"
}
}
For the API Gateway I have tried with the following test:
I have tried automatically creating the API Gateway in the lambda management console. Recreating the lambda and the API Gateway.
Lambda function:
console.log('Loading function');
const doc = require('dynamodb-doc');
const dynamo = new doc.DynamoDB();
/**
* Demonstrates a simple HTTP endpoint using API Gateway. You have full
* access to the request and response payload, including headers and
* status code.
*
* To scan a DynamoDB table, make a GET request with the TableName as a
* query string parameter. To put, update, or delete an item, make a POST,
* PUT, or DELETE request respectively, passing in the payload to the
* DynamoDB API as a JSON body.
*/
exports.handler = (event, context, callback) => {
//console.log('Received event:', JSON.stringify(event, null, 2));
const done = (err, res) => callback(null, {
statusCode: err ? '400' : '200',
body: err ? err.message : JSON.stringify(res),
headers: {
'Content-Type': 'application/json',
},
});
switch (event.httpMethod) {
case 'DELETE':
dynamo.deleteItem(JSON.parse(event.body), done);
break;
case 'GET':
dynamo.scan({ "TableName": "productdb" }, done);
//dynamo.scan({"TableName":"productdb"})
break;
case 'POST':
dynamo.putItem(JSON.parse(event.body), done);
break;
case 'PUT':
dynamo.updateItem(JSON.parse(event.body), done);
break;
default:
done(new Error(`Unsupported method "${event.httpMethod}"`));
}
};
API Gateway logs from the test:
Execution log for request 885e5505-2212-11e9-aee0-7f024016f574
Sun Jan 27 09:04:20 UTC 2019 : Starting execution for request: 885e5505-2212-11e9-aee0-7f024016f574
Sun Jan 27 09:04:20 UTC 2019 : HTTP Method: GET, Resource Path: /
Sun Jan 27 09:04:20 UTC 2019 : Method request path: {}
Sun Jan 27 09:04:20 UTC 2019 : Method request query string: {}
Sun Jan 27 09:04:20 UTC 2019 : Method request headers: {}
Sun Jan 27 09:04:20 UTC 2019 : Method request body before transformations:
Sun Jan 27 09:04:20 UTC 2019 : Endpoint request URI: https://lambda.eu-central-1.amazonaws.com/2015-03-31/functions/arn:aws:lambda:eu-central-1:304886708348:function:dynamoDBService/invocations
Sun Jan 27 09:04:20 UTC 2019 : Endpoint request headers: {x-amzn-lambda-integration-tag=885e5505-2212-11e9-aee0-7f024016f574, Authorization=*****************************************************************************************************************************************************************************************************************************************************************************************500617, X-Amz-Date=20190127T090420Z, x-amzn-apigateway-api-id=lqhm3agxxf, X-Amz-Source-Arn=arn:aws:execute-api:eu-central-1:304886708348:lqhm3agxxf/test-invoke-stage/GET/, Accept=application/json, User-Agent=AmazonAPIGateway_lqhm3agxxf, X-Amz-Security-Token=FQoGZXIvYXdzEOH//////////wEaDFvawdYGjH/+gSI14yK9AzQFZtlDghAr2NUHIhLGWmeJkKL8sUP3L6fu0h5PtFPN7wA7hgfWMtUNHCWyGykG0g5Zs81zKx5bUGMLCMK2zuVwD4WMgBRmkx40bZYehHdeS8czOxRTbQIqwP1lfZ0d74l4MqG4g8XpigkcLACLEn6buaq37rO4WYOo+J8ecFeSpti+u+V8OON4idxxXEHiYGJEc23OwjVvf3GTr1EUscB+Lsp/nw58oCWQArUA6LLSwcnGYXYcmnPav2Xs8mJgvqnVowxxYre0N8Gca8D9XBN2Y93/qnVTsOI5nWHSUQOnwaoXSZzgBAXKrUV1S5X+UH3zQI9p [TRUNCATED]
Sun Jan 27 09:04:20 UTC 2019 : Endpoint request body after transformations:
Sun Jan 27 09:04:20 UTC 2019 : Sending request to https://lambda.eu-central-1.amazonaws.com/2015-03-31/functions/arn:aws:lambda:eu-central-1:304886708348:function:dynamoDBService/invocations
Sun Jan 27 09:04:20 UTC 2019 : Received response. Integration latency: 17 ms
Sun Jan 27 09:04:20 UTC 2019 : Endpoint response body before transformations: {"statusCode":"400","body":"Unsupported method \"undefined\"","headers":{"Content-Type":"application/json"}}
Sun Jan 27 09:04:20 UTC 2019 : Endpoint response headers: {Date=Sun, 27 Jan 2019 09:04:20 GMT, Content-Type=application/json, Content-Length=108, Connection=keep-alive, x-amzn-RequestId=6c00229e-caa1-4d37-aeaa-7c1cbd0ddd71, x-amzn-Remapped-Content-Length=0, X-Amz-Executed-Version=$LATEST, X-Amzn-Trace-Id=root=1-5c4d7414-e52b0fba267596b50fdbb102;sampled=0}
Sun Jan 27 09:04:20 UTC 2019 : Method response body after transformations: {"statusCode":"400","body":"Unsupported method \"undefined\"","headers":{"Content-Type":"application/json"}}
Sun Jan 27 09:04:20 UTC 2019 : Method response headers: {X-Amzn-Trace-Id=Root=1-5c4d7414-e52b0fba267596b50fdbb102;Sampled=0, Access-Control-Allow-Origin=*, Content-Type=application/json}
Sun Jan 27 09:04:20 UTC 2019 : Successfully completed execution
Sun Jan 27 09:04:20 UTC 2019 : Method completed with status: 200
I expect the result to be the same as the test event in the lambda returns.
When you create an API method, you need to select the option "Use Lambda Proxy integration" in order for the httpMethod field, along with other information from the API Gateway, to be accessible in the event object in your Lambda function.
From the docs:
You can set up a Lambda proxy integration for any API method. But a
Lambda proxy integration is more potent when it is configured for an
API method involving a generic proxy resource. The generic proxy
resource can be denoted by a special templated path variable of
{proxy+}, the catch-all ANY method placeholder, or both. The client
can pass the input to the backend Lambda function in the incoming
request as request parameters or applicable payload. The request
parameters include headers, URL path variables, query string
parameters, and the applicable payload. The integrated Lambda function
verifies all of the input sources before processing the request and
responding to the client with meaningful error messages if any of the
required input is missing.
You can find the "Use Lambda Proxy integration" option here, on the "Create Method" screen in your API Gateway instance:
Edit: For reference, you can tell that the API Gateway method is not using the Lambda proxy integration because under "Integration Request" the type is "LAMBDA", but when using the Lambda proxy integration the type is "LAMBDA_PROXY".
For anyone who couldn't get the above solutions to work, I had to change the switch statement for the http method that the Lambda function was originally built with
from: switch (event.httpMethod) {
to: switch (event.requestContext.http.method) {
I think V1.0 uses event.httpMethod? but I wasn't able to change the version in the API Gateway (message: "API Gateway managed resources are view only"). So I had to change the switch statement to match the event object.
I fixed the issue for HTTP API, by changing the payload format to V1.0 (under the integration details for the route).

determine when cookie will expire with node Request module

I'm writing an app that builds an API through scraping an external domain. In order to scrape the domain, my server must be authorized ( with a session cookie ).
I'm using the request module with a cookie jar to maintain the cookies across requests.
I want to set up some Node router middleware so that, if/when the session expires, I can re-run my authentication method. Think something like this:
export function validate(req, res, next) {
const cookie = cookieJar.getCookie('**target domain**');
const COOKIE_IS_EXPIRED = // ???
if ( COOKIE_IS_EXPIRED ) {
authenticate().then(next);
} else {
next();
}
}
When I log out the contents of cookieJar.getCookies() my result is something like the following:
[ Cookie="lw_opac_1=1483019929431275955; Expires=Wed, 29 Mar 2017 13:58:49 GMT; Max-Age=7776000; Path=/opac; hostOnly=true; aAge=0ms; cAge=6345ms",
Cookie="lw_opac=ta4tscsejs6c94ngikt7hlbcn0; Path=/; hostOnly=true; aAge=0ms; cAge=6347ms" ]
how can I validate when that both cookies are close to / have expired, and then re-run my auth function?
Thanks!
You can use the cookie module from npm to parse the cookies.
So, for example when you have a string like you posted:
var c = "lw_opac_1=1483019929431275955; Expires=Wed, 29 Mar 2017 13:58:49 GMT; Max-Age=7776000; Path=/opac; hostOnly=true; aAge=0ms; cAge=6345ms";
then running:
console.log(cookie.parse(c));
will result in printing:
{ lw_opac_1: '1483019929431275955',
Expires: 'Wed, 29 Mar 2017 13:58:49 GMT',
'Max-Age': '7776000',
Path: '/opac',
hostOnly: 'true',
aAge: '0ms',
cAge: '6345ms' }
You can use the Max-Age and Expires fields and use moment to parse them and compare them to the current time.
With moment you can compare the given date with today and get the number of hours or days of the difference. For example this:
moment().isAfter(someDate);
will tell you if that date has already passed (so if it's the expiration date then it will tell you if the cookie has expired).
See:
https://www.npmjs.com/package/cookie
https://www.npmjs.com/package/moment
#rsp answer correct. But if you are going to maintain access security token you should move on csrf token which automatically maintain client side cookie and generate token for new session. For more detail http://www.senchalabs.org/connect/csrf.html

How do GET requests respond differently to a Server VS a Curl requests?

I am making a Node.js/Express backend for a mobile app that makes requests to an API using the RESTful approach, where I can use the JSON data returned by this API for users to use in my mobile app.
My confusion lies in the differences that go on 'under to hood' and what automatically get handled when using specific headers that are intended for browsers and a server VS when making curl request through the terminal, that you'll run more than once, sending and responding with different headers
Headers/The Rules That Apply I Make Requests
I must use the conditional GETs convention where:
1) All responses return an HTTP Cache-Control header. It’s content indicates how long a cached response can be used to reduce unnecessary API requests.
2) In addition to that, each response returns an HTTP ETag header. It’s content is to be used in subsequent requests to the same resource in an HTTP If-None-Match header. The API will then return a status code 304 Not Modified if the cached information is still valid.
Clients accessing the This API MUST use this techniques, also known as conditional GET.
Using Curl Requests
Making Initial Request:
A) Returns the data I requested. B) Returns an Etag to use for future requests C) Returns a Cache-Control header to know how long the data is cache-able, like so:
First Request:
$ curl -v "https://api.example.com/data/3" -X GET \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-u token:secret
Responce:
< HTTP/1.1 200 OK
< Date: Wed, 01 Sep 2017 22:22:22 GMT
< ETag: "xxxxxxxxxxxxxxxxxxxxxxx" <-----Got Etag
< Last-Modified: Wed, 09 Sep 2015 11:11:11 GMT
< Content-Type: application/json; charset=utf-8
< Cache-Control: max-age=3600, private <-----and Cache-Control is set
{"data":{"id":1, "...": "..."}} to 3600 seconds (1 hour).
Here I understand the Cache-Control header specifies that the data I just got is good for 3,600 seconds, i.e. 1 hour. So I can use the returned data-information for the next hour without having to request the data from the API again. After that time period, another request can be made to the API. This time I include the returned xxx..... ETag value I got one hour ago (because thats how long the Cache-Control was set for), and use it in the If-None-Match header if I make another request like so:
Making Another Curl Request:
Future Request:
$ curl -v "https://api.example.com/data/3" -X GET \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "If-None-Match: xxxxxxxxxxxxxxxxxxxxxxx" <--Added here
-u token:secret
Future Responce:
< HTTP/1.1 304 Not Modified <--Got 304 because the
< Date: Wed, 23 Sep 2015 20:24:20 GMT
< ETag: "xxxxxxxxxxxxxxxxxxxxxxx" <-- `Etag` returned is the same
< Last-Modified: Wed, 09 Sep 2015 11:45:39 GMT (so I know it hasn't changed)
< Cache-Control: max-age=3600, private
< Connection: close
So this leads to me having a few questions:
Making None-Curl Requests (via a Node.js server):
I am using the pretty straight forward Node.js request module to make my request like so:
var request = require('request');
var options = {
url: 'https://api.example.com/data/3',
headers: { <------ Setting up Headers
'Content-Type': 'application/json', Note: no "If-None-Match"
'Accept': 'application/json'
'Authorization: Basic QQWERQWERWQER='
}
};
function callback(error, response, body) {
if (!error && response.statusCode == 200) {
var data = body <------ Got 200 response and loaded
} return data into data varible
}
request(options, callback);
So how is my Express server communicating with this API when making requests?
So after I make my first request on my Express server, it (my server) can then determine if it needs to fetch that content from the network or from cache based on the Cache-Control header and ETag. So how is this being done?
Do I need to programmatically/write code to do that every hour? So will I have to write code to grab the etag like this
if (!error && response.statusCode == 200) {
var data = body;
var etag = response.headers.etag <------ Grab this
}
and create a second set of headers like this
headers: {
'Content-Type': 'application/json',
'Accept': 'application/json'
'Authorization: Basic QQWERQWERWQER='
'if-none-match': etag <------ Adding it here
}
like I had to do when did when I did my second curl? (I did this and it will hold up on every requests not returning any status code, if this is how its done could it just be I did something wrong on my end?)
or does the initial response where Cache-Control headers set to max-age=3600, private and the Etag (set to xxxxxxxx in my example) automatically set the cache on my Express server and tells it to act accordingly with the it wants this resource where it check it again in an hour without having to do anything to tell it to?
Right now whenever I request a resource in the mobile app that will need a something (lets say the resource in my example https://api.example.com/data/3 ) from this API it triggers my Express server to fetch it on mobile app users behalf and return it. However when I request the same resource more than once, the API returns a 200 response every time. Where as it should be responding with a 304 for every request after the first. So what's going on here/ what am I doing/ understanding wrong? why would I not get 304 response?

NodeJS application on openshift gets flooded with empty requests

I am not entirely sure if this is a bad thing or not, but i was monitoring (logging) what incoming http request i get with this piece of code (note that running an NodeJS application on the scalable OpenShift platform):
function onRequest(request, response)
{
var date = new Date();
console.log(date.toUTCString() + " A request was made with url: " + request.url + " and header: " + JSON.stringify(request.headers));
// continue handling the request
}
The results i get are the following (every 2 seconds):
Fri, 07 Mar 2014 09:43:59 GMT A request was made with url: / and header: {}
Fri, 07 Mar 2014 09:44:01 GMT A request was made with url: / and header: {}
Fri, 07 Mar 2014 09:44:03 GMT A request was made with url: / and header: {}
So i am wondering if this is normal behaviour for a scalable NodeJS app (with a MongoDB database gear attached) in openshift, or is this something that could cause problems?
Sincerely,
Hylke Bron
If you are running a scaled application, then that is haproxy making sure that your application is up so that it can forward requests to it. You can change haproxy settings in your haproxy/haproxy.cfg file on your main gear.
I am using this method to inform OPENSHIFT that the application is alive
app.head("/", function(req, res, next){
res.status(200).end();
});
Since I didn't want to mess with haproxy.cfg, I did this and it seems to work. In your main app, add a middleware function to abort on the ping
function ignoreHeartbeat(except) {
except = except || 0;
var count = except;
return function(req, res, next) {
if (req.headers["x-forwarded-for"])
return next();
if (except > 0) {
if (--count <= 0) {
count = except;
return next();
}
}
res.end();
}
}
app.use(ignoreHeartbeat(1000));
Place that code before the call to setup the logger (Express 3 example shown below)
app.use(express.logger(...))
This logs out every 1000th ping. Set except to 0 or -1 to ignore all the pings.

Inconsistent browser retry behaviour for timed out POST requests

I am experiencing occasional retries for a POST request, when there is no response from server due to timeout. All modern browsers have retry logic for idempotent requests (GET, HEAD, etc) but I am unable to reason out why it happens for a POST request.
I am testing this case using a simple node.js server with 3 routes and chrome browser .
/ : gives a html page with jquery and code snippets to fire ajax requests
/hi : gives a text response 'hello'
/sleep : request will timeout without any response
By default, node.js http server times out a request after 2 minutes.
retry.js
var http = require('http');
var server = http.createServer();
server.on('request', function(req, res) {
console.log(new Date() + ' ' + req.method + ' request on ' + req.url);
if (req.url === '/sleep') {
console.log('!!! sleeping');
} else if (req.url === '/') {
html = "$.post('/hi', {'for':'server'}, function() { console.log(arguments) } ).error(function() { console.log(arguments) })";
html += "<br><br>";
html += "$.post('/sleep', {'for':'infinite'}, function() { console.log(arguments) } ).error(function() { console.log(arguments) })";
html += '<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>';
res.writeHead(200, {'Content-Type': 'text/html'});
res.end(html);
} else {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('hello');
}
});
server.listen(2020);
console.log('server listening on port 2020');
run it
$ node retry.js
server listening on port 2020
1
Load this page in browser http://localhost:2020
Fri Mar 01 2013 12:21:59 GMT+0530 (IST) GET request on /
Fri Mar 01 2013 12:21:59 GMT+0530 (IST) GET request on /favicon.ico
2
From Dev console, fire an ajax POST request to /hi using jquery
$.post('/hi', {'for':'server'}, function() { console.log(arguments) } ).error(function() { console.log(arguments) })
Fri Mar 01 2013 12:22:05 GMT+0530 (IST) POST request on /hi
3
Fire a POST request to /sleep, results in a retry after 2 mins and errors out after 4 mins.
$.post('/sleep', {'for':'infinite'}, function() { console.log(arguments) } ).error(function() { console.log(arguments) })
server logs shows 2 requests
Fri Mar 01 2013 12:22:21 GMT+0530 (IST) POST request on /sleep
!!! sleeping
Fri Mar 01 2013 12:24:21 GMT+0530 (IST) POST request on /sleep
!!! sleeping
Firing it again, errors out in 2 mins without any retry.
Fri Mar 01 2013 12:30:01 GMT+0530 (IST) POST request on /sleep
!!! sleeping ?
It's not getting retried until we fire a request to /hi (or any other url) that results in a response. And retry happens for just one subsequent request to /sleep.
In browser, the network tab shows the pattern like
/hi - success
/sleep - cancelled - 4 mins (retry happens)
/sleep - cancelled - 2 mins (no retry)
/sleep - cancelled - 2 mins (no retry)
/hi - success
/sleep - cancelled - 4 mins (retry happens)
/sleep - cancelled - 2 mins (no retry)
/sleep - cancelled - 2 mins (no retry)
Question
Though we need to design our web app to tolerate these extra requests (either by browsers or any other intermediaries), this inconsistent browser retries looks weird. I had observed this behaviour in chrome (v16 & v24) & firefox.
Can someone help me to understand the browser retry logic behind timed out non-idempotent requests ?
Other relevant stackoverflow questions
What happens when no response is received for a request? I'm seeing retries
Browsers retry requests (including POSTs) when the connection is closed before receiving a response from the server. This is defined in the HTTP 1.1 Spec Section 8.2.4.

Resources