How do I control request headers set by requirejs? - requirejs

Specifically I want to add a client-set request ID to troubleshoot some network problems. I see that requireJS allows for control over query string parameters, e.g. for cache busting, but putting a unique request ID in the URL will cache bust (when I don't want to) and make debugging hard (breakpoints will be set on e.g. /resource?request_id=288832uc8vasd8).

RequireJS is going to pull scripts in by adding them as <script/> tags and then waiting for the callback to fire. As far as I know, you can't add headers to the request the the <script/> tag fires so I don't think you can add headers for RequireJS calls.
From there, you can consider the answer to is it possible to set custom headers on js requests?:
Short answer: no. By default a script tag will just retrieve the
resource specified in the src attribute.

Related

how to set query variables on server response

I am running an express app and in a section I need to pass the page I'm serving some data. I am sending the file with the res.sendFile() function. I would prefer it to be in the form of query parameters, so that the page being sent is able to read them easily.
I am unable to run any templating tool or set cookies since the files are part of a cdn uploaded by users, so the information has to be contained so that it is not easily read by other files also served from my server.
Query parameters can only be sent by doing a redirect where your server returns a 3xx status (probably 302) which redirects the browser to a different URL with the query parameters set. This is not particularly efficient because it requires an extra request from the server. See res.redirect() for more info.
A more common way to give data to a browser is to set a few Javascript variables in the web page and the client Javascript can then just read those variables directly. You would have to switch from res.sendFile() to something that can modify specific parts of the web page before sending it - probably one of the many template engines available for Express (jade, handlebars, etc...).
You could also send data by returning a cookie with the response, though a cookie is not really the ideal mechanism for variables just for one particular instance of one particular page.

Is it possible for Varnish to examine the content of a request (not just headers) in vcl_fetch and react?

I know that the default Varnish vcl_fetch looks at beresp.ttl and beresp.http.* to reference the HTTP headers returned from the backend, but is it possible to examine the content of the response also? Our backend sometimes fails with junk HTML but with a status of 200 OK. We'd like to be able to run a regex on the result and retry if possible.
I understand that versions of Varnish <= 3.0 don't stream anyway and download the entire object before passing to the client, but I can't find the appropriate field in beresp in the documentation - I'm looking for something like beresp.http.content
Yes and no. It's accessible, but only through inline C, not VCL configuration (to the best of my knowledge). However, it's not easy to do and not really recommended due to the additional overhead of parsing body text. That said, you can see an attempt at something like what you're looking for here: rewrite vmod for varnish 3
If your junk HTML responses are of a specific length, you can retry the request based on the response's Content-Length header. Alternatively, you might consider adding client-side JS to evaluate the HTML and make an AJAX request to a URL to clear the cache of any junk pages. Lastly, if you know that only a specific subset of your site that returns invalid results, you can try proxying those URLs through something like OpenResty with LuaJIT or nginx with the subs module enabled, and do the body parsing there.

Post Form to a page in CQ5

I have a custom search component which searches for some parameter(s) from a dropdown [myParam] and displays the search results in another page. I currently use the default (GET) form
<form id="searchForm" action="/content/myWeb/searchResult.html" method="get" target="_blank">
In the result page, a component picks up the request params and processes the search.
I need to make it a POST submission so that the search parameters are NOT visible in the URL. But if I make it a method="Post" in the form above, I get this error:
Status
500
Message
javax.jcr.nodetype.ConstraintViolationException: no matching property definition found for {}myParam
Location /content/myWeb/searchResult
Parent Location /content/myWeb
Path
/path/to/search/page
That exception is the incidental way that Sling tells you that the servlet to which you are attempting to POST can not be found. What happens, in this case, is that Sling defaults to the SlingDefaultPostServlet, which attempts to to POST properties (represented by your form values) to the node /content/myWeb/searchResult. There's no way for Sling to say "I can't find a servlet that's registered to your request", so it just falls back to it's default behavior.
I'm assuming /content/myWeb/searchResult is a cq:Page node type. That node type is very restrictive, which is why it tells you that you cannot add properties that correspond to your form values.
This worked before, because your GET request to /content/myWeb/searchResult.html was able to resolve and execute. All GET requests to a page node can be served up by the system, inherently.
Now, since you are trying to do a POST, you need to create and register a new servlet that can handle this POST request. To do this, you'll need to create a SlingPostServlet and register it to your specific path (not recommended) or a specific selector/extension combination (recommended). That servlet should process the request parameters and respond with an HTML document.
A caveat...
What I just described will help you technically build what you are asking. That said, I don't agree with the premise that you should "make it a POST to hide the request parameters." The reason this is so much extra work, is because you are circumventing the principles of REST, which Sling is theoretically built to support. Your URL (via request path and parameters) should be communicating "I want the page at /content/myWeb/searchResult, given the criteria param1=x, param2=y, and so on". The GET with request params is an appropriately RESTful request.
I suggest you rethink what you're trying to do. Building a more complex solution around RESTful principles is not a good practice.
Just as a sidenote, you can always check if a given URL is bound to a servlet via the sling servlet resolver. Reachable via the OSGI-console or via URL:
http://localhost:4502/system/console/servletresolver
This can at least help you find closure on, if the servlet is registered to the given URL.
You can create a POST.jsp for your page, which could handle the POST request.
It is not restful to make get like request with POST, but sometimes it can be useful. Also With POST, dispatcher won't cache your request.

Can I capture JSON data already being sent with a userscript/Chrome extension?

I'm trying to write a userscript/Chrome extension to capture JSON data being sent while using a web service so that I can reformat it and display selected portion on page. Currently the JSON is sent as the application loads (as I've observed from watching traffic with Fiddler 2). Is my only option to request the JSON again or is capture possible? As I'm not providing a code example, a requested answer is even some guidance on what method / topic to research or if I'm barking up the wrong tree.
No easy way.
If it is for a specific site you might look into intercepting and overwriting part of a code which sends a request. For example if it is sent on a button click you can replace existing click handler with your own implementation.
You can also try to make a proxy for XMLHttpRequest. Not sure if this even possible, never seen a working example. You can look at some attempts here.
For all these tasks you probably would need to run your javascript code out of sandboxed content script to be able to access parent page variables, so you would need to inject <script> tag with your code right into the page from a content script:

How does the browser know a web page has changed?

This is a dangerously easy thing I feel I should know more about - but I don't, and can't find much around.
The question is: How exactly does a browser know a web page has changed?
Intuitively I would say that F5 refreshes the cache for a given page, and that cache is used for history navigation only and has an expiration date - which leads me to think the browser never knows if a web page has changed, and it just reloads the page if the cache is gone --- but I am sure this is not always the case.
Any pointers appreciated!
Browsers will usually get this information through HTTP headers sent with the page.
For example, the Last-Modified header tells the browser how old the page is. A browser can send a simple HEAD request to the page to get the last-modified value. If it's newer than what the browser has in cache, then the browser can reload it.
There are a bunch of other headers related to caching as well (like Cache-Control). Check out: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
Don't guess; read the docs. Here's a friendly, but authoritative introduction to the subject.
Web browsers send HTTP requests, and receive HTTP responses. They then displays the contents of the HTTP responses. Typically the HTTP responses will contain HTML. And many HTML elements may need new requests to receive the various parts of the page. For example each image is typically another HTTP request.
There are HTTP headers that indicate if a page is new or not. For example the last modified date. Web browsers typically use a conditional GET (conditional header field) or a HEAD request to detect the changes. A HEAD request receives only the headers and not the actual resource that is requested.
A conditional GET HTTP request will return a status of 304 Not Modified if there are no changes.
The page can then later change based on:
User input
After user input, changes can happen based on javascript code running without a postback
After user input, a new request to the server and get a whole new (possibly the same) page.
Javascript code can run once a page is already loaded and change things at any time. For example you may have a timer that changes something on the page.
Some pages also contain HTML tags that will scroll or blink or have other behavior.
You're getting along the right track, and as Jonathan mentioned, nothing is better than reading the docs. However, if you only want a bit more information:
There are HTTP response headers that let the server set the cacheability of a page, which falls into your expiration date system. However, one other important construct is the HTTP HEAD request, which essentially retrieves the MIME Type and Content-Length (if available) for a given page. Browsers can use the HEAD request to validate what is in their caches...
There is definitely more info on the subject though, so I would suggest reading the docs...

Resources