Get absolute-form request target of HTTP request using WAI - haskell

The Request type provides accessors for the request method and the request version but not for the bit in between.
So if I have the following request:
GET http://www.example.org/index.html HTTP/1.1
I want the http://www.example.org/index.html in between
RFC7230 Section 5.3.2 allows for this when making a request to a proxy. Section 5.4 says that the Host header should be overriden by the proxy with the host in the URI if the request is in absolute-form. This seems good enough for me, I don't know if WAI would handle this correctly if a client was not behaving correctly and sending a Host header different from the absolute-form URI.
Alternatively, if this is not possible: I'd like to ask if there is a more low level HTTP library than WAI available in Haskell?

The rawPathInfo accessor method will provide this. See https://hackage.haskell.org/package/wai-3.2.2.1/docs/Network-Wai.html#v:rawPathInfo for details.
If you want the query string too, it's available via the rawQueryString accessor.
As for the host, HTTP requests don't normally look like your example (HTTP/1.1 clients will only look like that if they're connecting to a proxy, rather than to the destination Web server). Instead, they often look like this:
GET /index.html HTTP/1.1
Host: www.example.org
If you want http://www.example.org too, then you'll have to rebuild it yourself from the host and protocol information.

Related

Allow access to local host from specific URL only on Linux

I have a REST API listening on the localhost:8000 and I want it to accept requests from localhost:5000 only. Is there a way to achieve this on Linux without modifying the API code?
you can use iptables,
but I think it will be easier to use socat like this:
socat TCP4:localhost:8000 TCP4:localhost:5000
for more information, you can look at this
https://unix.stackexchange.com/questions/10428/simple-way-to-create-a-tunnel-from-one-local-port-to-another
Your REST API probably have it's own mechanism of preventing cross-origin requests and that is the reason why you struggle with connecting those two locations. This problem can't be solved on the Linux level.
First of all, let's explain a few things.
Request's origin is defined by the following features:
scheme, which is simply a protocol that you API uses (HTTP or HTTPS)
hostname, which is domain or IP address (in your case it is localhost)
port, which is self-explanatory.
So, you want to perform a cross-origin request. In case of the simple HTTP request (GET, HEAD or POST request), you have to set Access-Control-Allow-Origin on the side of your REST API (localhost:8000). For that, check how to set up that header in your specific technology.
Cross-origin requests in your case will be possible if you set this header for the following value:
Access-Control-Allow-Origin: *
You want your localhost to be accessible for the specific URL only - in case of localhost, it is only accessible by the locally running applications. If you deploy your application somewhere in the web, and you want only specific URLs to be able to connect with the REST API, you have to use the following setting of Access-Control-Allow-Origin header:
Access-Control-Allow-Origin: https://foo.example
In your case on localhost, that would be:
Access-Control-Allow-Origin: http://localhost:5000
(I assumed that you use http protocol)...
In my opinion, it doesn't make much sense to restrict localhost connections this way - '*' is good. The only reason I can think of is protection against SSRF attacks, is that the case? (It is only justified if your server is exposed to the web.)
Further resources:
Simple cross-origin request documentation
Enabling CORS for REST API

Can I register a custom URL Scheme/Protocol with a Node HTTP Server?

I would like to be able to handle a custom URL scheme with the Node HTTP API. I would like to write links inside web pages like this: app://foo/bar. I would like to have the Node HTTP Request Handler receive this kind of URL.
When I try this kind of custom protocol in my URL, it looks like Chrome is not sending out the request because it is malformed. So nothing gets to my HTTP server in Node.
Is it possible to bind your HTTP server to a custom URL Scheme or Protocol like app://foo/bar?
Only certain protocols such as http:// and https:// will be sent to your nodejs http server. That's the issue. Your node.js server is an http server. The chrome browser will only send it URLs with the http protocol that it knows belong to an http server.
A custom protocol has to be first handled in the browser with a browser add-on that can then decide what to do with it.
Perhaps what you want to do is a custom HTTP URL such as:
http://yourserver.com/foo/bar
Then, your node.js http server will get the /foo/bar part of the request and you can write custom handlers for that.
To recap, the first part of the URL the part before the :// is the protocol. That tells the browser what protocol this URL is supposed to be used with. Without a browser add-on, a browser only comes with support for some built-in protocols such as http, https, ws, wss, mailto and some others.
An http server will only be able to respond to the http protocol so it will only work with URLs that expect to use the http protocol and that the browser knows use the http protocol. Thus your own protocol that the browser does not know about is not something the browser knows what to do with. It would take a browser add-on to tell the browser what to do for a custom URL.
When I try this kind of url, it almost looks like Chrome is batting it down before it can get to my HTTP server in Node.
Yes, it's not a recognizable protocol built into the browser so the browser doesn't know what to do with it or how to speak that protocol.
Is it possible to bind your HTTP server to a custom URL Scheme like this?
Only with a browser add-on that registers and implements support for the custom URL protocol.
I have made an npm module for this purpose.
link :https://www.npmjs.com/package/protocol-registry
So to do this in nodejs you just need to run the code below:
First Install it
npm i protocol-registry
Then use the code below to register you entry file.
const path = require('path');
const ProtocolRegistry = require('protocol-registry');
console.log('Registering...');
// Registers the Protocol
ProtocolRegistry.register({
protocol: 'testproto', // sets protocol for your command , testproto://**
command: `node ${path.join(__dirname, './index.js')} $_URL_`, // $_URL_ will the replaces by the url used to initiate it
override: true, // Use this with caution as it will destroy all previous Registrations on this protocol
terminal: true, // Use this to run your command inside a terminal
script: false
}).then(async () => {
console.log('Successfully registered');
});
Then suppose someone opens testproto://test
then a new terminal will be launched executing :
node yourapp/index.js testproto://test
Based on the comment thread on the other answer, I think I understand what you're trying to do.
I hear that you want to serve some files off localhost but not at all pollute the namespace of an existing webserver.
I have several weird alternative solutions:
Just pick a namespace that's unlikely to be used by a user. You can start with an underscore or a dollar sign? Or you can just a very random number.
You can serve your files, but only if a URI parameter exists with a very random string. PHP does this to serve the PHP logo for example.
You can't really change the scheme without creating browser add-ons, but you do have control over the TCP port. You can start a second webserver on a second port.
You can use a second domain. Just register a domain and point an A record to 127.0.0.1. Now your webserver running on localhost can check out the Host: header and serve your files if it matches your special hostname.

Cors and web resource

Im trying to figure what is cors.
In MDN it describe as :
A resource makes a cross-origin HTTP request when it requests a resource from a different domain than the one which the first resource itself serves.
Im not sure I know what is a web resource.
In addition, I understand thats cors allows me to use web resource from another domain in my domain by putting the domain in the header, but is it just convention or something more than that?
Let me try to give a short explanation.
Web resource
A web resource is anything you request on the web. That could be an image, a json payload, a pdf, an html-page etc. There's not more to it than that.
CORS
When you want to do an ajax-request in a browser (typically from javascript), you are typically limited to making requests to resources (url's) on the same domain. Eg. www.x.com can only request resources from www.x.com. Let's imagine you have a web page on www.x.com that want's to get a resource from api.x.com. This will not be possible unless the server (api.x.com) has CORS enabled.
So how does it work? Well, the flow is like this (simplified a lot).
When you do a ajax-request, for instance a GET request for a json payload, the browser sees this and issues an OPTIONS request to server in which it states who it is (www.x.com in the Origin header). The server is then supposed to answer with a response with a header saying that it is ok for www.x.com to do the GET request. The server does this by adding a header Access-Control-Allow-Origin: www.x.com. If the allowed origin matches the origin in the request, the browser issues the GET request and the json payload is returned by the server. If the allowed origin does not match, the browser refuses to do the request and shows an error in the console.
If you are doing the client (www.x.com), and are using - lets say jquery - you don't have to do anything. Everything happens automatically.
If you are doing the server (api.x.com), you have to enabled CORS. How this is done varies a lot but http://enable-cors.org/server.html has a nice guide on how to do it on different server types. They also have some more in depth guides on how it works. Specifically you might wanna take a look here https://www.nczonline.net/blog/2010/05/25/cross-domain-ajax-with-cross-origin-resource-sharing/
I hope this helps you out a bit

Is it possible to distinguish a requestURL as one typed in the address bar to log in a node proxy?

I just could not get the http-proxy module to work properly as a forward proxy. It works great as a reverse proxy. Therefore, I have implemented a node-based forward proxy using node's http and net modules. It works fine, both with http and https. I will deal with websockets later. Among other things, I want to log the URLs visited or requested through a browser. In the request object, I do get the URL, but as expected, when a page loads, a zillion other requests are triggered, including AJAX, third-party ads, etc. I do not want to log these.
I know that I can distinguish an AJAX request from the x-requested-with header. I can distinguish requests coming from a browser by examining the user-agent header (though these can be spoofed thru cURL). I want to minimize the log entries.
How do commercial proxies log such info? Or do they just log every request? One way would be to not log any requests within a certain time after the main request presuming that they are all associated with the main request. That would not be technically accurate.
I have researched in this area but did not find any solution. I am not looking for any specific code, just some direction...
No one can know that with precision, but you can find clues such as, "HTTP referer", "x-requested-with" or add your custom headers in each ajax request (squid proxy by default sends a "X-Forwarded-For" which says he is a proxy), but anybody can figure out what headers are you sending for your requests or copy all headers that a common browser sends by default and you will believe it is a person from a browser, but could be a bash cURL sent by a bot.
So, really, you can't know for example, if a request is an AJAX request because the headers aren't mandatory, by default your browser or your framework adds an x-requested-with or useful information to help to "guess" who is performing the request.

How do I add the Varnish-Cache signature to an HTTP request in VCL?

I would like to add the Varnish-Cache version/signature to my incoming HTTP requests so I can log the Varnish version with requests on my webserver. I understand this information is available in obj.http.Server, but this doesn't work inside vcl_recv or vcl_miss:
set req.http.X-VARNISH-VERSION = obj.http.Server;
Apparently those vcl subs only have access to req and not obj. Is there any other way to get the version number into an HTTP request header?
I am using Varnish 3.0.2.
[Edit]
I am using a Varnish module as an integral component in my system and as part of my automated testing I am running functional tests through the load balancer. I want my web servers (hhvm i this case) to know what version of Varnish is proxying the requests. Currently I am using a hardcoded string for this purpose, but I would like to automate this so I can distribute a non-hardcoded configuration to my varnish servers.
Varnish only sets the Server header when performing a synthetic response (like in vcl_error) and the header in that case doesn't contains Varnish' version.
Please extend your question, I can't envision what you want to achieve with that (and why a fixed string header substitution won't feet your needs).

Resources