How to generate a list of PASS url's in varnishlog? - varnish

I'm trying to generate a simple list of url's that get a 'PASS' from Varnish. varnishlog is a great utility, but it appears that it can't do this task, as it primarily logs HITS, and has no tag for PASS.
Any idea if there is a way to log this? Perhaps in vcl_pass subroutine?

Nevermind... I got it.
varnishlog -b -k 5000 -i TxURL

Related

How do I substitute my values in a post request?

I have links .php how do I substitute my values in all parameters using curl post?
Provided that I do not know what the parameters are in these php links, curl should determine for itself what parameters are in the post request and substitute my values.
If I know the parameter, then I can send it to the links like this:
while read p; do
curl $p -X POST --connect-timeout 18 --cookie "" --user-agent "" -d "parametr=helloworld" -w "%{url}:%{time_total}s\n"
done < domain.txt > output.txt
And if I do not know the parameters, what should I do? How to make curl automatically substitute values into parameters? For example, the value: "hello world" provided that I did not know "parameter"
It's simply not possible. curl is a client program and has no way of knowing or finding out which request parameters are supported by a server or which are not.
Unless of course, the API is properly documented and available as an OpenAPI/Swagger specification for example. If it isn't, you're out of luck.

varnish 5.0 varnishstat show empty response with parameter f

I have a problem with varnishtat, who return empty solution with the f parameter.
I plan to use varnishstat to monitor varnish, like that :
varnishstat -f MAIN.uptime
on previous version of varnish, like 4.0.3, there is no problem, i receive the value into console or inline. But on varnish as soon as i use the f parameter, the answer is empty.
if i use like that:
varnishstat -f MAIN.*
it works perfectly, but if i want to target a specific value, i have a empty response in return.
do you have a way to have the response of varnistat as usual ?
Thanks a lot.
i reply to my own question :)
seems in varnish5 need double the .
it means, before it's possible to use like that:
varnishstat -f MAIN.uptime
now in Varnish 5, need use like that :
varnishstat -f MAIN..uptime

how to download batch of data with linux command line?

For example I want to download data from:
http://nimbus.cos.uidaho.edu/DATA/OBS/
with the link:
http://nimbus.cos.uidaho.edu/DATA/OBS/pr_1979.nc
to
http://nimbus.cos.uidaho.edu/DATA/OBS/pr_2015.nc
How can I write a script to download all of them? with wget?and how to loop the links from 1979 to 2015?
wget can take file as input which contains URLs per line.
wget -ci url_file
-i : input file
-c : resume functionality
So all you need to do is put the URLs in a file and use that file with wget.
A simple loop like Jeff Puckett II's answer will be sufficient for your particular case, but if you happen to deal with more complex situations (random urls), this method may come in handy.
Probably something like a for loop iterating over a predefined series.
Untested code:
for i in {1979..2015}; do
wget http://nimbus.cos.uidaho.edu/DATA/OBS/pr_$i.nc
done

Is it possible to read only first N bytes from the HTTP server using Linux command?

Here is the question.
Given the url http://www.example.com, can we read the first N bytes out of the page?
using wget, we can download the whole page.
using curl, there is -r, 0-499 specifies the first 500 bytes. Seems solve the problem.
You should also be aware that many HTTP/1.1 servers do not have this feature enabled, so that when you attempt to get a range, you'll instead get the whole document.
using urlib in python. similar question here, but according to Konstantin's comment, is that really true?
Last time I tried this technique it failed because it was actually impossible to read from the HTTP server only specified amount of data, i.e. you implicitly read all HTTP response and only then read first N bytes out of it. So at the end you ended up downloading the whole 1Gb malicious response.
So the problem is that how can we read the first N bytes from the HTTP server in practice?
Regards & Thanks
You can do it natively by the following curl command (no need to download the whole document). According to the curl man page:
RANGES
HTTP 1.1 introduced byte-ranges. Using this, a client can request to get only one or more subparts of a specified document. curl
supports this with the -r flag.
Get the first 100 bytes of a document:
curl -r 0-99 http://www.get.this/
Get the last 500 bytes of a document:
curl -r -500 http://www.get.this/
`curl` also supports simple ranges for FTP files as well.
Then you can only specify start and stop position.
Get the first 100 bytes of a document using FTP:
curl -r 0-99 ftp://www.get.this/README
It works for me even with a Java web app deployed to GigaSpaces.
curl <url> | head -c 499
or
curl <url> | dd bs=1 count=499
should do
Also there are simpler utils with perhaps borader availability like
netcat host 80 <<"HERE" | dd count=499 of=output.fragment
GET /urlpath/query?string=more&bloddy=stuff
HERE
Or
GET /urlpath/query?string=more&bloddy=stuff
You should also be aware that many
HTTP/1.1 servers do not have this
feature enabled, so that when you
attempt to get a range, you'll instead
get the whole document.
You will have to get the whole web anyways, so you can get the web with curl and pipe it to head, for example.
head
c, --bytes=[-]N
print the first N bytes of each file; with the leading '-', print all
but the last N bytes of each file
I came here looking for a way to time the server's processing time, which I thought I could measure by telling curl to stop downloading after 1 byte or something.
For me, the better solution turned out to be to do a HEAD request, since this usually lets the server process the request as normal but does not return any response body:
time curl --head <URL>
Make a socket connection. Read the bytes you want. Close, and you're done.

Compare two websites and see if they are "equal?"

We are migrating web servers, and it would be nice to have an automated way to check some of the basic site structure to see if the rendered pages are the same on the new server as the old server. I was just wondering if anyone knew of anything to assist in this task?
Get the formatted output of both sites (here we use w3m, but lynx can also work):
w3m -dump http://google.com 2>/dev/null > /tmp/1.html
w3m -dump http://google.de 2>/dev/null > /tmp/2.html
Then use wdiff, it can give you a percentage of how similar the two texts are.
wdiff -nis /tmp/1.html /tmp/2.html
It can be also easier to see the differences using colordiff.
wdiff -nis /tmp/1.html /tmp/2.html | colordiff
Excerpt of output:
Web Images Vidéos Maps [-Actualités-] Livres {+Traduction+} Gmail plus »
[-iGoogle |-]
Paramètres | Connexion
Google [hp1] [hp2]
[hp3] [-Français-] {+Deutschland+}
[ ] Recherche
avancéeOutils
[Recherche Google][J'ai de la chance] linguistiques
/tmp/1.html: 43 words 39 90% common 3 6% deleted 1 2% changed
/tmp/2.html: 49 words 39 79% common 9 18% inserted 1 2% changed
(he actually put google.com into french... funny)
The common % values are how similar both texts are. Plus you can easily see the differences by word (instead of by line which can be a clutter).
The catch is how to check the 'rendered' pages. If the pages don't have any dynamic content the easiest way to do that is to generate hashes for the files using a md5 or sha1 commands and check then against the new server.
IF the pages have dynamic content you will have to download the site using a tool like wget
wget --mirror http://thewebsite/thepages
and then use diff as suggested by Warner or do the hash thing again. I think diff may be the best way to go since even a change of 1 character will mess up the hash.
I've created the following PHP code that does what Weboide suggest here. Thanks Weboide!
the paste is here:
http://pastebin.com/0V7sVNEq
Using the open source tool recheck-web (https://github.com/retest/recheck-web), there are two possibilities:
Create a Selenium test that checks all of your URLs on the old server, creating Golden Masters. Then running that test on the new server and find how they differ.
Use the free and open source (https://github.com/retest/recheck-web-chrome-extension) Chrome extension, that internally uses recheck-web to do the same: https://chrome.google.com/webstore/detail/recheck-web-demo/ifbcdobnjihilgldbjeomakdaejhplii
For both solutions you currently need to manually list all relevant URLs. In most situations, this shouldn't be a big problem. recheck-web will compare the rendered website and show you exactly where they differ (i.e. different font, different meta tags, even different link URLs). And it gives you powerful filters to let you focus on what is relevant to you.
Disclaimer: I have helped create recheck-web.
Copy the files to the same server in /tmp/directory1 and /tmp/directory2 and run the following command:
diff -r /tmp/directory1 /tmp/directory2
For all intents and purposes, you can put them in your preferred location with your preferred naming convention.
Edit 1
You could potentially use lynx -dump or a wget and run a diff on the results.
Short of rendering each page, taking screen captures, and comparing those screenshots, I don't think it's possible to compare the rendered pages.
However, it is certainly possible to compare the downloaded website after downloading recursively with wget.
wget [option]... [URL]...
-m
--mirror
Turn on options suitable for mirroring. This option turns on recursion and time-stamping, sets infinite recursion depth and keeps FTP
directory listings. It is currently equivalent to -r -N -l inf --no-remove-listing.
The next step would then be to do the recursive diff that Warner recommended.

Resources