Varnish script for warming cache

Varnish script for warming cache - varnish

I am using this script:
#!/bin/bash
wget --output-document=/dev/null --header='Cache-Control: no-cache' --tries=1 --quiet --base=http://example.com --input-file=/path/to/urls.txt
which is calling some urls I want to warm in Varnish from a txt file.
For this to work I added in
Varnish vcl
this code:
acl warmuper_ip {
"10.20.30.40";
}
sub vcl_recv {
# the script varnish-cache-warmup.sh must always refresh the cache
if (client.ip ~ warmuper_ip && req.http.Cache-Control ~ "no-cache") {
set req.hash_always_miss = true;
}
}
My question is, although this is working, every time I use this is it clearing cache and then add my urls? Because I am newbie and I don't want to clear Varnish every time I run this script. I just want to cache my links without affection previous cached links.

$req.http.Cache-Control ~ "no-cache" I don't think the content you retrieve from the backend get cached.
The content is simply being fetched and delivered, but it won't be stored in cache.

Related

Updating a file for a quick-pull using github cli

Currently in the github UI, a user can edit a file and create a new branch in a single action. This can also be done through the github api using something like this:
curl 'https://github.com/<my_org>/<my_repo>/tree-save/master/<path_to_file>' \
-H 'content-type: application/x-www-form-urlencoded' \
--data-raw 'authenticity_token=<generated_token>&filename=<filename>&new_filename=<filename>&content_changed=true&value=<new_contents_of_file>&message=Updated+file+in+my+repo&placeholder_message=Update+<filename>&description=&commit-choice=quick-pull&target_branch=<new_branch_name>&quick_pull=master&guidance_task=&commit=<target_commit_checksum>&same_repo=1&pr='
What I would like to be able to do, is perform the same action using the github cli* (gh). I have tried using the following commands:
gh api <my_org>/<my_repo>/tree-save/master/<path_to_file> -F "filename=<filename>" -F ...
and
gh api repos/<my_org>/<my_repo>/contents/<path_to_file> -F "filename=<filename>" -F ...
For both cases (and many variations on these options), I'm getting a 404** back. Any ideas what I'm doing wrong? Does the github cli even allow the functionality allowed in the above curl?
* For those curious, I want to use the CLI because of how it handles auth and it's statelessness. I can't generate a token to use, like in the curl above. And, due to multiple issues, I also can't clone the repo locally.
** I'm able to retrieve the file just fine using the simple GET command (the second command above without the '-F' flags)

After reading documentation, and then verifying by altering credentials, it appears to be a permissions issue. Evidently, for security reasons, if a token is used that does not meet the required permissions, a 404 is returned instead of a 403.
Interesting that I can still use the curl above through the browser. So, now i need to figure out why the gh cli token does not have the same permissions as my user.

why calling curl from execSync in Node.js fails but directly run the exact-same command works?

I have come into a trouble that when using execSync in node.js, it's not working as directly type the command in the shell.
Here is my issue:
I use a curl to request for some data from a server, and I need to do that with a cookie because there is a login requirement.
It's easy to handle the login process and get the cookie, but it's weird that using the cookie with a curl in node.js would cause the server an "internal error". And since I don't have the permission to change the server-code, I'm looking for help about the difference of calling curl in Node.js and directly use curl.
Here is the code:
var command = 'curl --cookie cookie.txt ' + getURL();
console.log(command);
// output: curl --cookie cookie.txt http://example.com/getdata
var result = child_process.execSync(command).toString();
// will cause an internal error and the "result" is an error-reporting page.
Directly calling this in the shell:
curl --cookie cookie.txt http://example.com/getdata
Everything works, I got the data I need.
I tried to find some plots, for instance, changing the code to:
var command = 'curl --cookie cookie-bad.txt ' + getURL();
I put some wrong cookie in the cookie-bad.txt, I will get a "you are not log in" result.
So there must be something wrong with:
sending a cookie to the server to request some data with curl running inside a nodejs script with execSync.
Is there any way I can improve the code or something?

What is your Node.js version? I don't have any problem with 10.16.0.

How do I clear output cache in Pimcore programmatically?

Note that this is not disable output cache which only disable for certain request, as what the documentation specified.

Pimcore provide a cli script to clear the cache
php pimcore/cli/console.php cache:clear
You can run the same command programmatically in any of the php file with exec() function

Found this in Pimcore 4 source code in pimcore/models/Object/ClassDefinition.php
// empty output cache
try {
Cache::clearTag("output");
} catch (\Exception $e) {
} ?>
Although I don't really agree on how this works (they did not specify that "output" is a reserved keyword for cache tags).

Please use this command your project root
php bin/console cache:clear
php bin/console pimcore:cache:clear

How to config wget to retry more than 20?

I used d4x to continue download, but it's obsolete from ubuntu. So I use flashgot with wget to continue download. But wget stops after 20 tries, I have to restart manually. Is there any conf file I could modify the retry times more than 20?
The wget CLI is automatic created by wget, so please don't tell me I could make options with the wget CLI.

Use the --tries option:
wget --tries=42 http://example.org/
Specify --tries=0 or --tries=inf for infinite retrying (default is 20 retries).
The default value also can be changed via config file, if that is your thing; open /etc/wgetrc and look there for:
# You can lower (or raise) the default number of retries when
# downloading a file (default is 20).
#tries = 20
uncomment tries=20 and change it to what you want.
The default is to retry 20 times, with the exception of fatal errors
like "connection refused" or "not found" (404), which are not retried

If the default retry value can not meet your needs, it seems you are downloading from an unstable source.
The following option may also help a lot.
--retry-connrefused --read-timeout=20 --timeout=15 --tries=0 --continue
--retry-connrefused
Force wget to retry even the server refuses requests, otherwise wget will stop to retry.
--waitretry=1
If you decide to retry many times, it's better to add some short period between each retry.
--timeout=15
unstable link always cause stopping of data flow, the default 900s timeout is too long. There is --dns-timeout, --connect-timeout, and --read-timeout. By specify --timeout, you update them all at the same time.
--tries=0
Make wget to retry infinity except fatal situations such as 404.
--continue
resume download

wget and curl somehow modifying bencode file when downloading

Okay so I have a bit of a weird problem going on that I'm not entirely sure how to explain... Basically I am trying to decode a bencode file (.torrent file) now I have tried 4 or 5 different scripts I have found via google and S.O. with no luck (get returns like this in not a dictionary or output error from same )
Now I am downloading the .torrent file like so
wget http://link_to.torrent file
//and have also tried with curl like so
curl -C - -O http://link_to.torrent
and am concluding that there is something happening to the file when I download in this way.
The reason for this is I found this site which will decode a .torrent file you upload online to display the info contained in the file. However when I download a .torrent file by not just clicking on the link through a browser but instead using one of the methods described above it does not work either.
So Has anyone experienced a similar problem using one of these methods and found a solution to the problem or even explain why this is happening ?
As I can;t find much online about it nor know of a workaround that I can use for my server
Update:
Okay as was suggested by #coder543 to compare the file size of download through browser vs. wget. They are not the same size using wget style results in a smaller filesize so clearly the problem is with wget & curl not the something else .. idea's?
Updat 2:
Okay so I have tried this now a few times and I am narrowing down the problem a little bit, the problem only seems to occur on torcache and torrage links. Links from other sites seems to work properly or as expected ... so here are some links and my results from the thrre different methods:
*** differnet sizes***
http://torrage.com/torrent/6760F0232086AFE6880C974645DE8105FF032706.torrent
wget -> 7345 , curl -> 7345 , browser download -> 7376
*** same size***
http://isohunt.com/torrent_details/224634397/south+park?tab=summary
wget -> 7491 , curl -> 7491 , browser download -> 7491
*** differnet sizes***
http://torcache.net/torrent/B00BA420568DA54A90456AEE90CAE7A28535FACE.torrent?title=[kickass.to]the.simpsons.s24e12.hdtv.x264.lol.eztv
wget -> 4890 , curl-> 4890 , browser download -> 4985
*** same size***
http://h33t.com/download.php?id=cc1ad62bbe7b68401fe6ca0fbaa76c4ed022b221&f=Game%20of%20Thrones%20S03E10%20576p%20HDTV%20x264-DGN%20%7B1337x%7D.torrent
wget-> 30632 , curl -> 30632 , browser download -> 30632
*** same size***
http://dl7.torrentreactor.net/download.php?id=9499345&name=ubuntu-13.04-desktop-i386.iso
wget-> 32324, curl -> 32324, browser download -> 32324
*** differnet sizes***
http://torrage.com/torrent/D7497C2215C9448D9EB421A969453537621E0962.torrent
wget -> 7856 , curl -> 7556 ,browser download -> 7888
So I it seems to work well on some site but sites which really on torcache.net and torrage.com to supply files. Now it would be nice if i could just use other sites not relying directly on the cache's however I am working with the bitsnoop api (which pulls all it data from torrage.com so it's not really an option) anyways, if anyone has any idea on how to solve this problems or steps to take to finding a solution it would be greatly appreciated!
Even if anyone can reproduce the reults it would be appreciated!
... My server is 12.04 LTS on 64-bit architecture and the laptop I tried the actual download comparison on is the same

For the file retrieved using the command line tools I get:
$ file 6760F0232086AFE6880C974645DE8105FF032706.torrent
6760F0232086AFE6880C974645DE8105FF032706.torrent: gzip compressed data, from Unix
And sure enough, decompressing using gunzip will produce the correct output.
Looking into what the server sends, gives interesting clue:
$ wget -S http://torrage.com/torrent/6760F0232086AFE6880C974645DE8105FF032706.torrent
--2013-06-14 00:53:37-- http://torrage.com/torrent/6760F0232086AFE6880C974645DE8105FF032706.torrent
Resolving torrage.com... 192.121.86.94
Connecting to torrage.com|192.121.86.94|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.0 200 OK
Connection: keep-alive
Content-Encoding: gzip
So the server does report it's sending gzip compressed data, but wget and curl ignore this.
curl has a --compressed switch which will correctly uncompress the data for you. This should be safe to use even for uncompressed files, it just tells the http server that the client supports compression, but in this case curl does look at the received header to see if it actually needs decompression or not.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Varnish script for warming cache - varnish

$req.http.Cache-Control ~ "no-cache" I don't think the content you retrieve from the backend get cached. The content is simply being fetched and delivered, but it won't be stored in cache.

Related

Updating a file for a quick-pull using github cli

why calling curl from execSync in Node.js fails but directly run the exact-same command works?

How do I clear output cache in Pimcore programmatically?

How to config wget to retry more than 20?

wget and curl somehow modifying bencode file when downloading

Categories

Resources