Varnish 5.2 started reporting "500 Internal Server Error" - varnish

I have been running Varnish for some time and about 6 months ago I added a Varnish 5.2 server that has been running perfectly.
A couple of weeks ago we have started to see odd "500 Internal Server Error" and when looking at older reports they suggest its the server running out of internal memory.
There was a suggestion to tune the parameters (which I have tried) but I am still getting the errors any suggestions on where to look?
Alan
PS The sugested tuning I saw was:
-p workspace_client=160k \
-p workspace_backend=160k \
The up the workspace elements from the default 64k, I tried 128k and then 160k but no change in the reported occasional issues.

You can control the "maximum number of HTTP header lines" varnish allows via the http_max_hdr option. The default is 64 and, in my case, setting it to 128 or 256 solved my problem. Note that, for some reason, the value needs to be set in power of two, so setting it to 100 or 150 will not allow varnish to restart.
https://varnish-cache.org/docs/4.1/reference/varnishd.html#http-max-hdr

After much playing and looking at varnish log:
sudo /usr/local/bin/varnishlog -n -q 'RespStatus eq 500'
I saw the error:
- RespHeader X-1-SM-None: None
- LostHeader X-1-ServerTXT: Live One
- Error out of workspace (req)
- LostHeader X-1-Cache: MISS
- Error out of workspace (Req)
- VCL_return deliver
- Timestamp Process: 1513078776.040695 0.419343 0.000086
- Error out of workspace (Req)
- LostHeader Accept-Ranges: bytes
- LostHeader Connection: keep-alive
- Error out of workspace (Req)
- Error workspace_client overflow
- RespProtocol HTTP/1.1
- RespStatus 500
- RespReason Internal Server Error
Realised that there were too many resp.http in vcl_deliver removing and commenting out some of them which I was using for debug the problem went away.

Related

GitLab Health Check without token

I've got GitLab 10.5.6. I'd like to use Health Check information in my monitoring system. I can configure it by using Health Check endpoints with health check access token, but as this solution is depracated, I want to use IP whitelist. And I have some problems with it.
According to this article https://docs.gitlab.com/ee/administration/monitoring/ip_whitelist.html I edited /etc/gitlab/gitlab.rb and added this line (as this GitLab was installed around version 7 or even older I think):
gitlab_rails['monitoring_whitelist'] = ['127.0.0.0/8', '192.168.0.1', 'X.X.X.X', 'Y.Y.Y.Y']
where X.X.X.X is IP of my computer and Y.Y.Y.Y is IP of server with GitLab. After it I executed reconfiguration (gitlab-ctl reconfigure). And started tests... Below logs are from production.log file.
Execution of curl http://127.0.0.1:8888/-/readiness on server Y.Y.Y.Y returns proper JSON with expected data:
Started GET "/-/readiness" for 127.0.0.1 at 2018-03-24 20:01:31 +0100
Processing by HealthController#readiness as /
Completed 200 OK in 27ms (Views: 0.6ms | ActiveRecord: 0.5ms)
Execution of curl http://Y.Y.Y.Y:8888/-/readiness on server Y.Y.Y.Y returns error:
Started GET "/-/readiness" for Y.Y.Y.Y at 2018-03-24 21:20:04 +0100
Processing by HealthController#readiness as /
Filter chain halted as :validate_ip_whitelisted_or_valid_token! rendered or redirected
Completed 404 Not Found in 2ms (Views: 1.0ms | ActiveRecord: 0.0ms)
Accessing address http://Y.Y.Y.Y:8888/-/readiness through Firefox browser on computer X.X.X.X returns error:
Started GET "/-/readiness" for X.X.X.X at 2018-03-24 20:03:04 +0100
Processing by HealthController#readiness as HTML
Filter chain halted as :validate_ip_whitelisted_or_valid_token! rendered or redirected
Completed 404 Not Found in 2ms (Views: 0.8ms | ActiveRecord: 0.0ms)
Accessing address http://Y.Y.Y.Y:8888/-/readiness?token=ZZZZZZZZZZZZZ through Firefox browser on computer X.X.X.X returns proper JSON with expected data.
I don't have any idea what I can check more. Maybe there's lack of any more configuration in /etc/gitlab/gitlab.rb as it's quite old GitLab instance.

wget: server returned error: HTTP/1.1 202 Accepted

while doing wget from BusyBox v1.23.1 getting an error :
wget: server returned error: HTTP/1.1 202 Accepted
wget call :
wget http://182.72.194.130:7777/device_mgr/device-mgmt/app/cnc/sno/SCNC12J001/updates?cur_fw_ver=1.1(0)7&cur_config_ver=1.0
But when I tried , within ubuntu it worked. How can it be resolved?
HTTP Code 202
The HyperText Transfer Protocol (HTTP) 202 Accepted response status
code indicates that the request has been received but not yet acted
upon.
can mean "got your request okay but the resource is not yet ready"
e.g. a tape archive needs to be mounted. Best to try again a while later. When you repeated your request on Ubuntu the resource was probably mounted.
Wget has some retry parameters you can play with to delay a follow request: see here https://superuser.com/questions/493640/how-to-retry-connections-with-wget/689340#answer-689340

WebDAV server using IIS: HTTP error 412

I'm setting up a WebDAV server in my Windows 7 desktop, using IIS 7.5. The aim is for a WebDAV client app in my Iphone to be able to access and sync a series of files in a given folder in the PC. It's all in my own home network, and I'll be the only one syncing files.
Anyway, I've set everything up, but here is something really odd:
If I create a file in my Iphone, and then sync it to the Windows desktop, and then delete it in my Iphone... it deletes the file in the Windows desktop the next time I sync, as it should.
But if I create a file in my PC, sync it to my Iphone, and then delete it in my Iphone, the next time I sync it won't delete the file in the PC. It will return the HTTP error code 412 (Precondition failed).
I have enabled tracing in IIS, and looking at the request headers, I don't see the WebDAV client sending any precondition. Here is what it says:
Data about the HTTP request:
SiteId 1
AppPoolId DefaultAppPool
ConnId 1610612964
RawConnId 0
RequestURL http://192.168.1.111:80/Notebooks/filename.txt
RequestVerb DELETE
Request headers:
Headers Connection: keep-alive
Content-Length: 0
Accept: */*
Accept-Encoding: gzip, deflate
Accept-Language: es-es
Host: 192.168.1.111
If-Modified-Since: Wed, 07 Oct 2015 19:37:37 GMT
If-None-Match: "1ab8c9f371d11:0"
User-Agent: Notebooks 8.1.2 ( iPhone; iOS 8.4.1; es_ES)
And here is the only thing that the module says in its warning message:
ModuleName WebDAVModule
Notification 128
HttpStatus 412
HttpReason Precondition Failed
HttpSubStatus 0
ErrorCode 0
ConfigExceptionInfo
Notification EXECUTE_REQUEST_HANDLER
ErrorCode The operation was correctly completed. (0x0)
The above were the errors shown in IIS tracing. As for the regular Web server log:
192.168.1.230 - (my PC name) [07/Oct/2015:23:13:59 +0200] "PROPFIND /Notebooks/filename.txt HTTP/1.1" 207 509
192.168.1.230 - (my PC name) [07/Oct/2015:23:13:59 +0200] "DELETE /Notebooks/filename.txt HTTP/1.1" 412 1738
I have been in touch with the developer of the Iphone WebDAV client, and he says that, in his experience, this tends to suggests permission problems, but I have checked the permissions of the files in Windows, and they are fine. By this I mean that I check the files that I can delete from WebDAV, and then the ones that give me the 412 error... and both have identical permissions.
What else could it be?

Network printer doesn't accept job from Debian Linux, no errors in error_log

There is a shared printer at my workplace. We send jobs and then go to the printer and authenticate, so printer prints your documents only when you present at it. Periodically, we change domain passwords, so I also have to change it in /etc/cups/printers.conf (windows users just change domain password). So, that's how it works.
But, suddenly, it stop receive my jobs. When I send job I have no errors and have this:
sudo tail /var/log/cups/access_log
localhost - - [14/Apr/2015:12:15:14 +0300] "POST /printers/Generic-PCL-6-PCL-XL HTTP/1.1" 200 499 Create-Job successful-ok
localhost - - [14/Apr/2015:12:15:14 +0300] "POST /printers/Generic-PCL-6-PCL-XL HTTP/1.1" 200 1273674 Send-Document successful-ok
localhost - - [14/Apr/2015:12:17:59 +0300] "POST / HTTP/1.1" 200 183 Renew-Subscription successful-ok
On cups page in browser it shows state for job - "Pending since (date/time)".
It seems like job was sent successfully, but when I came to printer I've got nothing and no job in my queue. Our IT support fix problems only for Windows users and who on Linux - on their own. So, I don't know what to do and what logs I should inspect. Please, help.
Probably, some updates broke it down. But I have found another solution - I add printer not via samba, but via lp and it doesn't ask username/password:
cat /etc/cups/printers.conf
# Printer configuration file for CUPS v1.5.3
# Written by cupsd
# DO NOT EDIT THIS FILE WHEN CUPSD IS RUNNING
<DefaultPrinter KonicaMinolta>
UUID urn:uuid:0f60c08a-ecfb-326a-421c-86aa3519147b
Info MyCompany Office printer
Location WestCorridor
MakeModel Generic PostScript Printer Foomatic/Postscript (recommended)
DeviceURI lpd://Company_printer_server_address/lp
State Idle
StateTime 1429265417
Type 8433692
Accepting Yes
Shared Yes
JobSheets none none
QuotaPeriod 0
PageLimit 0
KLimit 0
OpPolicy default
ErrorPolicy stop-printer
</Printer>
If somebody can provide another solution or some explanation why it is so, I will be glad to see.
As far as debugging you can view more data in your CUPS logs if you edit your /etc/cups/cupsd.conf file, find the section "loglevel" change "info" to "debug"
Then you should restart CUPS with:
/etc/init.d/cups restart
Then your log will be in
/var/log/cups/error_log

HTTP request using telnet not getting any response

We are using the telnet mechanism to send http request to server and get the response.
We noticed a strange thing when using the telnet for sending the HTTP GET request.
The first method is working in most of the environments but it's not working in one of the environment. But The second method(instead of relative path, use the complete path) is working fine in this environment.
**
Method1:
**
(printf "GET /test.jsp HTTP/1.0\nAccept: */*\nUser-Agent: WatchDog\n\n"; sleep 9) | telnet xx.xx.xx.xx 8093
Trying xx.xxx.xx.xx...
Connected to xx.xx.xx.xx.
Escape character is '^]'.
Connection closed by foreign host.
**
Method2:
**
(printf "GET http://xx.xx.xx.xx:8093/test.jsp HTTP/1.0\nAccept: */*\nUser-Agent: WatchDog\n\n"; sleep 9) | telnet xx.xx.xx.xx 8093
Trying xx.xx.xx.xx...
Connected to xx.xx.xx.xx.
Escape character is '^]'.
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Set-Cookie: JSESSIONID=91643475E80038EA8770CE6803EE320C; Path=/
Content-Type: text/html;charset=UTF-8
Content-Language: zh-US
Content-Length: 42
Date: Mon, 03 Dec 2012 04:25:09 GMT
Connection: close
The Server is Running
Connection closed by foreign host.
Why the method1 is not running in only one environment? do we need to check some thing in that environment?
Pls give your suggestions...
Thanks,
Sekhar
HTTP/1.0 (RFC 1945) specifies the line ending to be CR LF. Some servers may apply this rule over strictly. Try with sending the request with \r\n as line endings. Sending absolute URIs is also reserved for use by proxies (section 5.1.2 of RFC 1945).
If varying line endings and URI style doesn't help you'll have to look at the servers configuration/implementation, as I can not see anything wrong with method 1.
Apart from the line endings which must be \r\n and your accept header which should be */* instead of /, your first request doesn't have a host name.
An HTTP 1.1 server may deny HTTP requests that do not have a host set, either in the absolute request-URI or in a Host-header.

Resources