Get http code when uploading file using curl and printing curl output - linux

I'm writing a bash script where I want to upload a file using curl. I want to print in a log file the output from stdout and stderr (to see if the upload is being completed). This is what works so far:
curl -T home/user/mydir/filename <my url>/.../filename >> home/user/mydir/script.log 2>&1
Now, I also want to get the returned http code so I can use it somewhere else in my script. The following works as well:
CODE=$(curl -w '%{http_code}' -T home/user/mydir/filename <my url>/.../filename)
How can I combine these two commands? I want to print the stdout and stderr in the log file (append, to be precise) and also store the returned http code in a variable for later use.
Thank you.

Can you do something like this (the first < line is the return code) (from the verbose option). The -o can be - (stdout) OR any other file you want??
# curl -v -o /dev/null http://www.google.com
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 142.250.76.100...
* TCP_NODELAY set
* Connected to www.google.com (142.250.76.100) port 80 (#0)
> GET / HTTP/1.1
> Host: www.google.com
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Wed, 10 Mar 2021 02:37:46 GMT
< Expires: -1
< Cache-Control: private, max-age=0
< Content-Type: text/html; charset=ISO-8859-1
< P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
< Server: gws
< X-XSS-Protection: 0
< X-Frame-Options: SAMEORIGIN
< Set-Cookie: 1P_JAR=2021-03-10-02; expires=Fri, 09-Apr-2021 02:37:46 GMT; path=/; domain=.google.com; Secure
< Set-Cookie: NID=211=l3bDHkrWaIiQN03V4QI2bXrwHu0ZTMxoQgB1DDhIE2EfomQO6zrAPbu-5h6L6Ru60kQh0vtAog3iykbLvvtv28r8aVYiyMapXQWTexMArSIqRVQrcwDL4PnaoivdpE1_aL0rC6gohDpxFT-yQmk3jkE7EKtdSL_Dh7Z7nsnoqBM; expires=Thu, 09-Sep-2021 02:37:46 GMT; path=/; domain=.google.com; HttpOnly
< Accept-Ranges: none
< Vary: Accept-Encoding
< Transfer-Encoding: chunked
<
{ [686 bytes data]
100 13716 0 13716 0 0 37889 0 --:--:-- --:--:-- --:--:-- 37889
* Connection #0 to host www.google.com left intact
* Closing connection 0

Related

Invalid numeric literal when running jq from script via crontab

I have a shell script that runs fine from the command line but throws a error when it's run from a cronjob. What could be causing this error?
The following includes the cron, the script, and the error I'm getting in /var/spool/mail.
[jira-svc ~]$ cat jira_trigger_updater.sh
#!/usr/bin/sh
tmp_file=/tmp/merge-issues/$(date --iso-8601=minutes).txt
mkdir -p /tmp/merge-issues
/usr/bin/curl -s -X GET -H "Content-Type: application/json" "https://services-gateway.g054.usdcag.aws.ray.com/project-management/rest/api/2/search?jql=filter%3D14219&fields=key,status,fixVersions" -u jira-svc:${UPDATE_TRIGGER_PASSWORD} > ${tmp_file}
/usr/bin/jq -r '.issues[] | [.key , .fields.status.name , .fields.fixVersions[].name] | join(",")' ${tmp_file} > /rational/triggers/inputs/jira_merge.csv
/usr/bin/chmod 644 /rational/triggers/inputs/jira_merge.csv
#rm -rf /tmp/merge-issues
[jira-svc ~]$ crontab -l
#*/1 * * * * /usr/bin/sh /home/jira-svc/jira_trigger_updater.sh
[jira-svc# ~]$ tail -25 /var/spool/mail/jira-svc
From jira-svc#cc01-217-136.localdomain Tue Feb 8 20:10:02 2022
Return-Path: <jira-svc#cc01-217-136.localdomain>
X-Original-To: jira-svc
Delivered-To: jira-svc#cc01-217-136.localdomain
Received: by cc01-217-136.localdomain (Postfix, from userid 1001)
id 9C40168152B5; Tue, 8 Feb 2022 20:10:02 +0000 (UTC)
From: "(Cron Daemon)" <jira-svc#cc01-217-136.localdomain>
To: jira-svc#cc01-217-136.localdomain
Subject: Cron <jira-svc#cc01-217-136> /usr/bin/sh /home/jira-svc/jira_trigger_updater.sh
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
Precedence: bulk
X-Cron-Env: <XDG_SESSION_ID=2172>
X-Cron-Env: <XDG_RUNTIME_DIR=/run/user/1001>
X-Cron-Env: <LANG=en_US.UTF-8>
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/home/jira-svc>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=jira-svc>
X-Cron-Env: <USER=jira-svc>
Message-Id: <20220208201002.9C40168152B5#cc01-217-136.localdomain>
Date: Tue, 8 Feb 2022 20:10:02 +0000 (UTC)
parse error: Invalid numeric literal at line 13, column 0
[jira-svc ~]$
Cron runs jobs from a non-ineractive, non-login shell and doesn't load environment variables from files like ~/.bashrc, ~/.bash_profile, /etc/profile, and others. You must source these files if you want to include the environment variables defined in them.

Upload file using Curl on linux bash not working

I am trying from last 15-20 days but its not getting sucs. Please check below, if any sharepoint side security need to change or any command modification needed. please help ..Please help
I am trying to upload a file from Linux to SharePoint with my SharePoint login credentials. I use the cURL utility to achieve this.It's not getting successful ..
I tried to download file from sharepoint and its downloading file very easly using below command
curl -k --ntlm -u domain/username:password -O https://sharepointserver.com/sites/mysite/myfile.txt
The Upload command used is :
curl ---ntlm -u domain/username:password --upload-file myfile.txt -k "https://sharepointserver.com/sites/mysite
Output using verbose on:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to proxy 'proxy server' port 80 (#0)
* Trying 'Ip address'...
* Connected to proxy server (proxy Ip) port 80 (#0)
* Establish HTTP proxy tunnel to test.sharepoint.com:443
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* Server auth using NTLM with user 'domain/user'
> CONNECT test.sharepoint.com:443 HTTP/1.1
> Host: test.sharepoint.com:443
> User-Agent: curl/7.29.0
> Proxy-Connection: Keep-Alive
>
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0< HTTP/1.0 200 Connection established
< Set-Cookie: BIGipServerSquid_Dev_Pool=rd2319o0000000000000000000033441018o80; path=/; Httponly
<
* Proxy replied OK to CONNECT request
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate:
* subject: CN=*.sharepoint.com,OU=Microsoft Corporation,O=Microsoft Corporation,L=Redmond,ST=WA,C=US
* start date: Mar 07 21:35:03 2018 GMT
* expire date: Mar 06 21:35:03 2020 GMT
* common name: *.sharepoint.com
* issuer: CN=Microsoft IT TLS CA 1,OU=Microsoft IT,O=Microsoft Corporation,L=Redmond,ST=Washington,C=US
* Server auth using NTLM with user 'domain/user'
> GET /:x:/r/sites/TheHub/ECP/Project%20Glide/Project%20Glide%20Collaboration%20site/Shared%20Documents/14.%20Document%20and%20Records%20Management/TWS%20Reports/Development HTTP/1.1
> Authorization: NTLM TlRMTVNTUAABAAAABoIIAAAAAAAAAAAAAAAAAAAAAAA=
> User-Agent: curl/7.29.0
> Host: test.sharepoint.com
> Accept: */*
>
**< HTTP/1.1 401 Unauthorized**
< P3P: CP="ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI"
< SPRequestGuid: de7b2f9f-a041-2000-ccc6-0abdd166adb5
< request-id: de7b2f9f-a041-2000-ccc6-0abdd166adb5
< MS-CV: ny973kGgACDMxgq90WattQ.0
< Strict-Transport-Security: max-age=31536000
< X-FRAME-OPTIONS: SAMEORIGIN
< SPRequestDuration: 5
< SPIisLatency: 5
< X-Powered-By: ASP.NET
< MicrosoftSharePointTeamServices: 16.0.0.19708
< X-Content-Type-Options: nosniff
< X-MS-InvokeApp: 1; RequireReadOnly
< X-MSEdge-Ref: Ref A: 20DC3927B781464DAF54406EC7F8A4AB Ref B: LON21EDGE0516 Ref C: 2020-01-27T05:54:46Z
< Date: Mon, 27 Jan 2020 05:54:45 GMT
< Content-Length: 0
<
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host "proxyserver address" left intact
and If I am added -o(small o) then using basic authentication and command is getting successful but the file is not getting upload ... below is output for above command with small o(-o)
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to proxy 'proxy server' port 80 (#0)
* Trying Ip address...
* Connected to 'proxy server' (IP address) port 80 (#0)
* Establish HTTP proxy tunnel to test.sharepoint.com:443
* Server auth using Basic with user 'domain/user'
> CONNECT test.sharepoint.com:443 HTTP/1.1
> Host: test.sharepoint.com:443
> User-Agent: curl/7.29.0
> Proxy-Connection: Keep-Alive
>
< HTTP/1.0 200 Connection established
< Set-Cookie: BIGipServerSquid_Dev_Pool=rd2319o00000000000000000000ffff0add1018o80; path=/; Httponly
<
* Proxy replied OK to CONNECT request
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate:
* subject: CN=*.sharepoint.com,OU=Microsoft Corporation,O=Microsoft Corporation,L=Redmond,ST=WA,C=US
* start date: Mar 07 21:35:03 2018 GMT
* expire date: Mar 06 21:35:03 2020 GMT
* common name: *.sharepoint.com
* issuer: CN=Microsoft IT TLS CA 1,OU=Microsoft IT,O=Microsoft Corporation,L=Redmond,ST=Washington,C=US
* Server auth using Basic with user 'domain/user'
> PUT /:x:/r/sites/TheHub/ECP/Project%20Glide/Project%20Glide%20Collaboration%20site/Shared%20Documents/14.%20Document%20and%20Records%20Management/TWS%20Reports/Development HTTP/1.1
> Authorization: Basic Z2JidXBhZ3JvdXAvc2F3YW50bjozbWF0b231ml2QVNAQEBA
> User-Agent: curl/7.29.0
> Host: test.sharepoint.com
> Accept: */*
> Content-Length: 311
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
} [data not shown]
* We are completely uploaded and fine
< HTTP/1.1 301 Moved Permanently
< Content-Type: text/plain
< Location: https://test.sharepoint.com/sites/TheHub/ECP/Project%20Glide/Project%20Glide%20Collaboration%20site/Shared%20Documents/14.%20Document%20and%20Records%20Management/TWS%20Reports/Development?cid=50561376-1bb4-4f92-be7c-0a710fb78cc3
< P3P: CP="ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI"
< SPRequestGuid: 857c2f9f-9013-2000-6cd2-3c84489a0ad4
< request-id: 857c2f9f-9013-2000-6cd2-3c84489a0ad4
< MS-CV: ny98hROQACBs0jyESJoK1A.0
< Strict-Transport-Security: max-age=31536000
< SPRequestDuration: 5
< SPIisLatency: 5
< X-Powered-By: ASP.NET
< MicrosoftSharePointTeamServices: 16.0.0.19708
< X-Content-Type-Options: nosniff
< X-MS-InvokeApp: 1; RequireReadOnly
< X-MSEdge-Ref: Ref A: 49395CFB685F40A184BEA4E5406D6D64 Ref B: LON21EDGE1013 Ref C: 2020-01-27T06:06:09Z
< Date: Mon, 27 Jan 2020 06:06:08 GMT
< Content-Length: 0
<
100 311 0 0 100 311 0 1323 --:--:-- --:--:-- --:--:-- 1329

CURL multipart form post - HTTP error before end of send, stop sending

I am sending a multipart form POST via CURL from Linux shell script. The request works fine in Postman but from CURL it fails with:
Internal Server Error
then
HTTP error before end of send, stop sending
I even copy the Curl for Linux Shell code directly from Postman and paste into the shell script, so it should be exactly the same request that Postman is making.
Here is the command:
curl --request POST \
--no-alpn \
--url https://XXXXXXXXXXX/api/v1.0/XXXXX/XXXXXX/XXXXX \
--header 'accept: text/plain' \
--header 'cache-control: no-cache' \
--header 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \
--header 'sessionid: $session_id' \
--form filename=XXXXXX.zip \
--form XXXXXX=XXXXXX \
--form file=#$file_path \
--trace-ascii /dev/stdout || exit $?
}
And here is the log from --trace-ascii:
https://XXXXXXXXXXXXXXXXX/api/v1.0/XXXXXX/XXXXX/XXXXXXXXX
Note: Unnecessary use of -X or --request, POST is already inferred.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0== Info: Trying XXX.XXX.XXX.XXXX...
== Info: Connected to XXXXXXXXXXXXXXXX.com (XX.XX.XX.XX) port 443 (#0)
== Info: found 148 certificates in /etc/ssl/certs/ca-certificates.crt
== Info: found 592 certificates in /etc/ssl/certs
== Info: SSL connection using TLS1.2 / ECDHE_RSA_AES_128_GCM_SHA256
== Info: server certificate verification OK
== Info: server certificate status verification SKIPPED
== Info: common name: *.XXXXXX.com (matched)
== Info: server certificate expiration date OK
== Info: server certificate activation date OK
== Info: certificate public key: RSA
== Info: certificate version: #3
== Info: subject: OU=Domain Control Validated,CN=*.XXXXXX.com
== Info: start date: Mon, 15 Aug 2016 08:23:38 GMT
== Info: expire date: Thu, 15 Aug 2019 08:23:38 GMT
== Info: issuer: C=US,ST=Arizona,L=Scottsdale,O=GoDaddy.com\, Inc.,OU=http://certs.godaddy.com/repository/,CN=Go Daddy Secure Certificate Authority - G2
== Info: compression: NULL
=> Send header, 363 bytes (0x16b)
0000: POST /XXXXX/api/v1.0/XXXXXX/upload/XXXXXX HTTP/1.1
003b: Host: XXXXXX.XXXXXXX.com
0059: User-Agent: curl/7.47.0
0072: accept: text/plain
0086: cache-control: no-cache
009f: sessionid: $session_id
00b7: Content-Length: 1639
00cd: Expect: 100-continue
00e3: content-type: multipart/form-data; boundary=----WebKitFormBounda
0123: ry7MA4YWxkTrZu0gW; boundary=------------------------b059847fb557
0163: a899
0169:
<= Recv header, 23 bytes (0x17)
0000: HTTP/1.1 100 Continue
=> Send data, 387 bytes (0x183)
0000: --------------------------b059847fb557a899
002c: Content-Disposition: form-data; name="filename"
005d:
005f: xxxxxxxxxx.zip
006b: --------------------------b059847fb557a899
0097: Content-Disposition: form-data; name="XXXXXXXXXXX"
00cc:
00ce: XXXXXXXXXXXXXXXXXXXX
00ea: --------------------------b059847fb557a899
0116: Content-Disposition: form-data; name="file"; filename="XXXXXX.zip
0156: "
0159: Content-Type: application/octet-stream
0181:
=> Send data, 1204 bytes (0x4b4)
0000: PK........r~.K..D!....p.......output/XXXXXXX.XXXXX.....7Z..7Zux...
0040: ............{LSW....#.!`.9. F..Eh+.......JA..W.2.V...A.%>... #Q1
0080: T.....{Nb.]&..1.3M|.........w..z.]8..I.I>.....n?...\hM/.h..?oy^.
00c0: ..... ..:.>J..Q...N...*A...l`...."..N...#.P'........d..._.....L
0100: .].......z....N6.B......Y5t...Zd.V...}..l...........EC..$..e...W
0140: .V`.lV...p..d._.....S...............d`.l..}.....f[...{....`....M
0180: .....kN..[.4.2w.9.bN....q.8.'.K.......'..~........sI.....K...s.
01c0: ...U.'..d,.......>......T.5....|.$,)o'bIy{...pN.....K.o..[..cWp.
0200: c.#..B.S........d.I..P./.F..0....=4.......d..#{K$..#.^=.......
0240: *....Bi...i....8j!T......|.Ld...x....>......A...|.I.}>.....Yt=..
0280: ..Tp.q...O&.. .....Ac..V....a......f.G...!x.f.i.gu}.2i.4....NK..
02c0: .G;..k~......=*....g..c#..c.M.oW........-...vW.~#u...#....cz.bu=
0300: .."Bs.js\.z.1.....&|.MV..<a"4...IqRO.kKC.v.Gz.....].G.\.|...:om
0340: .C.v5G..X].kw..\....R/.........C.X].5<.B.\'....z.O|#.v.P\......
0380: ^...f~........9....YG~fum}....^,K.......F.vmIl....hI."h.FM.....f
03c0: ....Z...`um.}E...1;......_....yF.xV...BDh...U..z...*.o.`O..V.W.6
0400: ..kf.n...*.{..].].c~.w~K......4I.k.Y.....r.wV.................F
0440: .v..O..OPK..........r~.K..D!....p.....................output/xxxxxxx
.mldUT.....7Zux.............PK..........V...H.....
=> Send data, 48 bytes (0x30)
0000:
0002: --------------------------b059847fb557a899--
<= Recv header, 36 bytes (0x24)
0000: HTTP/1.1 500 Internal Server Error
<= Recv header, 15 bytes (0xf)
0000: Server: nginx
<= Recv header, 37 bytes (0x25)
0000: Date: Mon, 18 Dec 2017 15:15:56 GMT
<= Recv header, 26 bytes (0x1a)
0000: Content-Type: text/plain
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 24 bytes (0x18)
0000: Connection: keep-alive
100 1639 0 0 100 1639 0 2269 --:--:-- --:--:-- --:--:-- 2266<= Recv header, 29 bytes (0x1d)
0000: X-FRAME-OPTIONS: SAMEORIGIN
<= Recv header, 83 bytes (0x53)
0000: Set-Cookie: JSESSIONID=XXXXXXXXXXXXXXXXXXXXXXXX; Path=/;
0040: Secure; HttpOnly
== Info: HTTP error before end of send, stop sending
<= Recv header, 2 bytes (0x2)
0000:
<= Recv data, 106 bytes (0x6a)
0000: 64
0004: <ErrorResponse><key/><localizedMessage/><httpError>Internal Serv
0044: er Error</httpError></ErrorResponse>
<= Recv data, 5 bytes (0x5)
0000: 0
0003:
I should add that the CURL command is being run from a Docker container.
There's a problem in you curl command : $session_id have single quotes around, so the variable will never be evaluated.
"Double quote" every literal that contains spaces/metacharacters and every expansion: "$var", "$(command "$var")", "${array[#]}", "a & b". Use 'single quotes' for code or literal $'s: 'Costs $5 US', ssh host 'echo "$HOSTNAME"'. See
http://mywiki.wooledge.org/Quotes
http://mywiki.wooledge.org/Arguments
http://wiki.bash-hackers.org/syntax/words
In my particular case the problem was that I was sending session ID surrounded in double quotes, so the server failed to parse as a number, threw an exception and rejected the request. Had to get hold of the server logs to figure that out.
Reason session ID was in double quotes was because earlier on in the code, I was setting the session ID using:
someJson | jq '.sessionId'
If you do this, jq will return the result in double quotes. To get the value without the double quotes, use:
someJson | jq -r '.sessionId'

Adding another column to awk output

I have a HAProxy log file with content similar to this:
Feb 28 11:16:10 localhost haproxy[20072]: 88.88.88.88:6152 [28/Feb/2017:11:16:01.220] frontend backend_srvs/srv1 9063/0/0/39/9102 200 694 - - --VN 9984/5492/191/44/0 0/0 {Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36|http://subdomain.domain.com/location1} "GET /location1 HTTP/1.1"
Feb 28 11:16:10 localhost haproxy[20072]: 88.88.88.88:6152 [28/Feb/2017:11:16:10.322] frontend backend_srvs/srv1 513/0/0/124/637 200 14381 - - --VN 9970/5491/223/55/0 0/0 {Mozilla/5.0 AppleWebKit/537.36 Chrome/56.0.2924.87 Safari/537.36|http://subdomain.domain.com/location2} "GET /location2 HTTP/1.1"
Feb 28 11:16:13 localhost haproxy[20072]: 88.88.88.88:6152 [28/Feb/2017:11:16:10.960] frontend backend_srvs/srv1 2245/0/0/3/2248 200 7448 - - --VN 9998/5522/263/54/0 0/0 {another user agent with fewer columns|http://subdomain.domain.com/location3} "GET /location3 HTTP/1.1"
Feb 28 11:16:13 localhost haproxy[20072]: 88.88.88.88:6152 [28/Feb/2017:11:16:10.960] frontend backend_srvs/srv1 2245/0/0/3/2248 200 7448 - - --VN 9998/5522/263/54/0 0/0 {Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36|} "GET /another_location HTTP/1.1"
I want to extract some of the fields in order to have the following output:
Field 1 Field 2 Field 3 Field 4 Field 5 Field 6
Date/time HTTP status code HTTP Method Request HTTP version Referer URL
Basically, in this particular case the output should be:
Feb 28 11:16:10 200 GET /location1 HTTP/1.1 http://subdomain.domain.com/location1
Feb 28 11:16:10 200 GET /location2 HTTP/1.1 http://subdomain.domain.com/location2
Feb 28 11:16:13 200 GET /location3 HTTP/1.1 http://subdomain.domain.com/location3
Feb 28 11:16:13 200 GET /another_location HTTP/1.1
The only problem here is extracting the Referer URL which is between curly brackets together with the user agent and they're separated by a pipe. Also, the user agent has a variable number of fields.
The only solution I could think of was extracting the referer url separately and then pasting the columns together:
requests_temp=`grep -F " 88.88.88.88:" /root/file.log | tr -d '"'`
requests=`echo "${requests_temp}" | awk '{print $1" "$2" "$3" "$11, $(NF-2), $(NF-1), $NF}' > /tmp/requests_tmp`
referer_url=`echo "${requests_temp}" | awk 'NR > 1 {print $1}' RS='{' FS='}' | awk -F'|' '{ print $2 }' > /tmp/referer_url_tmp`
paste /tmp/abuse_requests_tmp /tmp/referer_url_tmp
But I don't really like this method. Is there any other way in which I can do it using only one awk line? Maybe assign the referer url column to a variable inside awk and then using it to create the same output?
try below solution -
awk '/88.88.88.88/ {gsub(/"/,"",$0);split($(NF-3),a,"|"); {print $1,$2,$3,$11, $(NF-2), $(NF-1), $NF, substr(a[2],1,(length(a[2])-1))}}' a
Feb 28 11:16:10 200 GET /location1 HTTP/1.1 http://subdomain.domain.com/location1
Feb 28 11:16:10 200 GET /location2 HTTP/1.1 http://subdomain.domain.com/location2
Feb 28 11:16:13 200 GET /location3 HTTP/1.1 http://subdomain.domain.com/location3
Feb 28 11:16:13 200 GET /another_location HTTP/1.1
You can do all at once using awk:
awk '$6 ~ /88\.88\.88\.88:[0-9]+/{
split($0,a,/[{}]/)
$0=a[1] OFS a[3]
split(a[2],b,"|")
print $1,$2,$3,$11,substr($18,2),$19,substr($20,1,length($20)-1),b[2]
}' file.log
The first split is splitting the variable part of line (included in between the {...}) into the array a.
The line is rebuilt in order to have a fix number of fields $0=a[1] OFS a[3]
The second split allows extracting the URL from variable based on | characters.
At last the print shows all needed elements. Note the substr are here for removing the ".

Varnish Breaking Social Sharing

Facebook uses a curl with range option to retrieve HTML of a page for sharing. Varnish is only returning page header info and not the html. This is the result I would say 75% to 80% of the time. Every once in a while it returns the correct Result
Anyone have an Idea how to fix this.
Example
#curl -v -H Range:bytes=0-524288 http://americanactionnews.com/articles/huge-protest-calls-for-death-to-usa-demands-isis-control
* Trying 52.45.101.42...
* TCP_NODELAY set
* Connected to americanactionnews.com (52.45.101.42) port 80 (#0)
> GET /articles/huge-protest-calls-for-death-to-usa-demands-isis-control HTTP/1.1
> Host: americanactionnews.com
> User-Agent: curl/7.51.0
> Accept: */*
> Range:bytes=0-524288
>
< HTTP/1.1 206 Partial Content
< Content-Type: text/html; charset=utf-8
< Status: 200 OK
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< Date: Fri, 23 Dec 2016 16:39:46 GMT
< Cache-Control: max-age=300, public
< X-Request-Id: 8007fb76-3878-430d-845c-06a8710ae1ae
< X-Runtime: 0.247333
< X-Powered-By: Phusion Passenger 4.0.55
< Server: nginx/1.6.2 + Phusion Passenger 4.0.55
< X-Varnish-TTL: 300.000
< X-Varnish: 65578
< Age: 0
< Via: 1.1 varnish-v4
< X-Cache: MISS
< Transfer-Encoding: chunked
< Connection: keep-alive
< Accept-Ranges: bytes
< Content-Range: bytes 0-15/16
<
* Curl_http_done: called premature == 0
* Connection #0 to host americanactionnews.com left intact
We have found the answer and made some modifications to our varnish based on the following link and it seems to be working.
https://info.varnish-software.com/blog/caching-partial-objects-varnish
It looks like it is actually your backend server (Nginx) that is the problem. Especially considering your mentioned hit-rate-like success rate :) Plus, the failure's example is a MISS (delivered from backend server).
Make sure that you don't have anything in your Nginx configuration that prevents range requests, i.e. one popular thing that breaks it is:
ssi on;
ssi_types *;

Resources