Curl function cannot parse proxy coming from a variable in bash - linux

I have a proxy txt file with the format:
102.129.249.120:3128
102.129.249.120:8080
101.4.136.34:8080
103.228.117.244:8080
etc
and I am trying to create a bash script that will do (for example): curl -x "$IP" google.com.
Unfortunately, curl is giving me an unsupported proxy syntax for all the proxies.
Any ideas?
BTW, I really doubt this question has been repeated as I have tried everything else to no avail.
My script:
Number=$(wc -l < ProxyList.txt)
for ((i=1;i<=$Number;++i)) do
ip=$(head -n ${i} ProxyList.txt | tail -n +${i})
curl -p -x "$ip" 'webpage' -H 'user-agent' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Cookie: wpml_referer_url=referer; _icl_current_language=es; PHPSESSID=tpikve1vl4ued06i082vprqdo1' -H 'If-Modified-Since: Mon, 16 May 2016 07:27:13 GMT' -H 'If-None-Match: "3d6-532f08d9d7640-gzip"' -H 'Cache-Control: max-age=0' -m 6
done
A small sample of my proxy list:
102.129.249.120:3128
102.129.249.120:8080
101.4.136.34:8080
103.228.117.244:8080
103.253.27.108:80
104.45.188.43:3128
104.250.34.179:80
105.27.238.161:80
104.154.143.77:3128
110.243.20.2:9999
111.68.26.237:8080
106.104.151.142:58198
113.252.95.19:8197
115.231.31.130:80
118.69.50.154:80
118.69.50.154:443
119.81.189.194:80
119.81.189.194:8123
119.81.199.81:8123
119.81.199.83:8123
119.81.199.80:8123
12.139.101.100:80
12.139.101.101:80
119.81.199.85:31288
119.81.199.86:8123
119.81.199.87:8123
12.139.101.102:80
124.156.98.172:443
13.228.91.252:3128
138.197.157.32:3128
138.197.157.32:8080
138.68.240.218:8080
138.68.240.218:3128
138.68.60.8:8080
138.68.60.8:3128

Your input file has carriage return characters at the end of each line.
Each line in your input file ends with \r\n instead of just \n.
You can check with od:
$ head -1 ProxyList.txt | od -c
0000000 1 0 2 . 1 2 9 . 2 4 9 . 1 2 0 :
0000020 3 1 2 8 \r \n
0000026
So in your script, $ip has in fact a value of 102.129.249.120:3128\r.
You can remove the \r characters with tr for example:
while read proxy; do
curl -p -x $proxy $webpage
done < <( tr -d '\r' < ProxyList.txt )

try this:
for ip in $(cat ProxyList.txt)
do
curl -p -x "$ip" 'webpage' -H 'user-agent' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Cookie: wpml_referer_url=referer; _icl_current_language=es; PHPSESSID=tpikve1vl4ued06i082vprqdo1' -H 'If-Modified-Since: Mon, 16 May 2016 07:27:13 GMT' -H 'If-None-Match: "3d6-532f08d9d7640-gzip"' -H 'Cache-Control: max-age=0' -m 6
done
but the problem with curl might be, that should set the environment variables http_proxy and https_proxy like this:
export http_proxy=http://1.2.3.4:3128/
export https_proxy=http://1.2.3.4:3128/

As per the curl man page, the -x (or --proxy) switch can be prefixed with the protocol in front of the argument (if omitted, I assume that it defaults to http://):
-x, --proxy [protocol://]host[:port]
A simple bash script with xargs would look like:
#!/bin/bash
webpage=${1:-http://google.com}
cat ProxyList.txt \
| xargs -n1 -I{} curl -p -x http://{} "$webpage" -H 'user-agent' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Cookie: wpml_referer_url=referer; _icl_current_language=es; PHPSESSID=tpikve1vl4ued06i082vprqdo1' -H 'If-Modified-Since: Mon, 16 May 2016 07:27:13 GMT' -H 'If-None-Match: "3d6-532f08d9d7640-gzip"' -H 'Cache-Control: max-age=0' -m 6

Related

supress cat output in following cmd in shell script

curl=`-c cat <<EOS
curl -s https://api.openai.com/v1/completions
-H 'Content-Type: application/json'
-H "Authorization: Bearer foo"
-d '{
"model": "text-davinci-003",
"prompt": "$1",
"max_tokens": 4000,
"temperature": $2
}'
--insecure | jq '.choices[]'
EOS`
eval ${curl} | tee $newFileName
There is no need for command substitution and eval. They only make the things more complicated than they need to be.
I suspect that you used command substitution to get the expanded the values of $1 and $2 in the JSON but this is not the right way to do it. If $1 contains " the generated JSON becomes invalid. The same if $2 is not a number.
The code is even vulnerable to data injection. If $1 contains the value blah", "model": "foo and the value of $2 is 0, the argument passed to -d becomes:
{ "model": "text-davinci-003", "prompt": "blah", "model": "foo", "max_tokens": 4000, "temperature": 0 }
It is not valid JSON (it contains the key model twice) but most JSON parsers accept it and they usually use the last value assigned to the key (probably because they do not check for key uniqueness).
This makes the API handle this input object:
{ prompt": "blah", "model": "foo", "max_tokens": 4000, "temperature": 0 }
It is not easy to remove the vulnerability using what the shell script offers. You can attempt to replace " with \" but this leads to more problems (and there must be escaped many characters, anyway).
The easiest solution for this is to use jq or other JSON manipulating tool.
Using jq, the code would be like this:
# Generate the data object using `jq`...
# ...and store it into a separate variable for readability
DATA=$(jq -n -c --arg prompt "$1" --arg temp "$2" '{
model: "text-davinci-003",
prompt: $prompt,
max_tokens: 4000,
temperature: $temp
}')
curl -s \
https://api.openai.com/v1/completions \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer foo" \
-d "$DATA" \
--insecure |
jq '.choices[]' |
tee "$newFileName"

Bash shell script to find Robots meta tag value

I've found this bash script to check status of URLs from text file and print the destination URL when having redirections :
#!/bin/bash
while read url
do
dt=$(date '+%H:%M:%S');
urlstatus=$(curl -kH 'Cache-Control: no-cache' -o /dev/null --silent --head --write-out '%{http_code} %{redirect_url}' "$url" )
echo "$url $urlstatus $dt" >> urlstatus.txt
done < $1
I'm not that good in bash : I'd like to add - for each url - the value of its Robots meta tag (if is exists)
Actually I'd really suggest a DOM parser (e.g. Nokogiri, hxselect, etc.),
but you can do this for instance (Handles lines starting with <meta and "extracts" the value of the robots' attribute content):
curl -s "$url" | sed -n '/\<meta/s/\<meta[[:space:]][[:space:]]*name="*robots"*[[:space:]][[:space:]]*content="*\([^"]*\)"*\>/\1/p'
This will print the value of the attribute or the empty string if not available.
Do you need a pure Bash solution? Or do you have sed?
You can add a line to extract the meta header for robots from the source code of the page and modify the line with echo to show its value:
#!/bin/bash
while read url
do
dt=$(date '+%H:%M:%S');
urlstatus=$(curl -kH 'Cache-Control: no-cache' -o /dev/null --silent --head --write-out '%{http_code} %{redirect_url}' "$url" )
metarobotsheader=$(curl -kH 'Cache-Control: no-cache' --silent "$url" | grep -P -i "<meta.+robots" )
echo "$url $urlstatus $dt $metarobotsheader" >> urlstatus.txt
done < $1
This example records the original line with the meta header for robots.
If you want to put a mark "-" when the page has no meta header for robots, you can change the metarobotsheader line, and put this one:
metarobotsheader=$(curl -kH 'Cache-Control: no-cache' --silent "$url" | grep -P -i "<meta.+robots" || echo "-")
If you want to extract the exact value of the attribute, you can change that line:
metarobotsheader="$(curl -kH 'Cache-Control: no-cache' --silent "$url" | grep -P -i "<meta.+robots" | perl -e '$line = <STDIN>; if ( $line =~ m#content=[\x27"]?(\w+)[\x27"]?#i) { print "$1"; } else {print "no_meta_robots";}')"
When the URL doesn't contain any meta header for robots, it will show no_meta_robots.

Curl bash script truncated/ merging

My aim is to have bash scripts get up to get a token from a WEB API and parse that token into a post command which will then create an new account (based on user input) on that WEB API. It succesffuly gets the token but when I reference the variable in my POST command, it either cuts it off or merges it into the tab. Has anyone seen this happen before?
I have read a similar issue where the issue was the content header, I have removed that but it hasnt fixed the issue.
A warning...I am a learner so my code isn't the best!
My bash script:
#!/bin/bash
curl --insecure -X POST \
https://192.168.XX.XXX:8443/application/token \
-H 'Accept: */*' \
-H 'Accept-Encoding: gzip, deflate' \
-H 'Cache-Control: no-cache' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/xml' \
-H 'Host: 192.168.XX.XX:8443' \
-H 'cache-control: no-cache' \
-H 'client-key: 123' \
-H 'client-name: 1234' \
-H 'client-secret: 123456' \
-H 'grant_type: OAuth' \
-d '<request>
<username>johndoe</username>
<password>secret</password>
</request>' > tokenNewAcc.xml
grep 'access_token' tokenNewAcc.xml | sed 's/.*://' | sed s/,// | tr -d '"' > /home/oracle/tokenNewAcc.txt
rm tokenNewAcc.xml
value=`cat /home/oracle/tokenNewAcc.txt`
MYVAR=`cat /home/oracle/testaccounts.txt`
ID=`cut -d' ' -f1 /home/oracle/testaccounts.txt`
FNAME=`cut -d' ' -f2 /home/oracle/testaccounts.txt`
LNAME=`cut -d' ' -f3 /home/oracle/testaccounts.txt`
EMAIL=`cut -d' ' -f4 /home/oracle/testaccounts.txt`L
\
curl -k -X POST \
https://192.168.XX.XXX:8443/application/accounts \
-H 'Accept: */*' \
-H 'Content-Type: application/xml' \
-H 'Host: 192.168.XX.XXX:8443' \
-H "auth: "$value"" \
-H 'cache-control: no-cache' \
-d '<request>
<userid'$ID'</userid>
<firstname>'$FNAME'</firstname>
<lastname>'$LNAME'</lastname>
<email>'$EMAIL'</email>
</request>'
OUTPUT of POST where it has trouble:
curl -k -X POST https://192.168.36.XXX:8443/cloudminder/accounts -H 'Accept: */*' -H 'Content-Type: application/xml' -H 'Host: 192.168.36.1' -H 'cache-control: no-cache' -d '<request>ad9f23295b6
<useridconnectortest</userid>
<firstname>conn</firstname>
<lastname>connect</lastname>
<email>connectorL</email>
</request>'
As you can see, the host url is cut off, the beginning of the token (ad9f23295b6) merges into the request.

How to save the response body from cURL in a file when executing the command in a loop in a bash script?

I've created a cURL bash script in which I want to save the response body into a file called output.log, but when I open the file output.log it looks like this:
Here is my bash script:
#!/bin/bash
SECRET_KEY='helloWorld'
FILE_NAME='sma.txt'
function save_log()
{
printf '%s\n' \
"Header Code : $1" \
"Executed at : $(date)" \
"Response Body : $2" \
'==========================================================\n' > output.log
}
while IFS= read -r line;
do
HTTP_RESPONSE=$(curl -I -L -s -w "HTTPSTATUS:%{http_code}\\n" -H "X-Gitlab-Event: Push Hook" -H 'X-Gitlab-Token: '$SECRET_KEY --insecure $line 2>&1)
HTTP_STATUS=$(echo $HTTP_RESPONSE | tr -d '\n' | sed -e 's/.*HTTPSTATUS://')
save_log $HTTP_STATUS $HTTP_RESPONSE
done < $FILE_NAME
Can anyone help me get my desired output in my output.log?
From the Curl documentation:
-I, --head Show document info only
Removing the -I or replace it with -i should solve your problem

How can I use an argument in a string in a bash function

I try:
ctests() {
curl -X POST \
http://route.to.host/cucumber/execute-tests \
-H 'Authorization: Basic xxxxxxxxxxxxxxxxxxxxx' \
-H 'Content-Type: application/json' \
-H 'cache-control: no-cache' \
-d '{ "text": "cucumber! alltests products=$1" }'
}
And want to call this like
> ctests someproduct
But $1 wont resolve. I tried ${1}, but its the same.
Is there a nice solution for this?
the $1 doesn't resolve because you are using single-ticks ' which prohibit variable resolution.
use double-ticks (") instead (you'll have to escape the double-quotes inside the double-quotes; or use single-quotes within the double-quotes; depending on your context)
ctests() {
curl -X POST \
http://route.to.host/cucumber/execute-tests \
-H 'Authorization: Basic xxxxxxxxxxxxxxxxxxxxx' \
-H 'Content-Type: application/json' \
-H 'cache-control: no-cache' \
-d "{ \"text\": \"cucumber! alltests products=$1\" }"
}
quoting bash(1):
QUOTING
[...]
Enclosing characters in single quotes preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of $ [...]

Resources