Unable to access some websites on Amazon service - linux

Description
I'm crawling the website :bjx.com and all codes can be run in the local.Then I put the code on the Amazon service and run ,it failed.
What I have Done
I guess that maybe the website block the server and I have tried some ways :
1) curl http://guangfu.bjx.com.cn/xtgc/List.aspx?classid=583
2) wget http://guangfu.bjx.com.cn/xtgc/List.aspx?classid=583
err msg as follows:
Resolving news.bjx.com.cn (news.bjx.com.cn)... 114.113.145.103
Connecting to news.bjx.com.cn (news.bjx.com.cn)|114.113.145.103|:80... failed: Connection timed out.
Retrying.
--2019-04-23 05:45:00-- (try: 2) http://news.bjx.com.cn/list
Connecting to news.bjx.com.cn (news.bjx.com.cn)|114.113.145.103|:80...
some reference:
https://serverfault.com/questions/124952/testing-a-website-from-linux-command-line
My question :
how to confirm whether the website has blocked me and if blocked, what can I do to solve the issue and crawl the website, thanks

How about to make the program failed with particular timeout setting?
For example, to make curl failed if it can't get response within 10 seconds
curl -m 10
And, to go over these issues, you can try to run the spiders with Proxy of VPN networking

Related

Pytorch Install

I HAVE A ERROR IN INSTALLING PYTORCH:
PLEASE HELP ME.
CondaHTTPError: HTT P000 CONNECTION FAILED for url https://conda.anaconda.org/pytorch/win-64/current_repodata.json
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
'https//conda.anaconda.org/pytorch/win-64'
An HTTP error occurred when trying to retrieve this URL. HTTP errors are often intermittent, and a simple retry will get you on your way
The possible reason for HTTP error could be an unstable network connection or a corporate firewall.
If it was an unstable network connection, as mentioned in the Error message, retry the installation steps that failed.
If you are behind a corporate firewall, you might need additional steps to add your proxy server to the .condarc file on your machine.
Since you are on Windows, you could open the Anaconda prompt and run conda info to figure out where the .condarc file is located.
Find the proxy by running echo "$http_proxy" in your prompt. Copy the proxy.
Open the .condarc file and paste the proxy under proxy_servers section
For more details see: Anaconda Docs: Configure conda for use behind a proxy server (proxy_servers)

Linux Lua: Permission denied on https request

I've been trying to send a https request using ssl.https library in Lua, however no matter what url I give, I alway get permission denied and no other values like headers, etc. The linux I am using is CentOS Linux version 7.
Here is the example code:
local httpsocket = require("socket.http")
local httpssocket = require ("ssl.https")
local ltn12 = require("ltn12")
local res, code, response_headers, status = httpssocket.request("https://www.google.com")
module:log("info","%s %s",code.."",response_headers);
The code itself is part of a prosody plugin and the last line in this example prints this out:
permission denied <nil>
My question is how do I fix this issue so that I can access the page?
EDIT: It seem that the problem might be the user that the service is run under and needs root privilages otherwise it throws ACCES error for ports lower than 1024. Does anyone know what to do in this case?
So... after attempting fix this issue again, I finally found the solution. If you are having trouble with services not being able to send http/https request on centOS, there is a single command that has to be run to fix this issue:
setsebool -P nis_enabled 1
For those who might have similar issues but not quite the same as me, look into the /var/log/audit/audit.log for anything related to your program, process, service, etc. then use this command:
grep <pattern_to_match_specific_log> /var/log/audit/audit.log | audit2why
This will give you a reason why it failed and how to fix it

Artifactory : Error : Ping failed since server state is blocked

I have installed artifactory-HA on two nodes and load balancer is configured to probe artifactory's health in frequent intervals. Ping fails. System is built on Azure. Both, artifactory and nginx services are up and running.
Contents of $ARTIFACTORY_HOME/logs/artifactory.log
2018-10-26 18:59:22,633 [http-nio-8081-exec-5] [ERROR] (o.a.r.r.s.PingResource:76) - Ping failed since the server state is blocked
Contents of $ARTIFACTORY_HOME/logs/request.log
20181026185727|1|REQUEST|<ip>|anonymous|GET|/api/system/ping|HTTP/1.1|503|0
20181026185732|2|REQUEST|<ip>|anonymous|GET|/api/system/ping|HTTP/1.1|503|0
Request is coming in. But, I couldn't make out as why this error is happening. Kindly help.
I had the exact same issue, it turned out to be a license issue. As mentioned by #DarthFennec in the comments above Removing the expired Trial license and adding the new license allowed the curl ping command to succeed.
/usr/bin/curl -f --insecure -u admin:password -X GET -H 'Content-Type:application/json' http://artent-01.mydomain:8081/artifactory/api/system/ping

GMAIL IMAP read fails due to authentication when used in bitbucket pipeline

This call (imap_open()) has been failing consistently when I run the script inside a docker container run by bitbucket pipeline.
PHP Warning: imap_open(): Couldn't open stream {imap.gmail.com:993/imap/ssl/novalidate-cert}INBOX in /opt/atlassian/pipelines/agent/build/test/tools/plib/confirm.php on line 24
PHP Fatal error: Uncaught exception 'Exception' with message 'signup: confirm failed' in /opt/atlassian/pipelines/agent/build/test/tools/plib/signup.php:24
Stack trace:
#0 /opt/atlassian/pipelines/agent/build/test/tools/test.php(51): signup(Array)
#1 {main}
thrown in /opt/atlassian/pipelines/agent/build/test/tools/plib/signup.php on line 24
PHP Notice: Unknown: Retrying PLAIN authentication after [ALERT] Please log in via your web browser: https://support.google.com/mail/acco (errflg=1) in Unknown on line 0
PHP Notice: Unknown: Retrying PLAIN authentication after [ALERT] Please log in via your web browser: https://support.google.com/mail/acco (errflg=1) in Unknown on line 0
PHP Notice: Unknown: Can not authenticate to IMAP server: [ALERT] Please log in via your web browser: https://support.google.com/mail/acco (errflg=2) in Unknown on line 0
Cannot connect to Gmail: Can not authenticate to IMAP server: [ALERT] Please log in via your web browser: https://support.google.com/mail/acco
I have followed all these instructions:
Enabled less secure apps
Enabled from https://accounts.google.com/b/0/DisplayUnlockCaptcha
Tried with and without /novalidate-cert flag
Same script works fine when run locally within the Mac OS or even with AWS EC2 instance but fails when run by bitbucket or Heroku. There is no way I can run a browser on these instances, so cannot try web interface and apparently once enabled for access, it should work every where.
Too bad, the link in the error message isn't even complete.
Any idea how to overcome this? All I want to do is to simply simulate a click on the verify link in a signup email programmatically.
As mentioned , google does alert "Less secured apps" and authentication keeps failing even after that sometimes.
Best solution here is to move to OAUTH2 auth method, else even if you find a way to solve this "less secured alert" issue... it may eventually occur in future on and off.

Jenkins Error 128 / Git Error 403: Jenkins can't connect to my Bitbucket repository

OS: Ubuntu 16.04
Hypervisor: VirtualBox
Network configuration: Nat Network with port forwarding to access the vms through the host ip. I can also ping a VM from another VM.
I try to connect my Jenkins app hosted on a VM to my BitBucket server also on a VM. I followed a tutorial on internet but when i enter the address of my git repository i'm getting this:
Failed to connect to repository : Command "usr/bin/git ls-remote -h http://admin#192.168.6.102:8005/scm/tes/repository-test.git HEAD" returned status code 128:
stdout:
stderr: fatal: unable to access 'http://admin#192.168.6.102:8005/scm/tes/repository-test.git/': The requested URL returned error: 403
So, to be sure I tried to exectute the command on the terminal... and on the terminal it seems to work.. I can also push, clone, pull etc..
On this image you can see that it's true
Do you have an explanation?
EDIT:
I try some others things like use or not sudo to see if the permissions problem came from that and it seems that it's not the case.
But I see that there is no result when we use the "HEAD" argument.
Do you think that because "HEAD" give no result, git in jenkins interprets it like no answer and returns the damn** error 403?
EDIT 2:
I found that on the web: http: // jenkins-ci.361315.n4.nabble.com/Jenkins-GIT-ls-remote-error-td4646903.html
The guy has the same problem but in a different way, I will try to allocate more RAM to see if it does the trick.
There could be many possible problems, but you are getting 403 - Access Forbidden, which indicates some problem with permissions. I would suggest first common mistakes:
a) trying https instead http - my scm only uses https,
b) check if admin is correct - scm by default uses scmadmin.
Here I run the exact same command twice.
The first time I used the proxy configuration wich I need to access internet, and the second time I set the mandatory server on "none".
So there is a problem with the damn proxy.
I was thinking that the proxy was not used in NAT connection with VirtualBox...
I found the solution.
I had to reinstall jenkins to have a user named "jenkins" with his own home directory.
I don't know if it is linked or not, but I configured my bitbucket server to use only HTTPS with a self signed certificate (I work in lan)
My troubleshoot was linked with my proxy settings.
I disabled all my proxy settings in Linux so I was able to launch the command that did'nt worked in jenkins with terminal.
I logged with sudo su jenkins the commands also worked.
I found out that in the home directory of the jenkins user there was a "proxy.xml" file. I opened it and saw my old proxy settings.
I deleted all the content with vim, saved and restarted and the error was gone.
there can be git version miss match.....
I would suggest you update git once. maybe it will resolve your issues.

Resources