Hackers constantly pull my login.aspx - security

I have a website with protected content and I've recently started getting a lot of email alerts about unsuccessful logins. It gets really annoying, about one attempt in a minute or two on average.
Because I host the website on a dedicated Windows box and I log the IPs, I realized that these requests are not coming from regular visitors to my site (because statcounter doesn't record the IPs), but from automated scripts from many different IP addresses (mostly from Ukraine, but mainly Reverse DNS cannot locate them).
I created a blocking rule in Windows Firewall and started to add all the addresses I found in the log file, but they are many. I already added probably 50 (5 times by 10 IPs) and this is stopping them only for few hours and then new IPs start coming around.
I am actually a software programmer and managing a real server is not my best side. Are there any tools that I can use to prevent these attacks?

You should implement a rate limiter in your code.
If you get more than (for example) 4 failed login requests from the same IP in 5 minutes, require a CAPTCHA for the next login.
Google Accounts login pages do exactly this.

It's not a tool but one practice that might be useful is to not include the word "Login" on your page and don't name your aspx page "Login". This might help to prevent crawlers that are looking for common keywords associated from finding your login page.
Doing something like replacing a Text link that contains "Login" with an image that looks just like the text and changing the name of your login form might go a long way to prevent crawlers from even finding your login page.
Example:
<img src="li.png" />

Related

How to bypass being rate limited ..HTML Error 1015 using Python

So i have created an automation bot to do some stuff for me on the internet .. Using Selenium Python..After long and grooling coding sessions ..days and nights of working on this project i have finally completed it ...Only to be randomly greeted with a Error 1015 "You are being rate limited".
I understand this is to prevent DDOS attacks. But it is a major blow.
I have contacted the website to resolve the matter but to no avail ..But the third party security software they use says that they the website can grant my ip exclusion of rate limiting.
So i was wondering is there any other way to bypass this ..maybe from a coding perspective ...
I don't think stuff like clearing cookies will resolve anything ..or will it as it is my specific ip address that they are blocking
Note:
The TofC of the website i am running my bot on doesn't say you cant use automation software on it ..but it doesn't say you cant either.
I don't mind coding some more to prevent random access denials ..that i think last for 24 hours which can be detrimental as the final stage of this build is to have my program run daily for long periods of times.
Do you think i could communicate with the third party security to ask them to ask the website to grant me access ..I have already tried resolving the matter with the website. All they said was that A. On there side it says i am fine
B. The problem is most likely on my side .."Maybe some malicious software is trying to access our website" which .. malicious no but a bot yes. That's what made me think maybe it would be better if i resolved the matter myself.
Do you think i may have to implement wait times between processes or something. Im stuck.
Thanks for any help. And its a single bot!
If you are randomly greeted with...
...implies that the site owner implemented Rate Limiting that affects your visitor traffic.
rate-limiting reason
Cloudflare can rate-limit the the visitor traffic trying to counter a possible Dictionary attack.
rate-limit thresholds
In generic cases Cloudflare rate-limits the visitor when the visitor traffic crosses the rate-limit thresholds which is calculated by, dividing 24 hours of uncached website requests by the unique visitors for the same 24 hours. Then, divide by the estimated average minutes of a visit. Finally, multiply by 4 (or larger) to establish an estimated threshold per minute for your website. A value higher than 4 is fine since most attacks are an order of magnitude above typical traffic rates.
Solution
In these cases the a potential solution would be to use the undetected-chromedriver to initialize the Chrome Browsing Context.
undetected-chromedriver is an optimized Selenium Chromedriver patch which does not trigger anti-bot services like Distill Network / Imperva / DataDome / Botprotect.io. It automatically downloads the driver binary and patches it.
Code Block:
import undetected_chromedriver as uc
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
driver = uc.Chrome(options=options)
driver.get('https://bet365.com')
References
You can find a couple of relevant detailed discussions in:
Selenium app redirect to Cloudflare page when hosted on Heroku
Linkedin API throttle limit
I see some possibilities for you here:
Introduce wait time between requests to the site
Reduce the requests you make
Extend your bot to detect when it hits the limit and change your ip address (e.g. by restarting you router)
The last one is the least preferable I would assume and also the most time consuming one.
First: Read to Terms of Use of the website, for example, look at the robots.txt, usually this is at the root of the website like www.google.com/robots.txt . Note that going against the website owner's explicit terms may be illegal depending on jurisdiction and may result in the owner blocking your tool and/or ip.
https://www.robotstxt.org/robotstxt.html
This will let you know what the website owner explicitly allows for automation and scraping.
After you've reviewed the website's terms and understand what they allow, and they do not respond to you, and you've determined you are not breaking the websites terms of use, the only real other option would be utilize proxies and/or VPSs that will give the system running the scripts different IPs.

Node.js brute force prevention

I have a MERN stack project running on Heroku, today someone has started to flood my server with many login requests (brute force). Every request has a different IP address so I cannot block the IP. This has caused a website outage.
How can I block it then? How can I allow login only using my website?
A typical solution you will see used by many login pages is one of several techniques that require human-like interaction and are hard for scripts to duplicate.
You have, for sure, seen the captcha systems that ask the user to interpret some image that is not easy or practical for computers to analyze.
There is also a no-captcha system that asks the user to click a particular spot on the screen with the mouse and it analyzes the movement to see if it appears human-like. These are often shown as a click on "I'm not a robot".
Many sites (like some U.S. airlines and a number of financial sites) now require the user to set up "challenge" questions (like: "Where were you born?" or "What's your favorite ice cream flavor?") and if a login request arrives without a previous placed signed cookie for this user (or other familiar browser detection metrics), then the challenge question is required before you can even attempt a login.
A more draconian approach (that could have more of an impact on the end-user) is to keep track of failed login attempts per account and after a certain number, you start slowing down the responses (this slows down the attackers systems) and after some higher number of failed responses, you immediately fail every request and require the end-user to confirm their login request via an email message sent to their registered email address. This is an inconvenience for the end-user, but prevents more than N guesses on any individual account without end-user confirmation. After some period of time, you can clear the prior login attempt numbers for any given account, freeing it up to work normally again.

How can I keep spambots from getting past multiple web security measures?

I am trying to stop spam accounts from being created on my website. I run a website that has approximately 50-80k pageviews per month. It's a social media website. Users sign up and communicate with one another for free. We've been battling with spam as of late even though we have implemented multiple security measures to counteract bots. I'd like to get any further suggestions of tips and tricks that I can try and also some help to see if I can identify if these are people coming from clickfarms, etc. (i.e. real people or computers)
Problem:
Signup form being completed and users posting spam in their profile information. Spammer signs up for the website by completing the signup form, activates their account via an email account, Logs into their account, and then completes their profile, putting spam in the description box with a link/url to their website they are advertising (everything from ##$%S enlargment to random blogs, to web developer websites, etc.) If there was one link they were posting we could detect it and ban them but they are not -- They are coming from multiple IP's, posting various links, using multiple email provider addresses for activating the accounts, registering with information from multiple countries, and creating about 10-30 accounts per day. Before implementing many security measures we were getting moreso around 100-200 fake accounts per day, but now we're down to 10-30 ... so we've seen some improvement, but the issue is still annoying me. So I'm half thinking now that the security measures are helping quite a bit, but that this is possibly humans still targeting our website and perhaps getting paid per signup they do or something similar to that. Even if so, is there any way I could confirm they are humans versus bots?
Security measures:
I won't get into all of the details here (for security reasons), but I'll just indicate what we've done to counteract the spambots:
Created honeypots at various areas of our website which automatically ban based on IP
IP banning - based on known botter/spammer ip addresses
Duration detection of signup form pageload to form submission -- if less than 5 seconds to complete our signup form, we're confirming you're a bot and then preventing the signup
Hidden checkbox in signup form -- there is a hidden checkbox in the signup form that is invisible to regular users (if a bot checks it we are automatically detecting and preventing the signup)
Google re-Captcha - We've enabled Google re-Captcha in our signup form as well
Email activation link - We send our users an activation email with a link that they have to click on to signup -- they are not able to sign into our website until they've activated their account.
Future actions include:
Detecting what users are posting in their descriptions in their profiles and banning based on that -- string detection for banned words, etc.
Any other suggestions or tips or tricks? In all honesty, if spam bots are getting through all of those security measures above --
do you think they are just that intelligent?
do you think we're being targeted?
Also, any way I can determine if they are bots or real humans? Suggestions?
This is a perennial problem; over the years I've found that as I add more anti-spam measures, the spammers continually get better at circumventing my measures.
I recommend doing an analysis of your spam to figure out how you can detect it. The spam itself contains the key to how to outsmart it. Look at the patterns, the structure, and decide what information is most useful and how the easiest way is to filter it out. Your spam detection doesn't need to be perfect, but generally, you want to get as much as possible, while getting as few false positives as possible.
Also, to answer your one question, you can make your bot-detection perfect, but there will always be humans submitting spam. And humans are tough to outsmart, and you may always need some manual attention to do it.
You are already implementing a lot of measures. Here are some more I would suggest:
When a signup form is generated, put a hidden field with a unique hash generated from the user's browser info, including the user's IP, HTTP user agent, and the date. Then, when the form is submitted, check the hash. This one method eliminated a surprising amount of spam.
If you want to take the previous method even farther, use a custom, time-sensitive hash in the URL of your contact form, and have the link to this form be dynamically generated. This way, if a spammer stores the form's URL, it won't work, but the link will work for every legitimate user of the site.
Make it so newly created, non-trusted users, cannot display any public profile information, such as URL's or text even. With a site as small as yours you could require manual approval of each user, and if your userbase got bigger, you could use an automated reputation system, a lot like Stack Overflow and the other Stack Exchange sites use. This removes the incentive for spam. Also, I found an overwhelming majority of spammers only ever logged onto the site once. If you wait to do the manual approval of users, until they have logged on twice, or even have returned to the site on another day, using a persistent cookie, you will filter out the vast majority of spammers and you will only have to do a small amount of manual approval work. Then have the system delete the unvalidated/inactive accounts after a certain amount of time.
Check for certain keywords or structure of info. I found an overwhelming majority of my spammers would use certain words or phrases that were never used by my legitimate users. Another one was entering a phone number in their profile, a common pattern in spammers, that no legitimate user ever did. Also look for signs of foul play like XSS attacks. A huge portion of spammers will, at some point, submit something that has a ton of HTML tags in it, you can either use the tags itself to filter them out, or you can do something like stripping the HTML tags and then comparing string length and banning them if it's more than a small amount (i.e. allow someone to do something simple like a few <em></em> or <strong></strong> tags.) Usually, if there are HTML tags in the entry, there's a ton of it. Also look for material with weird encodings or characters that don't make sense. This is often an attempt at sophisticated SQL injection attacks, XSS, or other types of hacking attempts.
Use external IP blacklists. AbuseIPDB is one example; it has an API that you can use to check new IP's before storing them in your temporary database. Their free plan allows checking of up to 1000 IP's a day and you can pay for more than that. It won't catch all the manual spam but I find they catch a ton of the automated spam.
Are they targeting you? Yes. They are targeting everyone. But any site with 50k+ pageviews a month is high enough volume to be an attractive target. The higher traffic you get, the more attractive of a target you will be. Even some of my tiny sites have been targeted with suprisingly sophisticated attacks these days. Everyone needs to be on guard.
Good luck. I wish this weren't so much of a problem, but it is.

How to find out my site is being scraped?

How to find out my site is being scraped?
I've some points...
Network Bandwidth occupation, causing throughput problems (matches if proxy used).
When querting search engine for key words the new referrences appear to other similar resources with the same content (matches if proxy used).
Multiple requesting from the same IP.
High requests rate from a single IP. (by the way: What is a normal rate?)
Headless or weird user agent (matches if proxy used).
Requesting with predictable (equal) intervals from the same IP.
Certain support files are never requested, ex. favicon.ico, various CSS and javascript files (matches if proxy used).
The client's requests sequence. Ex. client access not directly accessible pages (matches if proxy used).
Would you add more to this list?
What points might fit/match if a scraper uses proxying?
As a first note; consider if its worthwhile to provide an API for bots for the future. If you are being crawled by another company/etc, if it is information you want to provide to them anyways it makes your website valuable to them. Creating an API would reduce your server load substantially and give you 100% clarity on people crawling you.
Second, coming from personal experience (I created web-crawls for quite a while), generally you can tell immediately by tracking what the browser was that accessed your website. If they are using one of the automated ones or one out of a development language it will be uniquely different from your average user. Not to mention tracking the log file and updating your .htaccess with banning them (if that's what you are looking to do).
Its usually other then that fairly easy to spot. Repeated, very consistent opening of pages.
Check out this other post for more information on how you might want to deal with them, also for some thoughts on how to identify them.
How to block bad unidentified bots crawling my website?
I would also add analysis of when the requests by the same people are made. For example if the same IP address requests the same data at the same time every day, it's likely the process is on an automated schedule. Hence is likely to be scraping...
Possible add analysis of how many pages each user session has impacted. For example if a particular user on a particular day has browsed to every page in your site and you deem this unusual, then perhaps its another indicator.
It feels like you need a range of indicators and need to score them and combine the score to show who is most likely scraping.

Best practice against password-list-attacks with webapplications

i'd like to prevent bots from hacking weak password-protected accounts. (e.g. this happend to ebay and other big sites)
So i'll set a (mem-) cached value with the ip, amount of tries and timestamp of last try (memcache-fall-out).
But what about bots trying to open any account with just one password. For example, the bot tries all 500.000 Useraccounts with the password "password123". Maybe 10 will open.
So my attempt was to just cache the ip with tries and set max-tries to ~50. The i would delete it after a successful login. So the good-bot would just login with a valid account every 49 tries to reset the lock.
Is there any way to do it right?
What do big platforms do about this?
What can i do to prevent idiots from blocking all users on a proxy with retrying 50 times?
If there is no best practice - does this mean any platform is brute-forceable? At least with a hint on when counters are resetted?
I think you can mix your solution with captchas:
Count the number of tries per IP
In case there are too many tries from a given IP address within a given time, add a captcha to your login form.
Some sites give you maybe two or three tries before they start making you enter a captcha along with your username/password. The captcha goes away once you successfully log in.
There was a relatively good article on Coding Horror a few days ago.
While the code is focused on Django there is some really good discussion on the best practice methods on Simon Willison’s blog. He uses memcached to track IPs and login failures.
You could use a password strength checker when a user sets their password to make sure they're not using an easily brute-forced password.
EDIT: Just to be clear, this shouldn't be seen as a complete solution to the problem you're trying to solve, but it should be considered in conjunction with some of the other answers.
You're never going to be able to prevent a group of bots from trying this from lots of different IP addresses.
From the same IP address: I would say if you see an example of "suspicious" behavior (invalid username, or several valid accounts with incorrect login attempts), just block the login for a few seconds. If it's a legitimate user, they won't mind waiting a few seconds. If it's a bot this will slow them down to the point of being impractical. If you continue to see the behavior from the IP address, just block them -- but leave an out-of-band door for legitimate users (call phone #x, or email this address).
PLEASE NOTE: IP addresses can be shared among THOUSANDS or even MILLIONS of users!!! For example, most/all AOL users appear as a very small set of IP addresses due to AOL's network architecture. Most ISPs map their large user bases to a small set of public IP addresses.
You cannot assume that an IP address belongs to only a single user.
You cannot assume that a single user will be using only a single IP address.
Check the following question discussing best practices against distibuted brute force and dictionary attacks:
What is the best Distributed Brute Force countermeasure?

Resources