How would you implement a "who is online" feature?

How would you implement a "who is online" feature? - web

I have been meaning to implement a "who is online" feature on my site.
Was wondering how would you decide if a user is online or not?
Some options are:
Last seen, less than N minutes ago (what is N?)
A comet server with long polling
Something else

If you are using session variables then the user is online if last_activity + session_expiry > current_date. Else the session has already expired and they are not online.
Now, it depends on what people will be able to do with this "who is online" feature. You might prefer a more conservative measure to have a higher confidence the user's active.
But, given the nature of the web, there's no sure fire way to ensure the user's really online and active in your site, short of requiring user interaction every once in a while, but that would be annoying.

I would go with option 1 and allow N to be set from a configuration file. Presumably user activity is being logged with a timestamp in some datastore, so calculating whether a user is considered online (seen less than N time ago) should be pretty straightforward. You may consider using a periodic AJAX request to update the online collection of users at regular intervals.

You could also use a ping method. Send a light ajax request from the client to server approximately every 30-60 seconds. Keep the request and response as small as possible to reduce bandwidth and this should perform almost as well as the Comet method.

Related

Detecting people sharing login / account information for a website

I have a website that contains a secure area accessible by logging in with account info. Within the secure area, I have some expensive IP. I have been finding that people are sharing their passwords with other people. Are there any existing technologies / solutions / methods that I can implement to detect fraud patterns?
Thanks in advance for the help.

check geographical region. If within some timeframe multiple logins from regions geographically far apart log in, then you know those credentials have been shared.
Friday morning a log in from NY, Friday evening a log in from China
bandwitdh consumption: if your site offers lots of content, if a user goes over some high limit, it means its credentials have been shared
max bandwidth 5MB/s; then in one day 60*60*24*5MB is your upper limit per day per user
keep a counter of live sessions so you can see how many people log in at the same time. This is imprecise because the same person can log in through multiple browsers from the same IP and have a session on each one.
if they have 100 sessions (4 times/hr), that seems more than one person can do, unless your site expects this behaviour

There are several ways to approach this. But it's really going to boil down to the type of content and how often a given user really is grabbing new content. For adult websites, obviously the primary purpose of the logins is to download new content. I'm not sure about your site.
One way, and perhaps the easiest, is to simply limit the number of simultaneous downloads and/or rate limit each download.
If the files are large enough, you can impose a rate limit on how fast the data transfer takes place. Pick something that's a little slow, but not slow enough to make people mad. I would guess taking 30 seconds to download a file isn't too bad.
Then, only allow them to download 1 or 2 documents at a time per login id. People will be a bit less likely to share their password if they know that they may not be able to download something because someone else is.
Another approach would be to capture the IP address when the user signs in. Yes, I know this changes, but it gives you a starting point. If multiple users are active with the same login id but with different IPs, then you can either send them an alert stating that their account has been "hacked" ;) and that you are changing the password. Change it, kick everyone out, and send the password to the email address you have on file.
Bear in mind, that you don't want to stop a user from accessing it from work then going home and accessing it there. So, you have to make sure that they are essentially online at the same time. This means getting requests from different IPs within a minute or two of each other.
A twist on this would be to detect if multiple session ids are associated with the same login. For example, when they log in, save the current session id to a table. After they log out or a timeout is reached, clear that session id.
Don't let them log in again while another session id is active. Inform them they have to wait xx minutes until the session is cleared OR that another user is currently logged in with their account.
Ask them if they want to reset the session. This allows for situations where someone accidentally closes the browser and goes back to your site. If they pick yes, then stop the currently active session, change the password and send it to the email address on file.
I guarantee this last one will make people stop sharing their passwords. After all if I can't log in because someone I gave my password to is currently online, then this is a pain point I want to stop. Also, if I'm the one who borrowed the password and just locked myself out because the password changed then I'll either get my own account or go elsewhere: both of which are usually acceptable situations.

https://softwareengineering.stackexchange.com/a/442073/422609 has some detailed suggestions on this topic.
Signals that may be useful:
IP
Device identifier (via fingerprinting or other means depending on platform)
Location
User behaviour
You can also look at other means such as using links to get people to login or multifactor auth that adds some friction to the sharing.
I would think more about what you intend to do once you detect someone sharing it. Is the outcome to get them to pay per user or per organization?

It is quite a tricky issue:
If your users change location several times a day, their IP will change, but it's still the same person.
If your user has the same location throughout the day, but connects several times, it could very well be different users, say, in an internet café.
You will have to use a combination of those: if the user changes IP frequently, go and check the map location of that IP, and see if it's possible to travel the distance in the time between the 2 connections. If it's not, it's a fraud.

Is Weblog Expert reliable?

My boss asked me if Weblog expert (http://www.weblogexpert.com/lite.htm) is reliable in calculating the average time of the incoming visitors in a web site. Since HTTP is a stateless protocol, I think that the average time might be something left to personal interpretation. Does any one uses Weblog Expert? Is the visitor's average time reliable? Does anyone understand its criteria about how it process Apache logs to understand the average time?

From the WebLog Expert Lite help, the following definition:
Visitor - The program determines number of visitors by the IP addresses. If a request from an IP address came after 30 minutes since the last request from this IP, it is considered to belong to a different visitor. Requests from spiders aren't used to determine visitors.
That's a fairly useful heuristic to determine a visitor's visit, if all you have to go on is a timestamp and a requesting IP address. (I'm not sure how Web Log Expert determines a visitor is a spider, but it was irrelevant to my purpose.)
However, on closer inspection, I found the visitor average time to be very variable for our web app; some users request only a page or two, others are on for hours. So a single metric of "Average visit duration" might not give you a perfect understanding of your site's traffic.

I can't comment on that site in particular, but average time is usually calculated using some very clever bits of javascript.
You can set events on various parts of the page in javascript which fire off requests to servers. For example, when the user navigates away from a page or clicks on a link or closes the window the browser can send off a javascript request to their servers letting them know that the user has left. While this isn't 100% reliable, I think it provides a reasonable estimate for how long people spend there.

I get entirely different results if I change "Visitor session timeout".
Our internal network people (the majority of our visitors) all go to our website (external host) from the same IP (through our ISP), so the only way to determine a new visitor is by this Timeout. Choosing 1, 5 or 10 minutes creates very different results. HIGHLY UNRELIABLE. The only thing to do is be consistent and use the same parameters for comparative results, i.e., increased/decreased traffic. By the way, the update to WebLog Expert (version 7 -> 8) through that all out the window with entirely different counting mechanisms.

How to defend excessive login requests?

Our team have built a web application using Ruby on Rails. It currently doesn't restrict users from making excessive login requests. We want to ignore a user's login requests for a while after she made several failed attempts mainly for the purpose of defending automated robots.
Here are my questions:
How to write a program or script that can make excessive requests to our website? I need it because it will help me to test our web application.
How to restrict a user who made some unsuccessful login attempts within a period? Does Ruby on Rails have built-in solutions for identifying a requester and tracking whether she made any recent requests? If not, is there a general way to identify a requester (not specific to Ruby on Rails) and keep track of the requester's activities? Can I identify a user by ip address or cookies or some other information I can gather from her machine? We also hope that we can distinguish normal users (who make infrequent requests) from automatic robots (who make requests frequently).
Thanks!

One trick I've seen is having form fields included on the login form that through css hacks make them invisible to the user.
Automated systems/bots will still see these fields and may attempt to fill them with data. If you see any data in that field you immediately know its not a legit user and ignore the request.
This is not a complete security solution but one trick that you can add to the arsenal.

In regards to #1, there are many automation tools out there that can simulate large-volume posting to a given url. Depending on your platform, something as simple as wget might suffice; or something as complex (relatively speaking) a script that asks a UserAgent to post a given request multiple times in succession (again, depending on platform, this can be simple; also depending on language of choice for task 1).
In regards to #2, considering first the lesser issue of someone just firing multiple attempts manually. Such instances usually share a session (that being the actual webserver session); you should be able to track failed logins based on these session IDs ang force an early failure if the volume of failed attempts breaks some threshold. I don't know of any plugins or gems that do this specifically, but even if there is not one, it should be simple enough to create a solution.
If session ID does not work, then a combination of IP and UserAgent is also a pretty safe means, although individuals who use a proxy may find themselves blocked unfairly by such a practice (whether that is an issue or not depends largely on your business needs).
If the attacker is malicious, you may need to look at using firewall rules to block their access, as they are likely going to: a) use a proxy (so IP rotation occurs), b) not use cookies during probing, and c) not play nice with UserAgent strings.

RoR provides means for testing your applications as described in A Guide to Testing Rails Applications. Simple solution is to write such a test containing a loop sending 10 (or whatever value you define as excessive) login request. The framework provides means for sending HTTP requests or fake them
Not many people will abuse your login system, so just remembering IP addresses of failed logins (for an hour or any period your think is sufficient) would be sufficient and not too much data to store. Unless some hacker has access to a great many amount of IP addresses... But in such situations you'd need more/serious security measurements I guess.

cheat prevention for browser based xmlhttp/js/perl/php game

Lets say that in a browser based game, completing some action (for simplicity lets say someone clicks on a link that increases their score by 100) clicking on this link which would have a url for example increase_score.pl?amount=100 what kind of prevention is there from someone simply sending requests to the web server to execute this command:
Over and over again without actually doing the task of clicking on the link and
Sending a false request to the server where amount is set to something rediculus like 100000.
I am aware of checking HTTP_REFERER however I know people can get around that (not sure how exactly) and other than some bounds checking for the 2nd option I'm kind of stumped. Anyone ever experience similar problems? Solutions?

Nothing can stop them from doing this if you implement your game how you propose.
You need to implement game logic on the server and assign points only once the server validates the action.
For example: on SO when someone votes your question up, this isn't sent as a command to increase your reputation. The web-app just says to the server user X voted question Y up. The server then validates the data and assigns the points if everything checks out. (Not to say SO is a game, but the logic required is similar.)

Short version: you can't. Every piece of data you get from the client (browser) can be manually spoofed by somebody who knows what they're doing.
You need to fundamentally re-think how the application is structured. You need to code the server side of the app in such a way that it treats every piece of data coming from the client as a pack of filthy filthy lies until it can prove to itself that the data is, in fact, plausible. You need to avoid giving the server a mindset of "If the client tells me to do this, clearly it was allowed to tell me to do this."
WRONG WAY:
Client: Player Steve says to give Player Steve one gazillion points.
Server: Okay!
RIGHT WAY:
Client: Player Steve says to give Player Steve one gazillion points.
Server: Well, let me first check to see if Player Steve is, at this moment in time, allowed to give himself one gazillion points ... ah. He isn't. Please display this "Go Fsck Yourself, Cheater" message to Player Steve.
As for telling who's logged-in, that's a simple matter of handing the client a cookie with a damn-near-impossible-to-guess value that you keep track of on the server -- but I'll assume you know how to deal with session management. :-) (And if you don't, Google awaits.)

The logic of the game (application) should be based on the rule to not trust anything that comes from the user.
HTTP_REFERER can be spoofed with any web client.

Token with cookie/session.

You could make the link dynamic and have a hash that changed at the end of it. Verify that the hash is correct given that period of time.
This would vary in complexity depending on how often you allowed clicks.

A few things to note here.
First, your server requests for something like this should be POST, not GET. Only GET requests should be idempotent, and not doing so is actually a violation of the HTTP specification.
Secondly, what you're looking at here is the classic Client Trust Problem. You have to trust the client to send scores or other game-interval information to the server, but you don't want the client to send illegitimate data. Preventing disallowed actions is easy - but preventing foul-play data in an allowed action is much more problematic.
Ben S makes a great point about how you design the communication protocols between a client and a server like this. Allowing point values to be sent as trusted data is generally going to be a bad idea. It's preferable to indicate that an action took place, and let the server figoure out how many points should be assigned, if at all. But sometimes you can't get around that. Consider the scenario of a racing game. The client has to send the user's time and it can't be abstracted away into some other call like "completedLevelFour". So what do you do now?
The token approach that Ahmet and Dean suggest is sound - but it's not perfect. Firstly, the token still has to be transmitted to the client, which means it's discoverable by the potential attacker and could be used maliciously. Also, what if your game API needs to be stateless? That means session-based token authentication is out. And now you get into the deep, dark bowels of the Client Trust Problem.
There's very little you can do make it 100% foolproof. But you can make it very inconvenient to cheat. Consider Facebook's security model (every API request is signed). This is pretty good and requires the attacker to actually dig into your client side code before they can figure out how to spoof a reqeust.
Another approach is server replay. Like for a racing game, instead of just having a "time" value sent to the server, have checkpoints that also record time and send them all. Establish realistic minimums for each interval and verify on the server that all this data is within the established bounds.
Good luck!

It sounds like one component of your game would need request throttling. Basically, you keep track of how fast a particular client is accessing your site and you start to slow down your responses to that client when their rate exceeds what you think is reasonable. There are various levels of that, starting at the low-level IP filters up to something you handle in the web server. For instance, Stackoverflow has a bit in the web application that catches what it thinks are too many edits too close together. It redirects you to a captcha that you need to respond to if you want to continue.
As for the other bits, you should validate all input not just for its form (e.g. it's a number) but also that the value is reasonable (e.g. less than 100, or whatever). If you catch a client doing something funny, remember that. If you catch the same client doing something funny often, you can ban that client.

Expanding on Ahmet's response, every time they load a page, generate a random key. Store the key in the user session. Add the random key to every link, so that the new link to get those 100 points is:
increase_score.pl?amount=100&token=AF32Z90
When every link is clicked, check to make sure the token matches the one in the session, and then make a new key and store it in the session. One new random key for every time they make a request.
If they give you the wrong key, they're trying to reload a page.

I would suggest making a URL specific to each action. Something along the lines of:
/score/link_88_clicked/
/score/link_69_clicked/
/score/link_42_clicked/
Each of these links can do two things:
Mark in the session that the link has been clicked so that it wont track that link again.
Add to their score.

If you want the game to only run on your server, you can also detect where the signal is sent from in your recieving trick, and ignore anything not coming from your domain. It will be a real pain to tamper with your codes, if you have to run from your dedicated domain to submit scores.
This also blocks out most of CheatEngine's tricks.

Best way to limit (and record) login attempts

Obviously some sort of mechanism for limiting login attempts is a security requisite. While I like the concept of an exponentially increasing time between attempts, what I'm not sure of storing the information. I'm also interested in alternative solutions, preferrably not including captchas.
I'm guessing a cookie wouldn't work due to blocking cookies or clearing them automatically, but would sessions work? Or does it have to be stored in a database? Being unaware of what methods can/are being used so I simply don't know what's practical.

Use some columns in your users table 'failed_login_attempts' and 'failed_login_time'. The first one increments per failed login, and resets on successful login. The second one allows you to compare the current time with the last failed time.
Your code can use this data in the db to determine how long it waits to lock out users, time between allowed logins etc

Assuming google has done the necessary usability testing (not an unfair assumption) and decided to use captchas , I'd suggest going along with them.
Increasing timeouts is frustrating when I'm a genuine user and have forgotten my password (with so many websites and their associated passwords that happens a lot , especially to me)

Storing attempts in the database is the best solution IMHO since it gives you the auditing records of the security breach attempts. Depending on your application this may or may not be a legal requirement.
By recording all bad attempts you can also gather higher level information, such as if the requests are coming from one IP address (i.e. someone / thing is attempting a brute force attack) so you can block the IP address. This can be VERY usefull information.
Once you have determined a threshold, why not force them to request the email to be sent to their email address (i.e. similar to 'I have forgotten my password'), or you can go for the CAPCHA approach.

Answers in this post prioritize database centered solutions because they provide a structure of records that make auditing and lockout logic convenient.
While the answers here address guessing attacks on individual users, a major concern with this approach is that it leaves the system open to Denial of Service attacks. Any and every request from the world should not trigger database work.
An alternative (or additional) layer of security should be implemented earlier in the req/ res cycle to protect the application and database from performing lock out operations that can be expensive and are unnecessary.
Express-Brute is an excellent example that utilizes Redis caching to filter out malicious requests while allowing honest ones.

You know which userid is being hit, keep a flag and when it reaches a threshold value simply stop accepting anything for that user. But that means you store an extra data value for every user.
I like the concept of an exponentially increasing time between attempts, [...]
Instead of using exponentially increasing time, you could actually have a randomized lag between successive attempts.
Maybe if you explain what technology you are using people here will be able to help with more specific examples.

Lock out Policy is all well and good but there is a balance.
One consideration is to think about the consruction of usernames - guessable? Can they be enumerated at all?
I was on an External App Pen Test for a dotcom with an Employee Portal that served Outlook Web Access /Intranet Services, certain Apps. It was easy to enumerate users (the Exec /Managament Team on the web site itself, and through the likes of Google, Facebook, LinkedIn etc). Once you got the format of the username logon (firstname then surname entered as a single string) I had the capability to shut 100's of users out due to their 3 strikes and out policy.

Store the information server-side. This would allow you to also defend against distributed attacks (coming from multiple machines).

You may like to say block the login for some time say for example, 10 minutes after 3 failure attempts for example. Exponentially increasing time sounds good to me. And yes, store the information at the server side session or database. Database is better. No cookies business as it is easy to manipulate by the user.
You may also want to map such attempts against the client IP adrress as it is quite possible that valid user might get a blocked message while someone else is trying to guess valid user's password with failure attempts.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string