What do unrecognized GET requests mean? - node.js

It's a new Amazon EC2 instance, it's live maybe a couple of hours a day, I've just installed NodeJS and still playing with it. And then I get this in my Putty SSH window:
I don't know what those two last requests mean. I don't have robots.txt and I definitely don't have any HTML file whatsoever (it's all in Jade).
Do I havet to be concerned?

Its probably a bot. Generally you don't need to worry about it, unless they are generating to many requests to the point that it causes performance issues with your application.

Related

Is running flask in the default server safe (security-wise)?

I'm thinking about running a very simple Flask server locally (using the default development server) and then opening it up to the web via http://localhost.run/. I intend for this to be a personal webhook server and nothing else.
I've seen related questions before (for example, Is the server bundled with Flask safe to use in production?, etc), but:
I don't care that it won't scale well
I don't expect to get more than one request at a time
I will not be using the debugging mode
My question is this: How safe will my computer and local network be if I do this? I will probably limit requests to POST requests and check to see that they have a special key, or something like that, and the only thing I'm going to be doing with the webhooks is displaying a notification.
Short of someone from the Pallets Project speaking up, the official word on the recommendation is
https://flask.palletsprojects.com/en/1.1.x/tutorial/deploy/#run-with-a-production-server
If you have enough access to a server to permit running something that'll listen to a socket, the step of adding a WSGI server isn't a big one. The link above recommends waitress (and provides instructions), but gunicorn and uwsgi will work, too.
Adding my opinion:
Parsing HTTP and dealing with edge-cases is hard, so why should Flask/Werkzeug spend effort dealing with edge cases when there are WSGI front-ends that already take on the responsibility? In their position (which I'm not), I'd punt scaling and security to WSGI servers, and focus on making an excellent framework.

Determining Website Crash Time on Linux Server

2.5 months ago, I was running a website on a Linux server to do a user study on 3 variations of a tool. All 3 variations ran on the same website. While I was conducting my user study, the website (i.e., process hosting the website) crashed. In my sleep-deprived state, I unfortunately did not record when the crash happened. However, I now need to know a) when the crash happened, and b) for how long the website was down until I brought it back up. I only have a rough timeframe for when the crash happened and for long it was down, but I need to pinpoint this information as precisely as possible to do some time-on-task analyses with my user study data.
The server runs Linux 16.04.4 LTS (GNU/Linux 4.4.0-165-generic x86_64) and has been minimally set up to run our website. As such, it is unlikely that any utilities aside from those that came with the OS have been installed. Similarly, no additional setup has likely been done. For example, I tried looking at a history of commands used in hopes that HISTTIMEFORMAT was previously set so that I could see timestamps. This ended up not being the case; while I can now see timestamps for commands, setting HISTTIMEFORMAT is not retroactive, meaning I can't get accurate timestamps for the commands I ran 2.5 months ago. That all being said, if you have an idea that you think might work, I'm willing to try (as long as it doesn't break our server)!
It is also worth mentioning that I currently do not know if it's possible to see a remote desktop or something of the like; I've been just ssh'ing in and use the terminal to interact with the server.
I've been bouncing ideas off with friends and colleagues, and we all feel that there must be SOMETHING we could use to pinpoint when the server went down (e.g., network activity logs showing spikes around the time that the user study began as well as when the website was revived, a log of previous/no longer running processes, etc.). Unfortunately, none of us know about Linux logs or commands to really dig deep into this very specific issue.
In summary:
I need a timestamp for either when the website crashed or when it was revived. It would be nice to have both (or otherwise determine for how long the website was down for), but this is not completely necessary
I'm guessing only a "native" Linux command will be useful since nothing new/special has been installed on our server. Otherwise, any additional command/tool/utility will have to be retroactive.
It may or may not be possible to get a remote desktop working with the server (e.g., to use some tool that has a GUI you interact with to help get some information)
Myself and my colleagues have that sense of "there must be SOMETHING we could use" between various logs or system information, such at network activity, process start times, etc., but none of us know enough about Linux to do deep digging without some help
Any ideas for what I can try to help figure out at least when the website crashed (if not also for how long it was down)?
A friend of mine pointed me to the journalctl command, which apparently maintains timestamps of past commands separately from HISTTIMEFORMAT and keeps logs that for me went as far back as October 7. It contained enough information for me to determine both when I revived my Node js server as well as when my Node js server initially went down

Measure speed between two exact sites

I would really like to measure connection speed between two exact sites. Naturally one of the sites is our site. Somehow I need to prove that not our internet connection is flaky, but that a site at the other end, is overcrowded.
At our end I have windows and linux machines available for this.
I imagine I would run a script at certain times of day which - for example - tries to download an image from that site and try to measure download time. Then put the download time into a database then create a graph from the records in the database. (I know that this is really simple and not sophisticated enough, but hence my question)
I need help on the time measurement.
The felt speed differences are big, sometimes the application works flawlessly, but sometimes we get timed out errors.
Now I use speedtest to check if our internet connection is OK, but this does not show that the site that is not working is slow, and now I can't provide hard numbers to assist my situation.
Maybe it is worth mentioning that the application we try to use at the other end is java based.
Here's how I would do it in Linux:
Use wget to download whatever URL you think represents your sites best. Parse the output into a file (sed, awk), use crontab to trigger the download multiple times.
wget www.google.com
...
2014-02-24 22:03:09 (1.26 MB/s) - 'index.html' saved [11251]

Linux: What should I use to run terminal programs based on a calendar system?

Sorry about the really ambiguous question, I really have no idea how to word it though hopefully I can give you more detail here.
I am developing a project where a user can log into a website and book a server to run a game for a specific amount of time. When the time is up the server stops running and the players on the server are kicked off. The website part is not a problem, I am doing this in PHP and everything works. It has a calendar system to book a server and can generate config files based on what the user wants.
My question is what should I use to run the specific game server on the linux box with those config files at the correct time? I have got this working with bash scripts and cron, but it seems very un-elegant. It literally uses FTP to connect to the website so it can download all the necessary config files and put them in a folder for that game and time. I was wondering if there was a better way of doing this. Perhaps writing a program in C, but I am not sure how to go about doing this.
(I am not asking for someone to hold my hand and tell me "write this code here", just some ideas of a better way of approaching this problem)
Thanks so much guys!
Edit: The webserver is a totaly different machine. I would theoreticaly like to have more than one game server where each of them "connects" (at the moment FTP) to the webserver, gets a file saying what it has to do at a specific time and downloads any associated files then disconnects.
I think at is better suited for running one time jobs than cron.
For a better approach for the downloading files etc, you should give more details on your setup (like, the website and the game server, are they on the same machine? Or the same network? etc etc.
You need a distributed task scheduler. With that, you can:
Schedule command "X" to be run at a certain time.
Specify the machine (or ask it to pick a machine from a pool of available machines)
Webserver would send request to this scheduler via command line or via web service when user selects a game server and a time.
You can have a look at : http://www.acelet.com/super/SuperWatchdog/index.html
EDIT :
One more option :http://jobscheduler.sourceforge.net/

Call Visitors web stat program from PHP

I've been looking into different web statistics programs for my site, and one promising one is Visitors. Unfortunately, it's a C program and I don't know how to call it from the web server. I've tried using PHP's shell_exec, but my web host (NFSN) has PHP's safe mode on and it's giving me an error message.
Is there a way to execute the program within safe mode? If not, can it work with CGI? If so, how? (I've never used CGI before)
Visitors looks like a log analyzer and report generator. Its probably best setup as a chron job to create static HTML pages once a day or so.
If you don't have shell access to your hosting account, or some sort of control panel that lets you setup up chron jobs, you'll be out of luck.
Is there any reason not to just use Google Analytics? It's free, and you don't have to write it yourself. I use it, and it gives you a lot of information.
Sorry, I know it's not a "programming" answer ;)
I second the answer of Jonathan: this is a log analyzer, meaning that you must feed it as input the logfile of the webserver and it generates a summarization of it. Given that you are on a shared host, it is improbable that you can access to that file, and even if you would access it, it is probable that it contains then entries for all the websites hosted on the given machine (setting up separate logging for each VirtualHost is certainly possible with Apache, but I don't know if it is a common practice).
One possible workaround would be for you to write out a logfile from your pages. However this is rather difficult and can have a severe performance impact (you have to serialize the writes to the logfile for one, if you don't want to get garbage from time to time). All in all, I would suggest going with an online analytics service, like Google Analytics.
As fortune would have it I do have access to the log file for my site. I've been able to generate the HTML page on the server manually - I've just been looking for a way to get it to happen automatically. All I need is to execute a shell command and get the output to display as the page.
Sounds like a good job for an intern.
=)
Call your host and see if you can work out a deal for doing a shell execute.
I managed to solve this problem on my own. I put the following lines in a file named visitors.cgi:
#!/bin/sh
printf "Content-type: text/html\n\n"
exec visitors -A /home/logs/access_log

Resources