malicious attack identification using web application logs

malicious attack identification using web application logs - security

If logs are given in a specified format and you are supposed to do investigation for a malicious activity identification , where can one start as a beginner ? is there any software which can identify malicious activity. However i am supposed to do it with Pandas, numpy etc
please give me a path where can i start my research LOGS FORMAT

Install a logging tool like Elastic Stack. It will make viewing and searching the logged events easier. There is also Elastalert that sits on top of it and can send alerts for things like frequent repeated events in the logs.

Related

how can a program keep a secret from its creator?

The idea is that I want a program that can edit a file yet I, the programmer, cannot edit or forge the file. Encrypting the file is an obvious choice, but even then, I'll still have to keep the encryption key secret from myself somehow.
Obscuring the secret doesn't seem to work, because I could just use the de-obscuring part of the code that I would need for the program.
I'm asking this because I'm trying to make a program that will keep me productive by monitoring my activities, and tell my friends/boss/family just how terrible a procrastinator i am if i don't live up to the goals i set the previous day (in other words: Present me can force future me to not procrastinate)

It seems the content of the program doesn't matter that much but you want to assure that the timestamp and content of the log can't be forged. I suggest writing the log to some external site where you can put data to but not delete from.
Writing false values to the log can only be prevented by having a log which progresses by time. For example, if you hide expenses from your bank account you'll run into problems because future balances will be lower than expected.
For short pieces of information like your account balance, just write it to some public site like Twitter. AFAIK it's not possible to send twitter messages like there were sent some time before.
For more complex data like the progress of a software development project push your changes with a version control system like git to a remote repo where you can't delete or overwrite history.
Update: As you explained in the comments you want to log dinstinct data on your computer that could be forged to anything. IMHO it's virtually impossible for you to write a program on your own which runs on your own computer with root but cannot be controlled. The only kind of software that is somehow similar to your request is DRM software that is calling home to prevent software "piracy". You need a binary program written by somebody else or with the source code deleted. It would need some kind of encrypted and obfuscated network communication which you can't understand.
I think there is not much hope for you using this approach. Better learn to control yourself and not answer random questions by strangers on Stackoverflow, ehem.

Which is the best algorithm for finding the attacks from log file

I m working on forensic analysis of web logs. I have generated the DoS attack dataset and i m having the attack dataset of log files (unlabeled dataset) taken from Dr. Anton Chuvakin. I need to look for access log, Error log file that generates various attacks such as XSS, XSRF, SQLI etc. I want to know which field is mainly for finding those attacks and let me know which is the suitable data mining/ Machine learning technique to attacks happened in log files. Please suggest me some idea Please help me. I m struggling a lot to identify the suitable algorithm and if any materials pl send to me.

Your question is quite broad. You need to specify which attacks are you monitoring, because they get logged into several logs ranging from the system logs you mentioned, to Apache logs, application server logs, etc.
A good way to start would be to make a list of every application/service your server is running, as well as open access points such as FTP or SSH and then monitoring each log. If you are able to simulate attacks then do it on a separate environment and look at how the system logs these events. You can then build upon that.
Another option is to install an intrusion detection system (IDS). This should be selected according to your needs and the size of the monitored system. Google "intrusion detection systems" and choose what you need.
Links that may interest you:
Detecting Web Application attacks from log files
How to tell if a Linux server is under DDOS attack
Checking SSH logs to prevent bruteforcing
Detecting attacks from Apache log files
Detection of XSS and SQLinjection attacks

storing quick analytics using redis and node.js

I am new to redis and would like to store the web analytic of web site globally and per user activity .
Below is what i am stuck with.
// to get all unique ips
client.sadd('visitors',ip);
// to records hits per ip
client.hincrby('hits',ip,1);
The above so far works fine and i do get number of different ips and hit counter per ip.
the problem comes to store the activities made by each ip. i.e. Storing the link he clicked, searches he did, with datetime
Can some one please throw light on how to best manage it.
Thanks

the problem comes to store the activities made by each
You will need a separate structure for storing these.
The simplest rational structure is to have a "list of actions by session". Take a look at the sorted sets commands which provide a basic framework for creating a list of actions within a session.
This will get you something quickly. However, this is probably not what you really want. In fact redis is probably not useful for this at all.
If you want to re-trace an entire site visit you really want to connect to some sort of true analytics framework. There are dozens of website tracking tools that provide this type of functionality, so it's not really clear that building one is very efficient.

what are the tools to parse/analyze IIS logs - ideally free/open source?

note: there are few similar questions already asked here - but they are from 2009. May be something has changed since then.
I'm responsible for a bunch of websites hosted on different servers. I do not do any log analysis right now, but I would like to change this. First question - what is the best tool to view ISSUES with the website based on IIS logs (i.e. 404, 500 responses, long page processing, etc)? Ideally with grouping/sorting options? I do not want to spend a lot of time on this, I just want to periodically check if all is good with the website.
Second question (and I know most likely i'm asking for too much) - but is there any way to expose processed logs to web? So I can review things mentioned above without RPDing into the server?
Ideally I'm looking for a free/open source solution, but I'm ready to pay for a good software as well (but not a lot of $$).
Thank you.

You can take a look at our log monitoring solution EventSentry, which can monitor text-based logs like IIS logs. We have standard templates setup for IIS, and we can consolidate the logs in a database with web-access, so that you can review the logs without using RDP.
It's a pretty flexible solution that allows you to pick the fields you are interested in, and ignore the ones you are not - and thus save space in your database.
You can also setup real-time alerts, so that you can get an email when a critical error is encountered in a log file, like a 500 error.
http://www.eventsentry.com/features/log-file-monitoring
Finally, you can also plug-in command line tools which can verify that a given web page is accessible, or get alerted when it changes: http://www.eventsentry.com/features/application-monitoring.
I'm biased of course, but I would say that our solution is pretty affordable. Since it offers additional functionality as well, such as service monitoring (to monitor your IIS services) and event log monitoring (IIS does log critical messages to the event log), you can setup comprehensive monitoring with a single product.

I'd look into #LuckyLuke solution (or similar) - classic "build vs buy" decision. Based on your post, this isn't going to be your "full time" job so IMHO its best to leave it to those who do...
I don't know what "legacy" answers you are referring to, but if you want to tinker you can use Microsoft's own log parser, and depending on how far you want to go with it, you can use it (COM dll) to write your "admin web pages" in .Net/ASP.Net and host it in each of your servers....
If you're very specific about the errors you just want to be alerted about, another "hacky" way would be to provide your own custom error pages (either the default IIS error pages, or configure your Asp.Net apps to use specific error pages).

Using Datamining/Statistics for Log Monitoring

I have a large set of log files that I want to characterize or possibly add some kind of decision tree or some kind of analytics. But I don't know exactly what. What kind of analysis have you done with log files, a lot of log files.
For example, so far I am collecting how many requests are made to a particular page for a given log file.
Servlet = 60 requets
Servlet2 = 70 requests, etc.
I guess right there, filter by only the most popular requests. Also, might do something like 60 requests given a 2 hour period. 60 / 160 minutes.

Deciding what analysis to do depends on what decisions you're trying to make based on that analysis. For example, I currently monitor logs for exceptions reported by our application (all exceptions in the client application are logged with the server) to decide what should be high priority client bugs to investigate. I also use log searching software to monitor for any Exceptions reported by our server software which may need more immediate investigation. On top of the logs generated by everything anyway, I also use some monitoring software to track usage of our web server and database server which records usage stats etc. in a database. The final aim of this is to predict future usage levels and purchase more hardware as appropriate to keep up with demand.
Two (free) tools I've been using are:
Hyperic for monitoring, it's pretty easy to set up and might be able to start logging a lot of data you may be interested in, ie requests per second on a web server.
Splunk for searching log files, it's very easy to get set up and work with and gives you excellent searching capabilities over your log files. If you're working with log files right now and haven't tried out splunk I definitely recommend it. I have noticed a couple of moments of 100% cpu whilst using it on our main production server so stopped running it on that machine recently, just a word of warning.
Not sure what your aim is with this analysis, mine has been very much about looking for any errors I should know about, and planning for future capacity needs. If you're interested in the latter I'd also recommend The Art of Capacity Planning.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string