Find hosted directories Jetty/Apache - linux

Let say I have a directory which is being hosted by Jetty or Apache (i'd like an answer for both), i know the URL including the port and i can log into the server.
How can i find the directory that is being hosted by a certain port?
I'd also like to go the other way, i have a folder on the server, which i know if being hosted, but i don't know the port so i can't find it in a web browser.
How can i find a list of directories that are being hosted?
This has been bugging me for ages but i've never bothered to ask before!
Thanks.

This is the way how to find it out for Apache. Lets say you have an URL http://myserver.de:8081/somepath/index.html
Step 1: Find the process that has the given port open
You can do this by using lsof in a shell of the server, which lists open files (and ports) as well as the processes associated to it:
myserver:~ # lsof -i -P | grep LISTEN | grep :80
apache2 17479 root 4u IPv6 6271473 TCP *:80 (LISTEN)
We now know there is a process called "apache2" with process ID 17479
Step 2: Find out more about the process
We can now look at the environment of the process, where more information should be available:
myserver:~ # (cat /proc/17479/environ; echo) | tr "\000" "\n"
PATH=/usr/local/bin:/usr/bin:/bin
PWD=/
LANG=C
SHLVL=1
_=/usr/sbin/apache2
Okey, the process executable is /usr/sbin/apache2. Now lets look at the command line.
myserver:~ # (cat /proc/17479/cmdline; echo) | tr "\000" " "
/usr/sbin/apache2 -k start
Step 3: Finding the config of the process
Our previous examination has shown that no special configuration file has been given at the command line with the -f option, so we have to find the default location for that process. This depends on how the default command line is compiled into the apache2 executable. This could be extracted from it somehow, but obviously its the default location for Apache 2 on my machine (Debian Etch), namely /etc/apache2/apache2.conf.
Step 4: Examining the Apache config file
This again needs some knowledge about apache configurations. The config file can include other files, so we need to find those first:
myserver:~# cat /etc/apache2/apache2.conf | grep -i ^Include
Include /etc/apache2/mods-enabled/*.load
Include /etc/apache2/mods-enabled/*.conf
Include /etc/apache2/httpd.conf
Include /etc/apache2/ports.conf
Include /etc/apache2/conf.d/
Include /etc/apache2/sites-enabled/
A nice list. These configs tell evetything about your configuration, and there are many options that might map files to URLs. In particular apache can serve different directories for different domains, if those domains are all mapped to the same IP. So lets say on your server you host a whole bunch of domains, then "myserver.de" is either mapped by the default configuration or by a configuration that serves this domain specifically.
The most important directives are DocumentRoot, Alias and Redirect. On my system the following gives a quick overview (comments omitted):
myserver:~# ( cat /etc/apache2/apache2.conf; cat /etc/apache2/sites-enabled/* ) | grep 'DocumentRoot\|Alias\|Redirect'
Alias /icons/ "/usr/share/apache2/icons/"
DocumentRoot /var/www/
RedirectMatch ^/$ /apache2-default/
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
Alias /doc/ "/usr/share/doc/"
DocumentRoot /var/www/
RedirectMatch ^/$ /apache2-default/
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
Alias /doc/ "/usr/share/doc/"
Since the "mypath" part of the URL has no direct match, I can savely assume it lies below the DocumentRoot /var/www/, so the result of my search is that
http://myserver.de:8081/somepath/index.html --> /var/www/mypath/index.html
You can do a lookup in a similar way for jetty.

As a convention you could maintain a document detailing all the filesystem directories and corresponding URLs that are created. This will ansewr quetsions of file>URL and URL->file mappings, and is also useful for planning, resource management and security reviews.
What follows is food for thought rather than any serious proposal. My "proper" answer is to use good documentation. However...
An automated approach might be possible. Thinking freely (not necessarily practically!) you could find/create an Apache module/jetty extension to add a small virtual file to each web directory as it is served from the filesystem. The contents of that virtual file would contain the location of the directory on the server the files are served from as well as maybe internal server name, IP or other details to help bridge the gap from what you see on the web side and where it is in your intranet.
Mapping files to URLs is tricky. You might be able to automate it by scanning the http access logs, which is configured with a custom logger that logs an entry when a file is served. By scanning the URL accessed and corresponding file served, you can map files back to URLs. (You also get the URL->file mapping, in case you don't want to manually browse the URL as I outlined in the paragraph above.)

apache2ctl -S
or (for old computers):
apachectl -S
You'll see vhosts files with lines where these sites are described. There you can look for directories.

Related

Bash script or existing program to monitor "etc/hosts" file domain blacklist access?

This is sort of a complex question, I hope I can explain it clearly:
I have a long blacklist of domains added to my /etc/hosts file and it works perfectly. Now I want to be able to "monitor" with a "simple" Bash script every time a domain/entry is blocked, e.g.:
Let's say I'm running this hypothetical script on my Terminal and I try to access Facebook in my browser (which is blocked in my hosts file), I'd like to see in my Terminal something like:
0.0.0.0 facebook.com
Then, I try to access LinkedIn (also blocked), and now I want to see in my Terminal:
0.0.0.0 facebook.com
0.0.0.0 linkedin.com
Then, I try to access Instagram (blacklisted as well) and I see:
0.0.0.0 facebook.com
0.0.0.0 linkedin.com
0.0.0.0 instagram.com
And so on...
Is that possible? I've spent days looking for an existing program that does this but no luck..
It's possible; whether you find it simple is a different question.
Caveat: if you edit the hosts file you'll need to restart the process.
Save
BEGIN {
while (getline < "/etc/hosts") # read the hosts file line by line
{
split($0,fields," *") # break up each line on "any number of spaces" and assign to the array "fields"
hosts[fields[2]"."]++ # create an array "hosts" that holds e.g. www.google.com as it's key
}
close("/etc/hosts") # close the hosts file, be tidy
}
{
for(a=7;a<=NF;a++){ # for each line of input (these are now coming from tcpdump) iterate over the 7th to last (NF) fields
if($a in hosts){ # if the field is present in the hosts array (matches an index)
print $a # print it
}
}
}
as e.g. hosts.awk ...
You can then run
sudo tcpdump -l port 53 | awk -f hosts.awk

Performing a masscan on an input file containing domain names

I cannot find any options in order to perform a masscan on an input file containing domain names in the following format:
domain-name1.com
domain-name2.org
domain-name3.net
Is there a way I could use masscan with an input file containing those domain names? If the masscan software cannot perform that, would you see any Linux programs that could it that would be fast like masscan?
You can use GNU's parallel in combination with dig or host to do performative mass DNS resolution. A combination of dig and parallel would be the following:
parallel -j100 --retries 3 dig #$nameserver +short :::: hosts.txt | grep -E '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' > ip_list.txt
The number after -j switch is the amount of parallel DNS queries you want to execute;
+short tells dig to output only the DNS resolved addresses;
grep filters out only IP addresses, so we don't get MX entries, or any other unresolved addresses;
hosts.txt is the input file, containing hostnames;
ip_list.txt is the output file, which will be used to feed masscan.
Then you can feed masscan with your generated ip_list.txt as demonstrated:
masscan -iL ip_list.txt -p 80
edit
I found another command that can achieve the same parallel effect, but it's a bit more exciting as it's easier, which is xarg,an example syntax on how to use it is the following:
cat hosts.txt | xargs -n1 -P100 dig +short +retry=3 | grep -E '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' > ip_list.txt
-P100 specifies 100 parallel processes;
-n1 gets only the first argument of a line from hosts.txt (changing it to higher shouldn't have any effect, as long as the input file has only hostnames separated by newlines);
The downside of this approach is that xargs doesn't retry if dig fails, but I used +retry for that;
masscan does not provide any option to run scan on domain name you have to provide it list of ip address. So if you have a list of domain names in a file you can use following script to get ip address of all the domains.
$ cat hosts
github.com
google.com
pentestmonkey.com
script:
import sys
import socket
read_file = open('hosts','r')
for host in read_file:
print socket.gethostbyname(host.rstrip("\n")) #rstrip for removing new line characters
read_file.close()
From the Comparison with Nmap section of the masscan README.md:
You can think of masscan as having the following settings permanently
enabled:
...
-n: no DNS resolution happens
So that tool will not facilitate conversion of the host names to IP addresses. You'll need to do that using a tool like host or dig and then feed the results into masscan.
massscan can read scan ranges (i.e. IP addresses and CIDR blocks) from a file, using the -iL <filename> flag.
Also worth mentioning, here's another excerpt from README.md:
Scanning the entire Internet is bad. For one thing, parts of the
Internet react badly to being scanned. For another thing, some sites
track scans and add you to a ban list, which will get you firewalled
from useful parts of the Internet.
If the "millions of domain names" aren't under your control, the warning above is applicable.
Use a high-speed (parallel or async) mass DNS resolver (massdns, fernmelder, etc.) to get the IP addresses for each DNS name efficiently. Use the IP addresses as input for masscan. If you'd like, you can plug the DNS names back in to the masscan JSON output with a dozen lines of Python.
Unless you're working on a very small scale, do NOT use Python or nmap to do your DNS resolution. You need native code to do this well, hence the recommendation for massdns or fernmelder

How to reuse code in CMake

I've got the following piece of code which gets the current system IP and stores it in the SERVER_IP variable:
EXECUTE_PROCESS(
COMMAND ip route get 8.8.8.8
COMMAND awk "NR==1 {print $NF}"
OUTPUT_VARIABLE SERVER_IP
OUTPUT_STRIP_TRAILING_WHITESPACE
)
I need to use this IP several places in my CMakeLists.txt file hierarchy. What's the best approach to reuse this code? My first thought is to make it a function like function(GetIP), but I am not sure where to put this function to make it visible to all CMakeLists.txt file.
If you make the CMake function available in the top-level CMakeLists.txt file, it will be also available in the CMakeLists.txt files of the subdirectories you added with ADD_SUBDIRECTORY.
Either define the function directly in the top-level CMakeLists.txt file or use something like INCLUDE(GetIP.cmake) there.
If it's just about the IP itself, just put it in a variable.
Variables set in a directory are inherited by all subdirectories, but not by parent directories. You can extend the scope of a local variable by one level with the PARENT_SCOPE parameter of the set command.
Alternatively, put the variable in the cache to make it accessible globally. Unless marked as internal, this will also make the variable configurable via the CMake GUI.
set(MY_SERVER_IP 8.8.8.8 CACHE STRING "IP address of the server responsible for X")
[...]
EXECUTE_PROCESS(
COMMAND ip route get ${MY_SERVER_IP}
COMMAND awk "NR==1 {print $NF}"
OUTPUT_VARIABLE SERVER_IP
OUTPUT_STRIP_TRAILING_WHITESPACE
)

add lines to the apache2 confing

I need to add this to my apache2.conf in my VPS:
Include /etc/phpmyadmin/apache.conf
extension=mysql.so
extension=memcache.so
extension=mbstring.so
extension=gd.so
extension=mcrypt
After I add this and save the apache.conf, and trying to restart the apache, i get an error - Failed.
Why?
You find that error because you put php directives in apache.
You have to add these lines in php.in file.
you can find php.ini file location using,
#php -i | grep php.ini

How to include the server name in the filename of an apache log?

I want to configure apache so that the access and error logs generated from apache include are named as follows:
<server-name>_access_<timestamp>
<server-name>_error_<timestamp>
I have the timestamp part figured out using rotatelogs:
CustomLog logs/access_log combined
CustomLog "|bin/rotatelogs -l /var/logs/access_%Y-%m-%d 86400" common
The part that I cannot figulre out is how to include the server name in the filename. I am configuring Apache on a linux box.
Regards,
Mohan
It appears that as recent as Apache HTTPD 2.4, no feature provides this capability.
http://httpd.apache.org/docs/current/mod/mod_log_config.html#customlog
If you wanted to something tricky, you could use a pipe similar to how you're accomplishing your timestamp problem and have a script try to determine the VirtualHost. Not sure how you'd accomplish that.
This is my solution, in the http.conf use:
CustomLog "|$/usr/local/apache2/conf/my_log.pl" common
create a file in /usr/local/apache2/conf/my_log.pl with:
#!/usr/bin/perl
use strict;
use warnings;
my $path="/usr/local/apache2/logs";
my $access="_access.log";
my $hostname=`hostname`;
chomp($hostname);
my $filename="${path}/${hostname}${access}";
$|=1; # Use unbuffered output
open (STDOUT, ">> $filename") or die $!;
while(<STDIN>) {
print STDOUT $_;
}
add the execution perm.:
chmod a+x /usr/local/apache2/conf/my_log.pl
Check out the mod_log_config documentation; iIt looks like you want either %v or %V:
%...v The canonical ServerName of the server serving the request.
%...V The server name according to the UseCanonicalName setting.

Resources