How can I know which files were modified by a specific process in linux machines? - linux

I need to get list of all modified files on my linux machines (AIX, Solaris, Red Hat, CentOS, HP-UX) in a specific time range (similar to proc mon or forfiles in Windows)
I tried to use find command. But since it didn't search per specific PID I got too many results.
I wanted to narrow down the results by looking for files that were modified by specific process. I used the lsof command for specific PID. but I got list of files that were accessed, which wasn't helpful for me, because I could not know if the process changed them.
I tried the strace command for specific PID, but the output was to hard to work with (too much irrelevant info, and I need it for 24 hours time range)
I kind of got to a dead end. Any ideas?
(In short - I want to get list of all modified files by a specific process in a specific time range)

Linux does not maintain a log of a record, of any kind, of which files were modified by which process.
The only logged information is each file's last modification timestamp. And even that can be arbitrarily adjusted by any process, which has appropriate privileges, to be ten years in the future, for example.
The short answer is that the information you're looking for does not exist.

The closest what I know of for your usecase is SELinux. This will only work if SELinux is enabled on your Operating System.
SELinux is capable of logging a bunch of information along with uid, gid, and PIDs ( exactly what you need ) for different operations.
For more details look at:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security_Guide/sec-Understanding_Audit_Log_Files.html

Related

Count how many processes hold a file on a linux system

I am intersted to know how many processes or any other entity whatsover, holds a specific file on the system.
I tried to find a way using lsof (I don't want to aggregate all the holders for effectiveness reasons), but couldn't find anything in the man page.
Please note that I don't mean the inode link count that is counting the hardlink count for this specific file on the fs.
Edit: I know now it is possible to use fuser (mentioned in one of the answers below) to get this information, but fuser uses procfs and therefore it is not very efficient. Does anyone knows of any other tool which doesn't iterate procfs?
Thanks.
Try using following command
fuser filename
You can also try
lslocks

Daemon for file watching / reporting in the whole UNIX OS

I have to write a Unix/Linux daemon, which should watch for particular set of files (e.g. *.log) in any of the file directories, across various locations and report it to me. Then I have to read all the newly modified files and then I have to process them and push grepped data into Elasticsearch.
Any suggestion on how this can be achieved?
I tried various Perl modules (e.g. File::ChangeNotify, File::Monitor) but for these I need to specify the directories, which I don't want: I need the list of files to be dynamically generated and I also need the content.
Is there any method that I can call OS system calls for file creation and then read the newly generated/modified file?
Not as easy as it sounds unfortunately. You have hooks to inotify (on some platforms) that let you trigger an event on a particular inode changing.
But for wider scope changing, you're really talking about audit and accounting tracking - this isn't a small topic though - not a lot of people do auditing, and there's a reason for that. It's complicated and very platform specific (even different versions of Linux do it differently). Your favourite search engine should be able to help you find answers relevant to your platform.
It may be simpler to run a scheduled task in cron - but not too frequently, because spinning a filesystem like that is dirty - along with File::Find or similar to just run a search occasionally.

Unix and Linux /proc PID system

For my intro to operating systems class we were introduced to the /proc directory and many of the features that can be used to access data stored in the process ID's that are available in /proc.
When I was trying out some commands learned (and a few I looked up) on the UNIX server hosted by my school I noticed that some sub directories that were present in a process, that I created, were a file type called "TeX font metric data" or a .tfm file. I figured that was the file type that was used when my professor showed us how to get data from the directories like status and map.
When I entered the command cat /proc/(PID)/status to look into the status file I got a random assortment of characters and white space. When I tried the same command on a process I created in my schools Linux server I was shown the information I expected to see in the status and map files.
My question is:
Why did the Unix server produce the random characters from my process's /proc/(PID)/status file while the Linux server gave me the data I would expect from the same command? Also Is there a way to access the Unix /proc data by accessing the /proc directory?
The Linux procfs you are familiar with, aka /proc/ is not a POSIX thing. It's OS-specific and multiple OSes just happen to implement similar things also called /proc.
Because no formal standard covers it, it's allowed to be / going to be different on any *nix-like system that implements it.
My guess with /proc/(PID)/status is that your UNIX is dumping the process status in a binary form instead of easy to read plain text.
See also:
Knowing the process status using procf/<pid>/status
If you can determine WHAT Unix you're on (odds are, Solaris since there's a free variant) you should be able to find a more specific answer.

Change or hide process name in htop

It seems that htop shows all running processes to every user, and process names in htop contain all the file names that I include in the command line. Since I usually use very long file names that actually contains a lot of detailed information about my project, I do not want such information to be visible to every one (but I am OK that other users see what software that I am running).
How can I hide the details in the process name?
How can I hide the details in the process name?
Since kernel 3.3, you can mount procfs with the hidepid option set to 1 or 2.
The kernel documentation file proc.txt describe this option:
The following mount options are supported:
hidepid= Set proc access mode.
hidepid=0 means classic mode - everybody may access all /proc directories
(default).
hidepid=1 means users may not access any /proc directories but their own. Sensitive files like cmdline, sched*, status are now protected against other users. This makes it impossible to learn whether any user runs specific program (given the program doesn't reveal itself by its behaviour). As an additional bonus, as /proc//cmdline is unaccessible for other users, poorly written programs passing sensitive information via program arguments are now protected against local eavesdroppers.
hidepid=2 means hidepid=1 plus all /proc will be fully invisible to other users. It doesn't mean that it hides a fact whether a process with a specific pid value exists (it can be learned by other means, e.g. by "kill -0 $PID"), but it hides process' uid and gid, which may be learned by stat()'ing /proc// otherwise. It greatly complicates an intruder's task of gathering information about running processes, whether some daemon runs with elevated privileges, whether other user runs some sensitive program, whether other users run any program at all, etc.

Symbolic link to latest file in a folder

I have a program which requires the path to various files. The files live in different folders and are constantly updated, at irregular intervals.
When the files are updated, they change name, so, for instance, in the folder dir1 I have fv01 and fv02. Later on the day someone adds fv02_v1; the day after someone adds fv03 and so on. In other words, I always have an updated file but with different name.
I want to create a symbolic link in my "run" folder to these files, such that said link always points to the latest file created.
I can do this in Python or Bash, but I was wondering what is out there, as this is hardly an uncommon problem.
How would you go about it?
Thank you.
Juan
PS. My operating system is Linux. I currently have a simple daemon (Python) that looks every once in a while (refreshes every minute) for the latest file. Seems kind of an overkill to me.
Unless there is some compelling reason that you have left unstated (e.g. thousands of files in the directory) just do it the way you suggest with a script sorting the files by modification time. There is no secret method that I am aware of.
You could write a daemon using inotify to monitor your directories and immediately set your links but that seems like overkill.
Edit: I just saw your edit. Since you have the daemon already, inotify might not be such a bad idea. It would be somewhat more efficient than constantly querying since the OS will tell you when something in your directories has changed.
I don't know python well enough to point you to anything specific but there must exist a wrapper for inotify.

Resources