Grep Recursively for files in specific group of sub folders - linux

I've done a search through both Google and StackOverflow, but haven't seemed to be able to find an answer to this question. It might be because I'm searching the wrong terms, but hopefully someone can help!
I have the following file structure similar to this:
/logs/ServiceA/Prod/2017/10/01/11/logFileForHour
/logs/ServiceA/Prod/2017/10/01/12/logFileForHour
/logs/ServiceA/Prod/2017/10/01/13/logFileForHour
/logs/ServiceB/SubService1/Prod/2017/10/01/12/logFileForHour
/logs/ServiceC/SubService1/Prod/Mirror/2017/10/01/12/logFileForHour
/logs/ServiceC/SubService1/Beta/2017/10/01/12/logFileForHour
Each hour's folder contains the aggregate of the logs from all hosts running that service. Those hourly folders are aggregated into daily folders, which are aggregated into monthly folders and so on. Then the logs are aggregated by Stage (Prod/Demo/Dev) and then further by Service/SubService.
I need a way to grep for a common identifier across all PROD services, and sub services and would like to try and do this with a single grep if at all possible. I know the hour that the request was placed in.
Ideally I'd be able to use a filepath of: /logs/*/Prod/*/2017/10/01/12/* if I wanted all prod logs from all services and sub services during the 12:00 hour of October 10th, 2017, but this only works if there is a single folder where each asterisk is, when in reality that could be 1 or more folders for the first asterisk and 0 or more folders for the second asterisk.
Any help you all could provide would be greatly appreciated!

You could try something like the following:
find logs -type d -name Prod | xargs grep -r SEARCHSTRING
but beware I couldn't test it...
EDIT after clarification:
find logs -type f -path '*/Prod/*' -a -path '*/2017/10/01/12/*'
which is, find all files descendants of a path that matches Prod and matches the date/hour path you are looking for (add grep)

Related

Move non-sequential files to new directory

I have no previous programming experience. I know this question has been asked before or the answer is out there but I, for the life of me, cannot find it. I have searched google for hours trying to figure this out. I am working on a Red Hat Linux computer and it is in bash.
I have a directory of files 0-500 in /directory/.
They are named as such,
/directory/filename_001, /directory/filename_002, and so forth.
After running my analysis for my research, I have a listofnumbers.txt (txt file, with each row being a new number) of the numbers that I am interested in. For example,
015
124
187
345
412
A) Run a command from the list of files the files from the list of numbers? Our code looks like this:
g09slurm filename_001.com filename_001.log
Is there a way to write something like:
find value (row1 of listofnumbers.txt) then g09slurm filename_row1value.com filename_row1value.log
find value (row2 of listofnumbers.txt) then g09slurm filename_row2value.com filename_row2value.log
find value (row3 of listofnumbers.txt) then g09slurm filename_row3value.com filename_row2value.log
etc etc
B) Move the selected files from the list to a new directory, so I can rename them sequentially, then run a sequential number command?
Thanks.
First, read the list of files into an array:
readarray myarray < /path/to/filename.txt
Next, we'll get all the filenames based on those numbers, and move them
cd /path/to/directory
mv -t /path/to/new_directory "${myarray[#]/#/filename_}"
After this... honestly, I got bored. Stack Overflow is about helping people who make a good start at a problem, and you've done zero work toward figuring this out (other than writing "I promise I tried google").
I don't even understand what
Run a command from the list of files the files from the list of numbers
means.
To rename them sequentially (once you've moved them), you'll want to do something based on this code:
for i in $(ls); do
*your stuff here*
done
You should be able to research and figure stuff out. You might have to do some bash tutorials, here's a reasonable starting place

analyzing hacked website log : is thee any link between ./etc and ./tmp when uploading files ?

so one of my friends got hacked and asked me for help
this guy had a single shared host and uploaded like 20 website on that single host and one of them got hacked and now all of them are full of shells
so this happen today , so i asked for list of files that has been created or changed on the server since so i got the list something like below
the shell name is accesson.php it first appears in
./etc/assets/images/accesson.php
and than in
./tmp/assets/images/accesson.php
log :
./etc
./etc/newsss.com
./etc/shsh.com
./etc/nck.com
./etc/pvf.com
./etc/pvf.com/#pwcache/info
./etc/lck.com
./etc/assets
./etc/assets/images
./etc/assets/images/accesson.php
./tmp
./tmp/assets
./tmp/assets/images
./tmp/assets/images/accesson.php
./mail
./mail/new
./mail/tmp
./mail/assets
./mail/assets/images
./mail/assets/images/accesson.php
./public_html
./public_html/images/2017/01/162210999.jpg.CROP_.cq5dam_web_1280_1280_jpeg-280x200.jpg
./public_html/wp-content/plugins
./public_html/.ftpquota
./public_html/backlinks/error_log
./public_html/app/images
./public_html/app/images/css_sprites.png
./public_html/app/index.php
./public_html/assets
./public_html/assets/images
./public_html/assets/images/accesson.php
./public_ftp
./public_ftp/assets
./public_ftp/assets/images
./public_ftp/assets/images/accesson.php
./irso.com
./irso.com/.ftpquota
./irso.com/wp-content/plugins
./irso.com/wp-content/themes
./irso.com/assets
./irso.com/assets/images
./irso.com/assets/images/accesson.php
./ncl2.com
./ncl2.com/assets
./ncl2.com/assets/images
./ncl2.com/assets/images/accesson.php
./cache
./cache/assets
./cache/assets/images
./cache/assets/images/accesson.php
./ssl
./ssl/assets
./ssl/assets/images
./ssl/assets/images/accesson.php
./efr.com
./efr.com/.ftpquota
./efr.com/assets
./efr.com/assets/images
./efr.com/assets/images/accesson.php
it seems accesson.php has been spread trough all directories interesting thing is its been created with its directory /assets/images/ even in the tmp
so i thought since it first appeasers in ./tmp it must be uploaded trough some bug since all uploaded files first go to ./tmp but what about that ./etc ? its even before ./tmp .... so i thought maybe there is a link between the 2 ?
i appreciate any suggestion

better way to avoid use of ls inside of variable

not sure about how to correctly title this, please change it if you prefer
given that my code actually works, I'd like to have a peer review to increase the quality of it.
I have a folder full of .zip files. Theese files are streams of data (identifiable by their stream name) daily offloaded. There could be more than one daily file per stream, so I need to grab the last one in order of time. I can't rely on posix timestamp for this, so files expose timestamp on their name.
Filename example:
XX_XXYYZZ_XYZ_05_AB00C901_T001_20170808210052_20170808210631.zip
Last two fields are timestamps, and I'm interested in the second-last.
other fields are useless (now)
I've previously stored the stream name (in this case XYZ_05_AB00C901_T001 in the variable $stream
I have this line of code:
match=$(ls "$streamPath"/*.zip|grep "$stream"|rev|cut -d'_' -f2|rev|sort|tail -1)
And what it does is to search the given path for files matching the stream, cutting out the timestamp and sorting them. So now that I know what is the last timestamp for this stream, I can ls again, this time grepping for $streamand $match togegher, and I'm done:
streamFile=$(ls "$streamPath"/.zip|grep "$stream.*$match\|$match.*$stream")
Question time:
Is there a better way to achieve my goal ? Probably more than one, I'll prefer one-liner solution, tough.
ShellChecks advices me that it would be better to use a for loop or a while cycle instead of ls, to be able to handle particular filenames (which I'm not facing ATM, but who knows), but I'm not so sure about it (seems more complicated to me).
Thanks.
O.
Thanks to the page suggested by Cyrus I chose to go with this solution:
echo "$file"|grep "$stream"|rev|cut -d'_' -f2|rev|sort|tail -1
done < <(find "$streamPath" -maxdepth 1 -type f -name '*.zip' -print0)

Smart shell script to search for text in file name with error handling

I am trying to get a shell script built with some error handling but I have limited knowledge. I am searching for any files with a string say tests in it. However if this string is found I would to report an error. I can use the find . -name "*tests*" -print to list the files that contain the string but how do I list the file then report an error in the output if there are results, if it's null then it passes?
Thanks
The manual page has this to say:
find exits with status 0 if all files are processed
successfully, greater than 0 if errors occur.
This is deliberately a very broad
description, but if the return value is non-zero,
you should not rely
on the correctness of the results of find.
This is a bit of a conundrum, but if you can be reasonably sure that there will be no unrelated errors, you could do something like
find . -name '*tests*' -print -exec false \;
If you want the list of found files on standard error, add a redirection >&2

Using log parser to parse lot of logs in different folders

I recently started to use Log Parser with visual interface.
The logs that I want to parse come from IIS, and they are related to SharePoint. For example, I want to know how many people were visiting particular web pages, etc.
And it seems that IIS creates logs in different folders (I don't know why) and every day there is a new log file in a different folder.
So my question is, is it possible to approach all those files in different folders?
I know you can use From-clause, put different folders, but it is too difficult especially if in the future new folders are added. The goal is to create one script which would be executed.
So for example in a folder log named LogFIles, I have folders folder1, folder2, folder3, folder4, etc. and in each folder there are log files log1, log2, log 3, logN, etc.
So my query should be like this: Select * FROM path/LogFiles/*/*.log but the log parser doesn't accept it, so how to realize it?
You can use the -recurse option when calling logparser.
For example:
logparser file:"query.sql" -i:IISW3C -o:CSV -recurse
where query.sql contains:
select *
from .\Logs\*.log
and in my current directory, there is a directory called "Logs" that contains multiple sub-directories, each containing log files. Such as:
\Logs\server1\W3SVC1
\Logs\server1\W3SVC2
\Logs\server2\W3SVC1
\Logs\server2\W3SVC2
etc.
You can merge the logs then query the merged log
what i have to do is
LogParser.exe -i:w3c "select * into E:\logs\merged.log from E:\logs\WEB35\*, E:\logs\WEB36\*, E:\logs\WEB37\*" -o:w3c
I prefer powershell like this:
Select-String C:\Logs\diag\*.log -pattern "/sites/Very" | ?{$_.Line -match "Important"}
LogParser's help does not list the -recurse option so I'm not sure it it's still supported. However, this is what I did to get around it:
Let's say you use the following command to execute logparser -
logparser "SELECT * INTO test.csv FROM 'D:\samplelog\test.log'" -i:COM -iProgID:Sample.LogParser.Scriptlet -o:CSV
Then simply create a batch script to "recurse" through the folder structure and parse all files in it. The batch script that does this looks like this -
echo off
for /r %%a in (*) do (
for %%F in ("%%a") do (
logparser "SELECT * INTO '%%~nxF'.csv FROM '%%a'" -i:COM -iProgID:Sample.LogParser.Scriptlet
REM echo %%~nxF
)
)
Execute it from the path where the log files that need to be parsed are located.
This can further be customized to spit all parsed logs in one file using append (>>) operator.
Hope this helps.
Check this out: https://stackoverflow.com/a/31024196/4502867 by using powershell to recursively get file items in sub directories and parse them.

Resources