list files from directory between given hours using linux command - linux

There are few errors which we need to mitigate which can be done only when I am able to find the list of files which got created between 2 hours.
My file naming pattern is
App_ErrorFile1401_01_11_YYYYMMDDHHMMSS_1234_123456.csv.gz
I need to find files which are between 9AM-11AM of yesterday. Having found the list of files I will then FTP those files to a given server IP.
FTP part is something which we can do easily but I am not able to find any pattern using which I can search only 2 hours files and select them for FTP. I dont have much idea of grep with regex pattern and after going through net for more than an hour I am yet to figure out how to build my statement.
I can use find/grep. I am on RHEL.
Help would be greatly appreciated.

you can try the following code snippet if you can use bash:
#!/bin/bash
# define timearea
yesterday=$(date --date="yesterday" +"%Y%m%d")
start="${yesterday}090000"
stop="${yesterday}110000"
find . -name "App_ErrorFile*.csv.gz" | \
while read -r file; do
IFS=_ read -r -a arr <<< "$file"
timestamp="${arr[4]}"
if [[ $timestamp -ge $start && $timestamp -le $stop ]]; then
echo "$file"
fi
done
You have to start it from the directory in which the files are located.

Related

Change folder structure and make files available by using links

(Please see also the minimal example at bottom)
I have the following folder structure:
https://drive.google.com/open?id=1an6x1IRtNjOG6d9D5FlYwsUxmgaEegUY
Each folder shows at the end: the date, month and day:
e.g.
/HRIT/EPI/2004/01/14
Inside for each day, each of this sub-folders contains a lot of files (in total about 10 TB).
All available folders are:
/HRIT/EPI/YYYY/MM/DD/
/HRIT/HRV/YYYY/MM/DD/
/HRIT/VIS006/YYYY/MM/DD/
/HRIT/VIS008/YYYY/MM/DD/
/HRIT/WV_062/YYYY/MM/DD/
/HRIT/WV_073/YYYY/MM/DD/
/HRIT/IR_016/YYYY/MM/DD/
/HRIT/IR_039/YYYY/MM/DD/
/HRIT/IR_087/YYYY/MM/DD/
/HRIT/IR_097/YYYY/MM/DD/
/HRIT/IR_108/YYYY/MM/DD/
/HRIT/IR_120/YYYY/MM/DD/
/HRIT/IR_134/YYYY/MM/DD/
/HRIT/PRO/YYYY/MM/DD/
I would like to change the folder structure to:
/HRIT/YYYY/MM/DD/
All files from EPI, HRV, VIS006, VIS008, WV_062, WV_073, IR_016, IR_039, IR_087,IR_097, IR_108, IR_124, IR_134, PRO should not be copied physically but should be accessible by links in my home folder /home
-> /home/HRIT/YYYY/MM/DD/
Is something like that possible and how?
Here is an minimal example what I in principle have and what I wish as result:
I have:
/HRIT/EPI/2019/01/01/epi_1.txt
/HRIT/EPI/2019/01/01/epi_2.txt
/HRIT/HRV/2019/01/01/hrv_1.txt
/HRIT/HRV/2019/01/01/hrv_2.txt
/HRIT/VIS006/2019/01/01/vis006_1.txt
/HRIT/VIS006/2019/01/01/vis006_2.txt
As result I wish:
/home/HRIT/2019/01/01/epi_1.txt
/home/HRIT/2019/01/01/epi_2.txt
/home/HRIT/2019/01/01/hrv_1.txt
/home/HRIT/2019/01/01/hrv_2.txt
/home/HRIT/2019/01/01/vis006_1.txt
/home/HRIT/2019/01/01/vis006_2.txt
As mentioned above the files should not be copied to this new folder structure, but instead made accessible by links (because in reality I have too many files and not enough space to copy them).
!!! This is extremely simplified, since I have different years, month's and days (please see the link above).
Here's a small script to do this. I've made it just echo by default, you'd need to run it with --real to make it do the linking
This assumes you're running the script from within the /HRIT/ dir, and assumes that the date, then filename are the last part of the hierarchy. If there are further dirs below, this might not work for you.
DRYRUN=true
LINK="ln -s" # You can make this "ln" if you prefer hardlinks
if [[ $1 == '--real' ]]; then
DRYRUN=false
fi
for file in $(find "$PWD" -type f); do
dest=$( \
echo "$file" | awk -F/ '{ year = NF-3; mon=NF-2; day=NF-1; print "/home/HRIT/"$year"/"$mon"/"$day"/"$NF }' \
)
if [[ -f "$dest" ]]; then
echo "Skipping '$file' as destination link already exists at '$dest'"
else
if $DRYRUN; then
echo $LINK "$file" "$dest"
else
$LINK "$file" "$dest"
fi
fi
done
The simplest while read loop:
generate_list_of_files_for_example_with_find |
while IFS= read -r line; do
IFS='/' read -r _ hrit _ rest <<<"$line"
echo mkdir -p /home/"$hrit"/"$(dirname "$rest")"
echo ln -s /home/"$hrit/$rest" "$line"
done
Remove echos to really create links and directories.

extracting files that doesn't have a dir with the same name

sorry for that odd title. I didn't know how to word it the right way.
I'm trying to write a script to filter my wiki files to those got directories with the same name and the ones without. I'll elaborate further.
here is my file system:
what I need to do is print a list of those files which have directories in their name and another one of those without.
So my ultimate goal is getting:
with dirs:
Docs
Eng
Python
RHEL
To_do_list
articals
without dirs:
orphan.txt
orphan2.txt
orphan3.txt
I managed to get those files with dirs. Here is me code:
getname () {
file=$( basename "$1" )
file2=${file%%.*}
echo $file2
}
for d in Mywiki/* ; do
if [[ -f $d ]]; then
file=$(getname $d)
for x in Mywiki/* ; do
dir=$(getname $x)
if [[ -d $x ]] && [ $dir == $file ]; then
echo $dir
fi
done
fi
done
but stuck with getting those without. if this is the wrong way of doing this please clarify the right one.
any help appreciated. Thanks.
Here's a quick attempt.
for file in Mywiki/*.txt; do
nodir=${file##*/}
test -d "${file%.txt}" && printf "%s\n" "$nodir" >&3 || printf "%s\n" "$nodir"
done >with 3>without
This shamelessly uses standard output for the non-orphans. Maybe more robustly open another separate file descriptor for that.
Also notice how everything needs to be quoted unless you specifically require the shell to do whitespace tokenization and wildcard expansion on the value of a token. Here's the scoop on that.
That may not be the most efficient way of doing it, but you could take all files, remove the extension, and the check if there isn't a directory with that name.
Like this (untested code):
for file in Mywiki/* ; do
if [ -f "$d" ]; then
dirname=$(getname "$d")
if [ ! -d "Mywiki/$dirname" ]; then
echo "$file"
fi
fi
done
To List all the files in current dir
list1=`ls -p | grep -v /`
To List all the files in current dir without extension
list2=`ls -p | grep -v / | sed 's/\.[a-z]*//g'`
To List all the directories in current dir
list3=`ls -d */ | sed -e "s/\///g"`
Now you can get the desired directory listing using intersection of list2 and list3. Intersection of two lists in Bash

How do I search for a file based on what is output by a command running on that file

I am working on a project for one of my professors and he asked me to sort a couple hundred .fits images based on their header files (specifically what star they are images of) I think that grep would be the best way to do this however I can't seam to figure out how to use grep based on the header.
I am entering:
ls | imhead *.fits | grep -E -r "PG\ 1104+243" *
to just list them out for now, once they are listed I know how to copy them into a directory.
I am new to using grep so I am unsure as to where my error lies? any help would be greatly appreciated! Thanks!
Assuming that imghead will extract the headers of the .fits as txt, you can use a simple shell script to do it:
script.sh
#!/bin/bash
grep "$1" "$2" > /dev/null 2>&1 && echo "$2"
Note that the + is a special character if you use extended regular expression, meaning if you pass the -E as in the question. A simple grep without any options should do the trick here.
Use find to exec the script on every *.fits file in the current folder:
find -maxdepth 1 -name '*.fits' -exec ./script.sh 'PG 1104+243' {} \;
If you are going to copy/move/alter or do something with the files you find, you might be better off, in terms of complexity and ease of quoting, using a loop like this:
#!/bin/bash
find . -name \*.fits -print0 | while read -d '' -r file; do
echo Checking file: $file
imhead "$file" | grep -q 'PG 1104+243'
if [ $? -eq 0 ]; then
echo Object matches: $file
fi
done

bash scripts list files in a directory

I'm writing a script that takes an argument which is a directory .
i want to be able to construct list/array with all the files that have a certain extension in that directory and cut their extension .
For example if i have directory containing :
aaa.xx
bbb.yy
ccc.xx
and im searching for *.xx .
my list/array would be : aaa ccc.
I'm trying to use the code in this thread example the accepted answer .
set tests_list=[]
for f in $1/*.bpt
do
echo $f
if [[ ! -f "$f" ]]
then
continue
fi
set tmp=echo $f | cut -d"." -f1
#echo $tmp
tests_list+=$tmp
done
echo ${tests_list[#]}
if i run this script i get that the loop only executes once with $f is tests_list=[]/*.bpt which is weird since $f should be a file name in that directory , and echo empty string.
i validated that i'm in the correct directory and that the argument directory have files with .bpt extensions .
This should work for you:
for file in *.xx ; do echo "${file%.*}" ; done
To expand this to a script that takes an argument as a directory:
#!/bin/bash
dir="$1"
ext='xx'
for file in "$dir"/*."$ext"
do
echo "${file%.*}"
done
edit: switched ls with for - thanks #tripleee for the correction.
filear=($(find path/ -name "*\.xx"))
filears=()
for f in ${filear[#]}; do filears[${#filears[#]}]=${f%\.*}; done

preventing scripts from stopping on errors

My issue is likely a rather simple one, I just haven't found a satisfactory answer anywhere I've looked so far.
I have a script which runs a program/script, when it encounters an error it just hangs and will not continue along.
What is the best method to fix this?
To clarify, usually there is no output when it hangs but sometimes there is an error message, depending on which script I am using to work on my data set.
Example script:
#!/bin/bash
# iterate over a NUL-delimited stream of directory names
while IFS='' read -r -d '' dirname; do
# ...then list files in each directory:
for file in "$dirname"/*; do
# ignore directory contents that are not files
[[ -f $file ]] || continue
# run analysis tool
if [[ $file == *.dmp ]]; then
echo $dirname;
tshark -PVx -r "$file" >> $dirname/TEXT_out.txt #with packet summary line, packet details expanded, packet bytes
#ls $dirname;
echo "complete";
continue
fi
done
done < <(find . -type d -print0)
In nutshell, you are asking - how to handle if the child process hangs (from parent perspective). If that is the basic issue your are facing, then you can check this link.
HTH!

Resources