Shell script: Merge files from within a date range [duplicate]

Shell script: Merge files from within a date range [duplicate] - linux

This question already has answers here:
Extract data from log file in specified range of time [duplicate]
(5 answers)
Closed 6 years ago.
I'd like to merge several log files within a given date range. For example, I have 5 days of log files in a directory:
server.log.2016-04-14-00
server.log.2016-04-14-01
. . .
server.log.2016-04-18-23
server.log.2016-04-19-00
server.log.2016-04-19-01
I know I can use cat to merge the files, but how can I code in a shell script so that only the files between, for example, 2016-04-17-22 and 2016-04-18-01 are selected?

The following script accepts server's log file as its first argument.
Two important variables are from_date and to_date which control the from-to range. They're hard-coded in the script, and you might want to change this to enhance script's usage flexibility.
#!/bin/bash
# Server's log file.
server_log_file=$1
# The date from which the relevant part of the log file should be printed.
from_date='2016/04/14 00:00'
# The date until which the relevant part of the log file should be printed.
to_date='2016/04/19 01:00'
# Uses 'date' to convert a date to seconds since epoch.
# Arguments: $1 - A date acceptable by the 'date' command. e.g. 2016/04/14 23:00
date_to_epoch_sec() {
local d=$1
printf '%s' "$(date --date="$d" '+%s')"
}
# Convert 'from' and 'to' dates to seconds since epoch.
from_date_sec=$(date_to_epoch_sec "$from_date")
to_date_sec=$(date_to_epoch_sec "$to_date" )
# Iterate over log file entries.
while IFS=. read -r s l date; do
# Read and parse the date part.
IFS=- read -r y m d h <<< "$date"
# Convert the date part to seconds since epoch.
date_sec=$(date_to_epoch_sec "$y/$m/$d $h:00")
# If current date is within range, print the enire line as it was originally read.
if (( date_sec > from_date_sec && date_sec < to_date_sec )); then
printf '%s.%s.%s\n' "$s" "$l" "$date"
fi
done < "$server_log_file"
In order to test it I created the following file, named logfile:
server.log.2016-04-14-00
server.log.2016-04-14-01
server.log.2016-04-18-23
server.log.2016-04-19-00
server.log.2016-04-19-01
Usage example ( script name is sof ):
$ # Should print logs from 2016/04/14 00:00 to 2016/04/19 01:00
$ ./sof logfile
server.log.2016-04-14-01
server.log.2016-04-18-23
server.log.2016-04-19-00

Related

Setting value of command prompt ( PS1) based on present directory string length

I know I can do this to reflect just last 2 directories in the PS1 value.
PS1=${PWD#"${PWD%/*/*}/"}#
but lets say we have a directory name that's really messy and will reduce my working space , like
T-Mob/2021-07-23--07-48-49_xperia-build-20191119010027#
OR
2021-07-23--07-48-49_nokia-build-20191119010027/T-Mob#
those are the last 2 directories before the prompt
I want to set a condition if directory length of either of the last 2 directories is more than a threshold e.g. 10 chars , shorten the name with 1st 3 and last 3 chars of the directory (s) whose length exceeds 10
e.g.
2021-07-23--07-48-49_xperia-build-20191119010027 &
2021-07-23--07-48-49_nokia-build-20191119010027
both gt 10 will be shortened to 202*027 & PS1 will be respectively
T-Mob/202*027/# for T-Mob/2021-07-23--07-48-49_xperia-build-20191119010027# and
202*027/T-Mob# for 2021-07-23--07-48-49_nokia-build-20191119010027/T-Mob#
A quick 1 Liner to get this done ?
I cant post this in comments so Updating here. Ref to Joaquins Answer ( thx J)
PS1=''`echo ${PWD#"${PWD%/*/*}/"} | awk -v RS='/' 'length() <=10{printf $0"/"}; length()>10{printf "%s*%s/", substr($0,1,3), substr($0,length()-2,3)};'| tr -d "\n"; echo "#"`''
see below o/p's
/root/my-applications/bin # it shortened as expected
my-*ons/bin/#cd - # going back to prev.
/root
my-*ons/bin/# #value of prompt is the same but I am in /root

A one-liner is basically always the wrong choice. Write code to be robust, readable and maintainable (and, for something that's called frequently or in a tight loop, to be efficient) -- not to be terse.
Assuming availability of bash 4.3 or newer:
# Given a string, a separator, and a max length, shorten any segments that are
# longer than the max length.
shortenLongSegments() {
local -n destVar=$1; shift # arg1: where do we write our result?
local maxLength=$1; shift # arg2: what's the maximum length?
local IFS=$1; shift # arg3: what character do we split into segments on?
read -r -a allSegments <<<"$1"; shift # arg4: break into an array
for segmentIdx in "${!allSegments[#]}"; do # iterate over array indices
segment=${allSegments[$segmentIdx]} # look up value for index
if (( ${#segment} > maxLength )); then # value over maxLength chars?
segment="${segment:0:3}*${segment:${#segment}-3:3}" # build a short version
allSegments[$segmentIdx]=$segment # store shortened version in array
fi
done
printf -v destVar '%s\n' "${allSegments[*]}" # build result string from array
}
# function to call from PROMPT_COMMAND to actually build a new PS1
buildNewPs1() {
# declare our locals to avoid polluting global namespace
local shorterPath
# but to cache where we last ran, we need a global; be explicit.
declare -g buildNewPs1_lastDir
# do nothing if the directory hasn't changed
[[ $PWD = "$buildNewPs1_lastDir" ]] && return 0
shortenLongSegments shorterPath 10 / "$PWD"
PS1="${shorterPath}\$"
# update the global tracking where we last ran this code
buildNewPs1_lastDir=$PWD
}
PROMPT_COMMAND=buildNewPs1 # call buildNewPs1 before rendering the prompt
Note that printf -v destVar %s "valueToStore" is used to write to variables in-place, to avoid the performance overhead of var=$(someFunction). Similarly, we're using the bash 4.3 feature namevars -- accessible with local -n or declare -n -- to allow destination variable names to be parameterized without the security risk of eval.
If you really want to make this logic only apply to the last two directory names (though I don't see why that would be better than applying it to all of them), you can do that easily enough:
buildNewPs1() {
local pathPrefix pathFinalSegments
pathPrefix=${PWD%/*/*} # everything but the last 2 segments
pathSuffix=${PWD#"$pathPrefix"} # only the last 2 segments
# shorten the last 2 segments, store in a separate variable
shortenLongSegments pathSuffixShortened 10 / "$pathSuffix"
# combine the unshortened prefix with the shortened suffix
PS1="${pathPrefix}${pathSuffixShortened}\$"
}
...adding the performance optimization that only rebuilds PS1 when the directory changed to this version is left as an exercise to the reader.

Probably not the best solution, but a quick solution using awk:
PS1=`echo ${PWD#"${PWD%/*/*}/"} | awk -v RS='/' 'length()<=10{printf $0"/"}; length()>10{printf "%s*%s/", substr($0,1,3), substr($0,length()-2,3)};'| tr -d "\n"; echo "#"`
I got this results with your examples:
T-Mob/202*027/#
202*027/T-Mob/#

BASH: trying to read in file but it only gets the first line

Im trying to read in a file in bash and store the variables to be used at some later point, the format for the files is as follows
name abbreviation
price quantity maxQuantitiy
itemDescription
but when i try to actually read in the file it seems ot only store the first line in every variable and was wondering where it is that its storing the variables wrong
if [ -r data/$fileName.file ]; then
read name abbreviation < data/$fileName.file
read price quantity maxQuantity < data/$fileName.file
read itemDescription < data/$fileName.file
fi
and when i try to echo the price or quantity it echos the name and the abbreviation.

read reads the first line. When you do another redirection, it's not tied to the previous one, so it again reads the first line. You need to use one redirection for all the reads:
if [ -r data/$fileName.file ]; then
{
read name abbreviation
read price quantity maxQuantity
read itemDescription
} < data/$fileName.file
fi

How I substract 2 timestamps using shell scripts

I need to subtract the below 2 times using a shell script
var1=2019-11-14-03.00.02.000000
var2=2019-11-14-03.00.50.000000
The output should be 00-00-00-00.00.48.00000

First convert var1 and var2 to date in seconds (since epoch) with:
sec1=$(date --date $var1 +%s)
...
Use bash math operators to calculate the difference
delta=$((sec1 - sec2))
Finally convert it back to a readable format
date --date #1502938801"$delta"

As mentioned in my comment, var1 and var2 are not valid date formats for passing to GNU date with the -d option. Before you can convert the times to seconds from epoch, you must
remove the '-' between the date and time portions of each variable,
isolate the time alone,
remove the milliseconds
replace all '.' in the time with ':'
restore the milliseconds separated from the time with '.'
pass the reformatted string for each to date -d"$dt $tm" +%s with the reformatted date and time space-separated.
Bash provides parameter substitutions to handle each very easily.
After computing the time since epoch for both and taking the difference, you then have to manually compute the difference in years, months, days, hours, minutes, seconds and milliseconds in order to output the difference and you must format the output using printf and the integer conversion specifier using both the appropriate field-width and leading-zero modifiers.
Putting it altogether, (and using a 365-day/year and 30-day/mo approximation) you could do:
#!/bin/bash
var1=2019-11-14-03.00.02.000000
var2=2019-11-14-03.00.50.000000
dt1=${var1%-*} ## isolate date portion of variables
dt2=${var2%-*}
tm1=${var1##*-} ## isolate time portion of variables
tm2=${var2##*-}
ms1=${tm1##*.} ## isolate milliseconds portion of variables
ms2=${tm2##*.}
tm1=${tm1%.*} ## remove milliseconds from time
tm2=${tm2%.*}
tm1=${tm1//./:} ## substitute all . with : in times w/o milliseconds
tm2=${tm2//./:}
tm1=${tm1}.$ms1 ## restore milliseconds
tm2=${tm2}.$ms2
epoch1=$(date -d"$dt1 $tm1" +%s) ## get time since epoch for both
epoch2=$(date -d"$dt2 $tm2" +%s)
epochdiff=$((epoch2-epoch1)) ## get difference in epoch times
## Approximates w/o month or leap year considerations
y=$((epochdiff/(3600*24*365))) ## years difference
rem=$((epochdiff-y)) ## remainder
m=$((rem/(3600*24*30))) ## months difference (based on 30 day mo)
rem=$((rem-m)) ## remainder
d=$((rem/(3600*24))) ## days difference
rem=$((rem-m)) ## remainder
H=$((rem/3600)) ## hours difference
rem=$((rem-H)) ## remainder
M=$((rem/60)) ## minutes difference
S=$((rem-M)) ## secnds difference
ms=$((ms2-ms1)) ## millisecond difference
## format output
printf "%04d-%02d-%02d-%02d:%02d:%02d.%04d\n" $y $m $d $H $M $S $ms
(note: you can further fine-tune the month and year/leap-year calculations -- that is left to you)
Example Use/Output
$ bash ~/scr/dtm/fulltimediff.sh
0000-00-00-00:00:48.0000
Look things over and let me know if you have further questions.

How can you can calculate the time span between two time entries in a file using a shell script?

In a Linux script: I have a file that has two time entries for each message within the file. A 'received time' and a 'source time'. there are hundreds of messages within the file.
I want to calculate the elapsed time between the two times.
2014-07-16T18:40:48Z (received time)
2014-07-16T18:38:27Z (source time)
The source time is 3 lines after the received time, not that it matters.
info on the input data:
The input has a lines are as follows:
TimeStamp: 2014-07-16T18:40:48Z
2 lines later: a bunch of messages in one line and within each line, multiple times is:
sourceTimeStamp="2014-07-16T18:38:27Z"

If you have GNU's date (not busybox's), you can give difference in seconds with:
#!/bin/bash
A=$(date -d '2014-07-16T18:40:48Z' '+%s')
B=$(date -d '2014-07-16T18:38:27Z' '+%s')
echo "$(( A - B )) seconds"
For busybox's date and ash (modern probably / BusyBox v1.21.0):
#!/bin/ash
A=$(busybox date -d '2014-07-16 18:40:48' '+%s')
B=$(busybox date -d '2014-07-16 18:38:27' '+%s')
echo "$(( A - B )) seconds"

you should be able to use date like this (e.g.)
date +%s --date="2014-07-16T18:40:48Z"
to convert both timestamps into a unix timestamp. Getting the time difference between them is then reduced to a simple subtraction.
Does this help?

I would use awk. The following script searches for the lines of interest, converts the time value into a UNIX timestamp and saves them in the start, end variables. At the end of the script the difference will get calculated and printed:
timediff.awk:
/received time/ {
"date -d "$1" +%s" | getline end
}
/source time/ {
"date -d "$1" +%s" | getline start
exit
}
END {
printf "%s seconds in between", end - start
}
Execute it like this:
awk -f timediff.awk log.file
Output:
141 seconds in between

Bash merge columned files into one file with rows

I have many data files in this format:
-1597.5421
-1909.6982
-1991.8743
-2033.5744
But I would like to merge them all into one data file with each original data file taking up one row with spaces in between so I can import it in excel.
-1597.5421 -1909.6982 -1991.8743 -2033.5744
-1789.3324 -1234.5678 -9876.5433 -9999.4321
And so on. Each file is named ALL.ene and every directory in my working directory contains it. Can someone give me a quick fix? Thanks!
:edit. Each file has 11 entries. Those were just examples.

for i in */ALL.ene
do
echo $(<$i)
done > result.txt

Assumptions:
I assume all your data files are of this format:
<something1><newline>
<something2><newline>
<something3><newline>
So for example, if the last newline is missing, the following script will miss the field corresponding to <something3>.
Usage: ./merge.bash -o <output file> <input file list or glob>
The script appends to any existing output files from previous runs. It also does not make any assumptions to how many fields of data every input file has. It blindly puts every line into a line in the output file separated by spaces.
#!/bin/bash
# set -o xtrace # uncomment to debug
declare output
[[ $1 =~ -o$ ]] && output="$2" && shift 2 || { \
echo "The first argument should always be -o <output>";
exit -1; }
declare -a files=("${#}") row
for file in "${files[#]}";
do
while read data; do
row+=("$data")
done < "$file"
echo "${row[#]}" >> "$output"
row=()
done
Example:
$ cat data1
-1597.5421
-1909.6982
-1991.8743
-2033.5744
$ cat data2
-1789.3324
-1234.5678
-9876.5433
-9999.4321
$ ./merge.bash -o test data{1,2}
$ cat test
-1597.5421 -1909.6982 -1991.8743 -2033.5744
-1789.3324 -1234.5678 -9876.5433 -9999.4321

This is what coreutils paste is good at, try:
paste -s data_files*

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Shell script: Merge files from within a date range [duplicate] - linux

Related

Setting value of command prompt ( PS1) based on present directory string length

BASH: trying to read in file but it only gets the first line

How I substract 2 timestamps using shell scripts

How can you can calculate the time span between two time entries in a file using a shell script?

Bash merge columned files into one file with rows

Categories

Resources