How can I simplify this script? - linux

Can you help me simplify this script?
This works but I think that there is a easier way to do it, but I can't find it.
The file:
Car Brand:Mercedes | Country:Germany | Car Model:300 SL | Year:04-1960
Car Brand:Lamborghini | Country:Italy | Car Model:Miura | Year:10-1970
Car Brand:Aston Martin | Country:UK | Car Model:DBS | Year:12-1965
Car Brand:Ford | Country:United States of America | Car Model:GT40 | Year:09-1966
Output:
1:Mercedes:Germany:300 SL:61:xxx
2:Lamborghini:Italy:Miura:51:xxx
3:Aston Martin:UK:DBS:56:xxx
4:Ford:United States of America:GT40:55:xxx
1,2,3,4 is the number of the line; 61, 52, 56, 55 (current year - year, ignoring the month), xxx insurance company (always the same, this part stopped working)
Script:
line=$(awk '{print NR}' file.txt)
brand=$(sed 's/.*Brand:\(.*\) | Country.*/\1/' file.txt)
country=$(sed 's/.*Country:\(.*\) | Year.*/\1/' file.txt)
sed 's/.*Year:\(.*\) | Car.*/\1/; s/^...//' file.txt > cars.txt
age=$(awk -v age="$(date +%Y)" '{print age - $1}' cars.txt)
model=$(sed 's/.*Model:\(.*\)*/\1/' file.txt)
echo "$(paste <(echo "$line") <(echo "$brand") <(echo "$country") <(echo "$age") <(echo "$model") -d ':')" > cars.txt
# sed -i 's/$/:xxx/' cars.txt
cat cars.txt
Thank you

Assuming there is no dash - elsewhere apart from last item, you can do :
awk -v year="$(date +%Y)" -F '(-|:| \\| )' '{print NR":"$2":"$4":"$6":"(year-$9)":xxx"}' file.txt
-F take three field separators, - : and |
Pipe | (a single character) is the separator for the 3 regular expressions. |(a space followed by |, followed by another space) is one of the separator, to distinguish the pipe in your data file from the pipe as regex separator, we need to escape it with \\.
-F fs
--field-separator fs
Use fs for the input field separator (the value of the FS predefined variable).
For more inforamtion : https://www.gnu.org/software/gawk/manual/gawk.html#Regexp-Field-Splitting

How about this:
sed 's/ *|[^:]*: */:/g' file.txt |
awk -F: -v OFS=: -v year="$(date +%Y)" '{$1=NR; sub("^.*-","",$NF); $NF=year-$NF; print $0, "xxx"}'
Explanation: the sed command replaced all of the "| Fieldlabel:" bits with just ":", giving lines like this:
Car Brand:Mercedes:Germany:300 SL:04-1960
The awk command then splits it into colon-delimited fields, replaces the first one with the line number, removes the month from the last one (the date) and subtracts it from the current year, and finally it's printed with an extra fixed field added at the end.

This might work for you (GNU sed):
sed -E 's/^/ | /;s/ | [^:]*//g;s/(.*:)..(.*)/\1$(($(date +%Y)\2)):xxx/;=' file |
sed 'N;s/\n//;s/.*/echo "&"/e'
Prepend a pipe delimiter in readiness for the line number to be prepended later.
Globally remove text between the pipe delimiter and the next occurrence of the : character.
Replace the last field (date) with a bash expression that calculates the years difference from the current year and also append a dummy field xxx.
Prepend the current line number to the output.
Pass the contents of the result to a second sed invocation that combines the line number with the contents of that line and evaluates the bash expression by means of the prepended echo command.

Related

Bash issue with floating point numbers in specific format

(Need in bash linux)I have a file with numbers like this
1.415949602
91.09582241
91.12042924
91.40270349
91.45625033
91.70150341
91.70174342
91.70660043
91.70966213
91.72597066
91.7287678315
91.7398645966
91.7542977976
91.7678146465
91.77196659
91.77299733
abcdefghij
91.7827827
91.78288651
91.7838959
91.7855
91.79080605
91.80103075
91.8050505
sed 's/^91\.//' file (working)
Any way possible I can do these 3 steps?
1st I try this
cat input | tr -d 91. > 1.txt (didnt work)
cat input | tr -d "91." > 1.txt (didnt work)
cat input | tr -d '91.' > 1.txt (didnt work)
then
grep -x '.\{10\}' (working)
then
grep "^[6-9]" (working)
Final 1 line solution
cat input.txt | sed 's/\91.//g' | grep -x '.\{10\}' | grep "^[6-9]" > output.txt
Your "final" solution:
cat input.txt |
sed 's/\91.//g' |
grep -x '.\{10\}' |
grep "^[6-9]" > output.txt
should avoid the useless cat, and also move the backslash in the sed script to the correct place (and I added a ^ anchor and removed the g flag since you don't expect more than one match on a line anyway);
sed 's/^91\.//' input.txt |
grep -x '.\{10\}' |
grep "^[6-9]" > output.txt
You might also be able to get rid of at least one useless grep but at this point, I would switch to Awk:
awk '{ sub(/^91\./, "") } /^[6-9].{9}$/' input.txt >output.txt
The sub() does what your sed replacement did; the final condition says to print lines which match the regex.
The same can conveniently, but less readably, be written in sed:
sed -n 's/^91\.([6-9][0-9]\{9\}\)$/\1/p' input.txt >output.txt
assuming your sed dialect supports BRE regex with repetitions like [0-9]\{9\}.

Using sed to fetch date

I have a file which contains two values for abc... keyword. I want to grab the latest date for matching abc... string. After getting the date I also need to format the date by replacing / with -
---other data
2018/01/15 01:56:14.944+0000 INFO newagent.bridge BridgeTLSAssetector::setupACBContext() - abc...
2018/02/14 01:56:14.944+0000 INFO newagent.bridge BridgeTLSAssetector::setupACBContext() - abc...
---other data
In the above example, my output should be 2018-02-14. Here, I am fetching the line which contains abc... value and only getting the line with latest date value. Then, I need to strip out the remaining string and fetch only the date value.
I am using the following sed but it is not working
grep -iF "abc..." file.txt | tail -n 1 | sed -e 's/^[^|]*|[^|]*| *\([^ ]*\) .*/\1/' -e 's%/%-%g'
With awk:
$ awk '/abc\.\.\./{d=$1} END{gsub("/", "-", d); print d}' file.txt
2018-2-14
Something with sed:
tac file.txt | grep -Fi 'abc...' | sed 's/ .*//;s~/~-~g;q'
This does what you want:
grep -iF "abc..." file.txt | tail -n 1 | awk '{print $1}' | sed 's#/#-#g'
Outputs this:
2018-02-14
Since you asked for sed -
$: sed -nE ' / abc[.]{3}/x; $ { x; s! .*!!; s!/([0-9])/!/0\1/!g; s!/([0-9])$!/0\1!g; s!/!-!g; p; }' in
2018-02-14
arguments
-n says don't print by default
-E says use extended regexes
the script
/ abc[.]{3}/x; say on each line with abc... swap the line for the buffer
$ { x; s! .*!!; s!/([0-9])/!/0\1/!g; s!/([0-9])$!/0\1!g; s!/!-!g; p; } says on the LAST line($) do the set of commands inside the {}.
x swaps the buffer to get the last saved record back.
s! .*!!; deletes everything from the first space (after the date)
s!/([0-9])/!/0\1/!g; adds a zero to the month if needed
s!/([0-9])$!/0\1!g; adds a zero to the day if needed
s!/!-!g; converts the /'s to dashes
p prints the resulting record.
When you use sed for matching a part of the date, you can have it match year. month, date and abc... in one command.
sed -rn 's#([0-9]{4})/([0-9]{2})/([0-9]{2}).*abc[.]{3}.*#\1-\2-\3#p' file.txt | tail -1
Easy and more simple try this.
cat filename.txt | grep 'abc' | awk -F' ' '{print $1}'
As pattern abc always fix as per the given logs. So this will be more easier way to get desire output.

Fetch latest matching string value

I have a file which contains two values for initial... keyword. I want to grab the latest date for matching initial... string. After getting the date I also need to format the date by replacing / with -
---other data
INFO | abc 1 | 2018/01/04 20:04:35 | initial...
INFO | abc 1 | 2018/02/05 17:01:42 | INFO | new| InitialLauncher | c.t.s.s.setup.launch | initial...
---other data
In the above example, my output should be 2018-02-05. Here, I am fetching the line which contains initial... value and only getting the line with latest date value. Then, I need to strip out the remaining string and fetch only the date value.
I am using the following grep but it is not yet as per the requirement.
grep -q -iF "initial..." /tmp/file.log
Using the knowledge that later dates appear later in the file, it's only necessary to print the date from the last line containing initial....
First step (drop the -q from grep — you don't want it to be quiet):
grep -iF 'initial...' /tmp/file.log |
tail -n 1 |
sed -e 's/^[^|]*|[^|]*| *\([^ ]*\) .*/\1/' -e 's%/%-%g'
The (first) s/// command matches a series of non-pipes followed by a pipe, another series of non-pipes followed by a pipe, a blank, then captures a series of non-blanks, and finally matches a blank and anything; it replaces all that with just the captured string, which is the date field after the second pipe on the input line. The (second) s%%% command replaces slashes with dashes, using % to avoid the confusion that the equivalent s/\//-/g might engender, thereby reformatting the date in ISO 8601-style format.
But we can lose the tail with:
grep -iF 'initial...' /tmp/file.log |
sed -n -e '$ { s/^[^|]*|[^|]*| *\([^ ]*\) .*/\1/; s%/%-%gp; }'
The -n suppresses normal output; the $ matches only the last line; the p after the second s/// operation prints the result.
The case-insensitive fixed-pattern search is more conveniently written in grep than in sed. Although it could be done in a single sed command, you have to work fairly hard, saving matching rows in the hold space, then swapping the hold and pattern space at the end, and doing the substitution and printing:
sed -n \
-e '/[Ii][Nn][Ii][Tt][Ii][Aa][Ll]\.\.\./h' \
-e '$ { x; s/^[^|]*|[^|]*| *\([^ ]*\) .*/\1/; s%/%-%gp; }' /tmp/file.log
Each of these produces the output 2018-02-05 on the sample data. If fed an input with no initial... in it, they output nothing.
Grep for only (-o) the string you want, sort it, and cut for the first word:
grep -o '2[0-9]\{3\}/[0-9][0-9]/[0-9][0-9] [0-2][0-9]:[0-5][0-9]:[0-9][0-9] .* | initial' file.txt | sort | cut -d' ' -f1 | tai -1
something like this...
$ awk -F'|' '$NF~/initial\.\.\./ {if(max<$3) max=$3}
END {gsub("/","-",max);
split(max,dt," "); print dt[1]}' file

search a line that contain a special character using sed or awk

I wonder if there is a command in Linux that can help me to find a line that begins with "*" and contains the special character "|"
for example
* Date | Auteurs
Simply use:
grep -ne '^\*.*|' "${filename}"
Or if you want to use sed:
sed -n '/^\*.*|/{=;p}' "${filename}" | sed '{N;s/\n/:/}'
Or (gnu) awk equivalent (require to backslash the pipe):
awk '/^\*.*\|/' "${filename}"
Where:
^ : start of the line
\*: a literal *
.*: zero or more generic char (not newline)
| : a literal pipe
NB: "${filename}": i've assumed you're using the command in a script with the target file passed in a double quoted variable as "${filename}". In the shell simply use the actual name of the file (or the path to it).
UPDATE (line numbers)
Modify the above commands to obtain also the line number of the matched lines. With grep is simple as to add -n switch:
grep -ne '^\*.*|' "${filename}"
We obtain an output like this:
81806:* Date | Auteurs
To obtain exactly the same output from sed and awk we have to complicate the commands a little bit:
awk '/^\*.*\|/{print NR ":" $0}' "${filename}"
# the = print the line number, p the actual match but it's on two different lines so the second sed call
sed -n '/^\*.*|/{=;p}' "${filename}" | sed '{N;s/\n/:/}'

return all lines that match String1 in a file after the last matching String2 in the same file

I figured out how to get the line number of the last matching word in the file :
cat -n textfile.txt | grep " b " | tail -1 | cut -f 1
It gave me the value of 1787. So, I passed it manually to the sed command to search for the lines that contains the sentence "blades are down" after that line number and it returned all the lines successfully
sed -n '1787,$s/blades are down/&/p' myfile.txt
Is there a way that I can pass the line number from the first command to the second one through a variable or a file so I can but them in the script to be executed automatically ?
Thank you.
You can do this by just connecting your two commands with xargs. 'xargs -I %' allows you to take the stdin from a previous command and place it whenever you want in the next command. The '%' is where your '1787' will be written:
cat -n textfile.txt | grep " b " | tail -1 | cut -f 1 | xargs -I % sed -n %',$s/blades are down/&/p' myfile.txt
You can use:
command substitution to capture the result of the first command in a variable.
simple string concatenation to use the variable in your sed comand
startLine=$(grep -n ' b ' textfile.txt | tail -1 | cut -d ':' -f1)
sed -n ${startLine}',$s/blades are down/&/p' myfile.txt
You don't strictly need the intermediate variable - you could simply use:
sed $(grep -n ' b ' textfile.txt | tail -1 | cut -d ':' -f1)',$s/blades are down/&/p' myfile.txt`
but it may make sense to do error checking on the result of the command substitution first.
Note that I've streamlined the first command by using grep's -n option, which puts the line number separated with : before each match.
First we can get "half" of the file after the last match of string2, then you can use grep to match all the string1
tac your_file | awk '{ if (match($0, "string2")) { exit ; } else {print;} }' | \
grep "string1"
but the order is reversed if you don't care about the order. But if you do care, just add another tac at the end with a pipe |.
This might work for you (GNU sed):
sed -n '/\n/ba;/ b /h;//!H;$!d;x;//!d;s/$/\n/;:a;/\`.*blades are down.*$/MP;D' file
This reads through the file storing all lines following the last match of the first string (" b ") in the hold space.
At the end of file, it swaps to the hold space, checks that it does indeed have at least one match, then prints out those lines that match the second string ("blades are down").
N.B. it makes the end case (/\n/) possible by adding a new line to the end of the hold space, which will eventually be thrown away. This also caters for the last line edge condition.

Resources