sed script to remove file name duplicates

sed script to remove file name duplicates - linux

I hope the below task will be very easy for sed lovers. I am not sed-guru, but I need to express the following task in sed, as sed is more popular on Linux systems.
The input text stream is something which is produced by "make depends" and looks like following:
pgm2asc.o: pgm2asc.c ../include/config.h amiga.h list.h pgm2asc.h pnm.h \
output.h gocr.h unicode.h ocr1.h ocr0.h otsu.h barcode.h progress.h
box.o: box.c gocr.h pnm.h ../include/config.h unicode.h list.h pgm2asc.h \
output.h
database.o: database.c gocr.h pnm.h ../include/config.h unicode.h list.h \
pgm2asc.h output.h
detect.o: detect.c pgm2asc.h pnm.h ../include/config.h output.h gocr.h \
unicode.h list.h
I need to catch only C++ header files (i.e. ending with .h), make the list unique and print as space-separated list prepending src/ as a path-prefix. This is achieved by the following perl script:
make libs-depends | perl -e 'while (<>) { while (/ ([\w\.\/]+?\.h)/g) { $a{$1} = 1; } } print join " ", map { "src/$_" } keys %a;'
The output is:
src/unicode.h src/pnm.h src/progress.h src/amiga.h src/ocr0.h src/ocr1.h src/otsu.h src/barcode.h src/gocr.h src/../include/config.h src/list.h src/pgm2asc.h src/output.h
Please, help to express this in sed.

Not sed but hope this helps you:
make libs-depends | grep -io --perl-regexp "[\w\.\/]+\.h " | sort -u | sed -e 's:^:src/:'

If you really want to do this in pure sed:
make libs-depends | sed 's/ /\n/g' | sed '/\.h$/!d;s/^/src\//' | sed 'G;/^\(.*\)\n.*\1/!h;$!d;${x;s/\n/ /g}'
The first sed command breaks the output up into separate lines, the second filters out everything but *.h and prepends 'src/', the third gloms the lines together without repetition.

Sed probably isn't the best tool here as it's stream-oriented. You could possibly use it to convert the spaces to newlines though, pipe that through sort and uniq, then use sed again to convert the newlines back to spaces.
Typing this on my phone, though, so can't give exact commands :(

Related

how to loop through string for patterns from linux shell?

I have a script that looks through files in a directory for strings like :tagName: which works fine for single :tag: but not for multiple :tagOne:tagTwo:tagThree: tags.
My current script does:
grep -rh -e '^:\S*:$' ~/Documents/wiki/*.mkd ~/Documents/wiki/diary/*.mkd | \
sed -r 's|.*(:[Aa-Zz]*:)|\1|g' | \
sort -u
printf '\nNote: this fails to display combined :tagOne:tagTwo:etcTag:\n'
The first line is generating an output like this:
:politics:violence:
:positivity:
:positivity:somewhat:
:psychology:
:socialServices:family:
:strategy:
:tech:
:therapy:babylon:
:trauma:
:triggered:
:truama:leadership:business:toxicity:
:unfurling:
:tagOne:tagTwo:etcTag:
And the objective is to get that into a list of single :tag:'s.
Again, the problem is that if a line has multiple tags, the line does not appear in the output at all (as opposed to the problem merely being that only the first tag of the line gets displayed). Obviously the | sed... | there is problematic.
**I want :tagOne:tagTwo:etcTag: to be turned this into:
:tagOne:
:tagTwo:
:etcTag:
and so forth with :politics:violence: etc.
Colons aren't necessary, tagOne is just as good (maybe better, but this is trivial) than :tagOne:.
The problem is that if a line has multiple tags, the line does not appear in the output at all (as opposed to the problem merely being that only the first tag of the line gets displayed). Obviously the | sed... | there is problematic.
So I should replace the sed with something better...
I've tried:
A smarter sed:
grep -rh -e '^:\S*:$' ~/Documents/wiki/*.mkd ~/Documents/wiki/diary/*.mkd | \
sed -r 's|(:[Aa-Zz]*:)([Aa-Zz]*:)|\1\r:\2|g' | \
sed -r 's|(:[Aa-Zz]*:)([Aa-Zz]*:)|\1\r:\2|g' | \
sed -r 's|(:[Aa-Zz]*:)([Aa-Zz]*:)|\1\r:\2|g' | \
sort -u
...which works (for a limited number of tags) except that it produces weird results like:
:toxicity:p:
:somewhat:y:
:people:n:
...placing weird random letters at the end of some tags in which :p: is the final character of the :leadership: tag and "leadership" no longer appears in the list. Same for :y: and :n:.
I've also tried using loops in a couple ways...
grep -rh -e '^:\S*:$' ~/Documents/wiki/*.mkd ~/Documents/wiki/diary/*.mkd | \
sed -r 's|(:[Aa-Zz]*:)([Aa-Zz]*:)|\1\r:\2|g' | \
sed -r 's|(:[Aa-Zz]*:)([Aa-Zz]*:)|\1\r:\2|g' | \
sed -r 's|(:[Aa-Zz]*:)([Aa-Zz]*:)|\1\r:\2|g' | \
sort -u | grep lead
...which has the same problem of :leadership: tags being lost etc.
And like...
for m in $(grep -rh -e '^:\S*:$' ~/Documents/wiki/*.mkd ~/Documents/wiki/diary/*.mkd); do
for t in $(echo $m | grep -e ':[Aa-Zz]*:'); do
printf "$t\n";
done
done | sort -u
...which doesn't separate the tags at all, just prints stuff like:
:truama:leadership:business:toxicity
Should I be taking some other approach? Using a different utility (perhaps cut inside a loop)? Maybe doing this in python (I have a few python scripts but don't know the language well, but maybe this would be easy to do that way)? Every time I see awk I think "EEK!" so I'd prefer a non-awk solution please, preferring to stick to paradigms I've used in order to learn them better.

Using PCRE in grep (where available) and positive lookbehind:
$ echo :tagOne:tagTwo:tagThree: | grep -Po "(?<=:)[^:]+:"
tagOne:
tagTwo:
tagThree:
You will lose the leading : but get the tags nevertheless.
Edit: Did someone mention awk?:
$ awk '{
while(match($0,/:[^:]+:/)) {
a[substr($0,RSTART,RLENGTH)]
$0=substr($0,RSTART+1)
}
}
END {
for(i in a)
print i
}' file

Another idea using awk ...
Sample data generated by OPs initial grep:
$ cat tags.raw
:politics:violence:
:positivity:
:positivity:somewhat:
:psychology:
:socialServices:family:
:strategy:
:tech:
:therapy:babylon:
:trauma:
:triggered:
:truama:leadership:business:toxicity:
:unfurling:
:tagOne:tagTwo:etcTag:
One awk idea:
awk '
{ split($0,tmp,":") # split input on colon;
# NOTE: fields #1 and #NF are the empty string - see END block
for ( x in tmp ) # loop through tmp[] indices
{ arr[tmp[x]] } # store tmp[] values as arr[] indices; this eliminates duplicates
}
END { delete arr[""] # remove the empty string from arr[]
for ( i in arr ) # loop through arr[] indices
{ printf ":%s:\n", i } # print each tag on separate line leading/trailing colons
}
' tags.raw | sort # sort final output
NOTE: I'm not up to speed on awk's ability to internally sort arrays (thus eliminating the external sort call) so open to suggestions (or someone can copy this answer to a new one and update with said ability?)
The above also generates:
:babylon:
:business:
:etcTag:
:family:
:leadership:
:politics:
:positivity:
:psychology:
:socialServices:
:somewhat:
:strategy:
:tagOne:
:tagTwo:
:tech:
:therapy:
:toxicity:
:trauma:
:triggered:
:truama:
:unfurling:
:violence:

A pipe through tr can split those strings out to separate lines:
grep -hx -- ':[:[:alnum:]]*:' ~/Documents/wiki{,/diary}/*.mkd | tr -s ':' '\n'
This will also remove the colons and an empty line will be present in the output (easy to repair, note the empty line will always be the first one due to the leading :). Add sort -u to sort and remove duplicates, or awk '!seen[$0]++' to remove duplicates without sorting.
An approach with sed:
sed '/^:/!d;s///;/:$/!d;s///;y/:/\n/' ~/Documents/wiki{,/diary}/*.mkd
This also removes colons, but avoids adding empty lines (by removing the leading/trailing : with s before using y to transliterate remaining : to <newline>). sed could be combined with tr:
sed '/:$/!d;/^:/!d;s///' ~/Documents/wiki{,/diary}/*.mkd | tr -s ':' '\n'
Using awk to work with the : separated fields, removing duplicates:
awk -F: '/^:/ && /:$/ {for (i=2; i<NF; ++i) if (!seen[$i]++) print $i}' \
~/Documents/wiki{,/diary}/*.mkd

Sample data generated by OPs initial grep:
$ cat tags.raw
:politics:violence:
:positivity:
:positivity:somewhat:
:psychology:
:socialServices:family:
:strategy:
:tech:
:therapy:babylon:
:trauma:
:triggered:
:truama:leadership:business:toxicity:
:unfurling:
:tagOne:tagTwo:etcTag:
One while/for/printf idea based on associative arrays:
unset arr
typeset -A arr # declare array named 'arr' as associative
while read -r line # for each line from tags.raw ...
do
for word in ${line//:/ } # replace ":" with space and process each 'word' separately
do
arr[${word}]=1 # create/overwrite arr[$word] with value 1;
# objective is to make sure we have a single entry in arr[] for $word;
# this eliminates duplicates
done
done < tags.raw
printf ":%s:\n" "${!arr[#]}" | sort # pass array indices (ie, our unique list of words) to printf;
# per OPs desired output we'll bracket each word with a pair of ':';
# then sort
Per OPs comment/question about removing the array, a twist on the above where we eliminate the array in favor of printing from the internal loop and then piping everything to sort -u:
while read -r line # for each line from tags.raw ...
do
for word in ${line//:/ } # replace ":" with space and process each 'word' separately
do
printf ":%s:\n" "${word}" # print ${word} to stdout
done
done < tags.raw | sort -u # pipe all output (ie, list of ${word}s for sorting and removing dups
Both of the above generates:
:babylon:
:business:
:etcTag:
:family:
:leadership:
:politics:
:positivity:
:psychology:
:socialServices:
:somewhat:
:strategy:
:tagOne:
:tagTwo:
:tech:
:therapy:
:toxicity:
:trauma:
:triggered:
:truama:
:unfurling:
:violence:

How to replace values in a file using sed

I need to replace few values and remove the "#" symbol for this lines in the below file using sed in my shell script.
Below are the lines to be replaced:
#RateLimitInterval = 30s
#RateLimitBurst = 1000
Replacement value:
RateLimitInterval = 60s
RateLimitBurst = 10000
File is as below
#Storage=auto
#Compress=yes
#Seal=yes
#SplitMode=uid
#SyncIntervalSec=5m
#RateLimitInterval=30s
#RateLimitBurst=1000
#SystemMaxUse=
#SystemKeepFree=
#SystemMaxFileSize=
#RuntimeMaxUse=
#RuntimeKeepFree=
#RuntimeMaxFileSize=
#MaxRetentionSec=
#MaxFileSec=1month
#ForwardToSyslog=yes
#ForwardToKMsg=no
#ForwardToConsole=no
#ForwardToWall=yes
#TTYPath=/dev/console
#MaxLevelStore=debug
#MaxLevelSyslog=debug
#MaxLevelKMsg=notice
#MaxLevelConsole=info
#MaxLevelWall=emerg
#LineMax=48K
The code which I wrote:
sed -i.bk '/#RateLimit/ s/#//; /RateLimitInterval/ s/30s/60s/; /RateLimitBurst/ s/1000/10000/' check_jounal.conf
Can Someone provide me suggestions to optimize this code, feel like it is too long !!

I would just spell out the replacements in their entirety:
sed \
-e 's/^#RateLimitInterval=30s$/RateLimitInterval=60s/' \
-e 's/^#RateLimitBurst=1000$/RateLimitBurst=10000/'
By default, sed accepts its input on standard input and writes to standard output. You can add a filename to read the original from a file; add --in-place (-i) to also write the replacement to the same file.

sed -e '/RateLimitInterval/{s/30/60/; s/^#//;}' \
-e '/RateLimitBurst/{s/1000/10000/; s/^#//;}'
I strongly recommend against using -i. It seems a bit fragile to try to match the current RHS on each line, so you might prefer s/=.*/=60/' and s/=.*/=10000/'

parsing data from log using awk

I want to extract machineId userId origReqUri,filename,mime,size,checksum as comma-separated from this log pattern. Any awk command to do it?
test1.1/test.log.2020-07-14-20:2020-07-14 20:47:44,239 [http--1594759553405 sessionId:4567 nodeId:node-1 machineId:31656 userId:2540397 origReqUri:/test1/batch] INFO com.test.company - [RETURN INFO - RETURN] - TRACK_PREPROCESSED_DATA_POPULATION: Populated test_doc_version entry for doc version [1130783_1_0] with data from test_doc_metadata. File name: [09014b3080135f44.doc]. Mime type: [application/msword]. Content size: [100352]. MD5 checksum: [7ef30e834107990c95c7e53f7b6f6ee6]. [source:]
I tried
grep machineId:31656 test.1/test.log.2020-07-14-* |grep "Populated test_doc_version entry" | awk machineId |awk origReqUri

I didn't use AWK, but I would resolve your problem using mostly SED and GREP, like this:
sed s/': '/':'/g input | sed s/' '/\\n/g | grep 'machineId\|userId\|origReqUri\|name\|type\|size\|checksum' | sed 's/\[\|\]\|\.//g' | tr '\n' ',' | sed 's/name/filename/g' | sed 's/type/mime/g' | sed 's/.$//'
ps.: "input" is the name of the file where I wrote the input.
The result for the provided input is:
machineId:31656,userId:2540397,origReqUri:/test1/batch,filename:09014b3080135f44doc,mime:application/msword,size:100352,checksum:7ef30e834107990c95c7e53f7b6f6ee6
It is probably not the best solution and we can certainly make it smaller and more beautiful, but I hope it helps you.
There's another solution, simpler and way more readable. You could do like this:
tr -s ' :[]' ' ' < input | cut -d ' ' -f 12,14,16,39,43,47,51
In here, it's not comma-separated. I guess it's better not to use commas since they are in the list of special symbols.
The result for this one is:
31656 2540397 /test1/batch 09014b3080135f44.doc application/msword 100352 7ef30e834107990c95c7e53f7b6f6ee6

Count total number of pattern between two pattern (using sed if possible) in Linux

I have to count all '=' between two pattern i.e '{' and '}'
Sample:
{
100="1";
101="2";
102="3";
};
{
104="1,2,3";
};
{
105="1,2,3";
};
Expected Output:
3
1
1

A very cryptic perl answer:
perl -nE 's/\{(.*?)\}/ say ($1 =~ tr{=}{=}) /ge'
The tr function returns the number of characters transliterated.
With the new requirements, we can make a couple of small changes:
perl -0777 -nE 's/\{(.*?)\}/ say ($1 =~ tr{=}{=}) /ges'
-0777 reads the entire file/stream into a single string
the s flag to the s/// function allows . to handle newlines like a plain character.

Perl to the rescue:
perl -lne '$c = 0; $c += ("$1" =~ tr/=//) while /\{(.*?)\}/g; print $c' < input
-n reads the input line by line
-l adds a newline to each print
/\{(.*?)\}/g is a regular expression. The ? makes the asterisk frugal, i.e. matching the shortest possible string.
The (...) parentheses create a capture group, refered to as $1.
tr is normally used to transliterate (i.e. replace one character by another), but here it just counts the number of equal signs.
+= adds the number to $c.

Awk is here too
grep -o '{[^}]\+}'|awk -v FS='=' '{print NF-1}'
example
echo '{100="1";101="2";102="3";};
{104="1,2,3";};
{105="1,2,3";};'|grep -o '{[^}]\+}'|awk -v FS='=' '{print NF-1}'
output
3
1
1

First some test input (a line with a = outside the curly brackets and inside the content, one without brackets and one with only 2 brackets)
echo '== {100="1";101="2";102="3=3=3=3";} =;
a=b
{c=d}
{}'
Handle line without brackets (put a dummy char so you will not end up with an empty string)
sed -e 's/^[^{]*$/x/'
Handle line without equal sign (put a dummy char so you will not end up with an empty string)
sed -e 's/{[^=]*}/x/'
Remove stuff outside the brackets
sed -e 's/.*{\(.*\)}/\1/'
Remove stuff inside the double quotes (do not count fields there)
sed -e 's/"[^"]*"//g'
Use #repzero method to count equal signs
awk -F "=" '{print NF-1}'
Combine stuff
echo -e '{100="1";101="2";102="3";};\na=b\n{c=d}\n{}' |
sed -e 's/^[^{]*$/x/' -e 's/{[^=]*}/x/' -e 's/.*{\(.*\)}/\1/' -e 's/"[^"]*"//g' |
awk -F "=" '{print NF-1}'
The ugly temp fields x and replacing {} can be solved inside awk:
echo -e '= {100="1";101="2=2=2=2";102="3";};\na=b\n{c=d}\n{}' |
sed -e 's/^[^{]*$//' -e 's/.*{\(.*\)}/\1/' -e 's/"[^"]*"//g' |
awk -F "=" '{if (NF>0) c=NF-1; else c=0; print c}'
or shorter
echo -e '= {100="1";101="2=2=2=2";102="3";};\na=b\n{c=d}\n{}' |
sed -e 's/^[^{]*$//' -e 's/.*{\(.*\)}/\1/' -e 's/"[^"]*"//g' |
awk -F "=" '{print (NF>0) ? NF-1 : 0; }'

No harder sed than done ... in.
Restricting this answer to the environment as tagged, namely:
linux shell unix sed wc
will actually not require the use of wc (or awk, perl, or any other app.).
Though echo is used, a file source can easily exclude its use.
As for bash, it is the shell.
The actual environment used is documented at the end.
NB. Exploitation of GNU specific extensions has been used for brevity
but appropriately annotated to make a more generic implementation.
Also brace bracketed { text } will not include braces in the text.
It is implicit that such braces should be present as {} pairs but
the text src. dangling brace does not directly violate this tenet.
This is a foray into the world of `sed`'ng to gain some fluency in it's use for other purposes.
The ideas expounded upon here are used to cross pollinate another SO problem solution in order
to aquire more familiarity with vetting vagaries of vernacular version variances. Consequently
this pedantic exercice hopefully helps with the pedagogy of others beyond personal edification.
To test easily, at least in the environment noted below, judiciously highlight the appropriate
code section, carefully excluding a dangling pipe |, and then, to a CLI command line interface
drag & drop, copy & paste or use middle click to enter the code.
The other SO problem. linux - Is it possible to do simple arithmetic in sed addresses?
# _______________________________ always needed ________________________________
echo -e '\n
\n = = = {\n } = = = each = is outside the braces
\na\nb\n { } so therefore are not counted
\nc\n { = = = = = = = } while the ones here do count
{\n100="1";\n101="2";\n102="3";\n};
\n {\n104="1,2,3";\n};
a\nb\nc\n {\n105="1,2,3";\n};
{ dangling brace ignored junk = = = \n' |
# _____________ prepatory conditioning needed for final solutions _____________
sed ' s/{/\n{\n/g;
s/}/\n}\n/g; ' | # guarantee but one brace to a line
sed -n '/{/ h; # so sed addressing can "work" here
/{/,/}/ H; # use hHold buffer for only { ... }
/}/ { x; s/[^=]*//g; p } ' | # then make each {} set a line of =
# ____ stop code hi-lite selection in ^--^ here include quote not pipe ____
# ____ outputs the following exclusive of the shell " # " comment quotes _____
#
#
# =======
# ===
# =
# =
# _________________________________________________________________________
# ____________________________ "simple" GNU solution ____________________________
sed -e '/^$/ { s//0/;b }; # handle null data as 0 case: next!
s/=/\n/g; # to easily count an = make it a nl
s/\n$//g; # echo adds an extra nl - delete it
s/.*/echo "&" | sed -n $=/; # sed = command w/ $ counts last nl
e ' # who knew only GNU say you ah phoo
# 0
# 0
# 7
# 3
# 1
# 1
# _________________________________________________________________________
# ________________________ generic incomplete "solution" ________________________
sed -e '/^$/ { s//echo 0/;b }; # handle null data as 0 case: next!
s/=$//g; # echo adds an extra nl - delete it
s/=/\\\\n/g; # to easily count an = make it a nl
s/.*/echo -e & | sed -n $=/; '
# _______________________________________________________________________________
The paradigm used for the algorithm is instigated by the prolegomena study below.
The idea is to isolate groups of = signs between { } braces for counting.
These are found and each group is put on a separate line with ALL other adorning characters removed.
It is noted that sed can easily "count", actually enumerate, nl or \n line ends via =.
The first "solution" uses these sed commands:
print
branch w/o label starts a new cycle
h/Hold for filling this sed buffer
exchanage to swap the hold and pattern buffers
= to enumerate the current sed input line
substitute s/.../.../; with global flag s/.../.../g;
and most particularly the GNU specific
evaluate (execute can not remember the actual mnemonic but irrelevantly synonymous)
The GNU specific execute command is avoided in the generic code. It does not print the answer but
instead produces code that will print the answer. Run it to observe. To fully automate this, many
mechanisms can be used not the least of which is the sed write command to put these lines in a
shell file to be excuted or even embed the output in bash evaluation parentheses $( ) etc.
Note also that various sed example scripts can "count" and these too can be used efficaciously.
The interested reader can entertain these other pursuits.
prolegomena:
concept from counting # of lines between braces
sed -n '/{/=;/}/=;'
to
sed -n '/}/=;/{/=;' |
sed -n 'h;n;G;s/\n/ - /;
2s/^/ Between sets of {} \n the nl # count is\n /;
2!s/^/ /;
p'
testing "done in":
linuxuser#ubuntu:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic
linuxuser#ubuntu:~$ sed --version -----> sed (GNU sed) 4.4

And for giggles an awk-only alternative:
echo '{
> 100="1";
> 101="2";
> 102="3";
> };
> {
> 104="1,2,3";
> };
> {
> 105="1,2,3";
> };' | awk 'BEGIN{RS="\n};";FS="\n"}{c=gsub(/=/,""); if(NF>2){print c}}'
3
1
1

Sed: Replacing Date Format

I have a text file where there is a date string of "2014-06-01T03:11:00Z " in every line. I would like to replace that with "2014-06-01 03:11Z " using sed.
I've been trying to use this code but, it's failing me:
sed -i 's/[0-9]-[0-9]-[0-9]T[0-9]:[0-9]:[0-9]Z/[0-9]-[0-9]-[0-9] [0-9]:[0-9]Z/g' \
/home/aaron/grads/data/metars/${YMD}/latest.metars

Your digit sub-expressions only match a single digit, but the date contains 2 or 4 digits. A simple version that would match dates is:
sed -i 's/\([0-9]*-[0-9]*-[0-9]*\)T\([0-9]*:[0-9]*\):[0-9]*Z/\1 \2Z/g' \
/home/aaron/grads/data/metars/${YMD}/latest.metars
However, this matches zero or more digits at each position where digits are expected. You really want to insist on the correct number of digits in each segment. A more refined version is:
sed -i 's/\([0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}\)T\([0-9]\{2\}:[0-9]\{2\}\):[0-9]\{2\}Z/\1 \2Z/g' \
/home/aaron/grads/data/metars/${YMD}/latest.metars
And since your sed supports -i without specifying a back-up suffix (so it is probably GNU sed), you can probably abbreviate that to:
sed -r -i 's/([0-9]{4}-[0-9]{2}-[0-9]{2})T([0-9]{2}:[0-9]{2}):[0-9]{2}Z/\1 \2Z/g' \
/home/aaron/grads/data/metars/${YMD}/latest.metars

Try this GNU sed command to replace all the lines which contains the date string with the string you mentioned,
sed -ri 's/^.*([0-9]{4})-([0-9]{2})-([0-9]{2})\w*([0-9]{2}):([0-9]{2}):([0-9]{2})(.)(.*)$/\1-\2-\3 \4:\5\7/g' file
Example:
$ cat aa
jgklj 2014-06-01T03:11:00Z jhgkjhvk
blaf 2015-12-08T03:15:02Z bvcjghj
$ sed -r 's/^.*([0-9]{4})-([0-9]{2})-([0-9]{2})\w*([0-9]{2}):([0-9]{2}):([0-9]{2})(.)(.*)$/\1-\2-\3 \4:\5\7/g' aa
2014-06-01 03:11Z
2015-12-08 03:15Z
For to replace date only and print all the other text as it is then run the below command.
sed -ri 's/^(.*)([0-9]{4})-([0-9]{2})-([0-9]{2})\w*([0-9]{2}):([0-9]{2}):([0-9]{2})(.)(.*)$/\1\2-\3-\5 \5:\6\8\9/g' file
Example:
$ cat aa
jgklj 2014-06-01T03:11:00Z jhgkjhvk
blaf 2015-12-08T03:15:02Z bvcjghj
$ sed -r 's/^(.*)([0-9]{4})-([0-9]{2})-([0-9]{2})\w*([0-9]{2}):([0-9]{2}):([0-9]{2})(.)(.*)$/\1\2-\3-\5 \5:\6\8\9/g' aa
jgklj 2014-06-03 03:11Z jhgkjhvk
blaf 2015-12-03 03:15Z bvcjghj

You can use this method also
$-sed -r 's/^([^T]+).((.*):){1,2}.([^Z])/\1 \3/g'

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

sed script to remove file name duplicates - linux

Not sed but hope this helps you: make libs-depends | grep -io --perl-regexp "[\w\.\/]+\.h " | sort -u | sed -e 's:^:src/:'

Sed probably isn't the best tool here as it's stream-oriented. You could possibly use it to convert the spaces to newlines though, pipe that through sort and uniq, then use sed again to convert the newlines back to spaces. Typing this on my phone, though, so can't give exact commands :(

Related

how to loop through string for patterns from linux shell?

How to replace values in a file using sed

parsing data from log using awk

Count total number of pattern between two pattern (using sed if possible) in Linux

Sed: Replacing Date Format

Categories

Resources