split a string variable through shell script

split a string variable through shell script - linux

i have a string containing date and time as timestamp= 12-12-2012 16:45:00
I need to reformat it into timestamp= 16:45:00 12-12-2012
How to achieve this in shell script?
Note Please : variable's value is 12-12-2012 16:45:00 and timestamp is the name of variable
#!usr/bin/expect
set timestamp "16:45:00 12-12-2012"
Now what should i do so value of timestamp will become 12-12-2012 16:45:00
script extention is .tcl example test.tcl

You could use variable patterned removal. ## means "greedily remove everything that matches the pattern, starting from the left". %% means the same from the right:
tm=${timestamp##* }
dt=${timestamp%% *}
result="$tm $dt"
or you could use cut to do the same, giving a field delimiter:
tm=$(echo $timestamp | cut -f2 -d' ')
dt=$(echo $timestamp | cut -f1 -d' ')
result="$tm $dt"
or you could use sed to swap them with a regex (see other post).
or if you are pulling the date from the date command, you could ask it to format it for you:
result=$(date +'%r %F')
and for that matter, you might have a version of date that will parse your date and then let you express it however you want:
result=$(date -d '12/12/2012 4:45 pm' +'%r %F')
admittedely, this last one is picky about date input...see "info date" for information on accepted inputs.
If you want to use regex, I like Perl's...they are cleaner to write:
echo $timestamp | perl -p -e 's/^(\S+)\s+(\S+)/$2 $1/'
where \S matches non-space characters, + means "one or more", and \s matches spaces. The parens do captures of the parts matched.
EDIT:
Sorry, didn't realize that the "timestamp=" was part of the actual data. All of the above example work if you first strip that bit out:
var='timestamp=2012-12-12 16:45:11'
timestamp=${var#timestamp=}
... then as above ...

Using sed:
sed 's/\([0-9]*-[0-9]*-[0-9]*\)\([ \t]*\)\(.*\)/\3\2\1/' input
this command works on lines containing the pattern number-number-number whitespace antyhing. It simply swaps the number-number-number part \([0-9]*-[0-9]*-[0-9]*\) with the anything part \(.*\), also keeping the original whitespaces \([ \t]*\). So the replace part of sed is \3\2\1, which means the third part, white spaces, and the first part.
Same logic with tcl:
set timestamp "12-12-2012 16:45:00"
set s [regsub {([0-9]*-[0-9]*-[0-9]*)([ \t]*)(.*)} $timestamp \\3\\2\\1]
puts $s

awk solution here:
string="timestamp= 12-12-2012 16:45:00"
awk '{print $1, $3, $2}' <<< "$string"

In bash (and similar shells):
$ timestamp="12-12-2012 16:45:00"
$ read -a tsarr <<< "$timestamp"
$ echo "${tsarr[1]} ${tsarr[0]}"
16:45:00 12-12-2012

Related

Cutting certain string of variable

I'd like to cut off some special strings of a variable.
The variable contains the following, including a lot of blank space before <div... and a class attribute:
<div data-href="/www.somewebspace.com" class="class1 class2">
I would like to extract the contents of the data-href attribute i.e have this output /www.somewebspace.com
I tried out the following code, the output starts with the contents of the data-href attribute and the class attribute.
echo $Test | grep -oP '(?<=<div data-href=").*(?=")'
How can I get rid of the class attribute?
Kind regards and grateful for every reply,
X3nion
P.S. Some other question arouse. I've got this strings I'd like to extract from a text file:
<div class="aditem-addon">
Today, 23:23</div>`
What would be the correct command to extract only the "Today, 23:23" without any spaces and spaces before and after the term?
Maybe I would have to delete the black spaces before?

your regex is correct, you only need to adjust the greediness of the * quantifier:
* is a greedy quantifier : match as much as possible whilst getting a match
*? is a reluctant quantifier : match the minimum characters to get a match
# Correct
Test='<div data-href="/www.somewebspace.com" class="fdgks"></div>'
echo $Test | grep -oP '(?<=<div data-href=").*?(?=")'
#> /www.somewebspace.com
# the desired output
# WRONG
echo $Test | grep -oP '(?<=<div data-href=").*(?=")'
#> /www.somewebspace.com" class="fdgks
# didn't stop until it matched the last quote `"`
echo $Test$Test | grep -oP '(?<=<div data-href=").*(?=")'
#> /www.somewebspace.com" class="fdgks"></div><div data-href="/www.somewebspace.com" class="fdgks
# same as the last one
for a more detailed explanation about the difference between greedy, reluctant and possessive quantifiers (see)
EDIT
echo $Test$Test | grep -Poz '(?<=<div class="aditem-addon">\n ).*?(?=<\/div>)'
#> Today, 23:23
#> Today, 23:23
\n matches a newline an a leading space.
if the string you're looking for contains the newline character \n you'll need to add the z option to grep i.e the call will be grep -ozP

Unless the input is very simple, considering using xmllint or other html parsing tool. For the very simple cases, you can use bash solution:
#! /bin/sh
s=' <div data-href="/www.somewebspace.com" class="class1 class2"> '
s1=${s##*data-href=\"}
s1=${s1%%\"*}
echo "$s1"
Which will print
/www.somewebspace.com

Can't input date variable in bash

I have a directory /user/reports under which many files are there, one of them is :
report.active_user.30092018.77325.csv
I need output as number after date i.e. 77325 from above file name.
I created below command to find a value from file name:
ls /user/reports | awk -F. '/report.active_user.30092018/ {print $(NF-1)}'
Now, I want current date to be passed in above command as variable and get result:
ls /user/reports | awk -F. '/report.active_user.$(date +'%d%m%Y')/ {print $(NF-1)}'
But not getting required output.
Tried bash script:
#!/usr/bin/env bash
_date=`date +%d%m%Y`
active=$(ls /user/reports | awk -F. '/report.active_user.${_date}/ {print $(NF-1)}')
echo $active
But still output is blank.
Please help with proper syntax.

As #cyrus said you must use double quotes in your variable assignment because simple quote are use only for string and not for containing variables.
Bas use case
number=10
string='I m sentence with or wihtout var $number'
echo $string
Correct use case
number=10
string_with_number="I m sentence with var $number"
echo $string_with_number
You can use simple quote but not englobe all the string
number=10
string_with_number='I m sentence with var '$number
echo $string_with_number

Don't parse ls
You don't need awk for this: you can manage with the shell's capabilities
for file in report.active_user."$(date "+%d%m%Y")"*; do
tmp=${file%.*} # remove the extension
number=${tmp##*.} # remove the prefix up to and including the last dot
echo "$number"
done
See https://www.gnu.org/software/bash/manual/bashref.html#Shell-Parameter-Expansion

Trim a string up to 4th delimiter from right side

I have strings like following which should be parsed with only unix command (bash)
49_sftp_mac_myfile_simul_test_9999_4000000000000001_2017-02-06_15-15-26.49.csv.failed
I want to trim the strings like above upto 4th underscore from end/right side. So output should be
49_sftp_mac_myfile_simul_test
Number of underscores can vary in overall string. For example, The string could be
49_sftp_simul_test_9999_4000000000000001_2017-02-06_15-15-26.49.csv.failed
Output should be (after trimming up to 4th occurrence of underscore from right.
49_sftp_simul_test

Easily done using awk that decrements NF i.e. no. of fields to -4 after setting input+output field separator as underscore:
s='49_sftp_mac_myfile_simul_test_9999_4000000000000001_2017-02-06_15-15-26.49.csv.failed'
awk 'BEGIN{FS=OFS="_"} {NF -= 4; $1=$1} 1' <<< "$s"
49_sftp_mac_myfile_simul_test

You can use bash's parameter expansion for that:
string="..."
echo "${string%_*_*_*_*}"

With GNU sed:
$ sed -E 's/(_[^_]*){4}$//' <<< "49_sftp_mac_myfile_simul_test_9999_4000000000000001_2017-02-06_15-15-26.49.csv.failed"
49_sftp_mac_myfile_simul_test
From the end of line, removes 4 occurrences of _ followed by non _ characters.

Perl one-liner
echo $your-string | perl -lne '$n++ while /_/g; print join "_",((split/_/)[-$n-1..-5])'
input
49_sftp_mac_myfile_simul_test_9999_4000000000000001_2017-02-06_15-15-26.49.csv.failed
the output
49_sftp_mac_myfile_simul_test
input
49_sftp_simul_test_9999_4000000000000001_2017-02-06_15-15-26.49.csv.failed
the output
49_sftp_simul_test

Not the fastest but maybe the easiest to remember and funiest:
echo "49_sftp_mac_myfile_simul_test_9999_4000000000000001_2017-02-06_15-15-26.49.csv.failed"|
rev | cut -d"_" -f5- | rev

Extracting part of a string to a variable in bash

noob here, sorry if a repost. I am extracting a string from a file, and end up with a line, something like:
abcdefg:12345:67890:abcde:12345:abcde
Let's say it's in a variable named testString
the length of the values between the colons is not constant, but I want to save the number, as a string is fine, to a variable, between the 2nd and 3rd colons. so in this case I'd end up with my new variable, let's call it extractedNum, being 67890 . I assume I have to use sed but have never used it and trying to get my head around it...
Can anyone help? Cheers
On a side-note, I am using find to extract the entire line from a string, by searching for the 1st string of characters, in this case the abcdefg part.

Pure Bash using an array:
testString="abcdefg:12345:67890:abcde:12345:abcde"
IFS=':'
array=( $testString )
echo "value = ${array[2]}"
The output:
value = 67890

Here's another pure bash way. Works fine when your input is reasonably consistent and you don't need much flexibility in which section you pick out.
extractedNum="${testString#*:}" # Remove through first :
extractedNum="${extractedNum#*:}" # Remove through second :
extractedNum="${extractedNum%%:*}" # Remove from next : to end of string
You could also filter the file while reading it, in a while loop for example:
while IFS=' ' read -r col line ; do
# col has the column you wanted, line has the whole line
# # #
done < <(sed -e 's/\([^:]*:\)\{2\}\([^:]*\).*/\2 &/' "yourfile")
The sed command is picking out the 2nd column and delimiting that value from the entire line with a space. If you don't need the entire line, just remove the space+& from the replacement and drop the line variable from the read. You can pick any column by changing the number in the \{2\} bit. (Put the command in double quotes if you want to use a variable there.)

You can use cut for this kind of stuff. Here you go:
VAR=$(echo abcdefg:12345:67890:abcde:12345:abcde |cut -d":" -f3); echo $VAR
For the fun of it, this is how I would (not) do this with sed, but I'm sure there's easier ways. I guess that'd be a question of my own to future readers ;)
echo abcdefg:12345:67890:abcde:12345:abcde |sed -e "s/[^:]*:[^:]*:\([^:]*\):.*/\1/"

this should work for you: the key part is awk -F: '$0=$3'
NewVar=$(getTheLineSomehow...|awk -F: '$0=$3')
example:
kent$ newVar=$(echo "abcdefg:12345:67890:abcde:12345:abcde"|awk -F: '$0=$3')
kent$ echo $newVar
67890
if your text was stored in var testString, you could:
kent$ echo $testString
abcdefg:12345:67890:abcde:12345:abcde
kent$ newVar=$(awk -F: '$0=$3' <<<"$testString")
kent$ echo $newVar
67890

Is it possible to use the matched string in sed as an input to shell date command?

I have a file with records having timestamp fields that include GMT offset. I want to use the sed command to replace the value on the record to a regular timestamp (without GMT offset).
For example:
`$date -d '2012/11/01 00:50:22 -0800' '+%Y-%m-%d %H:%M:%S'`
returns this value which is what I am looking to do:
2012-11-01 01:50:22
Except I want to perform that operation on every line of this file and apply the date command to the timestamp value. Here is a sample record:
"SB","6GV96644X48128125","","","","T0006",2012/10/03 13:08:43 -0700,"NJ"
Here is my code:
head -1 myfile | sed 's/,[0-9: /\-]\{25\},/,'"`date -d \1 '+%Y-%m-%d %H:%M:%S'`"',/
which doesn't work: it just ignores \1 and replaces the matched pattern with today's date:
"SB","6GV96644X48128125","","","","T0006",2012-11-14 01:00:00,"NJ"
I hoped that \1 would result in the matched patterns be passed to the date function and return a regular timestamp value (as in the example I provided above showing how the date functions applies the GMT offset and returns a regular stimestamp string) and would replace the old value on the record.

I would use awk instead. For example:
awk '{cmd="date -d \""$7"\" \"+%Y-%m-%d %H:%M:%S\"";
cmd | getline k; $7=k; print}' FS=, OFS=, myFile
This will replace the 7th field with the results of running the date command on the original contents of the 7th field.

In sed:
head -1 datefile |
sed '
# handle % in input correctly
s/%/%%/g
# execute date(1) command
s/\(.*,\)\([0-9: /\-]\{25\}\)\(,.*\)/'"date -d '\2' '+\1%Y-%m-%d %H:%M:%S\3'"'/e'
'

This might work for you (GNU sed):
sed -r 's/^(([^,]*,){6})([^,]*)(.*)/printf "%s%s%s" '\''\1'\'' $(date -d '\''\3'\'' '\''+%Y-%m-%d %H:%M:%S'\'') '\''\4'\''/e;q' file

use
head -1 datefile | sed -e 's?\(..\)/\(..\)/\(.... ..:..:..\)?'"date -d '\2/\1/\3' '+%s'"'?e'

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

split a string variable through shell script - linux

awk solution here: string="timestamp= 12-12-2012 16:45:00" awk '{print $1, $3, $2}' <<< "$string"

In bash (and similar shells): $ timestamp="12-12-2012 16:45:00" $ read -a tsarr <<< "$timestamp" $ echo "${tsarr[1]} ${tsarr[0]}" 16:45:00 12-12-2012

Related

Cutting certain string of variable

Can't input date variable in bash

Trim a string up to 4th delimiter from right side

Extracting part of a string to a variable in bash

Is it possible to use the matched string in sed as an input to shell date command?

Categories

Resources