Extract part of a path - linux

I have a variable that is a path of a windows folder.
I would like to handle the way with SED .
Example:
Input:
\\computer1\folder$
Output:
computer1
I would always pick the host name that is between \\ and \
Could someone give me a light?

You can do this in a POSIX compatible shell:
% folder='\\computer1\folder$'
% folder="${folder/\\\\/}" # Remove leading '\\'
% printf "%s\n" "${folder%%\\*}"
computer1
Alternative with Bashism:
% folder='\\computer1\folder$'
% [[ "$folder" =~ '\\'([^\\]*) ]]
% printf "%s\n" "${BASH_REMATCH[1]}"
computer1

With sed :
$ sed 's/\\\\\([^\]*\)\\.*/\1/' <<< '\\computer1\folder$'
computer1
The basic syntax for sed substitution command is s/oldtext/newtext/.
s is for substitution command
in the path string, every \ must be escaped so it becomes \\
\([^\]*\)\ captures every non \ character until next \
the captured string is output with backreference \1

For this scenario, awk is a better option. Use awk -F'\' '{print $2}'
Example
$> echo "\\computer1\folder$"|awk -F'\' '{print $2}'
Output
computer1
Or you can try putting it with a variable.
$> export val="\\computer1\folder$"
$> echo $val|awk -F'\' '{print $2}'
computer1

With awk. s stands for "separator" and a stands for "any".
echo '\\computer1\folder$' | \
awk '{s="\\\\"; a=".*"; sub(a s s, ""); sub(s a, ""); print}'

Related

How to add single quotes in a shell script using sed

Need help in making a sed script to find and replace user input along with single quotes. Input file admins.py:
Script:
read adminsid
while [[ $adminsid == "" ]];
do
echo "You did not enter anything. Please re-enter AdminID"
read adminsid
done
## Please enter Admin's ID
9999999999,8888888888,1111111111
## Script To Replace ADMIN_IDS = [] to ADMIN_IDS = ['9999999999,8888888888,1111111111'] in file
sed -i "s|ADMIN_IDS = \[.*\]|ADMIN_IDS = ['$adminsid']|g" $file
## Current results:
ADMIN_IDS = ['9999999999,8888888888,1111111111']
## Expected results:
ADMIN_IDS = ['9999999999','8888888888','1111111111']
Assign the variable to the data
adminsid=9999999999,8888888888,1111111111
Then use sed -e (script) option to add the quoting, and square brackets.
echo "$adminsid" | sed -e "s/,/\',\'/g" -e "s/^/[\'/" -e "s/$/\']/"
or to apply changes to a file (filename in $file):
sed -i "$file" -e "s/,/\',\'/g" -e "s/^/[\'/" -e "s/$/\']/"
You can do this with awk too:
Suppose you have assigned the variable as :
adminsid=9999999999,8888888888,1111111111
Then the solution:
echo "$adminsid"| awk -F"," -v quote="'" -v OFS="','" '$1=$1 {print "["quote $0 quote"]"}'
-F"," -v OFS="','" :: Replacing separator (,) with (',')
print "["quote $0 quote"]" :: Add single quotes(') and ([) and (]) to the begin and end of line
This might work for you (GNU sed & bash):
<<<"$adminsid" sed 's/[^,]\+/'\''&'\''/g;s/.*/[&]/'
Surround all non-comma characters by single quotes and then surround the entire string by square brackets.
Replace the , with ',' in the variable and add characters at the beginning and at the end.
sed "s/.*/['&']/" <<< "${adminsid//,/','}"
echo "('${adminsid//,/\\',\\'}')"

Extract last digits from each word in a string with multiple words using bash

Given a string with multiple words like below, all in one line:
first-second-third-201805241346 first-second-third-201805241348 first-second-third-201805241548 first-second-third-201705241540
I am trying to the maximum number from the string, in this case the answer should be 201805241548
I have tried using awk and grep, but I am only getting the answer as last word in the string.
I am interested in how to get this accomplished.
echo 'first-second-third-201805241346 first-second-third-201805241348 first-second-third-201805241548 first-second-third-201705241540' |\
grep -o '[0-9]\+' | sort -n | tail -1
The relevant part is grep -o '[0-9]\+' | sort -n | tail -n 1.
Using single gnu awk command:
s='first-second-third-201805241346 first-second-third-201805241348 first-second-third-201805241548 first-second-third-201705241540'
awk -F- -v RS='[[:blank:]]+' '$NF>max{max=$NF} END{print max}' <<< "$s"
201805241548
Or using grep + awk (if gnu awk is not available):
grep -Eo '[0-9]+' <<< "$s" | awk '$1>max{max=$1} END{print max}'
Another awk
echo 'first-...-201705241540' | awk -v RS='[^0-9]+' '$0>max{max=$0} END{print max}'
Gnarly pure bash:
n='first-second-third-201805241346 \
first-second-third-201805241348 \
first-second-third-201805241548 \
first-second-third-201705241540'
z="${n//+([a-z-])/;p=}"
p=0 m=0 eval echo -n "${z//\;/\;m=\$((m>p?m:p))\;};m=\$((m>p?m:p))"
echo $m
Output:
201805241548
How it works: This code constructs code, then runs it.
z="${n//+([a-z-])/;p=}" substitutes non-numbers with some pre-code
-- setting $p to the value of each number, (useless on its own). At this point echo $z would output:
;p=201805241346 \ ;p=201805241348 \ ;p=201805241548 \ ;p=201705241540
Substitute the added ;s for more code that sets $m to the
greatest value of $p, which needs eval to run it -- the actual
code the whole line with eval runs looks like this:
p=0 m=0
m=$((m>p?m:p));p=201805241346
m=$((m>p?m:p));p=201805241348
m=$((m>p?m:p));p=201805241548
m=$((m>p?m:p));p=201705241540
m=$((m>p?m:p))
Print $m.

Remove lines containing space in unix

Below is my comma separated input.txt file, i want to read the columns and write the lines in to the output.txt when any 1 column has a space.
Content of input.txt:
1,Hello,world
2,worl d,hell o
3,h e l l o, world
4,Hello_Hello,World#c#
5,Hello,W orld
Content of output.txt:
1,Hello,world
4,Hello_Hello,World#c#
is't possible to achieve using awk? Please help!
A simple way to filter out lines with spaces is using inverted matching with grep:
grep -v ' ' input.txt
If you must use awk:
awk '!/ /' input.txt
Or perl:
perl -ne '/ / || print' input.txt
Or pure bash:
while read line; do [[ $line == *' '* ]] || echo $line; done < input.txt
# or
while read line; do [[ $line =~ ' ' ]] || echo $line; done < input.txt
UPDATE
To check if let's say field 2 contains space, you could use awk like this:
awk -F, '$2 !~ / /' input.txt
To check if let's say field 2 OR field 3 contains space:
awk -F, '!($2 ~ / / || $3 ~ / /)' input.txt
For your follow-up question in comments
To do the same using sed, I only know these awkward solutions:
# remove lines if 2nd field contains space
sed -e '/^[^,]*,[^,]* /d' input.txt
# remove lines if 2nd or 3rd field contains space
sed -e '/^[^,]*,[^,]* /d' -e '/^[^,]*,[^,]*,[^,]* /d' input.txt
For your 2nd follow-up question in comments
To disregard leading spaces in the 2nd or 3rd fields:
awk -F', *' '!($2 ~ / / || $3 ~ / /)' input.txt
# or perhaps what you really want is this:
awk -F', *' -v OFS=, '!($2 ~ / / || $3 ~ / /) { print $1, $2, $3 }' input.txt
This can also be done easily with sed
sed '/ /d' input.txt
try this one-liner
awk 'NF==1' file
as #jwpat7 pointed out, it won't give correct output if the line has only leading space, then this line, with regex should do, but it has been already posted in janos's answer.
awk '!/ /' file
or
awk -F' *' 'NF==1'
Pure bash for the fun of it...
#!/bin/bash
while read line
do
if [[ ! $line =~ " " ]]
then
echo $line
fi
done < input.txt
columnWithSpace=2
ColumnBef=$(( ${columnWithSpace} - 1 ))
sed '/\([^,]*,\)\{${ColumnBef\}[^ ,]* [^,]*,/ d'
if you know the column directly (by example the 3):
sed '/\([^,]*,\)\{2}[^ ,]* [^,]*,/ d'
If you can trust the input to always have no more than three fields, simply finding a space somewhere after a comma is sufficient.
grep ',.* ' input.txt
If there can be (or usually are) more fields, you can pull that off with grep -E and a suitable ERE, but you are fast approaching the point at which the equivalent Awk solution will be more readable and maintainable.

How to extract a part of string?

I have string contains a path
string="toto.titi.1.tata.2.abc.def"
I want to extract the substring which is situated after toto.titi.1.tata.2.. but 1 and 2 here are examples and could be other numbers.
In general: I want to extract the substring which situated after toto.titi.[i].tata.[j]..
[i] and [j] are a numbers
How to do it?
Pure bash solution:
[[ $string =~ toto\.titi\.[0-9]+\.tata\.[0-9]+\.(.*$) ]] && result="${BASH_REMATCH[1]}"
echo "$result"
An alternate bash solution that uses parameter expansion instead of a regular expression:
echo "${string#toto.titi.[0-9].tata.[0-9].}"
If the numbers can be multi-digit values (i.e., greater than 9), you would need to use an extended pattern:
shopt -s extglob
echo "${string#toto.titi.+([0-9]).tata.+([0-9]).}"
You can use cut
echo $string | cut -f6- -d'.'
This does it:
echo ${string} | sed -re 's/^toto\.titi\.[[:digit:]]+\.tata\.[[:digit:]]+\.//'
May be like this:
echo "$string" | cut -d '.' -f 6-
You can use sed. Like this:
string="toto.titi.1.tata.2.abc.def"
string=$(sed 's/toto\.titi\.[0-9]\.tata\.[0-9]\.//' <<< "$string")
echo "$string"
Output:
abc.def
try this awk line:
awk -F'toto\\.titi\\.[0-9]+\\.tata\\.[0-9]+\\.' '{print $2}' file
with your example:
kent$ echo "toto.titi.1.tata.2.abc.def"|awk -F'toto\\.titi\\.[0-9]+\\.tata\\.[0-9]+\\.' '{print $2}'
abc.def

Need to grab data inbetween tilde character

Can any one advise how to search on linux for some data between a tilde character. I need to get IP data however its been formed like the below.
Details:
20110906000418~118.221.246.17~DATA~DATA~DATA
One more:
echo '20110906000418~118.221.246.17~DATA~DATA~DATA' | sed -r 's/[^~]*~([^~]+)~.*/\1/'
echo "20110906000418~118.221.246.17~DATA~DATA~DATA" | cut -d'~' -f2
This uses the cut command with the delimiter set to ~. The -f2 switch then outputs just the 2nd field.
If the text you give is in a file (called filename), try:
grep "[0-9]*~" filename | cut -d'~' -f2
With cut:
echo "20110906000418~118.221.246.17~DATA~DATA~DATA" | cut -d~ -f2
With awk:
echo "20110906000418~118.221.246.17~DATA~DATA~DATA"
| awk -F~ '{ print $2 }'
In awk:
echo '20110906000418~118.221.246.17~DATA~DATA~DATA' | awk -F~ '{print $2}'
Just use bash
$ string="20110906000418~118.221.246.17~DATA~DATA~DATA"
$ echo ${string#*~}
118.221.246.17~DATA~DATA~DATA
$ string=${string#*~}
$ echo ${string%%~*}
118.221.246.17
one more, using perl:
$ perl -F~ -lane 'print $F[1]' <<< '20110906000418~118.221.246.17~DATA~DATA~DATA'
118.221.246.17
bash:
#!/bin/bash
IFS='~'
while read -a array;
do
echo ${array[1]}
done < ip
If string is constant, the following parameter expansion performs substring extraction:
$ a=20110906000418~118.221.246.17~DATA~DATA~DATA
$ echo ${a:15:14}
118.221.246.17
or using regular expressions in bash:
$ echo $(expr "$a" : '[^~]*~\([^~]*\)~.*')
118.221.246.17
last one, again using pure bash methods:
$ tmp=${a#*~}
$ echo $tmp
118.221.246.17~DATA~DATA~DATA
$ echo ${tmp%%~*}
118.221.246.17

Resources