How to pass quoted arguments but with blank spaces in linux - linux

I have a file with these arguments and their values ​​this way
# parameters.txt
VAR1 001
VAR2 aaa
VAR3 'Hello World'
and another file to configure like this
# example.conf
VAR1 = 020
VAR2 = kab
VAR3 = ''
when I want to get the values in a function I use this command
while read p; do
VALUE=$(echo $p | awk '{print $2}')
done < parameters.txt
the firsts arguments throw the right values, but the last one just gets the 'Hello for the blank space, my question is how do I get the entire 'Hello World' value?

If you can use bash, there is no need to use awk: read and shell parameter expansion can be combined to solve your problem:
while read -r name rest; do
# Drop the '= ' part, if present.
[[ $rest == '= '* ]] && value=${rest:2} || value=$rest
# $value now contains the line's value,
# but *including* any enclosing ' chars, if any.
# Assuming that there are no *embedded* ' chars., you can remove them
# as follows:
value=${value//\'/}
done < parameters.txt
read by default also breaks a line into fields by whitespace, like awk, but unlike awk it has the ability to assign the remainder of the line to a varaible, namely the last one, if fewer variables than fields found are specified;
read's -r option is generally worth specifying to avoid unexpected interpretation of \ chars. in the input.
As for your solution attempt:
awk doesn't know about quoting in input - by default it breaks input into fields by whitespace, irrespective of quotation marks.
Thus, a string such as 'Hello World' is simply broken into fields 'Hello and World'.
However, in your case you can split each input line into its key and value using a carefully crafted FS value (FS is the input field separator, which can be also be set via option -F; the command again assumes bash, this time for use of <(...), a so-called process substitution, and $'...', an ANSI C-quoted string):
while IFS= read -r value; do
# Work with $value...
done < <(awk -F$'^[[:alnum:]]+ (= )?\'?|\'' '{ print $2 }' parameters.txt)
Again the assumption is that values contain no embedded ' instances.
Field separator regex $'^[[:alnum:]]+ (= )?\'?|\'' splits each line so that $2, the 2nd field, contains the value, stripped of enclosing ' chars., if any.
xargs is the rare exception among the standard utilities in that it does understand single- and double-quoted strings (yet also without support for embedded quotes).
Thus, you could take advantage of xargs' ability to implicitly strip enclosing quotes when it passes arguments to the specified command, which defaults to echo (again assumes bash):
while read -r name rest; do
# Drop the '= ' part, if present.
[[ $rest == '= '* ]] && value=${rest:2} || value=$rest
# $value now contains the line's value, strippe of any enclosing
# single quotes by `xargs`.
done < <(xargs -L1 < parameters.txt)
xargs -L1 process one (1) line (-L) at a time and implicitly invokes echo with all tokens found on each line, with any enclosing quotes removed from the individual tokens.

The default field separator in awk is the space. So you are only printing the first word in the string passed to awk.
You can specify the field separator on the command line with -F[field separator]
Example, setting the field separator to a comma:
$ echo "Hello World" | awk -F, '{print $1}'
Hello World

Related

Way to replace one variable with another in a string

I need to replace one variable with another variable in a multiple strings.
For example:
string1="One,two"
string2="three.four"
string3="five:six"
y=";"
for str in string1 string2 string3; do
x="$(echo "$str" | sed 's/[a-zA-Z]//g')" # extracting a character between letters
sed 's/$x/$y/'$str # I tried this, but it does not work at all.
echo "$str"
done
Expecting output:
One;two
three;four
five;six
In my output, nothing changes:
One,two
three.four
five:six
You can use bash's substitution operator instead of sed. And simply replace anything that isn't a letter with $y.
#!/bin/bash
string1="One,two"
string2="three.four"
string3="five:six"
y=";"
for str in "$string1" "$string2" "$string3"; do
x=${str//[^a-zA-Z]+/$y}
echo "$x"
done
Output is:
One;two
three;four
five;six
Note that your general approach wouldn't work if the input string has muliple delimiters, e.g. One,two,three. When you remove all the letters you get ,,, but that doesn't appear anywhere in the string.
Addressing issues with OP's current code:
referencing variables requires a leading $, preferably a pair of {}, and (usually) double quotes (eg, to insure embedded spaces are considered as part of the variable's value)
sed can take as input a) a stream of text on stdin, b) a file, c) process substitution or d) a here-document/here-string
when building a sed script that includes variable refences the sed script must be wrapped in double quotes (not single quotes)
Pulling all of this into OP's current code we get:
string1="One,two"
string2="three.four"
string3="five:six"
y=";"
for str in "${string1}" "${string2}" "${string3}"; do # proper references of the 3x "stringX" variables
x="$(echo "$str" | sed 's/[a-zA-Z]//g')"
sed "s/$x/$y/" <<< "${str}" # feeding "str" as here-string to sed; allowing variables "x/y" to be expanded in the sed script
echo "$str"
done
This generates:
One;two # generated by the 2nd sed call
One,two # generated by the echo
;hree.four # generated by the 2nd sed call
three.four # generated by the echo
five;six # generated by the 2nd sed call
five:six # generated by the echo
OK, so we're now getting some output but there are obviously some issues:
the results of the 2nd sed call are being sent to stdout/terminal as opposed to being captured in a variable (presumably the str variable - per the follow-on echo ???)
for string2 we find that x=. which when plugged into the 2nd sed call becomes sed "s/./;/"; from here the . matches the first character it finds which in this case is the 1st t in string2, so the output becomes ;hree.four (and the . is not replaced)
dynamically building sed scripts without knowing what's in x (and y) becomes tricky without some additional coding; instead it's typically easier to use parameter substitution to perform the replacements for us
in this particular case we can replace both sed calls with a single parameter substitution (which also eliminates the expensive overhead of two subprocesses for the $(echo ... | sed ...) call)
Making a few changes to OP's current code we can try:
string1="One,two"
string2="three.four"
string3="five:six"
y=";"
for str in "${string1}" "${string2}" "${string3}"; do
x="${str//[^a-zA-Z]/${y}}" # parameter substitution; replace everything *but* a letter with the contents of variable "y"
echo "${str} => ${x}" # display old and new strings
done
This generates:
One,two => One;two
three.four => three;four
five:six => five;six

How to extract a substring from a string stored in a variable, based on a start / stop character

In the first line I'm after the value 64 and F2DD65
I want to catch the first variable by reading data from from a string in a variable, first from the beginning of the line untill the : character, and read the other variable from after the # character and 6 characters forward.
Is this possible?
This is the string:
var="64: (242,221,101) #F2DD65 srgb(242,221,101)"
my end result would be stored in variables:
var1="64"
var2="F2DD65"
var1=${var%%:*}
var2=${var##*#}
var2=${var2%% *}
Reference: Shell Parameter Expansion.
sed -rn 's/(^.*)(\:.*#)(.*)([[:space:]].*$)/\1 - \3/p' <<< "64: (242,221,101) #F2DD65 srgb(242,221,101)"
With sed, split the line into sections using regular expressions (-r). Substitute the line for the relevant section (the first and then third separated with a -.
awk -F [:#\ ] '{ print $1" - "$5 }' <<< "64: (242,221,101) #F2DD65 srgb(242,221,101)"
With awk, split the line based on a :, a # and a space as delimiters. Print the 1st and 5th delimited fields with a - in between.
With bash regular expressions:
var="64: (242,221,101) #F2DD65 srgb(242,221,101)"
re="^([^:]+): .* #([[:xdigit:]]+)"
if [[ $var =~ $re ]]; then
var1="${BASH_REMATCH[1]}"
var2="${BASH_REMATCH[2]}"
else
# String isn't the right format
echo Fail
fi

How can I truncate a line of text longer than a given length?

How would you go about removing everything after x number of characters? For example, cut everything after 15 characters and add ... to it.
This is an example sentence should turn into This is an exam...
GnuTools head can use chars rather than lines:
head -c 15 <<<'This is an example sentence'
Although consider that head -c only deals with bytes, so this is incompatible with multi-bytes characters like UTF-8 umlaut ü.
Bash built-in string indexing works:
str='This is an example sentence'
echo "${str:0:15}"
Output:
This is an exam
And finally something that works with ksh, dash, zsh…:
printf '%.15s\n' 'This is an example sentence'
Even programmatically:
n=15
printf '%.*s\n' $n 'This is an example sentence'
If you are using Bash, you can directly assign the output of printf to a variable and save a sub-shell call with:
trim_length=15
full_string='This is an example sentence'
printf -v trimmed_string '%.*s' $trim_length "$full_string"
Use sed:
echo 'some long string value' | sed 's/\(.\{15\}\).*/\1.../'
Output:
some long strin...
This solution has the advantage that short strings do not get the ... tail added:
echo 'short string' | sed 's/\(.\{15\}\).*/\1.../'
Output:
short string
So it's one solution for all sized outputs.
Use cut:
echo "This is an example sentence" | cut -c1-15
This is an exam
This includes characters (to handle multi-byte chars) 1-15, c.f. cut(1)
-b, --bytes=LIST
select only these bytes
-c, --characters=LIST
select only these characters
Awk can also accomplish this:
$ echo 'some long string value' | awk '{print substr($0, 1, 15) "..."}'
some long strin...
In awk, $0 is the current line. substr($0, 1, 15) extracts characters 1 through 15 from $0. The trailing "..." appends three dots.
Todd actually has a good answer however I chose to change it up a little to make the function better and remove unnecessary parts :p
trim() {
if (( "${#1}" > "$2" )); then
echo "${1:0:$2}$3"
else
echo "$1"
fi
}
In this version the appended text on longer string are chosen by the third argument, the max length is chosen by the second argument and the text itself is chosen by the first argument.
No need for variables :)
Using Bash Shell Expansions (No External Commands)
If you don't care about shell portability, you can do this entirely within Bash using a number of different shell expansions in the printf builtin. This avoids shelling out to external commands. For example:
trim () {
local str ellipsis_utf8
local -i maxlen
# use explaining variables; avoid magic numbers
str="$*"
maxlen="15"
ellipsis_utf8=$'\u2026'
# only truncate $str when longer than $maxlen
if (( "${#str}" > "$maxlen" )); then
printf "%s%s\n" "${str:0:$maxlen}" "${ellipsis_utf8}"
else
printf "%s\n" "$str"
fi
}
trim "This is an example sentence." # This is an exam…
trim "Short sentence." # Short sentence.
trim "-n Flag-like strings." # Flag-like strin…
trim "With interstitial -E flag." # With interstiti…
You can also loop through an entire file this way. Given a file containing the same sentences above (one per line), you can use the read builtin's default REPLY variable as follows:
while read; do
trim "$REPLY"
done < example.txt
Whether or not this approach is faster or easier to read is debatable, but it's 100% Bash and executes without forks or subshells.

Bash remove substring in file from string

I've one string like this:
myString='value1|value57|value31|value21'
and I've a file, called values_to_remove.txt containing a list of values, one per line, in this way
values_to_remove.txt
value1
value31
In bash, how can I remove the values contained in "values_to_remove.txt" from the string, taking into account that the values are separated by pipe and of course if I remove a value I have to removee also the preceding and the following pipe if any.
I've achieved this in python and called the python script from bash, but I need to do this directly in bash with one line command, rather than small script, otherwise I can already use my little python script.
That's the python code
myString = 'value1|value2|value3|value4'
arrString = myString.split("|")
with open("myfile.txt", encoding="utf-8") as file:
for l in file:
if l in arrString:
arrString.remove(l)
myNewString = "|".join(arrString)
Note that: the values separeted by pipe can be anything string.
Thank you
You may use this awk:
awk -v str="$myString" 'BEGIN {
n = split(str, a, /\|/)
}
{
val[$1]
}
END {
for (i=1; i<=n; i++)
if (!(a[i] in val))
s = (s == "" ? "" : s "|") a[i]
print s
}' values_to_remove.txt
value57|value21
This awk first uses a split function to split input string on |
It stores all values to be removed in another array val
In the end block it loops through split array and builds a string if value is not found in to-be-removed array.
Here is a bash solution (The if statement is a runtime optimization to skip the repacement in case of no match, thanks #Inian):
for val in value1 value31; do
if [[ "$mystring" =~ \|$val|$val\| ]]; then
mystring=${mystring/$BASH_REMATCH/}
fi
done
This looks in pure bash for the first regular expression that matches either |value or value| and removes it. Note you can match both at the same times because then you will delete too many separators. If there is a chance there are no separators you need to use ? after each pipe (maybe just the second one is enough).
You can also avoid regular expressions and just attempt to delete both a prior and a posterior pipe:
for val in value1 value31; do
mystring=${mystring/|$val/};
mystring=${mystring/$val|/};
done
All of these can be written on one line if you really need to:
for val in value1 value31; do [[ "$mystring" =~ \|$val|$val\| ]]; mystring=${mystring/$BASH_REMATCH/}; done
A pure bash solution:
#!/usr/bin/env bash
# Define the location of the values-to-be-removed file
: ${PATH_TO_FILE:=${1:-"./values_to_remove.txt"}}
# Define the string we will be working with
: ${MY_STRING:=${2:-"value1|value57|value31|value21"}}
# Process all entries in PATH_TO_FILE, one by one
while read -r substring || [[ -n "$line" ]]; do
# Remove "substring|" from the beginning of MY_STRING
MY_STRING=${MY_STRING#${substring}|}
# Remove "|substring" from the rest of MY_STRING
MY_STRING=${MY_STRING//|${substring}}
done < "${PATH_TO_FILE}"
# Return the results
echo ${MY_STRING}
Why do we...
Use ${VAR_NAME:=${1:-"DEFAULT_VALUE"}} notation - To allow the user to customise script's inputs either via environment variables or script arguments. Basically, this notation says:
If VAR_NAME environment variable exists, then use it;
If VAR_NAME doesn't exist, then set VAR_NAME to the value of the first argument to the script;
If the first argument doesn't exist either, then set VAR_NAME to the DEFAULT_VALUE.
Use read -r substring || [[ -n "$line" ]] to read the file? – read allows us to read content of ./values_to_remove.txt file, line by line. The [[ -n "$line" ]] bit is there to catch the last line in the file if it doesn't end with a newline.
References:
Assign a default value in bash
Return default value in bash
Bash substring removal
Bash search and replace

rearranging column based on condition

I have a *.csv file. with value as below
"ASDP02","8801942183589"
"ASDP06","8801939151023"
"CSDP04","8801963981740"
"ASDP09","8801946305047"
"ASDP12","8801941195677"
"ASDP05","8801922826186"
"CSDP08","8801983008938"
"ASDP04","8801944346555"
"CSDP11","8801910831518"
or sometimes the value is as below
"8801989353984","KSDP05"
"8801957608165","ASDP11"
"8801991455848","CSDP10"
"8801981363116","CSDP07"
"8801921247870","KSDP07"
"8801965386240","CSDP06"
"8801956293036","KSDP10"
"8801984383904","KSDP11"
"8801944211742","ASDP09"
I just want to put the numeric value (e.g. 8801989353984) always in 1st column. Is it possible using BASH script?
Sed is also your friend here
Input
cat 41189347
"ASDP02","8801942183589"
"ASDP06","8801939151023"
"CSDP04","8801963981740"
"ASDP09","8801946305047"
"ASDP12","8801941195677"
"ASDP05","8801922826186"
"CSDP08","8801983008938"
"ASDP04","8801944346555"
"CSDP11","8801910831518"
Script
sed -E 's/^("[[:alpha:]]+.*"),("[[:digit:]]+")$/\2,\1/' 41189347
Output
"8801942183589","ASDP02"
"8801939151023","ASDP06"
"8801963981740","CSDP04"
"8801946305047","ASDP09"
"8801941195677","ASDP12"
"8801922826186","ASDP05"
"8801983008938","CSDP08"
"8801944346555","ASDP04"
"8801910831518","CSDP11"
awk to the rescue!
$ awk -F, -v OFS=, '$1~/[A-Z]/{t=$2;$2=$1;$1=t}1' file
if first field has alpha chars, swap first and second columns and print.
Bash can do the work but awk might be a better choice for rearrange your file:
sample.csv:
"ASDP02","8801942183589"
"8801944211742","ASDP09"
command:
awk -F, 'BEGIN{OFS=","}{$1=$1;if(substr($1, 2, length($1) - 2) + 0 == substr($1, 2, length($1) - 2)){print $1,$2}else{print $2,$1}}' sample.csv
substr($1, 2, length($1) - 2) + 0 == substr($1, 2, length($1) - 2) checks the column is numeric or not. If it is, print the original line otherwise switch column1 and column2
Output:
"8801942183589","ASDP02"
"8801944211742","ASDP09"
You can create a pure bash script to generate other file which has the structure you need:
#!/bin/bash
csv_file="/path/to/your/csvfile"
output_file="/path/to/output_file"
#Optional
rm -rf "${output_file}"
readarray -t LINES < <(cat < "${csv_file}" 2> /dev/null)
for item in "${LINES[#]}"; do
if [[ $item =~ ^\"([0-9A-Z]+)\"\,\"([0-9]+)\" ]]; then
echo "\"${BASH_REMATCH[2]}\",\"${BASH_REMATCH[1]}\"" >> "${output_file}"
else
echo "$item" >> "${output_file}"
fi
done
This works even if your file is "mixed" I mean with some lines in the right format and other lines in the bad format.
The following commands assume that the cells in the CSV files do not contain newlines and commas. Otherwise, you should write a more complicated script in Perl, PHP, or other programming language capable of parsing CSV files properly. But Bash, definitely, is not appropriate for this task.
Perl
perl -F, -nle '#F = reverse #F if $F[0] =~ /^"\d+"$/;
print join(",", #F)' file
Beware, If the cells contain newlines, or commas, use Perl's Text::CSV module, for instance. Although it is a simple task in Perl, it goes beyond the scope of the current question.
The command splits the input lines by commas (-F,) and stores the result into #F array, for each line. The items in the array are reversed, if the first field $F[0] matches the regular expression. You can also swap the items this way: ($F[0], $F[1]) = ($F[1], $F[0]).
Finally, the joins the array items with commas, and prints to the standard output.
If you want to edit the file in-place, use -i option: perl -i.backup -F, ....
AWK
awk -F, -vOFS=, '/^"[0-9]+",/ {print; next}
{ t = $1; $1 = $2; $2 = t; print }' file
The input and output field separators are set to , with -F, and -vOFS=,.
If the line matches the pattern /^"[0-9]+",/ (the line begins with a "numeric" CSV column), the script prints the record and advances to the next record. Otherwise the next block is executed.
In the next block, it swaps the first two columns and prints the result to the standard output.
If you want to edit the file in-place, see answers to this question.

Resources