How do I replace backspace characters (\b) using sed?

How do I replace backspace characters (\b) using sed? - linux

I want to delete a fixed number of some backspace characters ocurrences ( \b ) from stdin. So far I have tried this:
echo -e "1234\b\b\b56" | sed 's/\b{3}//'
But it doesn't work. How can I achieve this using sed or some other unix shell tool?

You can use the hexadecimal value for backspace:
echo -e "1234\b\b\b56" | sed 's/\x08\{3\}//'
You also need to escape the braces.

You can use tr:
echo -e "1234\b\b\b56" | tr -d '\b'
123456
If you want to delete three consecutive backspaces, you can use Perl:
echo -e "1234\b\b\b56" | perl -pe 's/(\010){3}//'

sed interprets \b as a word boundary. I got this to work in perl like so:
echo -e "1234\b\b\b56" | perl -pe '$b="\b";s/$b//g'

With sed:
echo "123\b\b\b5" | sed 's/[\b]\{3\}//g'
You have to escape the { and } in the {3}, and also treat the \b special by using a character class.
[birryree#lilun ~]$ echo "123\b\b\b5" | sed 's/[\b]\{3\}//g'
1235

Note if you want to remove the characters being deleted also, have a look at ansi2html.sh which contains processing like:
printf "12..\b\b34\n" | sed ':s; s#[^\x08]\x08##g; t s'

No need for Perl here!
# version 1
echo -e "1234\b\b\b56" | sed $'s/\b\{3\}//' | od -c
# version 2
bvar="$(printf '%b' '\b')"
echo -e "1234\b\b\b56" | sed 's/'${bvar}'\{3\}//' | od -c

Related

Linux Command to make columns separated by multiple delimeters

Want to convert the following pattern
ab
cd
de
fg as
'ab','cd','de','fg' using unix / linux command .
----Guys -------- The patern is as following
QRTC1065173134
QRTC3988977812
QRTC0889556882
QUTR1641276912
ABCD1763495154
QRTC3991601819
and this is the required pattern 'QRTC1065173134','QRTC3988977812','QRTC0889556882','QUTR1641276912','ABCD1763495154','QRTC3991601819'

I agree with the comments that it's a bit unclear but, for fun sake:
A="ab cd ef gh "
echo $A | sed -e "s/^\s*/'/; s/ \{1,\}/','/g; s/\s*$/'/g"
Since the question wasn't precise, I only worked on spaces & string boundaries. So it will work with any number of characters, separated space(s). The result is also trimmed at both ends. HTH.

From the clarification, it seems that this is what you want.
$ echo "QRTC1065173134 QRTC3988977812 QRTC0889556882 QUTR1641276912 ABCD1763495154 QRTC3991601819" | sed -E "s/ /', '/g" | sed -E "s/$/'/" | sed -E "s/^/'/"
'QRTC1065173134', 'QRTC3988977812', 'QRTC0889556882', 'QUTR1641276912', 'ABCD1763495154', 'QRTC3991601819'
Here "E" is for extended regex so that we do not need to escape the regex metacharacters.
COMMENT 1: Removing an extra whitespace is left as an exercise for you.

If I understand correctly, that you have groups (e.g. ab bc de ...) separated by spaces, where you want to include a ' at the beginning/end of everything, and replace the spaces with ',' then sed can handle this with relative ease. Below ab cd ... can be any string of characters, such as QRTC1065173134. There are several ways to piece together a matching regular expression, but the following is fairly simple:
sed -e "s/\s/','/g" -e "s/^/'/" -e "s/$/'/"
example
$ echo "ab cd de fg hi" | sed -e "s/\s/','/g" -e "s/^/'/" -e "s/$/'/"
'ab','cd','de','fg','hi'
or
$ echo "QRTC1065173134 QRTC3988977812 QRTC0889556882" | sed -e "s/\s/','/g" -e "s/^/'/" -e "s/$/'/"
'QRTC1065173134','QRTC3988977812','QRTC0889556882'

Best way to swap first 4 chars with last 4 chars of string?

What's the way to swap first 4 chars with last 4 chars of string?
e.g. I have the string 20140613, I'd like to convert that to 06132014.

$ f=20140613
$ g=${f#????}${f%????}
$ echo $g
06132014
For dealing with longer strings something like the following is needed. (With inspiration from konsolebox's answer.)
echo ${f:(-4)}${f:4:${#f} - 8}${f:0:4}

Using pure BASH regex:
s='20140613'
[[ "$s" =~ ^(.*)([[:digit:]]{4})$ ]] && echo "${BASH_REMATCH[2]}${BASH_REMATCH[1]}"
06132014

Simply use substring expansion:
$ STRING=20140613
$ echo "${STRING:(-4)}${STRING:0:4}"
06132014
See Parameter Expansion.

Using date which is optimized for such kind of conversion:
$ str="20140613"
$ date +"%m%d%Y" -d "$str"
06132014
When you have to convert dates, no need to look so far ;)

Using sed:
STRING="20140613"
STRING=$(echo $STRING | sed 's/\(....\)\(.*\)/\2\1/')

Or using awk:
echo 20140613 | awk '{print substr($0,5,7) substr($0,1,4)}'
Test:
~$ echo 20140613 | awk '{print substr($0,5,7) substr($0,1,4)}'
>> 06132014

Through sed,
$ echo 20140613 | sed 's/^\(.\{4\}\)\(.\{4\}\)$/\2\1/g'
06132014
Through perl,
$ echo 20140613 | perl -pe 's/^(.{4})(.{4})$/\2\1/g'
06132014

With GNU Coreutils:
input=20140613
output=$(echo $input | fold -w4 | tac | tr -d \\n)
If you also need the last line feed, you can replace tr -d \\n with printf %s%s\\n or just append && echo to the command.

With perl
for str in 11112222 1111xxxx2222 111222
do
echo -n "$str -> "
echo "$str" | perl -ple 's/^(.{4})(.*)(.{4})$/\3\2\1/'
done
produces:
11112222 -> 22221111
1111xxxx2222 -> 2222xxxx1111
111222 -> 111222

Bash sort by regexp

I have something about 100 files with the following syntax
ahfsdjfhdfhj_EPI_34_fdsafasdf
asdfasdf_EPI_2_fdsf
hfdjh_EPI_8_dhfffffffffff
ffffffffffasdfsdf_EPI_1_fyyy44
...
There is always EPI_NUMBER. How can I sort it by this number?

From your example it appears that delimiter is _ and text EPI_nnn comes at the same position after delimiter _. If that is always the case then you can use following command to sort the file:
sort -n -t "_" -k 3 file.txt
UPDATE:
If position of EPI_ text is not fixed then use following shell command:
sed 's/^\(.*EPI_\)\(.*\)$/\2##\1/' file.txt | sort -n -t "_" -k1 | sed 's/^\(.*\)##\(.*\)$/\2\1/'

If Perl is okay you can:
print sort foo <>;
sub foo {
($x = $a) =~s/.*EPI_(\d+).*/$1/;
($y = $b) =~s/.*EPI_(\d+).*/$1/;
return $x <=> $y;
}
and use it as:
perl prg.pl inputfile
See it

sed -e 's/EPI_/EPI /' file1 file2 ...|sort -n -k 2 -t ' '
Pipe that to sed -e 's/ /_/' to get back the original form.

This might work for you:
ls | sed 's/.*EPI_\([0-9]*\)/\1 &/' | sort -n | sed 's/\S* //'

linux bash, camel case string to separate by dash

Is there a way to convert something like this:
MyDirectoryFileLine
to
my-directory-file-line
I found some ways to convert all letters to uppercase or lowercase, but not in that way; any ideas?

You can use s/\([A-Z]\)/-\L\1/g to find an upper case letter and replace it with a dash and it's lower case. However, this gives you a dash at the beginning of the line, so you need another sed expression to handle that.
This should work:
sed --expression 's/\([A-Z]\)/-\L\1/g' \
--expression 's/^-//' \
<<< "MyDirectoryFileLine"

I propose to use sed to do that:
NEW=$(echo MyDirectoryFileLine \
| sed 's/\(.\)\([A-Z]\)/\1-\2/g' \
| tr '[:upper:]' '[:lower:]')
UPD I forget to convert to lower case, updated code

echo MyDirectoryFileLine | perl -ne 'print lc(join("-", split(/(?=[A-Z])/)))'
prints my-directory-file-line

Slight variation on #bilalq's answer that covers some more possible edge cases:
echo "MyDirectoryMVPFileLine" \
| sed 's/\([^A-Z]\)\([A-Z0-9]\)/\1-\2/g' \
| sed 's/\([A-Z0-9]\)\([A-Z0-9]\)\([^A-Z]\)/\1-\2\3/g' \
| tr '[:upper:]' '[:lower:]'
output is still:
my-directory-mvp-file-line
but also:
WhatADeal -> what-a-deal
TheMVP -> the-mvp
DoSomeABTesting -> do-some-ab-testing
The3rdThing -> the-3rd-thing
The3Things -> the-3-things
ThingNumber3 -> thing-number-3

None of the solutions posted here worked for me. Most didn't support multiple platforms well. The one from #4ndrew was close, but it failed on edge cases that had multiple capitalized characters next to each other (example: FooMVPClient turns into foo-mv-pclient instead of foo-mvp-client).
This worked for me:
echo "MyDirectoryMVPFileLine" \
| sed 's/\([a-z]\)\([A-Z]\)/\1-\2/g' \
| sed 's/\([A-Z]\{2,\}\)\([A-Z]\)/\1-\2/g' \
| tr '[:upper:]' '[:lower:]'
output:
my-directory-mvp-file-line

My modest contribution that works with "/" (possible use for directory names or github repo names). It's not as clean as it could be, but does the job. I've used #Peter contribution as a base, then tweaked a bit.
function kebab_case() {
echo -n "$1" |\
sed 's/\([^A-Z+]\)\([A-Z0-9]\)/\1-\2/g' |\
sed 's/\([0-9]\)\([A-Z]\)/\1-\2/g' |\
sed 's/\([A-Z]\)\([0-9]\)/\1-\2/g' |\
sed 's/--/-/g' |\
sed 's/\([\/]\)-/\1/g' |\
tr '[:upper:]' '[:lower:]'
}
function assert_kebab_equal() {
local Actual
local Expected
Expected="$1"
Actual="$(kebab_case "$2")"
if [ "${Expected}" != "${Actual}" ]; then
echo Error:
echo " Actual: ${Actual}"
echo "Expected: ${Expected}"
else
echo "$2" "$1" | awk '{ printf "%-30s -> %-40s\n", $1, $2}'
fi
}
assert_kebab_equal "abc-def" "AbcDef"
assert_kebab_equal "/abc-def-ghi/def" "/AbcDef-Ghi/Def"
assert_kebab_equal "/ab-cd-ef" "/AbCdEf"
assert_kebab_equal "repo-owner/repo-name" "RepoOwner/RepoName"
assert_kebab_equal "repo-12-owner/repo-12-name" "Repo12Owner/Repo12Name"
assert_kebab_equal "repo-12-3-owner/repo-12-name" "Repo12-3Owner/Repo12Name"
assert_kebab_equal "repo-owner/repo-name" "REPO-OWNER/REPO-NAME"
assert_kebab_equal "repo-owner-2/repo-name" "REPO-OWNER2/REPO-NAME"
assert_kebab_equal "repo-1-owner" "REPO1-OWNER"
assert_kebab_equal "repo-1-owner-1/22-repo-2-name" "REPO1-OWNER1/22REPO-2NAME"
# Outputs:
AbcDef -> abc-def
/AbcDef-Ghi/Def -> /abc-def-ghi/def
/AbCdEf -> /ab-cd-ef
RepoOwner/RepoName -> repo-owner/repo-name
Repo12Owner/Repo12Name -> repo-12-owner/repo-12-name
Repo12-3Owner/Repo12Name -> repo-12-3-owner/repo-12-name
REPO-OWNER/REPO-NAME -> repo-owner/repo-name
REPO-OWNER2/REPO-NAME -> repo-owner-2/repo-name
REPO1-OWNER -> repo-1-owner
REPO1-OWNER1/22REPO-2NAME -> repo-1-owner-1/22-repo-2-name

This might work for you:
<<<"MyDirectoryFileLine" sed 's/[A-Z]/-\l&/g;s/.//'
my-directory-file-line

With GNU sed:
echo "MyDirectoryFileLine"|sed -e 's/\([A-Z]\)/-\L\1/g'
You just need to strip the first dash if that's bothers you:
echo "MyDirectoryFileLine"|sed -e 's/\([A-Z]\)/-\L\1/g' -e 's/^-//'
With BSD sed it it's a bit longer:
echo "MyDirectoryFileLine"|sed -e 's/\([A-Z]\)/-\1/g' -e 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/' -e 's/^-//'
Update: the BSD version will work with the GNU version, so I recommend using the latter.

echo "SomeACRONYMInCamelCaseString" \
| sed -e 's/\([a-z]\)\([A-Z]\)/\1-\L\2/' \
| sed -e 's/\(.*\)/\L\1/')
sed -e 's/\([a-z]\)\([A-Z]\)/\1-\L\2/' replace an uppercase letter with a hyphen and a lowercase letter only if it is preceded by a lowercase letter.
sed -e 's/\(.*\)/\L\1/' puts the whole string in lowercase

linux shell title case

I am wrinting a shell script and have a variable like this: something-that-is-hyphenated.
I need to use it in various points in the script as:
something-that-is-hyphenated, somethingthatishyphenated, SomethingThatIsHyphenated
I have managed to change it to somethingthatishyphenated by stripping out - using sed "s/-//g".
I am sure there is a simpler way, and also, need to know how to get the camel cased version.
Edit: Working function derived from #Michał's answer
function hyphenToCamel {
tr '-' '\n' | awk '{printf "%s%s", toupper(substr($0,1,1)), substr($0,2)}'
}
CAMEL=$(echo something-that-is-hyphenated | hyphenToCamel)
echo $CAMEL
Edit: Finally, a sed one liner thanks to #glenn
echo a-hyphenated-string | sed -E "s/(^|-)([a-z])/\u\2/g"

a GNU sed one-liner
echo something-that-is-hyphenated |
sed -e 's/-\([a-z]\)/\u\1/g' -e 's/^[a-z]/\u&/'
\u in the replacement string is documented in the sed manual.

Pure bashism:
var0=something-that-is-hyphenated
var1=(${var0//-/ })
var2=${var1[*]^}
var3=${var2// /}
echo $var3
SomethingThatIsHyphenated
Line 1 is trivial.
Line 2 is the bashism for replaceAll or 's/-/ /g', wrapped in parens, to build an array.
Line 3 uses ${foo^}, which means uppercase (while ${foo,} would mean 'lowercase' [note, how ^ points up while , points down]) but to operate on every first letter of a word, we address the whole array with ${foo[*]} (or ${foo[#]}, if you would prefer that).
Line 4 is again a replace-all: blank with nothing.
Line 5 is trivial again.

You can define a function:
hypenToCamel() {
tr '-' '\n' | awk '{printf "%s%s", toupper(substr($0,0,1)), substr($0,2)}'
}
CAMEL=$(echo something-that-is-hyphenated | hypenToCamel)
echo $CAMEL

In the shell you are stuck with being messy:
aa="aaa-aaa-bbb-bbb"
echo " $aa" | sed -e 's/--*/ /g' -e 's/ a/A/g' -e 's/ b/B/g' ... -e 's/ *//g'
Note the carefully placed space in the echo and the double space in the last -e.
I leave it as an exercise to complete the code.
In perl it is a bit easier as a one-line shell command:
perl -e 'print map{ $a = ucfirst; $a =~ s/ +//g; $a} split( /-+/, $ARGV[0] ), "\n"' $aa

For the records, here's a pure Bash safe method (that is not subject to pathname expansion)—using Bash≥4:
var0=something-that-is-hyphenated
IFS=- read -r -d '' -a var1 < <(printf '%s\0' "${var0,,}")
printf '%s' "${var1[#]^}"
This (safely) splits the lowercase expansion of var0 at the hyphens, with each split part in array var1. Then we use the ^ parameter expansion to uppercase the first character of the fields of this array, and concatenate them.
If your variable may also contain spaces and you want to act on them too, change IFS=- into IFS='- '.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How do I replace backspace characters (\b) using sed? - linux

I want to delete a fixed number of some backspace characters ocurrences ( \b ) from stdin. So far I have tried this: echo -e "1234\b\b\b56" | sed 's/\b{3}//' But it doesn't work. How can I achieve this using sed or some other unix shell tool?

You can use the hexadecimal value for backspace: echo -e "1234\b\b\b56" | sed 's/\x08\{3\}//' You also need to escape the braces.

You can use tr: echo -e "1234\b\b\b56" | tr -d '\b' 123456 If you want to delete three consecutive backspaces, you can use Perl: echo -e "1234\b\b\b56" | perl -pe 's/(\010){3}//'

sed interprets \b as a word boundary. I got this to work in perl like so: echo -e "1234\b\b\b56" | perl -pe '$b="\b";s/$b//g'

With sed: echo "123\b\b\b5" | sed 's/[\b]\{3\}//g' You have to escape the { and } in the {3}, and also treat the \b special by using a character class. [birryree#lilun ~]$ echo "123\b\b\b5" | sed 's/[\b]\{3\}//g' 1235

Note if you want to remove the characters being deleted also, have a look at ansi2html.sh which contains processing like: printf "12..\b\b34\n" | sed ':s; s#[^\x08]\x08##g; t s'

No need for Perl here! # version 1 echo -e "1234\b\b\b56" | sed $'s/\b\{3\}//' | od -c # version 2 bvar="$(printf '%b' '\b')" echo -e "1234\b\b\b56" | sed 's/'${bvar}'\{3\}//' | od -c

Related

Linux Command to make columns separated by multiple delimeters

Best way to swap first 4 chars with last 4 chars of string?

Bash sort by regexp

linux bash, camel case string to separate by dash

linux shell title case

Categories

Resources