I have a shell script that needs to trim newline from input. I am trying to trim new line like so:
param=$1
trimmed_param=$(echo $param | tr -d "\n")
# is the new line in my trimmed_param? yes
echo $trimmed_param| od -xc
# if i just run the tr -d on the data, it's trimmed.
# why is it not trimmed in the dynamic execution of echo in line 2
echo $param| tr -d "\n" |od -xc
I run it from command line as follows:
sh test.sh someword
And I get this output:
0000000 6f73 656d 6f77 6472 000a
s o m e w o r d \n
0000011
0000000 6f73 656d 6f77 6472
s o m e w o r d
0000010
The last command in the script echos what I would think trimmed_param would be if the tr -d "\n" had worked in line 2. What am I missing?
I realize I can use sed etc but ... I would love to understand why this method is failing.
There has never been a newline in the param. It's the echo which appends the newline. Try
# script.sh
param=$1
printf "%s" "${param}" | od -xc
Then
bash script.sh foo
gives you
0000000 6f66 006f
f o o
0000003
Related
without using sed or awk
I tried this command to solve this problem
tr Ac Ze
but this command doesn't work
Does any help, please?
You can use sed command:
g: Apply the replacement to all matches to the regexp, not just the first.
s: stand for substitute
$ -> echo Aca | sed 's/A/Z/g; s/c/e/g'
Zea
Or just use tr command as said #James
$ -> echo Aca | tr Ac Ze
Zea
Another example:
#!/bin/bash
read -p "Insert word: " word
echo $word | tr Ac Ze
Result:
Insert word: Aca
Zea
Or:
#!/bin/bash
read -p "Insert word: " word
echo $word | sed 's/A/Z/g; s/c/e/g'
Aditional info:
tr
$ -> whatis tr
tr (1) - translate or delete characters
sed
$ -> whatis sed
sed (1) - stream editor for filtering and transforming text
How do I convert a word into characters using the shell script ?
As an example user to "u s e r"
$ echo user | sed 's/./& /g'
u s e r
Something like this work fine in bash 4:
$ readarray -t arr < <(grep -o '.' <<<"user")
$ declare -p arr #Just prints the array
declare -a arr=([0]="u" [1]="s" [2]="e" [3]="r")
$ echo "${arr[2]}"
e
I would like to get a portion of a string using cut. Here a dummy example:
$ echo "foobar" | cut -c1-3 | hexdump -C
00000000 66 6f 6f 0a |foo.|
00000004
Notice the \n char added at the end.
In that case there is no point to use cut to remove the last char as follow:
echo "foobar" | cut -c1-3 | rev | cut -c 1- | rev
I will still get this extra and unwanted char and I would like to avoid using an extra command such as:
shasum file | cut -c1-16 | perl -pe chomp
The \n is added by echo. Instead, use printf:
$ echo "foobar" | od -c
0000000 f o o b a r \n
0000007
$ printf "foobar" | od -c
0000000 f o o b a r
0000006
It is funny that cut itself also adds a new line:
$ printf "foobar" | cut -b1-3 | od -c
0000000 f o o \n
0000004
So the solution seems using printf to its output:
$ printf "%s" $(cut -b1-3 <<< "foobar") | od -c
0000000 f o o
0000003
On Linux, this runs as expected:
$ echo -e "line1\r\nline2"|awk -v RS="\r\n" '/^line/ {print "awk: "$0}'
awk: line1
awk: line2
But under windows the \r is dropped (awk considers this one line):
Windows:
$ echo -e "line1\r\nline2"|awk -v RS="\r\n" '/^line/ {print "awk: "$0}'
awk: line1
line2
Windows GNU Awk 4.0.1
Linux GNU Awk 3.1.8
EDIT from #EdMorton (sorry if this is an unwanted addition but I think maybe it helps demonstrate the issue):
Consider this RS setting and input (on cygwin):
$ awk 'BEGIN{printf "\"%s\"\n", RS}' | cat -v
"
"
$ echo -e "line1\r\nline2" | cat -v
line1^M
line2
This is Solaris with gawk:
$ echo -e "line1\r\nline2" | awk '1' | cat -v
line1^M
line2
and this is cygwin with gawk:
$ echo -e "line1\r\nline2" | awk '1' | cat -v
line1
line2
RS was just it's default newline so where did the control-M go in cygwin?
I just checked with Arnold Robbins (the provider of gawk) and the answer is that it's something done by the C libraries and to stop it happening you should set the awk BINMODE variable to 3:
$ echo -e "line1\r\nline2" | awk '1' | cat -v
line1
line2
$ echo -e "line1\r\nline2" | awk -v BINMODE=3 '1' | cat -v
line1^M
line2
See the man page for more info if interested.
It seems like the issue is awk specific under Cygwin.
I tried a few different things and it seems that awk is silently treating replacing \r\n with \n in the input data.
If we simply ask awk to repeat the text unmodified, it will "sanitize" the carriage returns without asking:
$ echo -e "line1\r\nline2" | od -a
0000000 l i n e 1 cr nl l i n e 2 nl
0000015
$ echo -e "line1\r\nline2" | awk '{ print $0; }' | od -a
0000000 l i n e 1 nl l i n e 2 nl
0000014
It will, however, leave other carriage returns intact:
$ echo -e "Test\rTesting\r\nTester\rTested" | awk '{ print $0; }' | od -a
0000000 T e s t cr T e s t i n g nl T e s
0000020 t e r cr T e s t e d nl
0000033
Using a custom record separator of _ ended up leaving the carriage returns intact:
$ echo -e "Testing\r_Tested" | awk -v RS="_" '{ print $0; }' | od -a
0000000 T e s t i n g cr nl T e s t e d nl
0000020 nl
0000021
The most telling example involves having \r\n in the data, but not as a record separator:
$ echo -e "Testing\r\nTested_Hello_World" | awk -v RS="_" '{ print $0; }' | od -a
0000000 T e s t i n g nl T e s t e d nl H
0000020 e l l o nl W o r l d nl nl
0000034
awk is blindly converting \r\n to \n in the input data even though we didn't ask it to.
This substitution seems to be happening before applying record separation, which explains why RS="\r\n" never matches anything. By the time awk is looking for \r\n, it's already substituted it with \n in the input data.
How do I remove ^H and ^M characters from a file using Linux shell scripting?
^[[0^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H rcv-packets: 0
^[[0^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H^H ^H rcv-errs: 0
rcv-drop: 0
rcv-fifo: 0
rcv-frame: 0
What you're seeing there are control characters, you simply could delete them with tr
cat your_file |
tr -d '\b\r'
this is better:
tr -d '\b\r' < your_file
Two methods come to mind immediately:
tr -d control+v control+h
sed 's/control+v control+h//g'
Here's both in action:
$ od -c test
0000000 \b h e l l o \b t h e r e \b \n
0000016
$ sed 's/^H//g' < test | od -c
0000000 h e l l o t h e r e \n
0000013
$ tr -d ^H < test | od -c
0000000 h e l l o t h e r e \n
0000013
For removing ^M characters appearing at the end of every line, I usually do this in vi editor.
:%s/.$//g
It just removes the last character of every line irrespective of what the character is.
This solved my provlem.
Use sed utility.
See below as per examples:
sed 's/%//' file > newfile
echo "82%%%" | sed 's/%*$//'
echo "68%" | sed "s/%$//" #assume % is always at the end.
You can remove all control characters by using tr, e.g.
tr -d "[:cntrl:]" file.txt
To exclude some of them (like line endings), check: Removing control characters from a file.
if you want to change original file, do this:
sed -i '.bak' 's/^M//g ; s/^H//g' test.md
(^M is control+v control+m)
(^H is control+v control+h)
much file, you can do this:
find source -name '*.md' | xargs sed -i '.bak' 's/^M//g ; s/^H//g'