Concatenating string read from file with string literals creates jumbled output - string

My problem is that the result is jumbled. Consider this script:
#!/bin/bash
INPUT="filelist.txt"
i=0;
while read label
do
i=$[$i+1]
echo "HELLO${label}WORLD"
done <<< $'1\n2\n3\n4'
i=0;
while read label
do
i=$[$i+1]
echo "HELLO${label}WORLD"
done < "$INPUT"
filelist.txt
5
8
15
67
...
The first loop, with the immediate input (through something I believe is called a herestring (the <<< operator) gives the expected output
HELLO1WORLD
HELLO2WORLD
HELLO3WORLD
HELLO4WORLD
The second loop, which reads from the file, gives the following jumbled output:
WORLD5
WORLD8
WORLD15
WORLD67
I've tried echo $label: This works as expected in both cases, but the concatenation fails in the second case as described. Further, the exact same code works on my Win 7, git-bash environment. This issue is on OSX 10.7 Lion.
How to concatenate strings in bash |
Bash variables concatenation |
concat string in a shell script

Well, just as I was about to hit post, the solution hit me. Sharing here so someone else can find it - it took me 3 hours to debug this (despite being on SO for almost all that time) so I see value in addressing this specific (common) use case.
The problem is that filelist.txt was created in Windows. This means it has CRLF line endings, while OSX (like other Unix-like environments) expects LF only line endings. (See more here: Difference between CR LF, LF and CR line break types?)
I used the answer here to convert the file before consumption. Using sed I managed to replace only the final line's carriage return, so I stuck to known guns and went for the perl approach. Final script is below:
#!/bin/bash
INPUTFILE="filelist.txt"
INPUT=$(perl -pe 's/\r\n|\n|\r/\n/g' "$INPUTFILE")
i=0;
while read label
do
i=$[$i+1]
echo "HELLO${label}WORLD"
done <<< $'INPUT'
Question has been asked in a different form at Bash: Concatenating strings fails when read from certain files

Related

How do you append a string built with interpolation of vars and STDIN to a file?

Can someone fix this for me.
It should copy a version log file to backup after moving to a repo directory
Then it automatically appends line given as input to the log file with some formatting.
That's it.
Assume existence of log file and test directory.
#!/bin/bash
cd ~/Git/test
cp versionlog.MD .versionlog.MD.old
LOGDATE="$(date --utc +%m-%d-%Y)"
read -p "MSG > " VHMSG |
VHENTRY="- **${LOGDATE}** | ${VHMSG}"
cat ${VHENTRY} >> versionlog.MD
shell output
virufac#box:~/Git/test$ ~/.logvh.sh
MSG > testing script
EOF
EOL]
EOL
e
E
CTRL + C to get out of stuck in reading lines of input
virufac#box:~/Git/test$ cat versionlog.MD
directly outputs the markdown
# Version Log
## version 0.0.1 established 01-22-2020
*Working Towards Working Mission 1 Demo in 0.1 *
- **01-22-2020** | discovered faker.Faker and deprecated old namelessgen
EOF
EOL]
EOL
e
E
I finally got it to save the damned input lines to the file instead of just echoing the command I wanted to enter on the screen and not executing it. But... why isn't it adding the lines built from the VHENTRY variable... and why doesn't it stop reading after one line sometimes and this time not. You could see I was trying to do something to tell it to stop reading the input.
After some realizing a thing I had done in the script was by accident... I tried to fix it and saw that the | at the end of the read command was seemingly the only reason the script did any of what it did save to the file in the first place.
I would have done this in python3 if I had know this script wouldn't be the simplest thing I had ever done. Now I just have to know how you do it after all the time spent on it so that I can remember never to think a shell script will save time again.
Use printf to write a string to a file. cat tries to read from a file named in the argument list. And when the argument is - it means to read from standard input until EOF. So your script is hanging because it's waiting for you to type all the input.
Don't put quotes around the path when it starts with ~, as the quotes make it a literal instead of expanding to the home directory.
Get rid of | at the end of the read line. read doesn't write anything to stdout, so there's nothing to pipe to the following command.
There isn't really any need for the VHENTRY variable, you can do that formatting in the printf argument.
#!/bin/bash
cd ~/Git/test
cp versionlog.MD .versionlog.MD.old
LOGDATE="$(date --utc +%m-%d-%Y)"
read -p "MSG > " VHMSG
printf -- '- **%s** | %s\n' "${LOGDATE}" "$VHMSG" >> versionlog.MD

Understanding sed

I am trying to understand how
sed 's/\^\[/\o33/g;s/\[1G\[/\[27G\[/' /var/log/boot
worked and what the pieces mean. The man page I read just confused me more and I tried the info sai Id but had no idea how to work it! I'm pretty new to Linux. Debian is my first distro but seemed like a rather logical place to start as it is a root of many others and has been around a while so probably is doing stuff well and fairly standardized. I am running Wheezy 64 bit as fyi if needed.
The sed command is a stream editor, reading its file (or STDIN) for input, applying commands to the input, and presenting the results (if any) to the output (STDOUT).
The general syntax for sed is
sed [OPTIONS] COMMAND FILE
In the shell command you gave:
sed 's/\^\[/\o33/g;s/\[1G\[/\[27G\[/' /var/log/boot
the sed command is s/\^\[/\o33/g;s/\[1G\[/\[27G\[/' and /var/log/boot is the file.
The given sed command is actually two separate commands:
s/\^\[/\o33/g
s/\[1G\[/\[27G\[/
The intent of #1, the s (substitute) command, is to replace all occurrences of '^[' with an octal value of 033 (the ESC character). However, there is a mistake in this sed command. The proper bash syntax for an escaped octal code is \nnn, so the proper way for this sed command to have been written is:
s/\^\[/\033/g
Notice the trailing g after the replacement string? It means to perform a global replacement; without it, only the first occurrence would be changed.
The purpose of #2 is to replace all occurrences of the string \[1G\[ with \[27G\[. However, this command also has a mistake: a trailing g is needed to cause a global replacement. So, this second command needs to be written like this:
s/\[1G\[/\[27G\[/g
Finally, putting all this together, the two sed commands are applied across the contents of the /var/log/boot file, where the output has had all occurrences of ^[ converted into \033, and the strings \[1G\[ have been converted to \[27G\[.

Read content from text file formed in Windows in Linux bash [duplicate]

This question already has answers here:
How to concatenate string variables in Bash
(30 answers)
Closed 9 years ago.
I am trying to download files from a database using wget and url. E.g.
wget "http://www.rcsb.org/pdb/files/1BXS.pdb"
So format of the url is as such: http://www.rcsb.org/pdb/files/($idnumber).pdb"
But I have many files to download; so I wrote a bash script that reads id_numbers from a text file, forms url string and downloads by wget.
!/bin/bash
while read line
do
url="http://www.rcsb.org/pdb/files/$line.pdb"
echo -e $url
wget $url
done < id_numbers.txt
However, url string is formed as
.pdb://www.rcsb.org/pdb/files/4H80
So, .pdb is repleced with http. I cannot figure out why. Does anyone have an idea?
How can I format it so url is
"http://www.rcsb.org/pdb/files/($idnumber).pdb"
?
Thanks a lot.
Note. This question was marked as duplicate of 'How to concatenate strings in bash?' but I was actually asking for something else. I read that question before asking this one and it turns out my problem was with preparing the txt file in Windows not really string concetanation. I edited question title. I hope it is more clear now.
It sounds like your id_numbers.txt file has DOS/Windows-style line endings (carriage return followed by linefeed characters) instead of plain unix line endings (just linefeed). The result is that read thinks the line ends with a carriage return, $line actually has a carriage return at the end, and that gets embedded in the url, causing various confusion.
There are several ways to solve this. You could have bash trim the carriage return from the variable when you use it:
url="http://www.rcsb.org/pdb/files/${line%$'\r'}.pdb"
Or you could have read trim it by telling it that carriage return counts as whitespace (read will trim leading and trailing whitespace from what it reads):
while IFS=$'\r' read line
Or you could use a command like dos2unix (or whatever the equivalent is on your OS) to convert the id_numbers.txt file.
The -e echo option is used to output the desired content without inserting a new line, you do not need it here.
Also I suspect your file containing the ids to be malformed, on which OS did you create it?
Anyway, you can simplify your script this way:
!/bin/bash
while read line
do
wget "http://www.rcsb.org/pdb/files/$line.pdb"
done < id_numbers.txt
I was able to successfully test it with an id_numbers.txt file generated like so:
for i in $(0 9) ; do echo "$i" >> id_numbers.txt ; done
Try this:
url="http://www.rcsb.org/pdb/files/"$line
$url=$url".pdb"
For more info, check How to concatenate string variables in Bash?

Can't read to var in Bash

I wrote a little Bash script and I'm having a problem while reading from the command line. I think its because I wrote the script on Windows. Here is the code:
read NEW_MODX_PROJECT
and the output of the debug mode
+ read $'NEW_MODX_PROJECT\r'
Finally here the error I get
': Ist kein gültiger Bezeichner.DX_PROJECT
I think in English it should mean "': is not a valid identifier.DX_PROJECT"
While writing it on Windows, it worked fine. I used console2 to test it which is using the sh.exe.
Your assertion is correct -- Windows uses CRLF line separators but Linux just uses a LF.
The reason for your strange error message is that while printing the name of your variable, it includes the carriage return as part of its name -- the terminal then jumps back to the first column to print the rest of the error message (which overwrites the beginning of the message with the end of it).
There are a set of utilities known as dos2unix and unix2dos which you can use to easily convert between formats, e.g.:
dos2unix myscript.sh
If you don't happen to have them, you can achieve the same using tr:
tr -d '\r' < myscript.sh > myscript-new.sh
Either will strip all the carriage returns and should un-confuse things.

How can I replace a specific line by line number in a text file?

I have a 2GB text file on my linux box that I'm trying to import into my database.
The problem I'm having is that the script that is processing this rdf file is choking on one line:
mismatched tag at line 25462599, column 2, byte 1455502679:
<link r:resource="http://www.epuron.de/"/>
<link r:resource="http://www.oekoworld.com/"/>
</Topic>
=^
I want to replace the </Topic> with </Line>. I can't do a search/replace on all lines but I do have the line number so I'm hoping theres some easy way to just replace that one line with the new text.
Any ideas/suggestions?
sed -i yourfile.xml -e '25462599s!</Topic>!</Line>!'
sed -i '25462599 s|</Topic>|</Line>|' nameoffile.txt
The tool for editing text files in Unix, is called ed (as opposed to sed, which as the name implies is a stream editor).
ed was once intended as an interactive editor, but it can also easily scripted. The way ed works, is that all commands take an address parameter. The way to address a specific line is just the line number, and the way to change the addressed line(s) is the s command, which takes the same regexp that sed would. So, to change the 42nd line, you would write something like 42s/old/new/.
Here's the entire command:
FILENAME=/path/to/whereever
LINENUMBER=25462599
ed -- "${FILENAME}" <<-HERE
${LINENUMBER}s!</Topic>!</Line>!
w
q
HERE
The advantage of this is that ed is standardized, while the -i flag to sed is a proprietary GNU extension that is not available on a lot of systems.
Use "head" to get the first 25462598 lines and use "tail" to get the remaining lines (starting at 25462601). Though... for a 2GB file this will likely take a while.
Also are you sure the problem is just with that line and not somewhere previous (ie. the error looks like an XML parse error which might mean the actual problem is someplace else).
My shell script:
#!/bin/bash
awk -v line=$1 -v new_content="$2" '{
if (NR == line) {
print new_content;
} else {
print $0;
}
}' $3
Arguments:
first: line number you want change
second: text you want instead original line contents
third: file name
This script prints output to stdout then you need to redirect. Example:
./script.sh 5 "New fifth line text!" file.txt
You can improve it, for example, by taking care that all your arguments has expected values.

Resources