Bash: opening file which name in listed inside another file - linux

I have a file that contains a list of file names to be opened later.
After I load lines (file names) to variables, for a reason unknown to me I cannot open it as a file later.
Here is a simplified example of what i'm trying to do:
Main file's contents:
first_file.txt
second_file.txt
Bash commands:
read line < $main_file # file with the list, received as an argument
echo $line # to check that correct filename has been read
cat $line # attempt to dump "first_file.txt" contents <- FAILS
cat first_file.txt # read "first_file.txt" contents manually
Execution esult:
first_file.txt
: No such file or directory
*** this is 1st file's contents ***
*** ....
So, cat first_file.txt works, $line contains "first_file.txt", but cat $line fails...
I obviously misunderstand something here, suggestions are welcomed!
Edit:
As requested, here is cat -v $main_file's output:
first_file.txt^M
second_file.txt^M
third_file.txt^M
^M
^M

The ^M characters are carriage returns (a.k.a. \r) and are often part of a Windows line ending. They don't show up when you echo them, but they are messing up your ability to open a file with the text having it at the end.
The best solution is to remove them from your "main file." You could use the dos2unix tool if you have it, or you could use GNU sed like sed -i -e 's/\s+$//g' $main_file to edit it in place and remove the extra white space (which includes ^M) from the end of each line.

Related

Hidden line in file?

I have a UTF-8/no BOM file (converted from ISO-8859-1) that has 31214 lines. I have already run dos2unix on the file. When I open it in notepad++, I see a blank line underneath. When I remove this blank line, the line count reduces by one. I save it under a different name and when I tail the file, the prompt displays on the same line. From bash, how do I delete the blank line in the 1st file to produce the result displayed below in the 2nd file?
The goal is to do this from bash w/o manually deleting the line in notepad++
1st file:
[user#server]$ cat file1.txt | wc -l
31214
[user#server]$ tail file1.txt
T 31212 Data 20170517
[user#server]$
2nd file (edited with notepad++)
[user#server]$ cat file2.txt | wc -l
31213
[user#server]$ tail file2.txt
T 31212 Data 20170517[user#server]$
That's the trailing newline of the last line. Some editors allow you to go to the nonexisting "empty" line at the end, some don't show it. Again, some programs may allow you to remove the final newline, but note that e.g. POSIX in effect requires it to be there, and some standard utilities act oddly if it isn't present.
E.g. wc -l counts the number of newlines in the input file (printf "foo\nbar" | wc -l shows 1) so removing the final newline does decrease the line count.
Also, Bash prints the prompt wherever it was that the cursor was left on the screen, so if you print something that doesn't have the trailing newline, the prompt will be placed where the final incomplete line ended, as you saw.
There's no need to remove that final newline, just leave it there.
To remove the final newline character it is possible, as explained here, to use
sed -i '$ s/.$//' your.file
which will substitute nothing for the last character in the last line of the file (if you want to delete smth else from the end of the file you can replace the regex .$ with smth-else$). -i means ‘substitute in-place’ (in FreeBSD/MacOS you need to add an empty string as an argument: sed -i "" '$ s/.$//' your.file)
The file2.txt is missing a trailing newline.
Yes, a text file should end on a newline character.
Given that you do know that a trailing newline is missing, this command should be enough to correct the problem:
$ echo >> file2.txt

Reading from STDIN, performing commands, then Outputting to STDOUT in Bash

I need to:
Accept STDIN in my script from a pipe
save it to a temp file so that I don't modify the original source
perform operations on the temp file to generate some output
output to STDOUT
Here is my script:
#!/bin/bash
temp=$(cat)
sed 's/the/THE/g' <temp
echo "$temp"
Right now, I am just trying to get it to be able to replace all occurences of "the" with "THE".
Here is the sample text:
the quick brown fox jumped over the lazy
brown dog the quick
brown fox jumped
over
Here is my command line:
cat test.txt | ./hwscript >hwscriptout
"test.txt" contains the sample text, "hwscript" is the script, "hwscriptout" is the output
However, when I look at the output file, nothing has changed (all of occurences of "the" remain uncapitalized). When I do the sed command on the command line instead of the script, it works though. I also tried to use $(sed) instead of sed but when I did that, the command returned an error:
"./hwscript: line 5: s/the/THE/g: no such file or directory"
I have tried to search for a solution but could not find one.
Help is appreciated, thank you.
save it to a temp file so that I don't modify the original source
Anything received via stdin is just a stream of data, disconnected from wherever it originated from: whatever you do with that stream has no effect whatsoever on its origin.
Thus, there is no need to involve a temporary file - simply modify stdin input as needed.
#!/bin/bash
sed 's/the/THE/g' # without a filename operand or pipe input, this will read from stdin
# Without an output redirection, the output will go to stdout.
As you can tell, in this simple case you may as well use the sed command directly, without creating a script.
Use this:
temp=$(sed 's/the/THE/' <<<"$temp")
or
temp=$(printf "%s" "$temp" | sed 's/the/THE/')
You were telling sed to process a file named temp, not the contents of the variable $temp. You also weren't saving the result anywhere, so echo "$temp" simply prints the old value
Here is a way to do it as you described it
#!/bin/sh
# Read the input and append to tmp file
while read LINE; do
echo ${LINE} >> yourtmpfile
done
# Edit the file in place
sed -i '' 's/the/THE/g' yourtmpfile
#Output the result
cat yourtmpfile
rm yourtmpfile
And here is a simpler way without a tmp file
#!/bin/sh
# Read the input and output the line after sed
while read LINE; do
echo ${LINE} | sed 's/the/THE/g'
done

Creating a file by merging two files

I would like to merge two files and create a new file using Linux command.
I have the two files named as a1b.txt and a1c.txt
Content of a1b.txt
Hi,Hi,Hi
How,are,you
Content of a1c.txt
Hadoop|are|world
Data|Big|God
And I need a new file called merged.txt with the below content(expected output)
Hi,Hi,Hi
How,are,you
Hadoop|are|world
Data|Big|God
To achieve that in terminal I am running the below command,but it gives me output like below
Hi,Hi,Hi
How,are,youHadoop|are|world
Data|Big|God
cat /home/cloudera/inputfiles/a1* > merged.txt
Could somebody help on getting the expected ouput
Probably your files do not have newline characters. Here is how to put the newline character to them.
$ sed -i -e '$a\' /home/cloudera/inputfiles/a1*
$ cat /home/cloudera/inputfiles/a1* > merged.txt
If you are allowed to be destructive (not have to keep the original two files unmodified) then:
robert#debian:/tmp$ cat fileB.txt >> fileA.txt
robert#debian:/tmp$ cat fileA.txt
this is file A
This is file B.

Getting extra line when output to file?

I'm using a diff command and it's printing out to a file. The file keeps getting an extra line in the end that I don't need to appear. How can I prevent it from being there?
The command is as follows:
diff -b <(grep -B 2 -A 1 'bedrock.local' /Applications/MAMP/conf/apache/httpd.conf) /Applications/MAMP/conf/apache/httpd.conf > test.txt
The file being used is here (thought I don't think it matters): http://yaharga.com/httpd.txt
Perhaps at least I'd like to know how to check the last line of the file and delete it only if it's blank.
To delete empty last line you can use sed, it will delete it only if it's blank:
sed '${/^\s*$/d;}' file
Ok i made research with your file on my MacOS.
I created file new.conf by touch new.conf and then copied data from your file to it.
btw i checked file and didn't have extra empty line at the bottom of it.
I wrote script script.sh with following:
diff -b <(grep -B 2 -A 1 'bedrock.local' new.conf) new.conf > test.txt
sed -i.bak '1d;s/^>//' test.txt
It diffed what was needed and deleted first useless row and all > saving it to a new file test.txt
I checked again and no extra empty line was presented.
Additionaly i would suggest you to try and delete the extra line you have like this: sed -i.bak '$d' test.txt
And check a number of lines before and after sed = test.txt
Probably your text editor somehow added this extra line to your file. Try something else - nano for example or vi

How can I remove the last character of a file in unix?

Say I have some arbitrary multi-line text file:
sometext
moretext
lastline
How can I remove only the last character (the e, not the newline or null) of the file without making the text file invalid?
A simpler approach (outputs to stdout, doesn't update the input file):
sed '$ s/.$//' somefile
$ is a Sed address that matches the last input line only, thus causing the following function call (s/.$//) to be executed on the last line only.
s/.$// replaces the last character on the (in this case last) line with an empty string; i.e., effectively removes the last char. (before the newline) on the line.
. matches any character on the line, and following it with $ anchors the match to the end of the line; note how the use of $ in this regular expression is conceptually related, but technically distinct from the previous use of $ as a Sed address.
Example with stdin input (assumes Bash, Ksh, or Zsh):
$ sed '$ s/.$//' <<< $'line one\nline two'
line one
line tw
To update the input file too (do not use if the input file is a symlink):
sed -i '$ s/.$//' somefile
Note:
On macOS, you'd have to use -i '' instead of just -i; for an overview of the pitfalls associated with -i, see the bottom half of this answer.
If you need to process very large input files and/or performance / disk usage are a concern and you're using GNU utilities (Linux), see ImHere's helpful answer.
truncate
truncate -s-1 file
Removes one (-1) character from the end of the same file. Exactly as a >> will append to the same file.
The problem with this approach is that it doesn't retain a trailing newline if it existed.
The solution is:
if [ -n "$(tail -c1 file)" ] # if the file has not a trailing new line.
then
truncate -s-1 file # remove one char as the question request.
else
truncate -s-2 file # remove the last two characters
echo "" >> file # add the trailing new line back
fi
This works because tail takes the last byte (not char).
It takes almost no time even with big files.
Why not sed
The problem with a sed solution like sed '$ s/.$//' file is that it reads the whole file first (taking a long time with large files), then you need a temporary file (of the same size as the original):
sed '$ s/.$//' file > tempfile
rm file; mv tempfile file
And then move the tempfile to replace the file.
Here's another using ex, which I find not as cryptic as the sed solution:
printf '%s\n' '$' 's/.$//' wq | ex somefile
The $ goes to the last line, the s deletes the last character, and wq is the well known (to vi users) write+quit.
After a whole bunch of playing around with different strategies (and avoiding sed -i or perl), the best way i found to do this was with:
sed '$! { P; D; }; s/.$//' somefile
If the goal is to remove the last character in the last line, this awk should do:
awk '{a[NR]=$0} END {for (i=1;i<NR;i++) print a[i];sub(/.$/,"",a[NR]);print a[NR]}' file
sometext
moretext
lastlin
It store all data into an array, then print it out and change last line.
Just a remark: sed will temporarily remove the file.
So if you are tailing the file, you'll get a "No such file or directory" warning until you reissue the tail command.
EDITED ANSWER
I created a script and put your text inside on my Desktop. this test file is saved as "old_file.txt"
sometext
moretext
lastline
Afterwards I wrote a small script to take the old file and eliminate the last character in the last line
#!/bin/bash
no_of_new_line_characters=`wc '/root/Desktop/old_file.txt'|cut -d ' ' -f2`
let "no_of_lines=no_of_new_line_characters+1"
sed -n 1,"$no_of_new_line_characters"p '/root/Desktop/old_file.txt' > '/root/Desktop/my_new_file'
sed -n "$no_of_lines","$no_of_lines"p '/root/Desktop/old_file.txt'|sed 's/.$//g' >> '/root/Desktop/my_new_file'
opening the new_file I created, showed the output as follows:
sometext
moretext
lastlin
I apologize for my previous answer (wasn't reading carefully)
sed 's/.$//' filename | tee newFilename
This should do your job.
A couple perl solutions, for comparison/reference:
(echo 1a; echo 2b) | perl -e '$_=join("",<>); s/.$//; print'
(echo 1a; echo 2b) | perl -e 'while(<>){ if(eof) {s/.$//}; print }'
I find the first read-whole-file-into-memory approach can be generally quite useful (less so for this particular problem). You can now do regex's which span multiple lines, for example to combine every 3 lines of a certain format into 1 summary line.
For this problem, truncate would be faster and the sed version is shorter to type. Note that truncate requires a file to operate on, not a stream. Normally I find sed to lack the power of perl and I much prefer the extended-regex / perl-regex syntax. But this problem has a nice sed solution.

Resources