How do I remove newlines from a text file? - linux

I have the following data, and I need to put it all into one line.
I have this:
22791
;
14336
;
22821
;
34653
;
21491
;
25522
;
33238
;
I need this:
22791;14336;22821;34653;21491;25522;33238;
EDIT
None of these commands is working perfectly.
Most of them let the data look like this:
22791
;14336
;22821
;34653
;21491
;25522

tr --delete '\n' < yourfile.txt
tr -d '\n' < yourfile.txt
Edit:
If none of the commands posted here are working, then you have something other than a newline separating your fields. Possibly you have DOS/Windows line endings in the file (although I would expect the Perl solutions to work even in that case)?
Try:
tr -d "\n\r" < yourfile.txt
If that doesn't work then you're going to have to inspect your file more closely (e.g. in a hex editor) to find out what characters are actually in there that you want to remove.

tr -d '\n' < file.txt
Or
awk '{ printf "%s", $0 }' file.txt
Or
sed ':a;N;$!ba;s/\n//g' file.txt
This page here has a bunch of other methods to remove newlines.
edited to remove feline abuse :)

perl -p -i -e 's/\R//g;' filename
Must do the job.

paste -sd "" file.txt

Expanding on a previous answer, this removes all new lines and saves the result to a new file (thanks to #tripleee):
tr -d '\n' < yourfile.txt > yourfile2.txt
Which is better than a "useless cat" (see comments):
cat file.txt | tr -d '\n' > file2.txt
Also useful for getting rid of new lines at the end of the file, e.g. created by using echo blah > file.txt.
Note that the destination filename is different, important, otherwise you'll wipe out the original content!

You can edit the file in vim:
$ vim inputfile
:%s/\n//g

use
head -n 1 filename | od -c
to figure WHAT is the offending character.
then use
tr -d '\n' <filename
for LF
tr -d '\r\n' <filename
for CRLF

Use sed with POSIX classes
This will remove all lines containing only whitespace (spaces & tabs)
sed '/^[[:space:]]*$/d'
Just take whatever you are working with and pipe it to that
Example
cat filename | sed '/^[[:space:]]*$/d'

Using man 1 ed:
# cf. http://wiki.bash-hackers.org/doku.php?id=howto:edit-ed
ed -s file <<< $'1,$j\n,p' # print to stdout
ed -s file <<< $'1,$j\nwq' # in-place edit

xargs consumes newlines as well (but adds a final trailing newline):
xargs < file.txt | tr -d ' '

Nerd fact: use ASCII instead.
tr -d '\012' < filename.extension
(Edited cause i didn't see the friggin' answer that had same solution, only difference was that mine had ASCII)

Using the gedit text editor (3.18.3)
Click Search
Click Find and Replace...
Enter \n\s into Find field
Leave Replace with blank (nothing)
Check Regular expression box
Click the Find button
Note: this doesn't exactly address the OP's original, 7 year old problem but should help some noob linux users (like me) who find their way here from the SE's with similar "how do I get my text all on one line" questions.

Was having the same case today, super easy in vim or nvim, you can use gJ to join lines. For your use case, just do
99gJ
this will join all your 99 lines. You can adjust the number 99 as need according to how many lines to join. If just join 1 line, then only gJ is good enough.

$ perl -0777 -pe 's/\n+//g' input >output
$ perl -0777 -pe 'tr/\n//d' input >output

If the data is in file.txt, then:
echo $(<file.txt) | tr -d ' '
The '$(<file.txt)' reads the file and gives the contents as a series of words which 'echo' then echoes with a space between them. The 'tr' command then deletes any spaces:
22791;14336;22821;34653;21491;25522;33238;

Assuming you only want to keep the digits and the semicolons, the following should do the trick assuming there are no major encoding issues, though it will also remove the very last "newline":
$ tr -cd ";0-9"
You can easily modify the above to include other characters, e.g. if you want to retain decimal points, commas, etc.

I usually get this usecase when I'm copying a code snippet from a file and I want to paste it into a console without adding unnecessary new lines, I ended up doing a bash alias
( i called it oneline if you are curious )
xsel -b -o | tr -d '\n' | tr -s ' ' | xsel -b -i
xsel -b -o reads my clipboard
tr -d '\n' removes new lines
tr -s ' ' removes recurring spaces
xsel -b -i pushes this back to my clipboard
after that I would paste the new contents of the clipboard into oneline in a console or whatever.

I would do it with awk, e.g.
awk '/[0-9]+/ { a = a $0 ";" } END { print a }' file.txt
(a disadvantage is that a is "accumulated" in memory).
EDIT
Forgot about printf! So also
awk '/[0-9]+/ { printf "%s;", $0 }' file.txt
or likely better, what it was already given in the other ans using awk.

You are missing the most obvious and fast answer especially when you need to do this in GUI in order to fix some weird word-wrap.
Open gedit
Then Ctrl + H, then put in the Find textbox \n and in Replace with an empty space then fill checkbox Regular expression and voila.

To also remove the trailing newline at the end of the file
python -c "s=open('filename','r').read();open('filename', 'w').write(s.replace('\n',''))"

fastest way I found:
open vim by doing this in your commandline
vim inputfile
press ":" and input the following command to remove all newlines
:%s/\n//g
Input this to also remove spaces incase some characters were spaces :%s/ //g
make sure to save by writing to the file with
:w
The same format can be used to remove any other characters, you can use a website like this
https://apps.timwhitlock.info/unicode/inspect
to figure out what character you're missing
You can also use this to figure out other characters you can't see and they have a tool as well
Tool to learn of other invisible characters

Related

sed script replacing string "\n" in string with newline character

I have recently wrote a script that will parse a whole bunch of files and increment the version number throughout. The script works fine for all files except one. It uses the following sed command (which was pieced together from various google searches and very limited sed knowledge) to find a line in a .tex file and increment the version number.
sed -i -r 's/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' fileName.tex
The issue with the above (which I am unsure how to fix) is that the line it finds to change appears as
\newcommand{\VERSION}{0.123},
and the sed command replaces the "\n" in the line above with the newline character, and thus outputting
ewcommand{\VERSION}{0.124} (with a newline before it).
The desired output would be:
\newcommand{\VERSION}{0.124}
How can I fix this?
Alright so I was not able to get the answer from Cyrus to work because the file was finding about 50 other lines in my tex files it wanted to modify and I wasn't quite sure how to fix the awk statement to find just the specific line I wanted. However, I got it working with the original sed method by making a simple change.
My sed command becames two, where the first creates a temporary string %TMPSTR%, immediately followed by replacing said temp string to get the desired output and avoid any newline characters appearing.
sed -i -r 's/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\\%TMPSTR%{\\\2$((\3+1))\4"/ge' fileName.tex
sed -i -r 's/%TMPSTR%/newcommand/g' fileName.tex
So the line in the file goes from
\newcommand{\VERSION}{0.123} --> \%TMPSTR%{\VERSION}{0.124} --> \newcommand{\VERSION}{0.124}
and ends at the desired outcome. A bit ugly I suppose but it does what I need!
Use awk, that won't get confused by data with special characters.
Your problem could be solved by temporarily replacing the backslashes, but I hope this answer will lead you to awk.
For one line:
echo '\newcommand{\VERSION}{0.123},' | tr '\' '\r' |
sed -r 's/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' | tr '\r' '\'
For a file
tr '\' '\r' < fileName.tex |
sed -r 's/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' |
tr '\r' '\' > fileName.tex.tmp && mv fileName.tex.tmp fileName.tex
When \n is the only problem, you can try
sed -i -r 's/\\n/\r/g;s/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge;s/\r/\\n/' fileName.tex

Replace multiple commas with a single one - linux command

This is an output from my google csv contacts (which contains more than 1000 contacts):
A-Tech Computers Hardware,A-Tech Computers,,Hardware,,,,,,,,,,,,,,,,,,,,Low,,,* My Contacts,,,,,,,,,Home,+38733236313,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
I need a linux cli command to replace the duplicate commas, with single commas, so i get this:
A-Tech Computers Hardware,A-Tech Computers,Hardware,Low,* My Contacts,Home,+38733236313,
What I usually do in notepad++ is Replace ",," with "," six times.
I tried with:
cat googlecontacts.txt | sed -e 's/,,/,/g' -e 's/,,/,/g' -e 's/,,/,/g' -e 's/,,/,/g' -e 's/,,/,/g' -e 's/,,/,/g' > google.txt
But it doesn't work...
However, when I try it on smaller files (two lines) it works... :(
Help please!
Assuming your line still compliant after modification(not the concern of the question)
sed 's/,\{2,\}/,/g' googlecontacts.txt > google.txt
It replace any occurence greater than 1 of , by a single , any place on the line
any space between , is consider as a correct field, so not modified
In your command, you need to recursive change the character and not reexecute several time the same (there is always a gretear occurence possible) , like this
cat googlecontacts.txt | sed ':a
# make your change
s/,,/,/g
# if change occur, retry once again by returning to line :a
t a' > google.txt
You need the squeeze option of tr:
tr -s ',' < yourFile
You can see it in action like this:
echo hello,,there,,,,I,have,,too,many,,,commas | tr -s ,
hello,there,I,have,too,many,commas
This might work for you (GNU sed):
sed 's/,,*/,/g' file
or
sed 's/,\+/,/g' file
Thanks #potong, your solution worked for one of my requirement. I had to replace the | symbol in the first line of my file and used this solution with small change.
sed -i "1s/|'*//g" ${filename}
I was unable to add comments so thought of posting it as an answer. Please excuse

How can I remove the last character of a file in unix?

Say I have some arbitrary multi-line text file:
sometext
moretext
lastline
How can I remove only the last character (the e, not the newline or null) of the file without making the text file invalid?
A simpler approach (outputs to stdout, doesn't update the input file):
sed '$ s/.$//' somefile
$ is a Sed address that matches the last input line only, thus causing the following function call (s/.$//) to be executed on the last line only.
s/.$// replaces the last character on the (in this case last) line with an empty string; i.e., effectively removes the last char. (before the newline) on the line.
. matches any character on the line, and following it with $ anchors the match to the end of the line; note how the use of $ in this regular expression is conceptually related, but technically distinct from the previous use of $ as a Sed address.
Example with stdin input (assumes Bash, Ksh, or Zsh):
$ sed '$ s/.$//' <<< $'line one\nline two'
line one
line tw
To update the input file too (do not use if the input file is a symlink):
sed -i '$ s/.$//' somefile
Note:
On macOS, you'd have to use -i '' instead of just -i; for an overview of the pitfalls associated with -i, see the bottom half of this answer.
If you need to process very large input files and/or performance / disk usage are a concern and you're using GNU utilities (Linux), see ImHere's helpful answer.
truncate
truncate -s-1 file
Removes one (-1) character from the end of the same file. Exactly as a >> will append to the same file.
The problem with this approach is that it doesn't retain a trailing newline if it existed.
The solution is:
if [ -n "$(tail -c1 file)" ] # if the file has not a trailing new line.
then
truncate -s-1 file # remove one char as the question request.
else
truncate -s-2 file # remove the last two characters
echo "" >> file # add the trailing new line back
fi
This works because tail takes the last byte (not char).
It takes almost no time even with big files.
Why not sed
The problem with a sed solution like sed '$ s/.$//' file is that it reads the whole file first (taking a long time with large files), then you need a temporary file (of the same size as the original):
sed '$ s/.$//' file > tempfile
rm file; mv tempfile file
And then move the tempfile to replace the file.
Here's another using ex, which I find not as cryptic as the sed solution:
printf '%s\n' '$' 's/.$//' wq | ex somefile
The $ goes to the last line, the s deletes the last character, and wq is the well known (to vi users) write+quit.
After a whole bunch of playing around with different strategies (and avoiding sed -i or perl), the best way i found to do this was with:
sed '$! { P; D; }; s/.$//' somefile
If the goal is to remove the last character in the last line, this awk should do:
awk '{a[NR]=$0} END {for (i=1;i<NR;i++) print a[i];sub(/.$/,"",a[NR]);print a[NR]}' file
sometext
moretext
lastlin
It store all data into an array, then print it out and change last line.
Just a remark: sed will temporarily remove the file.
So if you are tailing the file, you'll get a "No such file or directory" warning until you reissue the tail command.
EDITED ANSWER
I created a script and put your text inside on my Desktop. this test file is saved as "old_file.txt"
sometext
moretext
lastline
Afterwards I wrote a small script to take the old file and eliminate the last character in the last line
#!/bin/bash
no_of_new_line_characters=`wc '/root/Desktop/old_file.txt'|cut -d ' ' -f2`
let "no_of_lines=no_of_new_line_characters+1"
sed -n 1,"$no_of_new_line_characters"p '/root/Desktop/old_file.txt' > '/root/Desktop/my_new_file'
sed -n "$no_of_lines","$no_of_lines"p '/root/Desktop/old_file.txt'|sed 's/.$//g' >> '/root/Desktop/my_new_file'
opening the new_file I created, showed the output as follows:
sometext
moretext
lastlin
I apologize for my previous answer (wasn't reading carefully)
sed 's/.$//' filename | tee newFilename
This should do your job.
A couple perl solutions, for comparison/reference:
(echo 1a; echo 2b) | perl -e '$_=join("",<>); s/.$//; print'
(echo 1a; echo 2b) | perl -e 'while(<>){ if(eof) {s/.$//}; print }'
I find the first read-whole-file-into-memory approach can be generally quite useful (less so for this particular problem). You can now do regex's which span multiple lines, for example to combine every 3 lines of a certain format into 1 summary line.
For this problem, truncate would be faster and the sed version is shorter to type. Note that truncate requires a file to operate on, not a stream. Normally I find sed to lack the power of perl and I much prefer the extended-regex / perl-regex syntax. But this problem has a nice sed solution.

Pad all lines with spaces to a fixed width in Vim or using sed, awk, etc

How can I pad each line of a file to a certain width (say, 63 characters wide), padding with spaces if need be?
For now, let’s assume all lines are guaranteed to be less than 63 characters.
I use Vim and would prefer a way to do it there, where I can select the lines I wish to apply the padding to, and run some sort of a printf %63s current_line command.
However, I’m certainly open to using sed, awk, or some sort of linux tool to do the job too.
Vim
:%s/.*/\=printf('%-63s', submatch(0))
$ awk '{printf "%-63s\n", $0}' testfile > newfile
In Vim, I would use the following command:
:%s/$/\=repeat(' ',64-virtcol('$'))
(The use of the virtcol() function, as opposed to the col() one,
is guided by the necessity to properly handle tab characters as well
as multibyte non-ASCII characters that might occur in the text.)
Just for fun, a Perl version:
$ perl -lpe '$_ .= " " x (63 - length $_)'
This might work for you:
$ sed -i ':a;/.\{63\}/!{s/$/ /;ba}' file
or perhaps more efficient but less elegant:
$ sed -i '1{x;:a;/.\{63\}/!{s/^/ /;ba};x};/\(.\{63\}\).*/b;G;s//\1/;y/\n/ /' file
It looks like you are comfortable using vim, but here is a pure Bash/simple-sed solution in case you need to do it from the command line (note the 63 spaces in the sed substitution):
$ sed 's/$/ /' yourFile.txt |cut -c 1-63
With sed, without a loop:
$ sed -i '/.\{63\}/!{s/$/ /;s/^\(.\{63\}\).*/\1/}' file
Be sure to have enough spaces in the 1st substitution to match the number of space you want to add.
Another Perl solution:
$ perl -lne 'printf "%-63s\n", $_' file

Replace whitespace with a comma in a text file in Linux

I need to edit a few text files (an output from sar) and convert them into CSV files.
I need to change every whitespace (maybe it's a tab between the numbers in the output) using sed or awk functions (an easy shell script in Linux).
Can anyone help me? Every command I used didn't change the file at all; I tried gsub.
tr ' ' ',' <input >output
Substitutes each space with a comma, if you need you can make a pass with the -s flag (squeeze repeats), that replaces each input sequence of a repeated character that is listed in SET1 (the blank space) with a single occurrence of that character.
Use of squeeze repeats used to after substitute tabs:
tr -s '\t' <input | tr '\t' ',' >output
Try something like:
sed 's/[:space:]+/,/g' orig.txt > modified.txt
The character class [:space:] will match all whitespace (spaces, tabs, etc.). If you just want to replace a single character, eg. just space, use that only.
EDIT: Actually [:space:] includes carriage return, so this may not do what you want. The following will replace tabs and spaces.
sed 's/[:blank:]+/,/g' orig.txt > modified.txt
as will
sed 's/[\t ]+/,/g' orig.txt > modified.txt
In all of this, you need to be careful that the items in your file that are separated by whitespace don't contain their own whitespace that you want to keep, eg. two words.
without looking at your input file, only a guess
awk '{$1=$1}1' OFS=","
redirect to another file and rename as needed
What about something like this :
cat texte.txt | sed -e 's/\s/,/g' > texte-new.txt
(Yes, with some useless catting and piping ; could also use < to read from the file directly, I suppose -- used cat first to output the content of the file, and only after, I added sed to my command-line)
EDIT : as #ghostdog74 pointed out in a comment, there's definitly no need for thet cat/pipe ; you can give the name of the file to sed :
sed -e 's/\s/,/g' texte.txt > texte-new.txt
If "texte.txt" is this way :
$ cat texte.txt
this is a text
in which I want to replace
spaces by commas
You'll get a "texte-new.txt" that'll look like this :
$ cat texte-new.txt
this,is,a,text
in,which,I,want,to,replace
spaces,by,commas
I wouldn't go just replacing the old file by the new one (could be done with sed -i, if I remember correctly ; and as #ghostdog74 said, this one would accept creating the backup on the fly) : keeping might be wise, as a security measure (even if it means having to rename it to something like "texte-backup.txt")
This command should work:
sed "s/\s/,/g" < infile.txt > outfile.txt
Note that you have to redirect the output to a new file. The input file is not changed in place.
sed can do this:
sed 's/[\t ]/,/g' input.file
That will send to the console,
sed -i 's/[\t ]/,/g' input.file
will edit the file in-place
Here's a Perl script which will edit the files in-place:
perl -i.bak -lpe 's/\s+/,/g' files*
Consecutive whitespace is converted to a single comma.
Each input file is moved to .bak
These command-line options are used:
-i.bak edit in-place and make .bak copies
-p loop around every line of the input file, automatically print the line
-l removes newlines before processing, and adds them back in afterwards
-e execute the perl code
If you want to replace an arbitrary sequence of blank characters (tab, space) with one comma, use the following:
sed 's/[\t ]+/,/g' input_file > output_file
or
sed -r 's/[[:blank:]]+/,/g' input_file > output_file
If some of your input lines include leading space characters which are redundant and don't need to be converted to commas, then first you need to get rid of them, and then convert the remaining blank characters to commas. For such case, use the following:
sed 's/ +//' input_file | sed 's/[\t ]+/,/g' > output_file
This worked for me.
sed -e 's/\s\+/,/g' input.txt >> output.csv

Resources