Related
I have recently wrote a script that will parse a whole bunch of files and increment the version number throughout. The script works fine for all files except one. It uses the following sed command (which was pieced together from various google searches and very limited sed knowledge) to find a line in a .tex file and increment the version number.
sed -i -r 's/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' fileName.tex
The issue with the above (which I am unsure how to fix) is that the line it finds to change appears as
\newcommand{\VERSION}{0.123},
and the sed command replaces the "\n" in the line above with the newline character, and thus outputting
ewcommand{\VERSION}{0.124} (with a newline before it).
The desired output would be:
\newcommand{\VERSION}{0.124}
How can I fix this?
Alright so I was not able to get the answer from Cyrus to work because the file was finding about 50 other lines in my tex files it wanted to modify and I wasn't quite sure how to fix the awk statement to find just the specific line I wanted. However, I got it working with the original sed method by making a simple change.
My sed command becames two, where the first creates a temporary string %TMPSTR%, immediately followed by replacing said temp string to get the desired output and avoid any newline characters appearing.
sed -i -r 's/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\\%TMPSTR%{\\\2$((\3+1))\4"/ge' fileName.tex
sed -i -r 's/%TMPSTR%/newcommand/g' fileName.tex
So the line in the file goes from
\newcommand{\VERSION}{0.123} --> \%TMPSTR%{\VERSION}{0.124} --> \newcommand{\VERSION}{0.124}
and ends at the desired outcome. A bit ugly I suppose but it does what I need!
Use awk, that won't get confused by data with special characters.
Your problem could be solved by temporarily replacing the backslashes, but I hope this answer will lead you to awk.
For one line:
echo '\newcommand{\VERSION}{0.123},' | tr '\' '\r' |
sed -r 's/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' | tr '\r' '\'
For a file
tr '\' '\r' < fileName.tex |
sed -r 's/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' |
tr '\r' '\' > fileName.tex.tmp && mv fileName.tex.tmp fileName.tex
When \n is the only problem, you can try
sed -i -r 's/\\n/\r/g;s/(.*)(VERSION\}\{0.)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge;s/\r/\\n/' fileName.tex
I have a file contained \n hidden behind each line:
input:
s3741206\n
s2561284\n
s4411364\n
s2516482\n
s2071534\n
s2074633\n
s7856856\n
s11957134\n
s682333\n
s9378200\n
s1862626\n
I want to remove \n behind
desired output:
s3741206
s2561284
s4411364
s2516482
s2071534
s2074633
s7856856
s11957134
s682333
s9378200
s1862626
however, I try this:
tr -d '\n' < file1 > file2
but it goes like below without space and new line
s3741206s2561284s4411364s2516482s2071534s2074633s7856856s11957134s682333s9378200s1862626
I also try sed $'s/\n//g' -i file1 and it doesn't work in mac os.
Thank you.
This is a possible solution using sed:
sed 's/\\n/ /g'
with awk
awk '{sub(/\\n/,"")} 1' < file1 > file2
What you are describing so far in your question+comments doesn't make sense. How can you have a multi-line file with a hidden newline character at the end of each line? What you show as your input file:
s3741206\n
s2561284\n
s4411364\n
etc.
where each "\n" above according to your comment is a single newline character "\n" is impossible. If those "\n"s were newline characters then your file would simply look like:
s3741206
s2561284
s4411364
etc.
There's really only 2 possibilities I can think of:
You are wrongly interpreting what you are seeing in your input file
and/or using the wrong terminology and you actually DO have \r\n
at the end of every line. Run cat -v file to see the \rs as
^Ms and run dos2unix or similar (e.g. sed 's/\r$//' file) to
remove the \rs - you do not want to remove the \ns or you will
no longer have a POSIX text file and so POSIX tools will exhibit
undefined behavior when run on it. If that doesn't work for you then
copy/paste the output of cat -v file into your question so we can
see for sure what is in your file.
Or:
It's also entirely possible that your file is a perfectly fine POSIX
text file as-is and you are incorrectly assuming you will have a
problem for some reason so also include in your question a
description of the actual problem you are having, include an example
of the command you are executing on that input file and the output
you are getting and the output you expected to get.
You could use bash-native string substitution
$ cat /tmp/newline
s3741206\n
s2561284\n
s4411364\n
s2516482\n
s2071534\n
s2074633\n
s7856856\n
s11957134\n
s682333\n
s9378200\n
s1862626\n
$ for LINE in $(cat /tmp/newline); do echo "${LINE%\\n}"; done
s3741206
s2561284
s4411364
s2516482
s2071534
s2074633
s7856856
s11957134
s682333
s9378200
s1862626
I have a file "test.txt" with the lines below and also lot bunch of extra stuff after the "version"
soainfra_metrics{metric_group="sca_composite",partition="test",is_active="true",state="on",is_default="true",composite="test123"} map:stats version:1.0
soainfra_metrics{metric_group="sca_composite",partition="gello",is_active="true",state="on",is_default="true",composite="test234"} map:stats version:1.8
soainfra_metrics{metric_group="sca_composite",partition="bolo",is_active="true",state="on",is_default="true",composite="3415"} map:stats version:3.1
soainfra_metrics{metric_group="sca_composite",partition="solo",is_active="true",state="on",is_default="true",composite="hji"} map:stats version:1.1
I tried:
egrep -r 'partition|is_active|state|is_default|composite' test.txt
It's displaying every line, but I need only specific mentioned fields like this below,ignoring rest of the data/stuff or lines
in a nut shell, i want to display only these fields from a line not the rest
partition="test",is_active="true",state="on",is_default="true",composite="test123"
partition="gello",is_active="true",state="on",is_default="true",composite="test234"
partition="bolo",is_active="true",state="on",is_default="true",composite="3415"
partition="solo",is_active="true",state="on",is_default="true",composite="hji"
If your version of grep supports Perl-style regular expressions, then I'd use this:
grep -oP '.*?,\K[^}]+' file
It removes everything up to the first comma (\K kills any previous output) and prints everything up to the }.
Alternatively, using awk:
awk -F'}' '{ sub(/[^,]+,/, ""); print $1 }' file
This sets the field separator to } so the part you're interested in is the first field. It then uses sub to remove the part up to the first comma.
For completeness, you could also use sed:
sed 's/[^,]*,\([^}]*\).*/\1/' file
This captures the part after the first , up to the } and replaces the content of the line with it.
After the grep to pick out the lines you want, use sed to edit the lines:
sed 's/.*\(partition[^}]*\)} map.*/\1/'
This means: "whenever you see anything .*, followed by partition and
any number of non-}, then } map and anything else, grab the part
from partition up to but not including the brace \(...\) as group 1.
The replacement text is just group 1 \1.
Use a pipe | to connect the output of egrep to the input of sed:
egrep ... | sed ...
As far as i understood your file might have more lines you don't want to see, so i would use:
sed -n 's/.*\(partition.*\)}.*/\1/p' file
we use -n p to show only lines where we made substitution. The substitution part just gets the part of the line you need substituting the whole line with the pattern.
This might work for you (GNU sed):
sed -r 's/(partition|is_active|state|is_default|composite)="[^"]*"/\n&\n/g;s/[^\n]*\n([^\n]*)\n[^\n]*/\1,/g;s/,$//' file
Treat the problem as if it were a "decomposed club sandwich". Identify the fillings, remove the bread and tidy up.
I'm trying to view the content of a file including its delimiters in terminal.
For example:
hello\t\tworld\n
hello\t\tworld\t\tagain\n
instead of just:
hello world
hello world again
I did this once awhile ago using either "sed" or "awk"....I think...but, I can't seem to remember any of it now.
Thanks for the help.
VI can show you this if open the file in it and type :set list.
e.g.
$ cat test.txt
hello world
hello world again
In VI the ^I are tabs and $ are Line Feeds.
Also like the comment states - cat -A will get you the same output:
$ cat -A test.txt
hello^I^Iworld$
hello^I^Iworld^I^Iagain$
you can use od command,
od -t a input_file | awk '{$1=""}1' |
awk 'BEGIN{RS="[ \t\n]+";ORS="";
d["sp"]=" "; d["nl"]="\\n\n"; d["ht"]="\\t"; d["cr"] = "\\r";
}length($0)>1{$0=d[$0]}1'
with input_file
hello world
hello world again
you get,
hello\t\tworld\n
hello\t\tworld again\n
This might work for you (GNU sed):
sed -n l0 file
This will show any tabs and newlines will be replaced by $.
If you wish to see newlines as \n then slurp the whole file:
sed -n '1h;1!H;$!d;x;l0' file
This sed should probably work, you can just modify it to match and display whatever it is you want to see (e.g. whitespace). The tricky part is matching newlines since sed utilizes newlines to delimit the stream, it is already consumed prior to you getting to it. So matching along the lines of sed 's/\n/\\n/' will fail, you can presume the newline by matching the end character and tack on the \\n.
sed 's/$/\\n/;s/\t/\\t/g;s/ \{4\}/\\t/g' file
I need to edit a few text files (an output from sar) and convert them into CSV files.
I need to change every whitespace (maybe it's a tab between the numbers in the output) using sed or awk functions (an easy shell script in Linux).
Can anyone help me? Every command I used didn't change the file at all; I tried gsub.
tr ' ' ',' <input >output
Substitutes each space with a comma, if you need you can make a pass with the -s flag (squeeze repeats), that replaces each input sequence of a repeated character that is listed in SET1 (the blank space) with a single occurrence of that character.
Use of squeeze repeats used to after substitute tabs:
tr -s '\t' <input | tr '\t' ',' >output
Try something like:
sed 's/[:space:]+/,/g' orig.txt > modified.txt
The character class [:space:] will match all whitespace (spaces, tabs, etc.). If you just want to replace a single character, eg. just space, use that only.
EDIT: Actually [:space:] includes carriage return, so this may not do what you want. The following will replace tabs and spaces.
sed 's/[:blank:]+/,/g' orig.txt > modified.txt
as will
sed 's/[\t ]+/,/g' orig.txt > modified.txt
In all of this, you need to be careful that the items in your file that are separated by whitespace don't contain their own whitespace that you want to keep, eg. two words.
without looking at your input file, only a guess
awk '{$1=$1}1' OFS=","
redirect to another file and rename as needed
What about something like this :
cat texte.txt | sed -e 's/\s/,/g' > texte-new.txt
(Yes, with some useless catting and piping ; could also use < to read from the file directly, I suppose -- used cat first to output the content of the file, and only after, I added sed to my command-line)
EDIT : as #ghostdog74 pointed out in a comment, there's definitly no need for thet cat/pipe ; you can give the name of the file to sed :
sed -e 's/\s/,/g' texte.txt > texte-new.txt
If "texte.txt" is this way :
$ cat texte.txt
this is a text
in which I want to replace
spaces by commas
You'll get a "texte-new.txt" that'll look like this :
$ cat texte-new.txt
this,is,a,text
in,which,I,want,to,replace
spaces,by,commas
I wouldn't go just replacing the old file by the new one (could be done with sed -i, if I remember correctly ; and as #ghostdog74 said, this one would accept creating the backup on the fly) : keeping might be wise, as a security measure (even if it means having to rename it to something like "texte-backup.txt")
This command should work:
sed "s/\s/,/g" < infile.txt > outfile.txt
Note that you have to redirect the output to a new file. The input file is not changed in place.
sed can do this:
sed 's/[\t ]/,/g' input.file
That will send to the console,
sed -i 's/[\t ]/,/g' input.file
will edit the file in-place
Here's a Perl script which will edit the files in-place:
perl -i.bak -lpe 's/\s+/,/g' files*
Consecutive whitespace is converted to a single comma.
Each input file is moved to .bak
These command-line options are used:
-i.bak edit in-place and make .bak copies
-p loop around every line of the input file, automatically print the line
-l removes newlines before processing, and adds them back in afterwards
-e execute the perl code
If you want to replace an arbitrary sequence of blank characters (tab, space) with one comma, use the following:
sed 's/[\t ]+/,/g' input_file > output_file
or
sed -r 's/[[:blank:]]+/,/g' input_file > output_file
If some of your input lines include leading space characters which are redundant and don't need to be converted to commas, then first you need to get rid of them, and then convert the remaining blank characters to commas. For such case, use the following:
sed 's/ +//' input_file | sed 's/[\t ]+/,/g' > output_file
This worked for me.
sed -e 's/\s\+/,/g' input.txt >> output.csv