sed regex with variables to replace numbers in a file - linux

Im trying to replace numbers in my textfile by adding one to them. i.e.
sed 's/3/4/g' path.txt
sed 's/2/3/g' path.txt
sed 's/1/2/g' path.txt
Instead of this, Can i automate it, i.e. find a /d and add one to it in the replace.
Something like
sed 's/\([0-8]\)/\1+1/g' path.txt
Also wanted to capture more than one digit i.e. ([0-9])\t([0-9]) and change each one keeping the tab inbetween
Thanks
edited #2
Using the perl example,
I also would like it to work with more digits i.e.
perl -pi~ -e 's/(\d+)\.(\d+)\.(\d+)\.(\d+)/ ($1+1)\.($2+1)\.($3+1)\.($4+1) /ge' output.txt
Any tips on making the above work?

There is no support for arithmetic in sed, but you can easily do this in Perl.
perl -pe 's/(\d+)/ $1+1 /ge'
With the /e option, the replacement expression needs to be valid Perl code. So to handle your final updated example, you need
perl -pi~ -e 's/(\d+)\.(\d+)\.(\d+)\.(\d+)/ $1+1 . "." $2+1 . "." . $3+1 . "." . $4+1 /ge'
where strings are properly quoted and adjacent strings are concatenated together with the . Perl string concatenation operator. (The arithmetic numbers are coerced into strings as well when they are concatenated with a string.)
... Though of course, the first script already does that more elegantly, since with the /g flag it already increments every sequence of digits with one, anywhere in the string.

Triplee's perl solution is the more generic answer, but Michal's sed solution works well for this particular case. However, Michal's sed solution is more easily written:
sed y/12345678/23456789/ path.txt
and is better implemented as
tr 12345678 23456789 < path.txt
This utterly fails to handle 2 digit numbers (as in the edited question).

You can do it with sed but it's not easy, see this thread.
And it's hard with awk too, see this.
I'd rather use perl for this (something like this can be seen in action # ideone):
perl -pe 's/([0-8])/$1+1/e'
(The ideone.com example must have some looping as ideone does not sets -pe by default.)

You can't do addition directly in sed - you could do it in awk by matching numbers using a regex in each line and increasing the value, but it's quite complicated. If do not need to handle arbitrary numbers but a limited set, like only single-digit numbers from 0 to 8, you can just put several replacement commands on a single sed command line by separating them with semicolons:
sed 's/8/9/g ; s/7/8/g; s/6/7/g; s/5/6/g; s/4/5/g; s/3/4/g; s/2/3/g; s/1/2/g; s/0/1/g' path.txt

This might work for you (GNU sed & Bash):
sed 's/[0-9]/$((&+1))/g;s/.*/echo "&"/e' file
This will add one to every individual digit, to increment numbers:
sed 's/[0-9]\+/$((&+1))/g;s/.*/echo "&"/e' file
N.B. This method is fraught with problems and may cause unexpected results.

Related

prevent sed replacements from overwriting each other

I want to replace A with T and T with A
sed -e 's/T/A/g;s/A/T/g
as an example above line changes A:T to T:T
I am hoping to get T:A.
How do I do this?
If you want to change single characters, it is simply:
sed 'y/TA/AT/'
If you want to change longer (non-overlapping) strings, you need a temporary value that you know is never used. Conveniently, newline can never appear. So:
sed '
s/T/\n/g
s/A/T/g
s/\n/A/g
'
I'm not a SED expert - so not sure if that can be done as a single command. Just wondering if you've thought about doing that swap like you would in a programming language that would need a temporary variable to do the switch?
Maybe like change the A to a value you know you don't have in the string like Y for example. Then change the T to A and then change Y to T. Would something like that work?
Edit: I did a quick search just out of curiosity. Found this: https://unix.stackexchange.com/questions/528994/swapping-words-with-sed
In case that helps, but with regex stuff, the result is highly dependent on how structured and unique your inputs are. Not sure how to just swap two arbitrary sub-strings or characters throughout an entire string if there's no particular structure that tells you when you're about to get that sub-string or character like the answer above looking for the parenthesis.
Use this Perl one-liner for case-sensitive replacement:
echo 'TATAtata' | perl -pe 'tr{AT}{TA}'
ATATtata
Or this one-liner for case-insensitive replacement:
echo 'TATAtata' | perl -pe 'tr{ATat}{TAta}'
ATATatat
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc tr
perldoc tr/SEARCHLIST/REPLACEMENTLIST/cdsr

how to transpose values two by two using shell?

I have my data in a file store by lines like this :
3.172704445659,50.011996744997,3.1821975358417,50.012335988197,3.2174797791605,50.023182479597
And I would like 2 columns :
3.172704445659 50.011996744997
3.1821975358417 50.012335988197
3.2174797791605 50.023182479597
I know sed command for delete ','(sed "s/,/ /") but I don't know how to "back to line" every two digits ?
Do you have any ideas ?
One in awk:
$ awk -F, '{for(i=1;i<=NF;i++)printf "%s%s",$i,(i%2&&i!=NF?OFS:ORS)}' file
Output:
3.172704445659 50.011996744997
3.1821975358417 50.012335988197
3.2174797791605 50.023182479597
Solution viable for those without knowledge of awk command - simple for loop over an array of numbers.
IFS=',' read -ra NUMBERS < file
NUMBERS_ON_LINE=2
INDEX=0
for NUMBER in "${NUMBERS[#]}"; do
if (($INDEX==$NUMBERS_ON_LINE-1)); then
INDEX=0
echo "$NUMBER"
else
((INDEX++))
echo -n "$NUMBER "
fi
done
Since you already tried sed, here is a solution using sed:
sed -r "s/(([^,]*,){2})/\1\n/g; s/,\n/\n/g" YOURFILE
-r uses sed extended regexp
there are two substitutions used:
the first substitution, with the (([^,]*,){2}) part, captures two comma separated numbers at once and store them into \1 for reuse: \1 holds in your example at the first match: 3.172704445659,50.011996744997,. Notice: both commas are present.
(([^,]*,){2}) means capture a sequence consisting of NOT comma - that is the [^,]* part followed by a ,
we want two such sequences - that is the (...){2} part
and we want to capture it for reuse in \1 - that is the outer pair of parentheses
then substitute with \1\n - that just inserts the newline after the match, in other words a newline after each second comma
as we have now a comma before the newline that we need to get rid of, we do a second substitution to achieve that:
s/,\n/\n/g
a comma followed by newline is replace with only newline - in other words the comma is deleted
awk and sed are powerful tools, and in fact constitute programming languages in their own right. So, they can, of course, handle this task with ease.
But so can bash, which will have the benefits of being more portable (no outside dependencies), and executing faster (as it uses only built-in functions):
IFS=$', \n'
values=($(</path/to/file))
printf '%.13f %.13f\n' "${values[#]}"

Replace string between words multiple times in a file

I am trying to replace string between two strings in a file with the command below. There could be any number of such patterns in the file. This is just an example.
sed 's/word1.*word2/word1/' 1.txt
There are two instances where 'word1' followed by 'word2' occurs in the sample source file I'm testing. Content of the 1.txt file
word1---sjdkkdkjdk---word2 I want this text----word1---jhfnkfnsjkdnf----word2 I need this also
Result is as below.
word1 I need this also
Expected Output :
word1 I want this text----word1 I need this also
Can anybody help me with this please?
I looked at other stack-overflow questionnaire but they discuss about replacing only one instance of the pattern.
Regular expressions are greedy - they match the longest possible string, so everything from the first 'word1' to the last 'word2'. Not sure if any version of sed supports non-greedy regexps... you could just use perl, though, which does:
perl -pe 's/word1.*?word2/word1/g' 1.txt
should do the trick. That ? changes the meaning of the prior * from 'match as many times as possible as long as the rest of the pattern matches' to 'match as few times as possible as long as the rest of the pattern matches'.
$ sed 's/#/#A/g; s/{/#B/g; s/}/#C/g; s/word1/{/g; s/word2/}/g; s/{[^{}]*}/word1/g; s/}/word2/g; s/{/word1/g; s/#C/}/g; s/#B/{/g; s/#A/#/g' file
word1 I want this text----word1 I need this also
It's lengthy and looks complicated but it's a technique that is used fairly often and is really just a series of simple steps to robustly convert word1 to { and word2 to } so you're dealing with characters instead of strings in the actual substitution s/{[^{}]*}/word1/g and so can use a negated bracket expression to avoid the greedy regexp taking up too much of the line.
See https://stackoverflow.com/a/35708616/1745001 for more info on the general approach used here to be able to turn strings into characters that cannot be present in the input by the time the real work takes place and then restore them again afterwards.
If you only have two instances of the word1-word2 pattern on a line, this should work:
sed 's/\(word1\).*word2\(.*\)\(word1\).*word2\(.*\)/\1\2\3\4/' 1.txt
I grab the parts we want to keep inside escaped brackets \( and \) then I can refer to those parts as \1 \2 and so on.

I am trying to replace a text for example

Example:
"word" -nothing
To
word" - nothing
in gvim.
I tried
:%s/^.*\"/
But what I get is: -nothing
Well I am new to scripting so I would like to know if it can be done in any other way like using gvim or awk or sed.
In vim... Check for \(word + quote + space + hyphen\) as first reference, followed directly by another \(word\) as second reference... replace by first reference + space + second reference... Make sure the find/replace can happen multiple times on a line with g suffix.
:%s/\(\w" -\)\(\w\)/\1 \2/g
Note that I left out the leading quote... I suppose it is possible you might have spaces in the quoted text - and I think this form might be better for you. Now in sed, that is the really cool thing about the relationship between *nix tools - they all use similar (or the same) regular expressions pattern language. So, the same exact pattern as above can be done in sed (using : as delimiter for clarity).
sed 's:\(\w" -\)\(\w\):\1 \2:g'
Awk doesn't do back references; so, not to say it can't be done, but it is not so convenient.
Could you please try following and let me know if this helps you.
awk '{sub(/^"/,"");sub(/-/,"- ")} 1' Input_file
Solution 2nd: With sed.
sed 's/^"//;s/-/- /' Input_file
Since you also tagged grep: GNU grep has the -P switch for PCRE (Perl compatible reg ex) which has \K: Keep the stuff left of the \K, don't include it in $&, so:
$ echo \"word\" | grep -oP "\"\Kword\""
word"
If I understand your question correctly you want to replace first " in each line with empty string. So in sed it is just:
sed 's/"//'
Without g flag it will replace only first occurrence in each line.
EDIT:
The same way it will work in Vim (unless you have 'gdefault' option set), so in Vim you can:
:%s/"//
try this :
:%s/\"(.*)\"/\1\"/gc

sed: Find pattern over two lines, not replace after that pattern

Wow, this one has really got me. Gonna need some tricky sed skill here I think. Here is the output value of command text I'm trying to replace:
...
fast
n : abstaining from food
The value I'd like to replace it with, is:
...
Noun
: abstaining from food
This turns out to be tricker that I thought. Because 'fast' is listed a number of times and because it is listed in other places at the beginning of the line. So I came up with this to define the range:
sed '/fast/,/^ n : / s/fast/Noun/'
Which I thought would do, but... Unfortunately, this doesn't end the replacement and the rest of the output following this match are replaced with Noun. How to get sed to stop replacement after the match? Even better, can I find a two line pattern match and replace it?
Try this:
sed "h; :b; \$b ; N; /^${1}\n n/ {h;x;s//Noun\n/; bb}; \$b ; P; D"
Unfortunately, Paul's answer reads the whole file in which makes any additional processing you might want to do difficult. This version reads the lines in pairs.
By enclosing the sed script in double quotes instead of single quotes, you can include shell variables such as positional parameters. I would recommend surrounding them with curly braces so they are set apart from the adjacent characters. When using double quotes, you'll have to be careful of the shell wanting to do its various expansions. In this example, I've escaped the dollar signs that signify the last line of the input file for the branch commands. Otherwise the shell will try to substitute the value of a variable $b which is likely to be null thus making sed unhappy.
Another technique would be to use single quotes and close and open them each time you have a shell variable:
sed 'h; :b; $b ; N; /^'${1}'\n n/ {h;x;s//Noun\n/; bb}; $b ; P; D'
# ↑open close↑ ↑open close↑
I'm assuming that the "[/code]" in your expected result is a typo. Let me know if it's not.
This seems to do what you want:
sed -e ':a;N;$!ba;s/fast\n n/Noun\n/'
I essentially stole the answer from here.
This might work for you:
sed '$!N;s/^fast\n\s*n :/Noun\n :/;P;D' file
...
Noun
: abstaining from food

Resources