linux extract string between tags and paste between others tags - linux

I have files with xml text like:
<tag1>unknown string1</tag1>blablabla....<tag2></tag2>
I want use sed (or another command) to extract string between tag's 1 and paste between tag's 2, to result:
<tag1>unknown string1</tag1>blablabla....<tag2>unknown string1</tag2>
Thanks.
I found a solution!.
sed 's/\(.*<tag1>\)\(.*\)\(<\/tag1>.*<tag2>\)\(**\)\(<\/tag2>.*\)/\1\2\3\2\5/' file
Divide entire file in references, and after reorder this in convenience.

Try this sed command
Command :
sed 'N;s/\(<tag1>\(.*\)<\/tag1>\n<tag2>\).*\(<\/tag2>\)/\1\2\3/' FIleName
Output:
<tag1>unknown string1</tag1>
<tag2>unknown string1</tag2>

This might work for you (GNU sed):
sed -r '/<tag1>/h;/<tag2>/{G;s/>.*(<.*)\n.*>(.*)<.*/>\2\1/}' file
This makes a copy of tag1 in the hold space (HS) and on encountering tag2 appends the HS to the current line and uses pattern matching to produce the required string.
N.B. this assumes one tag per line.

Related

OSX Terminal: how to add same item to each line of a csv file?

I have got a csv file like this:
120,256,300
36,255,12
etc...
I want to add a fixed string like 'USA' to all lines in order to obtain:
120,256,300,USA
36,255,12,USA
etc...
How can I do that?
Thanks
From a text processing point of view that CSV file is plain text in this context, you just want to attach , USA to each line.
The easiest (and operationally least expensive) way to do so is probably:
sed -i '' 's/$/, USA/' file
What this does is to instruct sed to look for the end of line $ and "replace" it with , USA. As sed is line-based this obviously doesn't actually trim out the new line of the file.
-i '' instructs sed to make the changes in-line without creating a backup file.
If you wanted a backup you can put the desired extension instead of '', e.g. -i .bak.
You can just use sed: cat <input-file> | sed 's/\(.*\)/\1, USA/'.
Here s is the substitute command, which uses the following character as a separator between a regular expression and a substitution. For the regular expression, the escaped parenthesis are used to create a capture group, the regex .* captures the entire line. For the substitution, the \1 inserts the first capture group, and then the , USA text is appended.
You can perform the replacement in place using: sed -i .bak 's/\(.*\)/\1, USA/' <input-file>

sed to replace same patterns that have slightly different ending to the string

I am using grep on an entire directory and sed to replace the string. There are some conflicts in replacing the as there are two strings that are very similar and have the same pattern. Only big difference is the file extension at the end.
String1
xargs sed -i
's,//website.net/resources/special.js,//newsite.net/location/newspecial.js,g'
String2
xargs sed -i
's,//website.net/resources/file.swf,//newsite.net/location/player.swf,g'
How do I specify that .js receives the correct replacement and .swf receives the correct replacement?
For the first, you can restrict the match easily, for the second you need a mapping to provide the old file name to new file name otherwise how the script is going to know that "file.swf" to be replaced with "player.swf".
$ echo '//website.net/resources/special.js' |
sed -r 's,(.*/)(.*.js)$,\1new\2,'
//website.net/resources/newspecial.js
first match group will include every char until the last /., second match things ending with .js, you may need another anchor if there are multiple elements on the same line. Note that in one element case g is unnecessary.

Filter out only matched values from a text file in each line

I have a file "test.txt" with the lines below and also lot bunch of extra stuff after the "version"
soainfra_metrics{metric_group="sca_composite",partition="test",is_active="true",state="on",is_default="true",composite="test123"} map:stats version:1.0
soainfra_metrics{metric_group="sca_composite",partition="gello",is_active="true",state="on",is_default="true",composite="test234"} map:stats version:1.8
soainfra_metrics{metric_group="sca_composite",partition="bolo",is_active="true",state="on",is_default="true",composite="3415"} map:stats version:3.1
soainfra_metrics{metric_group="sca_composite",partition="solo",is_active="true",state="on",is_default="true",composite="hji"} map:stats version:1.1
I tried:
egrep -r 'partition|is_active|state|is_default|composite' test.txt
It's displaying every line, but I need only specific mentioned fields like this below,ignoring rest of the data/stuff or lines
in a nut shell, i want to display only these fields from a line not the rest
partition="test",is_active="true",state="on",is_default="true",composite="test123"
partition="gello",is_active="true",state="on",is_default="true",composite="test234"
partition="bolo",is_active="true",state="on",is_default="true",composite="3415"
partition="solo",is_active="true",state="on",is_default="true",composite="hji"
If your version of grep supports Perl-style regular expressions, then I'd use this:
grep -oP '.*?,\K[^}]+' file
It removes everything up to the first comma (\K kills any previous output) and prints everything up to the }.
Alternatively, using awk:
awk -F'}' '{ sub(/[^,]+,/, ""); print $1 }' file
This sets the field separator to } so the part you're interested in is the first field. It then uses sub to remove the part up to the first comma.
For completeness, you could also use sed:
sed 's/[^,]*,\([^}]*\).*/\1/' file
This captures the part after the first , up to the } and replaces the content of the line with it.
After the grep to pick out the lines you want, use sed to edit the lines:
sed 's/.*\(partition[^}]*\)} map.*/\1/'
This means: "whenever you see anything .*, followed by partition and
any number of non-}, then } map and anything else, grab the part
from partition up to but not including the brace \(...\) as group 1.
The replacement text is just group 1 \1.
Use a pipe | to connect the output of egrep to the input of sed:
egrep ... | sed ...
As far as i understood your file might have more lines you don't want to see, so i would use:
sed -n 's/.*\(partition.*\)}.*/\1/p' file
we use -n p to show only lines where we made substitution. The substitution part just gets the part of the line you need substituting the whole line with the pattern.
This might work for you (GNU sed):
sed -r 's/(partition|is_active|state|is_default|composite)="[^"]*"/\n&\n/g;s/[^\n]*\n([^\n]*)\n[^\n]*/\1,/g;s/,$//' file
Treat the problem as if it were a "decomposed club sandwich". Identify the fillings, remove the bread and tidy up.

Bash sed replace text with file content

I would like to replace string with file.txt content.
mtn="John"
fs=`cat file.txt`
lgtxt=`cat large_text.txt`
stxt1=`echo $lgtxt | sed "s/zzzz/$mtn/g"`
stxt2=`echo $stxt1 | sed "s/pppp/$fs/g"`
It replace 'zzzz' with value of 'mnt' but doesn't 'pppp'.
File file.txt contain list of names eg:
Tom jones
Ted Baker
Linda Evans
in separate lines.
I want to place them in file large_text.txt in separate lines like they are in oryginal file and separated by commas.
You don't want to use a variable for the substitution (because it may well contain newlines, for example). I'm assuming that it's GNU sed given it's linux. In which case, see whether GNU sed's r command could help you:
`r FILENAME'
As a GNU extension, this command accepts two addresses.
Queue the contents of FILENAME to be read and inserted into the
output stream at the end of the current cycle, or when the next
input line is read. Note that if FILENAME cannot be read, it is
treated as if it were an empty file, without any error indication.
As a GNU `sed' extension, the special value `/dev/stdin' is
supported for the file name, which reads the contents of the
standard input.
If pppp is on a line of its own, you could go with something like
/pppp/{
r file.txt
d
}
or, alternatively, the s command with e modifier:
`e'
This command allows one to pipe input from a shell command into
pattern space. If a substitution was made, the command that is
found in pattern space is executed and pattern space is replaced
with its output. A trailing newline is suppressed; results are
undefined if the command to be executed contains a NUL character.
This is a GNU `sed' extension.
This would look something like
s/pppp/cat file.txt/e
and is what you'll need if pppp is mid-line. Also, if you need to do further processing on file.txt, you could replace cat with whatever you need (though you need to be careful about quoting / and \).
A final option is to consider Perl, which will accept something very similar to your shell commands.

Execute a cat command within a sed command in Linux

I have a file.txt that has some content. I want to search for a string in file1.txt, if that string is matched I want to replace that string with the content of the file.txt. How can I achieve this?
I have tried using sed:
sed -e 's/%d/cat file.txt/g' file1.txt
This is searching for the matching string in the file1.txt and replacing that with the string cat file.txt, but I want contents of file.txt instead.
How about saving the content of the file in the variable before inserting it into sed string?
$content=`cat file.txt`; sed "s/%d/${content}/g file1.txt"
Given original.txt with content:
this is some text that needs replacement
and replacement.txt with content:
no changes
I'm not sure what your replacement criteria is, but lets say I want to replace all occurrences of the word replacement in the original file, you may replace the content of the original text with the content of the other file using:
> sed -e 's/replacement/'"`cat replacement.txt`"'/g' original.txt
this is some text that needs no changes
You can read a file with sed using the r command. However, that is a line-based operation, which may not be what you're after.
sed '/%d/r file1.txt'
The read occurs at the end of the 'per-line' cycle.

Resources