convert this linux statement into a statement which is supported by windows command prompt - linux

This is my statement supported by unix environment
"cat document.xml | grep \'<w:t\' | sed \'s/<[^<]*>//g\' | grep -v \'^[[:space:]]*$\'"
But I want to execute that statement in windows command prompt .
How do I do that? and what are the commands which are similar to cat, grep,sed .
please tell me the exact code supported for windows similar to above command

The double quotes around the pipeline in your question are a syntax error, and the backslashed single quotes should apparently really not have backslashes, but I assume it's just an artefact of a slightly imprecise presentation.
Here's what the code does.
cat document.xml |
This is a useless use of cat but its purpose is to feed the contents of this file into the pipeline.
grep '<w:t' |
This looks for lines containing the literal string <w:t (probably the start of a tag in the XML format in the file). The single quotes quote the string so that it is not interpreted by the shell (otherwise the < would be interpreted as a redirection operator); they are consumed by the shell, and not passed through to grep.
sed 's/<[^<]*>//g' |
This replaces every pair of open/close brokets with an empty string. The regular expression [^<]* matches zero or more occurrences of a character which can be anything except <. If the XML is well-formed, these should always occur in pairs, and so we effectively remove all XML tags.
grep -v '^[[:space:]]*$'
This removes any line which is empty or consists entirely of whitespace.
Because sed is a superset of grep, the program could easily be rephrased as a single sed script. Perhaps the easiest solution for your immediate problem would be to obtain a copy of sed for your platform.
sed -e '/<w:t/!d' -e 's/<[^<]*>//g' -e '/[^[:space]]/!d' document.xml
I understand quoting rules on Windows may be different; try with double quotes instead of single, or put the script in a file and use sed -f file document.xml where file contains the script itself, like this:
/<w:t/!d
s/<[^<]*>//g
/[^[:space]]/!d
This is a rather crude way to extract the CDATA from an XML document, anyway; perhaps some XML processor would be the proper way forward. E.g. xmlstarlet appears to be available for Windows. It works even if the XML input doesn't have the beginning and ending <w:t> tags on the same line, with nothing else on it. (In fact, parsing XML with line-oriented tools is a massive antipattern.)

May try with "powershell" ?
It is included since Win8 I think,
for sure on W10 it is.
I've just tested a "cat" command and it works.
"grep" don't but may be adapt like this :
PowerShell equivalent to grep -f
and
https://communary.wordpress.com/2014/11/10/grep-the-powershell-way/

The equivalent of grep on windows would be findstr and the equivalent of cat would be type.

Related

how to escape file path in bash script variable

I would like to escape a file path that is stored in a variable in a bash script.
I read several threads about escaping back ticks or but it seems not working as it should:
I have this variable:
The variables value is entered during the bash script execution as user parameter
CONFIG="/home/teams/blabla/blabla.yaml"
I would need to change this to: \/home\/teams\/blabla\/blabla.yaml
How can I do that with in the script via sed or so (not manually)?
With GNU bash and its Parameter Expansion:
echo "${CONFIG//\//\\/}"
Output:
\/home\/teams\/blabla\/blabla.yaml
Using the solution from this question, in your case it will look like this:
CONFIG=$(echo "/home/teams/blabla/blabla.yaml" | sed -e 's/[]\/$*.^[]/\\&/g')
echo "/home/teams/blabla/blabla.yaml" | sed 's/\//\\\//g'
\/home\/teams\/blabla\/blabla.yaml
explanation:
backslash is used to set the following letter/symbol as an regular expression or vice versa. double backslash is used when you need a backslash as letter.
Why does that need escaping? Is this an XY Problem?
If the issue is that you are trying to use that variable in a substitution regex, then the examples given should work, but you might benefit by removing some of the "leaning toothpick syndrom", which many tools can do just by using a different match delimiter. sed, for example:
$: sed "s,SOME_PLACEHOLDER_VALUE,$CONFIG," <<< SOME_PLACEHOLDER_VALUE
/home/teams/blabla/blabla.yaml
Be very careful about this, though. Commas are perfectly valid characters in a filename, as are almost anything but NULLs. Know your data.

regex replace in linux for multiple files

I have 200k files in a single folder in linux server where i need to transform this files using regex
here is the regex to find in text file
([^\s]+?.*)=((.*(?=,$))+|.*).*
now I need to replace it with below substitution value
"$1":"$2",
the above regex is working fine when i used them in python. the server which i am working does not
support python, so i need to use bash commands. i have tried below bash command but it is not working
command:
sed -r 's/([^\s]+?.*)=((.*(?=,$))+|.*).*/"$1":"$2",/g' *20200502*
the above bash command is not working
Fixed your regex:
sed -E 's/([^[:space:]]+?.*)=((.*(=?,$))+|.*).*/"\1":"\2",/g' *20200502*
Replaced inappropriate [^\s] by its POSIX ERE syntax equivalent [^[:space:]].
Fixed the misplaced optional marker ?=,$ with =?,$ instead.
Fixed the invalid capture group reference syntax "$1":"$2" with "\1":"\2".
Kinda difficult to test but just analyzing your approach this might work:
sed -i -E "s/([^\s]+?.*)=((.*(?=,$))+|.*).*/\1$1\2$2/" *20200502*
\1 and \2 in the second part are references to the groups captured in the first part
-E for extended regular expressions (+ and grouping)

Find line starts with and replace in linux using sed [duplicate]

This question already has answers here:
Replace whole line when match found with sed
(4 answers)
Closed 4 years ago.
How do I find line starts with and replace complete line?
File output:
xyz
abc
/dev/linux-test1/
Code:
output=/dev/sda/windows
sed 's/^/dev/linux*/$output/g' file.txt
I am getting below Error:
sed: -e expression #1, char 9: unknown option to `s'
File Output expected after replacement:
xyz
abc
/dev/sda/windows
Let's take this in small steps.
First we try changing "dev" to "other":
sed 's/dev/other/' file.txt
/other/linux-test1/
(Omitting the other lines.) So far, so good. Now "/dev/" => "/other/":
sed 's//dev///other//' file.txt
sed: 1: "s//dev///other//": bad flag in substitute command: '/'
Ah, it's confused, we're using '/' as both a command delimiter and literal text. So we use a different delimiter, like '|':
sed 's|/dev/|/other/|' file.txt
/other/linux-test1/
Good. Now we try to replace the whole line:
sed 's|^/dev/linux*|/other/|' file.txt
/other/-test1/
It didn't replace the whole line... Ah, in sed, '*' means the previous character repeated any number of times. So we precede it with '.', which means any character:
sed 's|^/dev/linux.*|/other/|' file.txt
/other/
Now to introduce the variable:
sed 's|^/dev/linux.*|$output|' file.txt
$output
The shell didn't expand the variable, because of the single quotes. We change to double quotes:
sed "s|^/dev/linux.*|$output|" file.txt
/dev/sda/windows
This might work for you (GNU sed):
output="/dev/sda/windows"; sed -i '\#/dev/linux.*/#c'"$output" file
Set the shell variable and change the line addressed by /dev/linux.*/ to it.
N.B. The shell variable needs to interpolated hence the ; i.e. the variable may be set on a line on its own. Also the the delimiter for the sed address must be changed so as not to interfere with the address, hence \#...#, and finally the shell variable should be enclosed in double quotes to allow full interpolation.
I'd recommend not doing it this way. Here's why.
Sed is not a programming language. It's a stream editor with some constructs that look and behave like a language, but it offers very little in the way of arbitrary string manipulation, format control, etc.
Sed only takes data from a file or stdin (also a file). Embedding strings within your sed script is asking for errors -- constructs like s/re/$output/ are destined to fail at some point, almost regardless of what workarounds you build into your sed script. The best solutions for making sed commands like this work is to do your input sanitization OUTSIDE of sed.
Which brings me to ... this may be the wrong tool for this job, or might be only one component of the toolset for the job.
The error you're getting is obviously because the sed command you're using is horribly busted. The substitute command is:
s/pattern/replacement/flags
but the command you're running is:
s/^/dev/linux*/$output/g
The pattern you're searching for is ^, the null at the beginning of the line. Your replacement pattern is dev, then you have a bunch of text that might be interpreted as flags. This plainly doesn't work, when your search string contains the same character that you're using as a delimiter to the options for the substitute command.
In regular expressions and in sed, you can escape things. You while you might get some traction with s/^\/dev\/linux.*/$output/, you'd still run into difficulty if $output contained slashes. If you're feeding this script to sed from bash, you could use ${output//\//\\\/}, but you can't handle those escapes within sed itself. Sed has no variables.
In a proper programming language, you'd have better separation of variable content and the commands used for the substitution.
output="/dev/sda/windows"
awk -v output="$output" '$1~/\/dev\/linux/ { $0=output } 1' file.txt
Note that I've used $1 here because in your question, your input lines (and output) appear to have a space at the beginning of each line. Awk automatically trims leading and trailing space when assigning field (positional) variables.
Or you could even do this in pure bash, using no external tools:
output="/dev/sda/windows"
while read -r line; do
[[ "$line" =~ ^/dev/linux ]] && line="$output"
printf '%s\n' "$line"
done < file.txt
This one isn't resilient in the face of leading whitespace. Salt to taste.
So .. yes, you can do this with sed. But the way commands get put together in sed makes something like this risky, and despite the available workarounds like switching your substitution command delimiter to another character, you'd almost certainly be better off using other tools.

Replace a phrase in a file with a string which contains special Characters

I am using sed -e "s/foo/$bar/" -e "s/some/$text/" file.whatever to replace a phrase in a certain file. The problem is that the $bar string contains multiple special characters like /. So when I try to replace something in a text file using the following code...
#!/bin/bash
bar="http://stackoverflow.com/"
sed -e "s/foo/$bar/" -e "s/some/$text/ file.whatever
...then I get an error saying : sed: unknown option to s is there anything I can do about it?
You can use any delimiter. s#some#SOME# for example. Another good delimiter is vertical-bar. Other chars can work but have special significance for some contexts such as regular expressions.
You can get this difficulty in sed regardless of what delimiters you use, especially if you don't know what the string contains. I'd pick a different method for passing the shell variables into the helper interpreter.
awk -v rep1="$bar" -v rep2="$text" '{sub(/foo/, rep1); sub(/some/, rep2); print}'
or
perl -spe 's/foo/$rep1/; s/some/$rep2/' -- -rep1="$bar" -rep2="$text"
Correctness trumps brevity in this case.
(reference for Perl example)

Linux command to replace string in LARGE file with another string

I have a huge SQL file that gets executed on the server. The dump is from my machine and in it there are a few settings relating to my machine. So basically, I want every occurance of "c://temp" to be replace by "//home//some//blah"
How can this be done from the command line?
sed is a good choice for large files.
sed -i.bak -e 's%C://temp%//home//some//blah%' large_file.sql
It is a good choice because doesn't read the whole file at once to change it. Quoting the manual:
A stream editor is used to perform
basic text transformations on an input
stream (a file or input from a
pipeline). While in some ways similar
to an editor which permits scripted
edits (such as ed), sed works by
making only one pass over the
input(s), and is consequently more
efficient. But it is sed's ability to
filter text in a pipeline which
particularly distinguishes it from
other types of editors.
The relevant manual section is here. A small explanation follows
-i.bak enables in place editing leaving a backup copy with .bak extension
s%foo%bar% uses s, the substitution command, which
substitutes matches of first string
in between the % sign, 'foo', for the second
string, 'bar'. It's usually written as s//
but because your strings have plenty
of slashes, it's more convenient to
change them for something else so you
avoid having to escape them.
Example
vinko#mithril:~$ sed -i.bak -e 's%C://temp%//home//some//blah%' a.txt
vinko#mithril:~$ more a.txt
//home//some//blah
D://temp
//home//some//blah
D://temp
vinko#mithril:~$ more a.txt.bak
C://temp
D://temp
C://temp
D://temp
Just for completeness. In place replacement using perl.
perl -i -p -e 's{c://temp}{//home//some//blah}g' mysql.dmp
No backslash escapes required either. ;)
Try sed? Something like:
sed 's/c:\/\/temp/\/\/home\/\/some\/\/blah/' mydump.sql > fixeddump.sql
Escaping all those slashes makes this look horrible though, here's a simpler example which changes foo to bar.
sed 's/foo/bar/' mydump.sql > fixeddump.sql
As others have noted, you can choose your own delimiter, which would prevent the leaning toothpick syndrome in this case:
sed 's|c://temp\\|home//some//blah|' mydump.sql > fixeddump.sql
The clever thing about sed is that it operating on a stream rather than a file all at once, so you can process huge files using only a modest amount of memory.
There's also a non-standard UNIX utility, rpl, which does the exact same thing that the sed examples do; however, I'm not sure whether rpl operates streamwise, so sed may be the better option here.
The sed command can do that.
Rather than escaping the slashes, you can choose a different delimiter (_ in this case):
sed -e 's_c://temp/_/home//some//blah/_' file1.txt > file2.txt
perl -pi -e 's#c://temp#//home//some//blah#g' yourfilename
The -p will treat this script as a loop, it will read the specified file line by line running the regex search and replace.
-i This flag should be used in conjunction with the -p flag. This commands Perl to edit the file in place.
-e Just means execute this perl code.
Good luck
gawk
awk '{gsub("c://temp","//home//some//blah")}1' file

Resources