I try to convert filenames and remove special chars and whitespaces.
For some reasons my SED regex don't work if I declare dash and slashes not to be replaced.
Example:
echo "/path/to/file 20-456 (1).jpg" | sed -e 's/ /_/g' -e 's/[^0-9a-zA-Z\.\_\-\/]//g'
Output:
/path/to/file_20456_1.jpg
So the dash isn't in.
When I try this command:
echo "/path/to/file 20-456 (1).jpg" | sed -e 's/ /_/g' -e 's/[^0-9a-zA-Z\.\_\-]//g'
Output:
pathtofile_20-456_1.jpg
the dash is there but without the directory slashes I can't move the files.
I wonder why the replacment with dash didn't work anymore if I add \/ into regex pattern.
Any suggestions?
With your shown samples and attempts, please try following awk code.
echo "/path/to/file 20-456 (1).jpg" |
awk 'BEGIN{FS=OFS="/"} {gsub(/ /,"_",$NF);gsub(/-|\(|\)/,"",$NF)} 1'
Explanation: Simple explanation would be, by echo printing value /path/to/file 20-456 (1).jpg as a standard input to awk program. In awk program, setting FS and OFS to / in BEGIN section. Then in main program using gsub to globally substitute space with _ in last field($NF) and then globally substitute - OR ( OR ) with NULL in last field and then mentioning 1 will print that line.
You may get the result using string manipulation in Bash:
#!/bin/bash
path="/path/to/file 20-456 (1).jpg"
fldr="${path%/*}" # Get the folder
file="${path##*/}" # Get the file name
file="${file// /_}" # Replace spaces with underscores in filename
echo "$fldr/${file//[^[:alnum:]._-]/}" # Get the result
See the online demo yielding /path/to/file_20-456_1.jpg.
Quick notes:
${path%/*} - Removes the smallest chunk up to / from the end of the path
${path##*/} - Removes the largest text chunk from start of path to last / (including it)
${file// /_} replaces all spaces with _ in file
${file//[^[:alnum:]._-]/} removes all chars that are not alphanumeric, ., _ and - from file.
For the below code:
sed "s/ //g" filename
Since, this is used to remove the spaces, why there are 2 forward slashes in front of 'g'. What can be the reason. Though it is working fine.
I suggest you read some tutorial about sed first.
Long story short, use this example sed "s/search_pattern/replace_string/g" filename:
s means search and replace
search_pattern is the pattern to be searched
replace_string is the string to be replaced
g means apply the action globally, which means keep search and replace for all match pattern
Thus, sed "s/ //g" filename means search all space in file and replace it to empty string
Each slash is a token, there's just nothing between them. For example if you wanted to replace spaces with underscores, you would put an underscore between the second and third slashes:
sed "s/ /_/g" filename
Example run:
$ echo "foo bar" | sed "s/ /_/g"
foo_bar
$ echo "foo bar" | sed "s/ //g"
foobar
I need to replace inverted exclamation and inverted question marks in subtitle files so they display correctly on my TV. The files work correctly in ISO-8859, but I can't remove the marks.
The first solution was to use the command 'sed':
sed s/\¿|¡//g "$FILE"
This works for files in UTF-8, but what would be the right solution for files in ISO-8859?
sed 's/\xBF//g', for example, doesn't work.
In this command, your \ is removed by bash before the argument is passed to sed:
sed s/\¿//g "$FILE"
That doesn't matter, because ¿ is not a bash metacharacter and it does not require quoting. However, if you write this:
sed s/\xBF//g "$FILE"
it won't do what you expect; bash will replace \x with x leaving sed with the command s/xBF//g, which is probably not what you wanted to do.
You must either write:
sed 's/\xBF//g'
or
sed s/\\xBF//g
The command posted will not work, though:
sed s/\¿|¡//g "$FILE"
| is a bash metacharacter, and it must therefore be quoted or escaped. Also, sed uses Basic Regular Expressions (BREs) by default, which means that you must write \| to express alternation. That means that you would have to type:
sed 's/¿\|¡//g' "$FILE"
or
sed s/¿\\\|¡//g "$FILE"
I have a list of file names in a directory (/path/to/local). I would like to remove a certain number of characters from all of those filenames.
Example filenames:
iso1111_plane001_00321.moc1
iso1111_plane002_00321.moc1
iso2222_plane001_00123.moc1
In every filename I wish to remove the last 5 characters before the file extension.
For example:
iso1111_plane001_.moc1
iso1111_plane002_.moc1
iso2222_plane001_.moc1
I believe this can be done using sed, but I cannot determine the exact coding. Something like...
for filename in /path/to/local/*.moc1; do
mv $filname $(echo $filename | sed -e 's/.....^//');
done
...but that does not work. Sorry if I butchered the sed options, I do not have much experience with it.
mv $filname $(echo $filename | sed -e 's/.....\.moc1$//');
or
echo ${filename%%?????.moc1}.moc1
%% is a bash internal operator...
This sed command will work for all the examples you gave.
sed -e 's/\(.*\)_.*\.moc1/\1_.moc1/'
However, if you just want to specifically "remove 5 characters before the last extension in a filename" this command is what you want:
sed -e 's/\(.*\)[0-9a-zA-Z]\{5\}\.\([^.]*\)/\1.\2/'
You can implement this in your script like so:
for filename in /path/to/local/*.moc1; do
mv $filename "$(echo $filename | sed -e 's/\(.*\)[0-9a-zA-Z]\{5\}\.\([^.]*\)/\1.\2/')";
done
First Command Explanation
The first sed command works by grabbing all characters until the first underscore: \(.*\)_
Then it discards all characters until it finds .moc1: .*\.moc1
Then it replaces the text that it found with everything it grabbed at first inside the parenthesis: /\1
And finally adds the .moc1 extension back on the end and ends the regex: .moc1/
Second Command Explanation
The second sed command works by grabbing all characters at first: \(.*\)
And then it is forced to stop grabbing characters so it can discard five characters, or more specifically, five characters that lie in the ranges 0-9, a-z, and A-Z: [0-9a-zA-Z]\{5\}
Then comes the dot '.' character to mark the last extension : \.
And then it looks for all non-dot characters. This ensures that we are grabbing the last extension: \([^.]*\)
Finally, it replaces all that text with the first and second capture groups, separated by the . character, and ends the regex: /\1.\2/
This might work for you (GNU sed):
sed -r 's/(.*).{5}\./\1./' file
If I run these commands from a script:
#my.sh
PWD=bla
sed 's/xxx/'$PWD'/'
...
$ ./my.sh
xxx
bla
it is fine.
But, if I run:
#my.sh
sed 's/xxx/'$PWD'/'
...
$ ./my.sh
$ sed: -e expression #1, char 8: Unknown option to `s'
I read in tutorials that to substitute environment variables from shell you need to stop, and 'out quote' the $varname part so that it is not substituted directly, which is what I did, and which works only if the variable is defined immediately before.
How can I get sed to recognize a $var as an environment variable as it is defined in the shell?
Your two examples look identical, which makes problems hard to diagnose. Potential problems:
You may need double quotes, as in sed 's/xxx/'"$PWD"'/'
$PWD may contain a slash, in which case you need to find a character not contained in $PWD to use as a delimiter.
To nail both issues at once, perhaps
sed 's#xxx#'"$PWD"'#'
In addition to Norman Ramsey's answer, I'd like to add that you can double-quote the entire string (which may make the statement more readable and less error prone).
So if you want to search for 'foo' and replace it with the content of $BAR, you can enclose the sed command in double-quotes.
sed 's/foo/$BAR/g'
sed "s/foo/$BAR/g"
In the first, $BAR will not expand correctly while in the second $BAR will expand correctly.
Another easy alternative:
Since $PWD will usually contain a slash /, use | instead of / for the sed statement:
sed -e "s|xxx|$PWD|"
You can use other characters besides "/" in substitution:
sed "s#$1#$2#g" -i FILE
一. bad way: change delimiter
sed 's/xxx/'"$PWD"'/'
sed 's:xxx:'"$PWD"':'
sed 's#xxx#'"$PWD"'#'
maybe those not the final answer,
you can not known what character will occur in $PWD, / : OR #.
if delimiter char in $PWD, they will break the expression
the good way is replace(escape) the special character in $PWD.
二. good way: escape delimiter
for example:
try to replace URL as $url (has : / in content)
x.com:80/aa/bb/aa.js
in string $tmp
URL
A. use / as delimiter
escape / as \/ in var (before use in sed expression)
## step 1: try escape
echo ${url//\//\\/}
x.com:80\/aa\/bb\/aa.js #escape fine
echo ${url//\//\/}
x.com:80/aa/bb/aa.js #escape not success
echo "${url//\//\/}"
x.com:80\/aa\/bb\/aa.js #escape fine, notice `"`
## step 2: do sed
echo $tmp | sed "s/URL/${url//\//\\/}/"
URL
echo $tmp | sed "s/URL/${url//\//\/}/"
URL
OR
B. use : as delimiter (more readable than /)
escape : as \: in var (before use in sed expression)
## step 1: try escape
echo ${url//:/\:}
x.com:80/aa/bb/aa.js #escape not success
echo "${url//:/\:}"
x.com\:80/aa/bb/aa.js #escape fine, notice `"`
## step 2: do sed
echo $tmp | sed "s:URL:${url//:/\:}:g"
x.com:80/aa/bb/aa.js
With your question edit, I see your problem. Let's say the current directory is /home/yourname ... in this case, your command below:
sed 's/xxx/'$PWD'/'
will be expanded to
sed `s/xxx//home/yourname//
which is not valid. You need to put a \ character in front of each / in your $PWD if you want to do this.
Actually, the simplest thing (in GNU sed, at least) is to use a different separator for the sed substitution (s) command. So, instead of s/pattern/'$mypath'/ being expanded to s/pattern//my/path/, which will of course confuse the s command, use s!pattern!'$mypath'!, which will be expanded to s!pattern!/my/path!. I’ve used the bang (!) character (or use anything you like) which avoids the usual, but-by-no-means-your-only-choice forward slash as the separator.
Dealing with VARIABLES within sed
[root#gislab00207 ldom]# echo domainname: None > /tmp/1.txt
[root#gislab00207 ldom]# cat /tmp/1.txt
domainname: None
[root#gislab00207 ldom]# echo ${DOMAIN_NAME}
dcsw-79-98vm.us.oracle.com
[root#gislab00207 ldom]# cat /tmp/1.txt | sed -e 's/domainname: None/domainname: ${DOMAIN_NAME}/g'
--- Below is the result -- very funny.
domainname: ${DOMAIN_NAME}
--- You need to single quote your variable like this ...
[root#gislab00207 ldom]# cat /tmp/1.txt | sed -e 's/domainname: None/domainname: '${DOMAIN_NAME}'/g'
--- The right result is below
domainname: dcsw-79-98vm.us.oracle.com
VAR=8675309
echo "abcde:jhdfj$jhbsfiy/.hghi$jh:12345:dgve::" |\
sed 's/:[0-9]*:/:'$VAR':/1'
where VAR contains what you want to replace the field with
I had similar problem, I had a list and I have to build a SQL script based on template (that contained #INPUT# as element to replace):
for i in LIST
do
awk "sub(/\#INPUT\#/,\"${i}\");" template.sql >> output
done
If your replacement string may contain other sed control characters, then a two-step substitution (first escaping the replacement string) may be what you want:
PWD='/a\1&b$_' # these are problematic for sed
PWD_ESC=$(printf '%s\n' "$PWD" | sed -e 's/[\/&]/\\&/g')
echo 'xxx' | sed "s/xxx/$PWD_ESC/" # now this works as expected
for me to replace some text against the value of an environment variable in a file with sed works only with quota as the following:
sed -i 's/original_value/'"$MY_ENVIRNONMENT_VARIABLE"'/g' myfile.txt
BUT when the value of MY_ENVIRONMENT_VARIABLE contains a URL (ie https://andreas.gr) then the above was not working.
THEN use different delimiter:
sed -i "s|original_value|$MY_ENVIRNONMENT_VARIABLE|g" myfile.txt