Add spaces around brackets in multiple files in Sublime Text 3 - search

I'm wondering if it's possible to do a search and replace in Sublime Text 3 to add spaces around brackets in multiple files.
For example, I need to convert this in multiple files:
$var = function($par1, $par2);
to:
$var = function( $par1, $par2 );
Any ideas? If so, how can I do it?
Thanks in advance

After some fiddling around, I managed to come up with a regex to add spaces around brackets:
Find: \((.*?)\)
Replace: ( $1 )
UPDATE: The following regex is more precise because it matches ($var) but not ( $var ) nor ():
Find: \((?!\s)([^()]+)(?<!\s)\)
Replace: ( $1 )

Related

PERL eplacing newline characters inside fish-brackets, while leaving other newline characters untouched?

I finally know how to use regular expressions to replace one substring with another every place where it occurs within a string. But what I need to do now is a bit more complicated than that.
A string I must transform will have many instances of the newline character ('\n'). If those newline character are enclosed within fish-tags (between '<' and '>') I need to replace it with a simple whitespace character (' ').
However, if a newline character occurs anywhere else in the string, I need to leave that newline character alone.
There will be several places in the string that are enclosed in fish-tags, and several places that aren't.
Is there a way to do this in PERL?
I honestly don't recommend doing this with regular expressions. Besides the fact that you should never parse html with a regular expression, it's also a pain to do negative matches with regular expressions and anyone reading the code will honestly have no idea what you just did. Doing it manually on the other hand is really easy to understand.
This code assumes well formed html that doesn't have tags starting inside the definition of other tags (otherwise you would have to track all the instances and increment/decrement a count appropriately) and it does not handle < or > inside quoted strings which isn't the most common thing. And if you're doing all that I really recommend you use a real html parser, there are many of them.
Obviously if you're not reading this from a filehandle, the loop would be going over an array of lines (or the output of splitting the whole text, though you would instead be appending ' ' or "\n" depending on the inside variable if you split since it would remove the newline)
use strict;
use warnings;
# Default to being outside a tag
my $inside = 0;
while(my $line = <DATA>) {
# Find the last < and > in the string
my ($open, $close) = map { rindex($line, $_) } qw(< >);
# Update our state accordingly.
if ($open > $close) {
$inside = 1;
} elsif ($open < $close) {
$inside = 0;
}
# If we're inside a tag change the newline (last character in the line) with a space. If you instead want to remove it you can use the built-in chomp.
if ($inside) {
# chomp($line);
substr($line, -1) = ' ';
}
print $line;
}
__DATA__
This is some text
and some more
<enclosed><a
b
c
> <d
e
f
>
<g h i
>
Given:
$ echo "$txt"
Line 1
Line 2
< fish tag line 1
and line 2 >
< line 3 >
< fish tag line 4
and line 5 >
You can do:
$ echo "$txt" | perl -0777 -lpe "s/(<[^\n>]*)\n+([^>]*>)/\1\2/g"
Line 1
Line 2
< fish tag line 1 and line 2 >
< line 3 >
< fish tag line 4 and line 5 >
I will echo that this only works in limited cases. Please do not get in the general habit of using a regex for HTML.
This solution uses zdim's data (thanks, zdim)
I prefer to use an executable replacement together with the non-destructive option of the tr/// operator
This solution finds all occurrences of strings enclosed in angle brackets <...> and alters all newlines within each one to single spaces
Note that it would be simple to allow for quoted substrings containing any characters by writing this instead
$data =~ s{ ( < (?: "[^"]+" | [^>] )+ > ) }{ $1 =~ tr/\n/ /r }gex;
use strict;
use warnings 'all';
use v5.14; # For /r option
my $data = do {
local $/;
<DATA>;
};
$data =~ s{ ( < [^<>]+ > ) }{ $1 =~ tr/\n/ /r }gex;
print $data;
__DATA__
start < inside tags> no new line
again <inside, with one nl
> out
more <inside, with two NLs
and more text
>
output
start < inside tags> no new line
again <inside, with one nl > out
more <inside, with two NLs and more text >
The (X)HTML/XML shouldn't be parsed with regex. But since no description of the problem is given here is a way to go at it. Hopefully it demonstrates how tricky and involved this can get.
You can match a newline itself. Together with details of how linefeeds may come in text
use warnings;
use strict;
my $text = do { # read all text into one string
local $/;
<DATA>;
};
1 while $text =~ s/< ([^>]*) \n ([^>]*) >/<$1 $2>/gx;
print $text;
__DATA__
start < inside tags> no new line
again <inside, with one nl
> out
more <inside, with two NLs
and more text
>
This prints
start < inside tags> no new line
again <inside, with one nl > out
more <inside, with two NLs and more text >
The negated character class [^>] matches anything other than >, optionally and any number of times with *, up to an \n. Then another such pattern follows \n, up to the closing >. The /x modifier allows spaces inside, for readability. We also need to consider two particular cases.
There may be multiple \n inside <...>, for which the while loop is a clean solution.
There may be multiple <...> with \n, which is what /g is for.
The 1 while ... idiom is another way to write while (...) { }, where the body of the loop is empty so everything happens in the condition, which is repeatedly evaluated until false. In our case the substitution keeps being done in the condition until there is no match, when the loop exits.
Thanks to ysth for bringing up these points and for the 1 while ... solution.
All of this necessary care for various details and edge cases (of which there may be more) hopefully convinces you that it is better to reach for an HTML parsing module suitable for the particular task. For this we'd need to know more about the problem.

Changing the names of multiple files using 'rename'

I've about 100 files, I'd like to change the letter P to the letter B in all file names. But when using the rename command:
rename 's/\P/\Bi/' *.txt
I get the following message
Empty \P{} in regex; marked by <-- HERE in m/\P <-- HERE / at (eval 1) line 1.
Help Please
Thanks
Two mistakes. First you don't need to escape P and B since they are not special characters. Second if that i is meant for case insensitivity then that is a flag and needs to be at the end.
rename 's/P/B/i' *.txt
This will change the first occurrence of P or p to B. If you want to change all occurrences then use g flag which means global like this:
rename 's/P/B/ig' *.txt
Update based on new requirement:
From anything complex, it is better off to write your own perl script. Here is a quick example using the File::Copy core module.
use strict;
use warnings;
use File::Copy;
my $dir = '/path/to/your/files/'; # path to your files
opendir my ($dh), $dir;
my #files = grep { /\.txt$/ } readdir $dh; # create a list of all files
for(#files) {
my $cnt;
my $from = $_;
chomp $from;
(my $to = $from) =~ s/(P)/++$cnt>=2 && $cnt<=3 ? "B" : $1/gie;
#print "from:$from to:$to\n";
move("$dir$from", "$dir$to") if ($to ne $from);
}
We create a temporary variable $cnt and with the following condition check if the the character in question is second or third. If it is we replace it with B else we retain it as is.
++$cnt>=2 && $cnt<=3 ? "B" : $1
Dont escape letter, also put regexp modifiers after close slash
rename 's/P/B/ig' *.txt
For more information about regexp see perlre

Best way to extract string between underscores using Perl

I have something like:
$string = '/mfsi_rpt/files/mfsi/reports/bval/bval_parlcont_pck_m_20130430.pdf';
I would like to extract the parlcont from the string (the word between the 2nd and 3rd underscore).
What is the best way to achieve this using Perl?
You can match this with a regular expression, by combining greedy and non-greedy matches, and using capturing parenthesis to extract the part you're interested in:
if( $string =~ m:.+/.*?_(.+?)_:) {
print "$1\n";
}
The ".+/" is a greedy match, which will gobble up everything up to the last / to get past the directory components.
Then the ".*?_" is non-greedy, so it will take everything up to the first _
Then "(.+?)_" is another non-greedy to match and capture everything up to the next _
It would be nice if you first take out the filename from the file path using File::Basename then you can use split to take out the desired name.
use strict;
use File::Basename;
my $string = "/mfsi_rpt/files/mfsi/reports/bval/bval_parlcont_pck_m_20130430.pdf";
my $data = ( split( /_/, basename($string) ))[1];
Output:
parlcont

How to split the string and retrieve last two columns in perl

i want to retrive the last two columns of a string
ex
$path = C:\Documents and Settings\ac62599\AC62599_SBI_Release_2012.12.1_int\vob\SBI_src
$path = C:\views\ac62599\AC62599_view\vob\aims
output should be
\vob\SBI_src
\vob\aims
output should come like this . Thanks in advance
Use split to split the paths into directories. You can use a slice to get the last two, then use join to concatenate them back:
for my $path ('C:\Documents and Settings\ac62599\AC62599_SBI_Release_2012.12.1_int\vob\SBI_src',
'C:\views\ac62599\AC62599_view\vob\aims') {
print '\\', join('\\', (split/\\/, $path)[-2, -1]), "\n";
}
A regex seems to be the simplest solution
my ($dir) = $path =~ /((?:\\[^\\]+){2})$/;
Which is to say, look for backslash, followed by one or more non-backslash characters, and look for this sequence twice at the end of the string and capture it.
Note the use of parentheses around the variable is required to give the regex list context.
Output for the sample paths:
\vob\SBI_src
\vob\aims
$string=~m/.*(\\[^\\]*\\[^\\]*)/g;print $1

Conditional Splitting in Perl

I have the following sentences:
my $sent = 'D. discoideum and D. purpureum developmental programs revealed';
Is there a way I can split the lines so that two consecutive words that have '.' (dot) in between will be treated as one word?
Hence we hope to get this after splitting:
$VAR = ['D. discoideum',
'and',
'D. purpureum',
'developmental',
'programs',
'revealed'];
The standard s/\s+//g will split everything based on space.
Try splitting on:
/(?<!\.)\s+/
This expression matches any space character that does not follow a period, without matching the period itself.
Without a split using a regex:
my #words = $sent =~ /(\S+\.\s+\S+|\S+)/g;

Resources