Does anyone know how to make a macro in Excel that convert strings ?
I want to convert all strings existing in one column respecting this transformation :
initial
result of macro
first_name
$firstName
last_name
$lastName
email
$email
email2
$email2
I would like to transform in Uppercase after the underscore if it's a caracter.
Thanks a lot !
A Notepad++ solution:
Ctrl+H
Find what: (?:^([^\W_]+)|\G(?!^))(?:_([^\W_]*))?
Replace with: (?1\$$1)\u$2
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
(?: # non capture group
^ # beginning of line
( # group 1
[^\W_]+ # 1 or more non non-word character or underscore
) # end group 1
| # OR
\G(?!^) # restart from last match position, not at the beginning of lie
) # end grroup
(?: # non capture group
_ # underscore
( # group 2
[^\W_]* # 0 or more non non-word character or underscore
) # end group 1
) # end group
Replacement:
(?1 # if group 1 exists
\$$1 # add a dollar before the group 1
) # endif
\u$1 # uppercase the first letter of group 2
Screenshot (before):
Screenshot (after):
Related
Is there a way to remove lines that contain three specific characters?
For example the characters should be U S E
So for these lines, it should just remove USER AND UDES:
USER
USAD
UDES
Thanks
Ctrl+H
Find what: ^(?=.*U)(?=.*S)(?=.*E).+\R?
Replace with: LEAVE EMPTY
TICK Match case
TICK Wrap around
SELECT Regular expression
UNTICK . matches newline
Replace all
Explanation:
^ # beginning of line
(?= # positive lookahead, make sure we have after:
.* # 0 or more any character but newline
U # letter uppercase U
) # end lookahead
(?=.*S) # same as above for letter S
(?=.*E) # same as above for letter E
.+ # 1 or more any character but newline
\R? # any kind of linebreak, optional
Screenshot (before):
Screenshot (after):
line A
foo bar bar foo bar foo
line B
foo bar bar foo
In line A, there are multiple occurrence of double space.
I only want to match lines like line B which has only once double space occurrence.
I tried
^.*\s{2}.*$
but it will match both.
How may I have the desired output? Thank you.
If you wish to match strings that contain no more than one string of two or more spaces between words you could use following regular expression.
r'^(?!(?:.*(?<! ) {2,}(?! )){2})'
Start your engine!
Note that this expression matches
abc de fgh
where there are four spaces between 'c' and 'd'.
Python's regex engine performs the following operations.
^
(?! : begin negative lookahead
(?: : begin non-capture group
.* : match 0+ characters other than line terminators
(?<! : begin negative lookbehind
[ ]{2,} : match 2+ spaces
(?! ) : negative lookahead asserts match is not followed by a space
) : end negative lookbehind
) : end non-capture group
{2} : execute non-capture group twice
) : end negative lookahead
You can do:
^(?!.*[ \t]{2,}.*[ \t]{2,})
# Negative look ahead assertion that states 'only start the match
# on this line IF there are NOT 2 (or potentially more) breaks with
# two (or potentially more) of tabs or spaces'.
Demo 1
If you want to require ONE double space in the line but not more:
^(?=.*[ \t]{2,})(?!.*[ \t]{2,}.*[ \t]{2,})
# Positive look ahead that states 'only start this match if there is
# at least one break with two tabs or spaces'
# BUT
# Negative look ahead assertion that states 'only start the match
# on this line IF there are NOT 2 (or potentially more) breaks with
# two (or potentially more) of tabs or spaces'.
Demo 2
If you want to limit to only two spaces (not tabs and not more than 2 spaces):
^(?=.*[ ]{2})(?!.*[ ]{2}.*[ ]{2})
# Same as above but remove the tabs as part of the assertion
Demo 3
Note: In your regex you have \s as the class for a space. That also matches [\r\n\t\f\v ] so both horizontal and vertical space characters.
Note 2:
You can do this without a regex as well (assuming you only want lines that have 1 and only 1 double space in them):
txt='''\
line A
foo bar bar foo bar foo
line B
foo bar bar foo'''
>>> [line for line in txt.splitlines() if len(line.split(' '))==2]
['foo bar bar foo']
You can get the match without lookarounds by starting the match with 1+ non whitespace chars.
Then optionally repeat a single whitespace char followed by non whitespace chars before and after matching a double whitespace char.
The negated character class [^\S\r\n] will match any whitespace chars except a newline or carriage return. If you want to allow matching newlines as well, you could use \s
^\S+(?:[^\S\r\n]\S+)*[^\S\r\n]{2}(?:\S+[^\S\r\n])*\S+$
Explanation
^ Start of string
\S+ Match 1+ non whitespace chars
(?: Non capture group
[^\S\r\n]\S+ Match a whitespace char without a newline
)* Close group and repeat 0+ times
[^\S\r\n]{2} Match the 2 whitespace chars without a newline
(?: Non capture group
\S+[^\S\r\n] Match 1+ non whitespace chars followed by a whitespace char without a newline
)* Close group a and repeat 1+ times
\S+ Match 1+ non whitespace chars
$ End of string
Regex demo
I need to remove both duplicates like:
admin
user
admin
result:
user
I have tried but none works for notepad++
You have to sort your file before apply this (for example using the plugin TexFX).
Ctrl+H
Find what: ^(.+)(?:\R\1)+
Replace with: NOTHING
check Wrap around
check Regular expression
DO NOT CHECK . matches newline
Replace all
Explanation:
^ : Beginning of line
(.+) : group 1, 1 or more any character but newline
(?: : start non capture group
\R : any kind of linebreak
\1 : content of group 1
)+ : end group, must appear 1 or more times
I have a text file which contains word like this
137.147.138.224|write|write|Australia
137.154.4.3|United States
And I want to find
137.154.4.3|United States
There may be anything in place of 137.154.4.3|United States like 155.186.7.9|India , 185.173.4.7|JapanSo i have long list of words like that and i just wanted to find the words contains only one vertical bar |
To find the lines which have an IP, a | and then a country, you can use this regex pattern:
\d+\.\d+\.\d+\.\d+\|[^|]+$
\d+\.\d+\.\d+\.\d+ # digits (1 or more) and dots
\| # string literal
[^|]+ # 1 or more characters that are not '|'
$ # end of line
Demo
E.g. I've got a text file like this:
abc,#32432,he#llo
xyz,#989,wor#ld
I wish to change it to be:
abc,32432,he#llo
xyz,989,wor#ld
Just remove first "#" from each row. How to do that?
Thanks.
To only change the first # in a string you could use a regular expression like this
"abc,#32432,he#llo" -replace '(.*?)#(.*$)', '${1}${2}'
Explanation as by Regexbuddy
# (.*?)#(.*$)
#
# Options: Case insensitive; Exact spacing; Dot doesn't match line breaks; ^$ match at line breaks; Parentheses capture
#
# Match the regex below and capture its match into backreference number 1 «(.*?)»
# Match any single character that is NOT a line break character (line feed) «.*?»
# Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Match the character “#” literally «#»
# Match the regex below and capture its match into backreference number 2 «(.*$)»
# Match any single character that is NOT a line break character (line feed) «.*»
# Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
# Assert position at the end of a line (at the end of the string or before a line break character) (line feed) «$»
You can use simple regex that replaces a hash that follows a section of non hashes with nothing, something like this:
Get-Content input_file.txt | % {$_ -replace '(?<=^[^#]+)#'} | Set-Content output_file.txt