I have the following string:
mystr = '(string_to_delete_20221012_11-36) keep this (string_to_delete_20221016_22-22) keep this (string_to_delete_20221017_20-55) keep this'
I wish to delete all the entries (string_to_deletexxxxxxxxxxxxxxx) (including the trailing space)
I sort of need pseudo code as follows:
If you find a string (string_to_delete then replace that string and the timestamp, closing parenthesis and trailing space with null e.g. delete the string (string_to_delete_20221012_11-36)
I would use a list comprehension but given that not all strings are contained inside parenthesis I cannot see what I could use to create the list via a string.split().
Is this somethng that needs regular expressions?
it seemed like a good place to put regex:
import re
pattern = r'\(string_to_delete_.*?\)\s*'
mystr = '(string_to_delete_20221012_11-36) keep this (string_to_delete_20221016_22-22) keep this (string_to_delete_20221017_20-55) keep this'
for match in re.findall(pattern, mystr):
mystr = mystr.replace(match, '', 1) # replace 1st occurence of matched str with empty string
print(mystr)
results with:
>> keep this keep this keep this
brief regex breakdown: \(string_to_delete_.*?\)\s*
\( look for left parenthesis - escape needed
match string string_to_delete_
.*? look for zero or more characters if any
\) match closing parenthesis
\s* include zero or more whitespaces after that
I'm trying to clean and format some set of data obtained from an accounting system and I have been able to create VBA code to use TRIM or CLEAN functions in the specific column ranges.
The thing is that I need to keep the blank spaces within the strings (can be 2, 3 or more blanks) but still remove the leading/trailing spaces and the mentioned functions reduce the inner spaces to 1. This does not work for me as the data is used as a key to match other information in further steps of the process. Bare in mind that leading/trailing blanks can be the result of space bar key, any other character that appears as a blank or even contains line breaks, so again, I want all of these removed but inner blanks. Strings can be made of alphanumeric characters.
I'm using this in a Private Sub (code is execute via a click in a button placed in the worksheet).
Dim rng1a As Range
Dim Area1a As Range
Set rng1a = Range("F2:F35001")
For Each Area1a In rng1a.Areas
Area1a.NumberFormat = "#"
Area1a.Value = Evaluate("IF(ROW(" & Area1a.Address & "),CLEAN(TRIM(" & Area1a.Address & ")))")
Next Area1a
Example (in range F2:F35001):
Original: Sample Text for Review. *(there are blanks after the string)
Result:Sample Text for Review.
Desired:Sample Text for Review.
I made some research for a couple of weeks and haven't been able to find a solution that keeps the inner blanks "as is" and avoid as much as possible duplicate question in the forum. Thanks in advance for the help.
You can do this with regular expressions:
Option Explicit
Function trimWhiteSpace(s As String) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.MultiLine = True
.Pattern = "^\s*(\S.*\S)\s*"
trimWhiteSpace = .Replace(s, "$1")
End With
End Function
Explanation of the Regex
Trim leading and trailing white space
^\s*(\S.*\S)\s*
Options: Case sensitive; ^$ match at line breaks
Assert position at the beginning of a line (at beginning of the string or after a line break character) (line feed, line feed, line separator, paragraph separator) ^
Match a single character that is a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) \s*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the regex below and capture its match into backreference number 1 (\S.*\S)
Match a single character that is NOT a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) \S
Match any single character that is NOT a line break character (line feed) .*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match a single character that is NOT a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) \S
Match a single character that is a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) \s*
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
$1
Insert the text that was last matched by capturing group number 1 $1
Created with RegexBuddy
On the other hand, if you want to avoid regular expressions, and if your only leading/trailing "white-space" characters are space, tab and linefeed, AND if the only "internal" white space characters are the space, you could use:
Function trimWhiteSpace(s As String) As String
trimWhiteSpace = Trim(Replace(Replace([a1], vbLf, ""), vbTab, ""))
End Function
Note that the VBA Trim function (unlike the worksheet function), only removes leading and trailing spaces, and leaves internal spaces unchanged. But this won't work if you have tab's within the string that need to be preserved.
Either of the above can be incorporated into your macro.
Have you tried using the LTRIM function to remove leading spaces then RTRIM to remove the trailing ones which will leave the internal ones intact?
From your description you don't expect TAB characters or Carriage Returns in the middle of your strings so you could just do a replace for them:
strSource = Replace(strSource, vbTab, "")
strSource = Replace(strSource, vbCrLf, " ")
I have a list Eg. a = ["dgbbgfbjhffbjjddvj/n//n//n' "]
How do I remove the trailing new lines i.e. all /n with extra single inverted comma at the end?
Expected result = ["dfgjhgjjhgfjjfgg"] (I typed it randomly)
you can use string rstrip() method.
usage:
str.rstrip([c])
where c are what chars have to be trimmed, whitespace is the default when no arg provided.
example:
a = ['Return a copy of the string\n', 'with trailing characters removed\n\n']
[i.rstrip('\n') for i in a]
result:
['Return a copy of the string', 'with trailing characters removed']
more about strip():
https://www.tutorialspoint.com/python3/string_rstrip.htm
Hi I'm new to Marie assembly language.
I'm trying to trim the white spaces at the end of a string. I have a print subroutine that stops printing once it reaches a 0 character so to trim the string at the ends I iterate to the end of the string, get the address of the last character and iterate backwards replacing any white spaces.
My problem is HOW to replace the white spaces because if I replace it in my trim string address I can't iterate backwards correctly? Because it loads value from address 0 instead? Any help will be appreciated.
StartRemoveSpace, LoadI TrimStringAddr //get last char that's not zero
Subt Space
Skipcond 400 //if its a space skip next line
JumpI TrimString //terminate trimming
Load CharacterReplace //replace with 0
//Replace where??
//Store TrimStringAddr
Load TrimStringAddr
Subt One // iterate backwards
Store TrimStringAddr
Jump StartRemoveSpace
The string I am given is as follows:
scrap1 =
a le h
ke fd
zyq b
ner i
You'll notice there are 2 blank spaces indicating a space (ASCII 32) in each row. I need to find the mean ASCII value in each column without taking into account the spaces (32). So first I would convert to with double(scrap1) but then how do I find the mean without taking into account the spaces?
If it's only the ASCII 32 you want to omit:
d = double(scrap1);
result = mean(d(d~=32)); %// logical indexing to remove unwanted value, then mean
You can remove the intermediate spaces in the string with scrap1(scrap1 == ' ') = ''; This replaces any space in the input with an empty string. Then you can do the conversion to double and average the result. See here for other methods.
Probably, you can use regex to find the space and ignore it. "\s"
findSpace = regexp(scrap1, '\s', 'ignore')
% I am not sure about the ignore case, this what comes to my mind. but u can read more about regexp by typying doc regexp.