How can I include a list inside a string - python-3.x

I want to include a list variable inside a string. Like this:
aStr = """
blah blah blah
blah = ["one","two","three"]
blah blah
"""
I tried these:
l = ["one","two","three"]
aStr = """
blah blah blah
blah = %s
blah blah
"""%str(l)
OR
l = ["one","two","three"]
aStr = """
blah blah blah
blah = """+str(l)+"""
blah blah
"""
They didn't work unfortunately

Both of your snippets work almost the way you mention. The only difference is that the line that includes your list has spaces after each comma. The attached code gives exactly the output you wanted.
l = ["one","two","three"]
aStr = """
blah blah blah
blah = %s
blah blah
"""% ('["'+'","'.join(l)+'"]')
So what does that do?
The most interesting part here is this one '","'.join(l). The .join() method takes the string that is passed before the dot and joins every element of the list separated by that string.
Other ways of doing this:
l = [1,2,3]
"a "+str(l)+" b"
"a {} b".format(l)
"a {name} b".format(name=l)
f"a {l} b"

If you just want a string that includes the list, you could do:
s1 = "blah blah blah"
l = [1,2,3]
s2 = "blah blah blah"
s3 = s1 + str(l) + s2

How about this using f string.
strQ =f"""
blah blah blah
blah = {l}
blah blah"""
This will give :
"\nblah blah blah \nblah = ['one', 'two', 'three']\nblah blah\n"
This will work in python 3.6 and beyond however not in below 3.6.

Related

Grammatically correct indefinite article in Python

I am creating a random sentence generator that can apply the
correct indefinite article (a, an) to the sentence.
But I am getting results such as these:
I eat a apple. I ride an bike. You eat an apple.
"a" should come before the consonant, and "an" should come before the vowel: an apple; a bike.
What am I doing wrong?
Import random
def main():
pronoun = ["I ", "You "]
verb = ["kick ", "ride", "eat "]
noun = [" ball.", " bike.", " apple.", " elephant."]
ind_art = "an" if random.choice(noun[0]).lower() in "aeiou" else "a"
a = random.choice(pronoun)
b = random.choice(verb)
c = ind_art
d = random.choice(noun)
print(a+b+c+d)
main()
When you call random.choice it returns a new value each time. So the random word in the line where you create your ind_art is a different word from the one that gets assigned to d.
You need to reorder your code so that d is used when determining the article.
d = random.choice(noun)
ind_art = "an" if d[0].lower() in "aeiou" else "a"
I fixed some lines in your code. Now, it works fine.
import random
def main():
pronoun = ["I ", "You "]
verb = ["kick ", "ride ", "eat "]
noun = ["ball.", "bike.", "apple.", "elephant."]
ch = random.choice(noun).lower() # <===== fixed this line
# print(ch[0]) # test
ind_art = "an " if ch[0] in "aeiou" else "a " # <===== fixed this line
a = random.choice(pronoun)
b = random.choice(verb)
c = ind_art
# d = random.choice(noun) # <===== removed this line
print(a+b+c+ch) # <===== replaced d with ch
main()

Regex for getting multiple words after a delimiter

I have been trying to get the separate groups from the below string using regex in PCRE:
drop = blah blah blah something keep = bar foo nlah aaaa rename = (a=b d=e) obs=4 where = (foo > 45 and bar == 35)
Groups I am trying to make is like:
1. drop = blah blah blah something
2. keep = bar foo nlah aaaa
3. rename = (a=b d=e)
4. obs=4
5. where = (foo > 45 and bar == 35)
I have written a regex using recursion but for some reason recursion is partially working for selecting multiple words after drop like it's selecting just first 3 words (blah blah blah) and not the 4th one. I have looked through various stackoverflow questions and have tried using positive lookahead also but this is the closest I could get to and now I am stuck because I am unable to understand what I am doing wrong.
The regex that I have written: (?i)(drop|keep|where|rename|obs)\s*=\s*((\w+|\d+)(\s+\w+)(?4)|(\((.*?)\)))
Same can be seen here: RegEx Demo.
Any help on this or understanding what I am doing wrong is appreciated.
You could use the newer regex module with DEFINE:
(?(DEFINE)
(?<key>\w+)
(?<sep>\s*=\s*)
(?<value>(?:(?!(?&key)(?&sep))[^()=])+)
(?<par>\((?:[^()]+|(?&par))+\))
)
(?P<k>(?&key))(?&sep)(?P<v>(?:(?&value)|(?&par)))
See a demo on regex101.com.
In Python this could be:
import regex as re
data = """
drop = blah blah blah something keep = bar foo nlah aaaa rename = (a=b d=e) obs=4 where = (foo > 45 and bar == 35)
"""
rx = re.compile(r'''
(?(DEFINE)
(?<key>\w+)
(?<sep>\s*=\s*)
(?<value>(?:(?!(?&key)(?&sep))[^()=])+)
(?<par>\((?:[^()]+|(?&par))+\))
)
(?P<k>(?&key))(?&sep)(?P<v>(?:(?&value)|(?&par)))''', re.X)
result = {m.group('k').strip(): m.group('v').strip()
for m in rx.finditer(data)}
print(result)
And yields
{'drop': 'blah blah blah something', 'keep': 'bar foo nlah aaaa', 'rename': '(a=b d=e)', 'obs': '4', 'where': '(foo > 45 and bar == 35)'}
You can use a branch reset group solution:
(?i)\b(drop|keep|where|rename|obs)\s*=\s*(?|(\w+(?:\s+\w+)*)(?=\s+\w+\s+=|$)|\((.*?)\))
See the PCRE regex demo
Details
(?i) - case insensitive mode on
\b - a word boundary
(drop|keep|where|rename|obs) - Group 1: any of the words in the group
\s*=\s* - a = char enclosed with 0+ whitespace chars
(?| - start of a branch reset group:
(\w+(?:\s+\w+)*) - Group 2: one or more word chars followed with zero or more repetitions of one or more whitespaces and one or more word chars
(?=\s+\w+\s+=|$) - up to one or more whitespaces, one or more word chars, one or more whitespaces, and =, or end of string
| - or
\((.*?)\) - (, then Group 2 capturing any zero or more chars other than line break chars, as few as possible and then )
) - end of the branch reset group.
See Python demo:
import regex
pattern = r"(?i)\b(drop|keep|where|rename|obs)\s*=\s*(?|(\w+(?:\s+\w+)*)(?=\s+\w+\s+=|$)|\((.*?)\))"
text = "drop = blah blah blah something keep = bar foo nlah aaaa rename = (a=b d=e) obs=4 where = (foo > 45 and bar == 35)"
print( [x.group() for x in regex.finditer(pattern, text)] )
# => ['drop = blah blah blah something', 'keep = bar foo nlah aaaa', 'rename = (a=b d=e)', 'obs=4', 'where = (foo > 45 and bar == 35)']
print( regex.findall(pattern, text) )
# => [('drop', 'blah blah blah something'), ('keep', 'bar foo nlah aaaa'), ('rename', 'a=b d=e'), ('obs', '4'), ('where', 'foo > 45 and bar == 35')]

Search for multiple keywords and corresponding index

I have strings like:
a = "currency is like gbp"
a= "currency blah blah euro"
a= "currency is equivalent to usd" .....
I want to substring or slice the above string wherever I found any of "gbp" , "euro" or "usd".
Not Working:
i = a.find("gbp") or a.find("euro") or a.find("usd")
a = a[i:]
Can do:
x = a.find('gbp')
y = a.find('euro')
z = a.find('usd')
But then I need to check which of them is greater than -1 and use that variable to slice the string which will be too much code.
Also, in my original example I have 10+ currencies so want a scalable solution.
Summary:
Want to slice/substring the main sentence from any of the words found till the end
You could try something like:
currency_array = ['gbp', 'euro', 'usd']
index = max(a.find(currency) for currency in currency_array)
print(a[index:])
Use regex for such purposes:
import re
a = "currency is like gbp currency"
print(re.findall(r'((?:gbp|euro|usd).*)', a))
# ['gbp currency']

Removing special characters from string in LUA

I'm trying to clean up a column of data containing postal codes before processing the values. The data contains all kinds of crazy formatting or input like the following and is a CHAR datatype:
12345
12.345
1234-5678
12345 6789
123456789
12345-6789
.
[blank]
I would like to remove all of the special characters and have tried the following code, but my script fails after many iterations of the logic. When I say it fails, let's say sOriginalZip = '.', but it gets past my empty string check and nil check as if it is not empty even after I have replaced all special characters, control characters and space characters. So my output looks like this:
" 2 sZip5 = "
code:
nNull = nil
sZip5 = string.gsub(sOriginalZip,"%p","")
sZip5 = string.gsub(sZip5,"%c","")
sZip5 = string.gsub(sZip5,"%s","")
print("sZip5 = " .. sZip5)
if sZip5 ~= sBlank or tonumber(sZip5) ~= nNull then
print(" 2 sZip5 = " .. sZip5)
else
print("3 sZip5 = " .. sZip5)
end
I think there are different ways to go, following should work:
sZip5 = string.gsub(sOriginalZip, '.', function(d) return tonumber(d) and d or '' end)
It returns a number string, blank value or nil
Thanks! I ended up using a combination of csarr and Egor's suggestions to get this:
sZip5 = string.gsub(sOriginalZip,"%W",function(d)return tonumber(d) and d or "" end)
Looks like it is evaluating correctly. Thanks again!

Regular Expression VBA for multiple rows/table?

I have a text (already stored in a String variable, if you want).
The text is structured as follows:
( 124314 ) GSK67SJ/11 ADS SDK
blah blah blah blah blah
blah blah blah
blah blah blah
( 298 ) 2KEER/98 EOR PRT
blah blah blah
blah blah blah blah blah
etc.
The number of empty spaces between the words is variable;
The value in brackets is variable, as the length of the alphanumeric
group (this one ends always with "/" and then two numbers);
The text "blah blah" at the end can be divided in an unknown number
of lines, each one with a variable number of characters
The last two groups of letters are always of 3 letters each. After
those there is a "/n" immediately after, without spaces;
The list goes down for 0 to N elements.
For each of them I have to store the number, the first 3-letters, the second 3-letters, and the "blah blah" in 4 columns of an Excel file.
Let's say that the columns are A, B, C, D. The result should be as follow (from A1):
124314 | ADS | SDK | blah blah blah.....
298 | EOR | PRT | blah blah.....
.........
Any help would be greatly appreciated
I managed to solve it
Dim RegX As VBScript_RegExp_55.RegExp 'Rememeber to reference it...
Dim Mats As Object
Dim TextFiltered As String
Dim counter As Integer
Set RegX = New VBScript_RegExp_55.RegExp
With RegX
.Global = True
.MultiLine = True
.Pattern = "[\s]{2,}(?!\(\s+(\d+)\s+\))" 'This will clear the annoying splitting into different lines of the "blah blah" A PART for the ones before "( number )"
TextFiltered = .Replace(TextFiltered, " ") ' You could also write [\r\n] instead of [\s] but in that way you eliminate all the spaces in one hit
End With
With RegX 'This is the pattern you're looking for, the brackets mean different elements you could retrieve from the array of the results
.Pattern = "\(\s+(\d+)\s+\)(\s+\w+/[0-9]{2}\s+)([A-Z]{3})\s+([A-Z]{3})\s+(.+)" 'I think you can remove the "+" from the "\s"
Set Mats = .Execute(TextFiltered)
End With
For counter = 0 To Mats.Count - 1 'SubMatches is what will give you the various elements one by one (124314, ADS, SDK, etc)
MsgBox Mats(counter).SubMatches(0) & " " & Mats(counter).SubMatches(2) & " " & Mats(counter).SubMatches(3) & " " & Mats(counter).SubMatches(4)
Next

Resources