How to change a string into a variable - string

I want to write out some data into a file. I saved the filename as a variable. I wan to use % mode to substitude the variable to the text, but it gives an error:
IndentationError: unindent does not match any outer indentation level
writeafile = open('N:\myfile\%s.txt' , "a") % (variable)

Assuming we are talking about Python here, you should move variable next to the
'N:\\myfile\\%s.txt' string for correct syntax, like so:
writeafile = open("N:\\myfile\\%s.txt" % variable, "a")
However, using this style of formatting is not recommended by Pydocs:
The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals, the str.format() interface, or template strings may help avoid these errors. Each of these alternatives provides their own trade-offs and benefits of simplicity, flexibility, and/or extensibility.
Source
So, I'd suggest using f-strings, which have been available in Python since 3.6. The double \\ is intentional here, otherwise Python will treat it as an escape character and you'll get undesired results.
writeafile = open(f"N:\\myfile\\{variable}.txt", "a")
Alternatively, you could also use str.format():
writeafile = open("N:\\myfile\\{name}.txt".format(name=variable), "a")

Related

Python regular expressions with Foreign characters in python PyQT5

This problem might be very simple but I find it a bit confusing & that is why I need help.
With relevance to this question I posted that got solved, I got a new issue that I just noticed.
Source code:
from PyQt5 import QtCore,QtWidgets
app=QtWidgets.QApplication([])
def scroll():
#QtCore.QRegularExpression(r'\b'+'cat'+'\b')
item = listWidget.findItems(r'\bcat\b', QtCore.Qt.MatchRegularExpression)
for d in item:
print(d.text())
window = QtWidgets.QDialog()
window.setLayout(QtWidgets.QVBoxLayout())
listWidget = QtWidgets.QListWidget()
window.layout().addWidget(listWidget)
cats = ["love my cat","catirization","cat in the clouds","catść"]
for i,cat in enumerate(cats):
QtWidgets.QListWidgetItem(f"{i} {cat}", listWidget)
btn = QtWidgets.QPushButton('Scroll')
btn.clicked.connect(scroll)
window.layout().addWidget(btn)
window.show()
app.exec_()
Output GUI:
Now as you can see I am just trying to print out the text data based on the regex r"\bcat\b" when I press the "Scroll" button and it works fine!
Output:
0 love my cat
2 cat in the clouds
3 catść
However... as you can see on the #3, it should not be printed out cause it obviously does not match with the mentioned regular expression which is r"\bcat\b". However it does & I am thinking it has something to do with that special foreign character ść that makes it a match & prints it out (which it shouldn't right?).
I'm expecting an output like:
0 love my cat
2 cat in the clouds
Researches I have tried
I found this question and it says something about this \p{L} & based on the answer it means:
If all you want to match is letters (including "international"
letters) you can use \p{L}.
To be honest I'm not so sure how to apply that with PyQT5 also still I've made some tries & and I tried changing the regex to like this r'\b'+r'\p{cat}'+r'\b'. However I got this error.
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
QString::contains: invalid QRegularExpression object
Obviously the error says it's not a valid regex. Can someone educate me on how to solve this issue? Thank you!
In general, when you need to make your shorthand character classes and word boundaries Unicode-aware, you need to pass the QRegularExpression.UseUnicodePropertiesOption option to the regex compiler. See the QRegularExpression.UseUnicodePropertiesOption reference:
The meaning of the \w, \d, etc., character classes, as well as the meaning of their counterparts (\W, \D, etc.), is changed from matching ASCII characters only to matching any character with the corresponding Unicode property. For instance, \d is changed to match any character with the Unicode Nd (decimal digit) property; \w to match any character with either the Unicode L (letter) or N (digit) property, plus underscore, and so on. This option corresponds to the /u modifier in Perl regular expressions.
In Python, you could declare it as
rx = QtCore.QRegularExpression(r'\bcat\b', QtCore.QRegularExpression.UseUnicodePropertiesOption)
However, since the QListWidget.findItems does not support a QRegularExpression as argument and only allows the regex as a string object, you can only use the (*UCP) PCRE
verb as an alternative:
r'(*UCP)\bcat\b'
Make sure you define it at the regex beginning.

Extracting variables from expression in Lua

I have expressions in lua which contains standard metatable operations .__add,.__sub,.__mul, (+,-,*)
For example a+b*xyz-cdeI am trying to extract all free variables in table. For this expression, the table will contain {a,b,xyz,cde}. Right now I am trying it with string operations, like splitting, substituting etc. This seems to work but I feel it as ugly way. It gets little complicated as there may nesting and brackets involved in expressions. For example, the expression (a+b)-c*xyz*(a+(b+c)) should return table {a,b,c,xyz}. Can there be a simple way to extract free variables in expressions? I am even looking for simple way with string library.
If you want to do string processing, it's easy:
local V={}
local s="((a+b)-c*xyz*(a+(b+c)))"
for k in s:gmatch("%a+") do
V[k]=k
end
for k in pairs(V) do print(k) end
For fun, you can let Lua do the hard work:
local V={}
do
local _ENV=setmetatable({},{__index=function (t,k) V[k]=k return 0 end})
local _=((a+b)-c*xyz*(a+(b+c)))
end
for k in pairs(V) do print(k) end
This code evaluates the expression in an empty environment where every variable has the value zero, saving the names of the variables in the expression in a table.

String formatting in python 3 without print function

Trying to understand how "%s%s" %(a,a) is working in below code I have only seen it inside print function thus far.Could anyone please explain how it is working inside int()?
a=input()
b=int("%s%s" %(a,a))
this "%s" format has been borrowed from C printf format, but is much more interesting because it doesn't belong to print statement. Note that it involves just one argument passed to print (or to any function BTW):
print("%s%s" % (a,a))
and not (like C) a variable number of arguments passed to some functions that accept & understand them:
printf("%s%s,a,a);
It's a standalone way of creating a string from a string template & its arguments (which for instance solves the tedious issue of: "I want a logger with formatting capabilities" which can be achieved with great effort in C or C++, using variable arguments + vsprintf or C++11 variadic recursive templates).
Note that this format style is now considered legacy. Now you'd better use format, where the placeholders are wrapped in {}.
One of the direct advantages here is that since the argument is repeated you just have to do:
int("{0}{0}".format(a))
(it references twice the sole argument in position 0)
Both legacy and format syntaxes are detailed with examples on https://pyformat.info/
or since python 3.6 you can use fstrings:
>>> a = 12
>>> int(f"{a}{a}")
1212
% is in a way just syntactic sugar for a function that accepts a string and a *args (a format and the parameters for formatting) and returns a string which is the format string with the embedded parameters. So, you can use it any place that a string is acceptable.
BTW, % is a bit obsolete, and "{}{}".format(a,a) is the more 'modern' approach here, and is more obviously a string method that returns another string.

substitue string by index without using regular expressions

It should be very easy, but I am looking for an efficient way to perform it.
I know that I could split the string into two parts and insert the new value, but I have tried to substitute each line between the indexes 22-26 as follows:
line.replace(line[22:26],new_value)
The Problem
However, that function substitutes everything in the line that is similar to the pattern in line[22:26].
In the example below, I want to replace the marked number 1 with number 17:
Here are the results. Note the replacement of 1 with 17 in several places:
Thus I don't understand the behavior of replace command. Is there a simple explanation of what I'm doing wrong?
Why I don't want RE
The values between index 22-26 are not unified in form.
Note: I am using python 3.5 on Unix/Linux machines.
str.replace replaces 1 sub-string pattern with another everywhere in the string.
e.g.
'ab cd ab ab'.replace('ab', 'xy')
# produces output 'xy cd xy xy'
similarly,
mystr = 'ab cd ab ab'
mystr.replace(mystr[0:2], 'xy')
# also produces output 'xy cd xy xy'
what you could do instead, to replace just the characters in position 22-26
line = line[0:22] + new_value + line[26:]
Also, looking at your data, it seems to me to be a fixed-width text file. While my suggestion will work, a more robust way to process this data would be to read it & separate the different fields in the record first, before processing the data.
If you have access to the pandas library, it provides a useful function just for reading fixed-width files

What's the point of nesting brackets in Lua?

I'm currently teaching myself Lua for iOS game development, since I've heard lots of very good things about it. I'm really impressed by the level of documentation there is for the language, which makes learning it that much easier.
My problem is that I've found a Lua concept that nobody seems to have a "beginner's" explanation for: nested brackets for quotes. For example, I was taught that long strings with escaped single and double quotes like the following:
string_1 = "This is an \"escaped\" word and \"here\'s\" another."
could also be written without the overall surrounding quotes. Instead one would simply replace them with double brackets, like the following:
string_2 = [[This is an "escaped" word and "here's" another.]]
Those both make complete sense to me. But I can also write the string_2 line with "nested brackets," which include equal signs between both sets of the double brackets, as follows:
string_3 = [===[This is an "escaped" word and "here's" another.]===]
My question is simple. What is the point of the syntax used in string_3? It gives the same result as string_1 and string_2 when given as an an input for print(), so I don't understand why nested brackets even exist. Can somebody please help a noob (me) gain some perspective?
It would be used if your string contains a substring that is equal to the delimiter. For example, the following would be invalid:
string_2 = [[This is an "escaped" word, the characters ]].]]
Therefore, in order for it to work as expected, you would need to use a different string delimiter, like in the following:
string_3 = [===[This is an "escaped" word, the characters ]].]===]
I think it's safe to say that not a lot of string literals contain the substring ]], in which case there may never be a reason to use the above syntax.
It helps to, well, nest them:
print [==[malucart[[bbbb]]]bbbb]==]
Will print:
malucart[[bbbb]]]bbbb
But if that's not useful enough, you can use them to put whole programs in a string:
loadstring([===[print "o m g"]===])()
Will print:
o m g
I personally use them for my static/dynamic library implementation. In the case you don't know if the program has a closing bracket with the same amount of =s, you should determine it with something like this:
local c = 0
while contains(prog, "]" .. string.rep("=", c) .. "]") do
c = c + 1
end
-- do stuff

Resources