This is an another thing I faced in python on Kaggle.
Question: Given a string, it should return whether or not that string represents a valid zip code. For our purposes, a valid zip code is any string consisting of exactly 5 digits.
And the answer I wrote is -
def is_valid_zip(zip_code):
"""Returns whether the input string is a valid (5 digit) zip code
"""
return len(zip_code)==5 and str.isdigit(zip_code)
pass
# Check your answer
q1.check()
This gave me the correct answer but the solution has used zip_code.isdigit()
My doubt is
are both the conventions correct?
how the first when used with str. will work but not alone i.e. isdigit(zip_code)
Related
This is the part of the code I have copied to see the output,
def check(string,sub_str):
if(string.find(sub_str)==-1):
print('no')
else:
print('yes)
# driver code for testing the above function
string='geeks for geeks'
sub_str='geeks'
I specifically wanted to understand how this expression works :
if(string.find(sub_str)==-1): . Also this code is for finding substrings in a given strings can some one tell if this is the optimal way, I know it is tutorial code but I have an easier way to find the substrings. Just wanted to know if that would make passing test cases easier hence the above code. Anyways thanks y'all for your answers.
The method find() returns the index of the string you are looking for. The string in front of find() is the one in which you are looking for the second string.
SentenceThatIsCompletelySearched.find(ForThisPartHere)
If the string you are looking for is present it will return the index (a number on which position of the sentence the string has been found).
If the string is not inside the sentence then find() will return -1 (a number).
So in your case you are checking if sub_str is inside string and if it is not present (return of -1) you will print "no". If it is you will print "yes".
This question already has an answer here:
How to match a sentence in Lua
(1 answer)
Closed 1 year ago.
Been stuck on this for over a day.
I'm trying to use gsub to extract a portion of an input string. The exact pattern of the input varies in different cases, so I'm trying to use a variable to represent that pattern, so that the same routine - which is otherwise identical - can be used in all cases, rather than separately coding each.
So, I have something along the lines of:
newstring , n = oldstring:gsub(matchstring[i],"%1");
where matchstring[] is an indexed table of the different possible pattern matches, set up so that "%1" will match the target sequence in each matchstring[].
For instance, matchstring[1] might be
"\[User\] <code:%w*>([^<]*)<\\code>.*" -- extract user name from within the <code>...<\code>
while matchstring[2] could be
"\[World\] (%w)* .*" -- extract user name as first word after prefix '[World] '
and matchstring[3] could be
"<code:%w*>([^<]*)<\\code>.*" -- extract username from within <code>...<\code> at start
This does not work.
Yet when, debugging one of the cases, I replace matchstring[i] with the exact same string -- only now passed as a string literal rather than saved in a variable -- it works.
So.. I'm guessing there must be some 'processing' of the string - stripping out special characters or something - when it's sent as a variable rather than a string literal ... but for the life of me I can't figure out how to adjust the matchstring[] entries to compensate!
Help much appreciated...
FACEPALM
Thankyou, Piglet, you got me on the right track.
Given how this particular platform processes & passes strings, anything within <...> needed the escape character \ for downstream use, but of course - duh - for the lua gsub's processing itself it needed the standard %
much obliged
Trying to understand how "%s%s" %(a,a) is working in below code I have only seen it inside print function thus far.Could anyone please explain how it is working inside int()?
a=input()
b=int("%s%s" %(a,a))
this "%s" format has been borrowed from C printf format, but is much more interesting because it doesn't belong to print statement. Note that it involves just one argument passed to print (or to any function BTW):
print("%s%s" % (a,a))
and not (like C) a variable number of arguments passed to some functions that accept & understand them:
printf("%s%s,a,a);
It's a standalone way of creating a string from a string template & its arguments (which for instance solves the tedious issue of: "I want a logger with formatting capabilities" which can be achieved with great effort in C or C++, using variable arguments + vsprintf or C++11 variadic recursive templates).
Note that this format style is now considered legacy. Now you'd better use format, where the placeholders are wrapped in {}.
One of the direct advantages here is that since the argument is repeated you just have to do:
int("{0}{0}".format(a))
(it references twice the sole argument in position 0)
Both legacy and format syntaxes are detailed with examples on https://pyformat.info/
or since python 3.6 you can use fstrings:
>>> a = 12
>>> int(f"{a}{a}")
1212
% is in a way just syntactic sugar for a function that accepts a string and a *args (a format and the parameters for formatting) and returns a string which is the format string with the embedded parameters. So, you can use it any place that a string is acceptable.
BTW, % is a bit obsolete, and "{}{}".format(a,a) is the more 'modern' approach here, and is more obviously a string method that returns another string.
I am parsing a .fasta-File containing one big sequence into python by using:
for rec in SeqIO.parse(faFile, "fasta"):
identifier=(rec.id)
sequence=(rec.seq)
Then, I am building a dictionary:
d={identifier:sequence}
When printing sequence only, I get the following result:
CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT
Note: All letters are printed, I made dots to shorten this
When printing the dictionary, I get:
{'NC_003047.1': Seq('CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT', SingleLetterAlphabet())}
Where does the "Seq" and the SingleLetter alphabet come from?
Desired result would be:
{'NC_003047.1':'CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT'}
Update1:
following the link in the comments, I tried
input_file=open(faFile)
d=SeqIO.to_dict(SeqIO.parse(faFile,"fasta"))
resulting in:
{'NC_003047.1': SeqRecord(seq=Seq('CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT', SingleLetterAlphabet()), id='NC_003047.1', name='NC_003047.1', description='NC_003047.1 Sinorhizobium meliloti 1021 chromosome, complete genome', dbxrefs=[])}
So, sadly, this does not help :(
Thanks in advance for your time and effort :)
SeqIO doesn't return a string, it returns an object. When you print it, you print the object's string representation, which in this case is not just the data contained in (some attribute of) the object.
(Some objects are designed so that printing the object will print just the data inside it. This depends on how the library is put together and how the programmer designed its __str__() method. This is probably not useful for you at this point, but might help you understand other related resources you find if you pursue this further.)
I'm not familiar with SeqIO but quick googling suggests you probably want
d={identifier: sequence.seq}
to put just the SeqIO object's seq attribute as the value for this identifier.
I've contered a Python 2 code to Python 3.
In doing so, I've changed
print 'String: ' + somestring
into
print(b'String: '+somestring)
because I was getting the following error:
Can't convert 'bytes' object to str implicitly
But then now I can't implement string attributes such as strip(), because they are no longer treated as strings...
global name 'strip' is not defined
for
if strip(somestring)=="":
How should I solve this dilemma between switching string to bytes and being able to use string attributes? Is there a workaround?
Please help me out and thank you in advance..
There are two issues here, one of which is the actual issue, the other is confusing you, but not an actual issue. Firstly:
Your string is a bytes object, ie a string of 8-bit bytes. Python 3 handles this differently from text, which is Unicode. Where do you get the string from? Since you want to treat it as text, you should probably convert it to a str-object, which is used to handle text. This is typically done with the .decode() function, ie:
somestring.decode('UTF-8')
Although calling str() also works:
str(somestring, 'UTF8')
(Note that your decoding might be something else than UTF8)
However, this is not your actual question. Your actual question is how to strip a bytes string. And the asnwer is that you do that the same way as you string a text-string:
somestring.strip()
There is no strip() builtin in either Python 2 or Python 3. There is a strip-function in the string module in Python 2:
from string import strip
But it hasn't been good practice to use that since strings got a strip() method, which is like ten years or so now. So in Python 3 it is gone.
>>> b'foo '.strip()
b'foo'
Works just fine.
If what you're dealing with is text, though, you probably should just have an actual str object, not a bytes object.
I believe you can use the "str" function to cast it to a string
print str(somestring).strip()
or maybe
print str(somestring, "utf-8").strip()
However, if the object is already a string, you don't get a new one. So if you're not sure whether an object is a string and need it to be and call str(obj), you won't create another if it's already a string.
x='123'
id(x)
2075707536496
y=str(x)
id(y)
2075707536496