String comparison in Python3 - python-3.x

I have a rest api which return True, False and "". Which i receive this in my requests.content I get the type as byte. I convert them to string and then try to compare. But the last else block executes leaving behind the first and second.
import requests
headers = {'Accept': '*/*'}
response = requests.get('http://{IP}/status', headers=headers)
status = response.content
status = str(status)
print(status)
# status returns "True", "False", ""
if (status == "True"):
print ('Admin approved this request')
elif (status == "False"):
print ('Admin disapproved this request')
else:
print ('No response from admin')
Getting :- 'No response from admin'
In all the cases

Double check the format of your response. If it's in something like JSON, you'll likely need to access the actual response ("True", "False", "") as a key/value pair.
Also, you can simply use response.text to get a string using UTF-8 encoding, instead of converting response.content to a string.
https://realpython.com/python-requests/#content

response.content is an object of type bytes.
Try calling decode() on response.content instead of casting to a str type.
For example if the content of the response is encoded in utf-8 then decode using utf-8:
status = response.content.decode('utf-8')
When casting a bytes object to a str type, the resulting string will be prefixed with "b'".
This is why the last else block in the code you've supplied always executes. The variable status will always be prefixed with "b'" (ie. "b'True'", "b'False'" or "b''") and the equality comparisons will always evaluate to False.

Related

How can I get the body of a gmail email with an attatchment gmail python API

This is my code to get the body of the email:
body = []
body.append(msg['payload']['parts'])
if 'data' in body[0][0]['body']:
print("goes path 1")
body = base64.urlsafe_b64decode(
body[0][0]['body']['data'])
else
print("goes path 2")
body = base64.urlsafe_b64decode(
body[0][1]['body']['data'])
else:
# What Do I do Here?
The reason i have the if elif statements is because sometimes the body is in different places so i have to try for both of them. When run through this an email that had an attachment resulted in a key error of data not existing meaning it's probably in a different place. The json object of body is in an image linked below because it is too big to paste here. How do I get the body of the email?
https://i.stack.imgur.com/Ufh5E.png
Edit:
The answers given by #fullfine aren't working, they output another json object the body of which can not be decoded for some reason:
binascii.Error: Invalid base64-encoded string: number of data characters (1185) cannot be 1 more than a multiple of 4
and:
binascii.Error: Incorrect padding
An example of a json object that i got from their answer is:
{'size': 370, 'data': 'PGRpdiBkaXI9Imx0ciI-WW91IGFyZSBpbnZpdGVkIHRvIGEgWm9vbSBtZWV0aW5nIG5vdy4gPGJyPjxicj5QbGVhc2UgcmVnaXN0ZXIgdGhlIG1lZXRpbmc6IDxicj48YSBocmVmPSJodHRwczovL3pvb20udXMvbWVldGluZy9yZWdpc3Rlci90Sll1Y3VpcnJEd3NHOVh3VUZJOGVEdkQ2NEJvXzhjYUp1bUkiPmh0dHBzOi8vem9vbS51cy9tZWV0aW5nL3JlZ2lzdGVyL3RKWXVjdWlyckR3c0c5WHdVRkk4ZUR2RDY0Qm9fOGNhSnVtSTwvYT48YnI-PGJyPkFmdGVyIHJlZ2lzdGVyaW5nLCB5b3Ugd2lsbCByZWNlaXZlIGEgY29uZmlybWF0aW9uIGVtYWlsIGNvbnRhaW5pbmcgaW5mb3JtYXRpb24gYWJvdXQgam9pbmluZyB0aGUgbWVldGluZy48L2Rpdj4NCg=='}
I figured out that i had to use base64.urlsafe_b64decode to decode the body which got me b'<div dir="ltr">You are invited to a Zoom meeting now. <br><br>Please register the meeting: <br>https://zoom.us/meeting/register/tJYucuirrDwsG9XwUFI8eDvD64Bo_8caJumI<br><br>After registering, you will receive a confirmation email containing information about joining the meeting.</div>\r\n'
How can I remove all the extra html tags while keeping the raw text?
Answer
The structure of the response body changes depending on the message itself. You can do some test to check how they look like in the documentation of the method: users.messages.get
How to manage it
Intial scenario:
Get the message with the id and define the parts.
msg = service.users().messages().get(userId='me', id=message_id['id']).execute()
payload = msg['payload']
parts = payload.get('parts')
Simple solution
You can find the raw version of the body message in the snippet, as the documentation says, it contains the short part of the message text. It's a simple solution that returns you the message without formatting or line breaks. Furthermore, you don't have to decode the result. If it does not fit your requirements, check the next solutions.
raw_message = msg['snippet']
Solution 1:
Add a conditional statement to check if any part of the message has a mimeType equal to multipart/alternative. If it is the case, the message has an attachment and the body is inside that part. You have to get the list of subparts inside that part. I attach you the code:
for part in parts:
body = part.get("body")
data = body.get("data")
mimeType = part.get("mimeType")
# with attachment
if mimeType == 'multipart/alternative':
subparts = part.get('parts')
for p in subparts:
body = p.get("body")
data = body.get("data")
mimeType = p.get("mimeType")
if mimeType == 'text/plain':
body_message = base64.urlsafe_b64decode(data)
elif mimeType == 'text/html':
body_html = base64.urlsafe_b64decode(data)
# without attachment
elif mimeType == 'text/plain':
body_message = base64.urlsafe_b64decode(data)
elif mimeType == 'text/html':
body_html = base64.urlsafe_b64decode(data)
final_result = str(body_message, 'utf-8')
Solution 2:
Use a recursive function to process the parts:
def processParts(parts):
for part in parts:
body = part.get("body")
data = body.get("data")
mimeType = part.get("mimeType")
if mimeType == 'multipart/alternative':
subparts = part.get('parts')
[body_message, body_html] = processParts(subparts)
elif mimeType == 'text/plain':
body_message = base64.urlsafe_b64decode(data)
elif mimeType == 'text/html':
body_html = base64.urlsafe_b64decode(data)
return [body_message, body_html]
[body_message, body_html] = processParts(parts)
final_result = str(body_message, 'utf-8')
Extra comments
If you need to get more data from your message I recommend you to use the documentation to see how the response body looks like.
You can also check the method in the API library of Python to see a detailed description of each element.
Do not use images in this way as DalmTo has said
edit
I tried the code with Python 2, it was my mistake. With Python 3, as you said, you have to use base64.urlsafe_b64decode(data) instead of base64.b64decode(data). I've already updated the code.
I added a simple solution that maybe fits your needs. It takes the message from the snippet key. It is a simplified version of the body message that does not need decoding.
I also don't know how you have obtained the text/html part with my code that does not handle that. If you want to get it, you have to add a second if statement, I updated the code so you can see it.
Finally, what you obtained using base64.urlsafe_b64decode is a bytes variable, to obtain the string you have to convert it using str(body_message, 'utf-8'). It is now in the code

Compare response 200 and I require 'true' as answer

print(response)
if(response=="<Response [200]>"):
print("true")
else:
print("false")
getting response as '<Response [200]>'
getting false
require True as output
The builtin function print will automatically apply str() to any object that is not already str. So what gets printed is actually str(response), which is a string representation of response, kinda like a summary. Comparing the human-readable str summary of any object to the object itself will only return true if that object was already a str. That is not the case here, as you're dealing with requests.Response object.
For your purposes, use .status_code to check:
Example:
import requests
response = requests.get('https://stackoverflow.com/questions/60733985/compare-response-200-and-i-require-true-as-answer')
print(response)
if(response.status_code==200):
print("true")
else:
print("false")
Output:
<Response [200]>
true

Extracting plain/text and html body from MBOX file to a list

I'm trying extract the body of email messages from a mbox file (previously converted from PST format).
I took the base function from another [slack question] (Extracting the body of an email from mbox file, decoding it to plain text regardless of Charset and Content Transfer Encoding). It work well for extracting the 'plain/text' body content, but I also wanted to extract the 'html' content.
From the last part of the code, which call the function to extract the body, I´ve tried modifying it to store the text and html strings in separate lists.
import mailbox
def getcharsets(msg):
charsets = set({})
for c in msg.get_charsets():
if c is not None:
charsets.update([c])
return charsets
def handleerror(errmsg, emailmsg, cs):
print()
print(errmsg)
print("This error occurred while decoding with ",cs," charset.")
print("These charsets were found in the one email.",getcharsets(emailmsg))
print("This is the subject:",emailmsg['subject'])
print("This is the sender:",emailmsg['From'])
def getbodyfromemail(msg):
body = 'no_text'
body_html = 'no_html'
#Walk through the parts of the email to find the text body.
if msg.is_multipart():
for part in msg.walk():
# If part is multipart, walk through the subparts.
if part.is_multipart():
for subpart in part.walk():
if subpart.get_content_type() == 'text/plain':
# Get the subpart payload (i.e the message body)
body = subpart.get_payload(decode=True)
#charset = subpart.get_charset()
elif subpart.get_content_type() == 'html':
body_html = subpart.get_payload(decode=True)
#body_html = subpart.get_payload(decode=True)
# Part isn't multipart so get the email body
elif part.get_content_type() == 'text/plain':
body = part.get_payload(decode=True)
#charset = part.get_charset()
# If this isn't a multi-part message then get the payload (i.e the message body)
elif msg.get_content_type() == 'text/plain':
body = msg.get_payload(decode=True)
# No checking done to match the charset with the correct part.
for charset in getcharsets(msg):
try:
body = body.decode(charset)
except UnicodeDecodeError:
handleerror("UnicodeDecodeError: encountered.",msg,charset)
except AttributeError:
handleerror("AttributeError: encountered" ,msg,charset)
return body, body_html
mboxfile = 'Bandeja de entrada'
body = []
body_html = []
for thisemail in mailbox.mbox(mboxfile):
body = body.append(getbodyfromemail(thisemail)[0])
body_html = body_html.append(getbodyfromemail(thisemail)[1])
print(body_html)
But right now, is giving me an error:
AttributeError: 'NoneType' object has no attribute 'append'
I expected the output of:
body = [string, string, string]
body_html = [html, html, html]
Your code works for me, except you should replace the list appends with the following:
for thisemail in mailbox.mbox(mboxfile):
body.append(getbodyfromemail(thisemail)[0])
body_html.append(getbodyfromemail(thisemail)[1])
print(body_html)
Python list append works in place, so it returns None. You could also replace the list appends with e.g.:
body = body + [getbodyfromemail(thisemail)[0]]

HTTP response not being manipulated

I made a HTTP request with the following line.
h.request('POST', uri, body,headers={})
I am collecting the response in a variable.
res = h.getresponse()
after that I am trying to print the response by
print(res.msg)
print(res.status)
print(res.read())
After the 3 print statment I am trying to modify the response by storing the res.read() output in a different variable to convert to a string and to do further processing.
text=res.read().decode("utf-8")
But while doing so the decoded response is not getting stored in the variable. If I do a print on text after print(res.read()) it gives me nothing
print(res.read())
text=res.read().decode("utf-8")
print(text)
The output of the above just prints me the first print statement. If I remove the first statement and do the following.
text=res.read().decode("utf-8")
print(text)
It gives me the required O/P. But I wanted both of them to work. So, is there a way to do so.
If you do res.read(), it reads the content of the response during the print statement. On the second execution you cannot read again, unless you seek back or re-do the request.
Store the first .read() in a variable, then print it.
Or u can try following way.
from requests import request
res = request('POST', uri, data,headers={})
print(res.text)

How to get the body text of email with imaplib?

I am in python3.4 .
import imaplib
import email
user="XXXX"
password="YYYY"
con=imaplib.IMAP4_SSL('imap.gmail.com')
con.login(user,password)
con.list()
con.select("INBOX")
result,data=con.fetch(b'1', '(RFC822)')
raw=email.message_from_bytes(data[0][1])
>>> raw["From"]
'xxxx'
>>> raw["To"]
'python-list#python.org'
>>> raw["Subject"]
'Re:get the min date from a list'
When i run 'print(raw)' there are many lines of the body of the email ,
i can't get it with raw[TEXT] OR raw['TEXT'] OR raw['BODY'] ,
how can i get the body of the email text?
You're asking it for a header named TEXT or BODY, and obviously there is no such thing. I think you're mixing up IMAP4 part names (the things you pass in con.fetch) and RFC2822 header names (the things you use in an email.message.Message).
As the email.message documentation explains, a Message consists of headers and a payload. The payload is either a string (for non-multipart messages) or a list of sub-Messages (for multipart). Either way, what you want here is raw.get_payload().
If you want to handle both, you can either first check raw.is_multipart(), or you can check the type returned from get_payload(). Of course you have to do decide what you want to do in the case of a multipart message; what counts as "the body" when there are three parts? Do you want the first? The first text/plain? The first text/*? The first text/plain if there is one, the first text/* if not, and the first of anything if even that doesn't exist? Or all of them concatenated together?
Let's assume you just want the first one. To do that:
def get_text(msg):
if msg.is_multipart():
return get_text(msg.get_payload(0))
else:
return msg.get_payload(None, True)
If you want something different, hopefully you can figure out how to do it yourself. (See the get_content_type and/or get_content_maintype methods on Message.)
Following up using Python 3.8 - parses all the parts that have an associated encoding and turns it into a single HTML page
import imaplib
import email
import webbrowser
import tempfile
import webbrowser
def email_to_html(parsed):
all_parts = []
for part in parsed.walk():
if type(part.get_payload()) == list:
for subpart in part.get_payload():
all_parts += email_to_html(subpart)
else:
if encoding := part.get_content_charset():
all_parts.append(part.get_payload(decode=True).decode(encoding))
return ''.join(all_parts)
# Login
imap = imaplib.IMAP4_SSL("imap.gmail.com")
result = imap.login("username", "password")
# Select the inbox, grab only the unseen emails
status, resp = imap.select('INBOX')
status, response = imap.search(None, '(UNSEEN)')
unread_msg_nums = response[0].split()
email_bodies = []
for idx in unread_msg_nums:
_, msg = imap.fetch(str(int(idx)), "(RFC822)")
for response in msg:
if isinstance(response, tuple):
raw_email = response[1]
parsed = email.message_from_bytes(raw_email)
email_bodies.append(email_to_html(parsed))
# If you want to view/check the emails in your browser
def display(html):
with tempfile.NamedTemporaryFile('w', delete=False, suffix='.html') as f:
url = 'file://' + f.name
f.write(html)
webbrowser.open(url)
for body in email_bodies:
display(body)

Resources