UnicodePython 3: EncodeError: 'ascii' codec can't encode character '\xe4' - python-3.x

I am trying to send some emails with pandas from an excel-file.
I have this error for over a week now and even after hours of searching through SO, google, forums and so on, I just can't come u with an answer to fix the problem.
Here is the code:
import pandas as pd
import smtplib
your_name = "myname"
your_email = "mymail"
your_password = "mypw"
server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
server.ehlo()
server.login(your_email, your_password)
# Read the file
email_list = pd.read_excel("myfile.xlsx")
# Get all the Names, Email Addreses, Subjects and Messages
all_emails = email_list['Email']
all_messages = email_list['Text']
# Loop through the emails
for idx in range(len(all_emails)):
# Get each records name, email, subject and message
email = all_emails[idx]
message = all_messages[idx]
# Create the email to send
full_email = ("From: {0} <{1}>\n"
"To: <{2}>\n"
"Subject: My_Subject_Title\n\n"
"{3}"
.format(your_name, your_email, email, message))
# In the email field, you can add multiple other emails if you want
# all of them to receive the same text
try:
server.sendmail(your_email, [email], full_email)
print('Email to {} successfully sent!\n\n'.format(email))
except Exception as e:
print('Email to {} could not be sent :( because {}\n\n'.format(email, str(e)))
server.close()
I am getting the error:
'ascii' codec can't encode character '\xe4'
So obviously the error is caused by some european letters inside my excel file.
What I tryied (along several others ways) was to encode the file:
email_list = pd.read_excel("myfile.xlsx", encoding=("utf-8"))
>>> TypeError: read_excel() got an unexpected keyword argument 'encoding'
or:
email_list = pd.read_excel("myfile.xlsx")
email_list.encode("utf-8")
>>> AttributeError: 'DataFrame' object has no attribute 'encode'
Non if it seems to work.
I'm happy if someone can help me out in what I`m doing wrong.
Very new to python and these are my first real trys to implement some actual work-related problems.
Thanks a lot in advance!

Related

Python Facebookchat AttributeError 'str' object has no attribute '_to_send_data'

I was trying send message to my friend using python.
But I am getting this error.
sent = client.send(friend.uid, msg)
File "/home/can/anaconda3/lib/python3.7/site-packages/fbchat/_client.py", line 1059, in send
data.update(message._to_send_data())
AttributeError: 'str' object has no attribute '_to_send_data'
I can login in to my account and ı can enter value of friends also ı can enter my friend name.
Then when ı write my message for example "hello" then press enter, it gives me error.
Codes of program;
import fbchat
from getpass import getpass
username = str(input("Username: "))
client = fbchat.Client(username, getpass())
no_of_friends = int(input("Number of friends: "))
for i in range(no_of_friends):
name = str(input("Name: "))
friends = client.searchForUsers(name) # return a list of names
friend = friends[0]
msg = str(input("Message: "))
sent = client.send(friend.uid, msg)
if sent:
print("Message sent successfully!")
can I see "Message sent successfully!" this message? also modules are successfully installed.
I am working on Ubuntu 19.10
I fix my issue by myself. ı put the answer here maybe someone will have same error and can find the answer here.
My mistake is here;
sent = client.send(friend.uid, msg)
I changed like this one;
sent = client.sendMessage(msg, thread_id=friend.uid)
and it's worked!

imaplib with Python 3.7.4 occasionally returns an attachment that fails to be decoded

Some background:
imaplib with Python 3.7.4 occasionally returns a photo attachment (jpg) that fails to be decoded from the server after being downloaded. I've confirmed that the photos are encoded when sent with byte64 encoding over multiple emails. Most Photos work; however, certain ones don't for whatever reason. At this time, I don't know which email client is being used to send this particular email that causes the crash or the source of the photo (phone, camera, pc, etc). I've tested every supported file type from python-pillow without any issues though. It's just this one photo/email. And lastly, if I remove the attachment there are no issues, so it's something to do with the photo. All python packages are the current versions.
The commented lines in the code below show things I've tried the following encodings:
utf-8 (which fails to decode it at all)
Traceback (most recent call last):
File "(file path)", line 514, in DoEmail
raw_email_string = raw_email.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 10922: invalid start byte
cp1252 (Which returns a NoneType object when trying to save the file.)
Traceback (most recent call last):
part.get_payload(decode=True))
TypeError: a bytes-like object is required, not 'NoneType'
I've looked at the documentation for email.parser Source and email.parser Docs and imaplib Docs. Also a good example by MattH and attachment example by John Paul Hayes.
My Question:
Why do certain photos, even though they seem to be encoded correctly, cause it to crash? And how do I fix it? Is there a better method to get and save the attachments?
Relevant Code:
# Site is the email server address
# Port is the email server port, usually 993.
mail = imaplib.IMAP4_SSL(host=Site, port=Port) # imaplib module implements connection based on IMAPv4 protocol
mail.login(Email, password)
mail.select('inbox', readonly=False) # Connected to inbox.
# SearchPhrase is the Phrase used when finding unique emails.
result, data = mail.uid('SEARCH', None, f'Subject "{SearchPhrase}"') # search and return uids instead
if result == 'OK':
EmailIdList = data[0].split() # EmailIdList is a space separated byte string of the ids
count = len(EmailIdList)
for x in range(count):
if GUI: GUI.resultStatus = resx.currentProgress(x+1, count)
latest_email_uid = EmailIdList[x] # unique ids wrt label selected
EmailID = latest_email_uid.decode('utf-8')
result, email_data = mail.uid('fetch', latest_email_uid, '(RFC822)')
if result == 'OK':
raw_email = email_data[0][1]
# try:
# raw_email_string = raw_email.decode('utf-8')
# except:
# raw_email_string = raw_email.decode('cp1252')
# email_message = email.message_from_string(raw_email)
email_message = email.message_from_bytes(raw_email)
print(email_message)
dt = parse(email_message['Date']) #dateutil.parser.parse()
day = str(dt.strftime("%B %d, %Y")) #date())
msg.get_content_charset(), 'ignore').encode('utf8', 'replace')
# this will loop through all the available multiparts in email
for part in email_message.walk():
charset = part.get_content_charset()
if part.get_content_maintype() != 'multipart' and part.get('Content-Disposition') is not None:
fileName = part.get_filename().replace('\n','').replace('\r','')
if fileName != '' and fileName is not None:
print(fileName)
with open(fileName, 'wb') as f:
######## ---- HERE ---- ##########
f.write(part.get_payload(decode=True))
elif part.get_content_type() == "text/plain": # get only text/plain
body = str(part.get_payload(decode=True), str(charset), "ignore").replace('\r','')
print(body)
elif part.get_content_type() == "text/html": # get only html
html = str(part.get_payload(decode=True), str(charset), "ignore").replace('\n', '').replace('\r', ' ')
print(html)
else:
continue
Edit:
I believe these are the MIME Headers for the image in question.
------=_NextPart_000_14A6_01D55B4C.3FE8C840
Content-Type: image/jpeg;
name="8~a~0ff68d6a-12aa-49bf-9908-0b28ecd7ec83~634676194557918023.jpg"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="8~a~0ff68d6a-12aa-49bf-9908-0b28ecd7ec83~634676194557918023.jpg"
Edit: The location of the crash (when it decodes the byte64 data to save the file) is denoted by: ######## ---- HERE ---- ##########

Python IMAP TypeError: 'Nonetype' Object Is Not Subscriptable

I am using imap to check for unread emails that match a specific subject. When I receive an email from my test email, it goes just fine, but when it comes from an automatic system that I'm needing it to check the emails from, I get an error stating that 'Nonetype' object is not subscriptable. The following is my code:
import imaplib, time, email, mailbox, datetime
server = "imap.gmail.com"
port = 993
user = "Redacted"
password = "Redacted"
def main():
while True:
conn = imaplib.IMAP4_SSL(server, port)
conn.login(user, password)
conn.list()
conn.select('inbox', readonly=True)
result, data = conn.search(None, '(UNSEEN SUBJECT "Alert: Storage Almost At Max Capacity")')
i = len(data[0].split())
for x in range (i):
latest_email_uid = data[0].split()[x]
result, email_data = conn.uid('fetch', latest_email_uid, '(RFC822)')
raw_email = email_data[0][1] #This is where it throws the error
raw_email_string = raw_email.decode('utf-8')
email_message = email.message_from_string(raw_email_string)
date_tuple = email.utils.parsedate_tz(email_message['Date'])
local_date = datetime.datetime.fromtimestamp(email.utils.mktime_tz(date_tuple))
local_message_date = "%s" %(str(local_date.strftime("%a, %d %b %Y %H:%M:%S")))
for part in email_message.walk():
if part.get_content_type() == "text/plain":
body = part.get_payload(decode=True)
body = body.decode('utf-8')
body = body.split()
#Do some stuff
conn.close()
if __name__ == "__main__":
main()
And the following is the traceback:
Traceback (most recent call last):
File "TestEmail.py", line 200, in <module>
main()
File "TestEmail.py", line 168, in main
raw_email = email_data[0][1]
TypeError: 'NoneType' object is not subscriptable.
I don't understand why this would work in an email sent from a person's email, yet not work when my system emails me an alert. Is there any obvious fix to this?
EDIT: I've tried printing the result and email variables. The following was their output:
Result: OK
Email: [None]
Whereas if I test the script against an email with the same subject, but sent from my test email, result is still "OK", but an email is contained.
EDIT#2: I've noticed that the format of the emails are a little different. The one that is being received fine is both text/plain and text/html, whereas the one that isn't being accepted is text/plain with Content-Transfer-Encoding: 7-bit. How might I remedy this? If I forward the email through a filter and check the email receiving from the filter, my code works just fine. However, I would like to not have to use multiple emails for this.
You are searching for message sequence numbers, and fetching by uid.
If you are going to use conn.uid('fetch'), you must also use conn.uid('search'), otherwise you are searching for apples and fetching oranges.
Thus, since not all MSNs are UIDs, you are occasionally fetching non-existent messages, which is not an error, but it just won't return you anything.

How to get the body text of email with imaplib?

I am in python3.4 .
import imaplib
import email
user="XXXX"
password="YYYY"
con=imaplib.IMAP4_SSL('imap.gmail.com')
con.login(user,password)
con.list()
con.select("INBOX")
result,data=con.fetch(b'1', '(RFC822)')
raw=email.message_from_bytes(data[0][1])
>>> raw["From"]
'xxxx'
>>> raw["To"]
'python-list#python.org'
>>> raw["Subject"]
'Re:get the min date from a list'
When i run 'print(raw)' there are many lines of the body of the email ,
i can't get it with raw[TEXT] OR raw['TEXT'] OR raw['BODY'] ,
how can i get the body of the email text?
You're asking it for a header named TEXT or BODY, and obviously there is no such thing. I think you're mixing up IMAP4 part names (the things you pass in con.fetch) and RFC2822 header names (the things you use in an email.message.Message).
As the email.message documentation explains, a Message consists of headers and a payload. The payload is either a string (for non-multipart messages) or a list of sub-Messages (for multipart). Either way, what you want here is raw.get_payload().
If you want to handle both, you can either first check raw.is_multipart(), or you can check the type returned from get_payload(). Of course you have to do decide what you want to do in the case of a multipart message; what counts as "the body" when there are three parts? Do you want the first? The first text/plain? The first text/*? The first text/plain if there is one, the first text/* if not, and the first of anything if even that doesn't exist? Or all of them concatenated together?
Let's assume you just want the first one. To do that:
def get_text(msg):
if msg.is_multipart():
return get_text(msg.get_payload(0))
else:
return msg.get_payload(None, True)
If you want something different, hopefully you can figure out how to do it yourself. (See the get_content_type and/or get_content_maintype methods on Message.)
Following up using Python 3.8 - parses all the parts that have an associated encoding and turns it into a single HTML page
import imaplib
import email
import webbrowser
import tempfile
import webbrowser
def email_to_html(parsed):
all_parts = []
for part in parsed.walk():
if type(part.get_payload()) == list:
for subpart in part.get_payload():
all_parts += email_to_html(subpart)
else:
if encoding := part.get_content_charset():
all_parts.append(part.get_payload(decode=True).decode(encoding))
return ''.join(all_parts)
# Login
imap = imaplib.IMAP4_SSL("imap.gmail.com")
result = imap.login("username", "password")
# Select the inbox, grab only the unseen emails
status, resp = imap.select('INBOX')
status, response = imap.search(None, '(UNSEEN)')
unread_msg_nums = response[0].split()
email_bodies = []
for idx in unread_msg_nums:
_, msg = imap.fetch(str(int(idx)), "(RFC822)")
for response in msg:
if isinstance(response, tuple):
raw_email = response[1]
parsed = email.message_from_bytes(raw_email)
email_bodies.append(email_to_html(parsed))
# If you want to view/check the emails in your browser
def display(html):
with tempfile.NamedTemporaryFile('w', delete=False, suffix='.html') as f:
url = 'file://' + f.name
f.write(html)
webbrowser.open(url)
for body in email_bodies:
display(body)

Tweepy Search API Writing to File Error

Noob python user:
I've created file that extracts 10 tweets based on the api.search (not streaming api). I get a screen results, but cannot figure how to parse the output to save to csv. My error is TypeError: expected a character buffer object.
I have tried using .join(str(x) and get other errors.
My code is
import tweepy
import time
from tweepy import OAuthHandler
from tweepy import Cursor
#Consumer keys and access tokens, used for Twitter OAuth
consumer_key = ''
consumer_secret = ''
atoken = ''
asecret = ''
# The OAuth process that uses keys and tokens
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(atoken, asecret)
# Creates instance to execute requests to Twitter API
api = tweepy.API(auth)
MarSec = tweepy.Cursor(api.search, q='maritime security').items(10)
for tweet in MarSec:
print " "
print tweet.created_at, tweet.text, tweet.lang
saveFile = open('MarSec.csv', 'a')
saveFile.write(tweet)
saveFile.write('\n')
saveFile.close()
Any help would be appreciated. I've gotten my Streaming API to work, but am having difficulty with this one.
Thanks.
tweet is not a string or a character buffer. It's an object. Replace your line with saveFile.write(tweet.text) and you'll be good to go.
saveFile = open('MarSec.csv', 'a')
for tweet in MarSec:
print " "
print tweet.created_at, tweet.text, tweet.lang
saveFile.write("%s %s %s\n"%(tweet.created_at, tweet.lang, tweet.text))
saveFile.close()
I just thought I'd put up another version for those who might want to save all
the attributes of a tweepy.models.Status object, if you're not yet sure which attributes of each tweet you want to save to file.
import json
search_results = []
for status in tweepy.Cursor(api.search, q=search_text).items(5000):
search_results.append(status._json)
with open('search_results.json', 'w') as f:
json.dump(search_results, f)
The first block will store the search results into a list of dictionaries, and the second block will output all the tweets into a json file.
Please beware, this might use up a lot of memory if the size of your search results is very big.
This is Twitter's classic error code when something is wrong while sending a wrong image.
Try to find images you are trying to upload and check the format of the images.
The only thing I did was erase the images that MY media player of Windows can´t read and that's all! the script run perfectly.

Resources