beautifulsoup 4.4 on python 3.5 - python-3.x

I am having issues on a new PC and cannot find an answer anywhere. I am trying to get beautifulsoup 4.4 to work on python 3.5. I am using pyCharm - I am reading that they may not be compatible but I have the same setup on my laptop and it works perfectly - only difference is the PC I am trying get BS to work on is Windows 7 where my laptop is Windows 8.
When I go into settings and look at Project Interpreter I do see BS 4.4.1 but when I try and run something I am getting this error:
Traceback (most recent call last):
File "C:/Users/PP/PycharmProjects/Shark/NOCO.py", line 3, in <module>
from bs4 import BeautifulSoup
File "C:\Users\PP\AppData\Local\Programs\Python\Python35-32\lib\site-packages\bs4\__init__.py", line 29, in <module>
from .builder import builder_registry
File "C:\Users\PP\AppData\Local\Programs\Python\Python35-32\lib\site-packages\bs4\builder\__init__.py", line 294, in <module>
from . import _htmlparser
File "C:\Users\PP\AppData\Local\Programs\Python\Python35-32\lib\site-packages\bs4\builder\_htmlparser.py", line 7, in <module>
from html.parser import (
ImportError: cannot import name 'HTMLParseError'

You are not running BeautifulSoup 4.4.1; your traceback shows you have an older version.
In 4.4.1, that section looks like this:
try:
from html.parser import HTMLParseError
except ImportError as e:
# HTMLParseError is removed in Python 3.5. Since it can never be
# thrown in 3.5, we can just define our own class as a placeholder.
class HTMLParseError(Exception):
pass
Line 7 is try:. This change was made in 4.4.0, so you have 4.3.2 or older installed instead.
Upgrade your installed package. PyCharm can do this, or you can use pip:
python3 -m pip install -U beautifulsoup4

Related

Cannot import installed python package (MacOS)

I cannot import any python package when running python on Visual Studio code or on my Terminal. I can still do this if I were to code on a Jupyter notebook. However, when I tried other environment that doesn't use the notebook server. It returns me ModuleNotFound Error like this
Traceback (most recent call last):
File "/Users/truongminh/Desktop/DataScience/Datasets/CityU Text/test_PreprocessText.py", line 1, in <module>
import PreprocessText
File "/Users/truongminh/Desktop/DataScience/Datasets/CityU Text/PreprocessText.py", line 1, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
Most of my packages was download via anaconda. I don't know if it might be the cause of this.

Having problems installing python-docx

>>> import docx
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\users\kevin\mu_code\docx\__init__.py", line 3, in <module>
from docx.api import Document # noqa
File "c:\users\kevin\mu_code\docx\api.py", line 14, in <module>
from docx.package import Package
File "c:\users\kevin\mu_code\docx\package.py", line 9, in <module>
from docx.opc.package import OpcPackage
File "c:\users\kevin\mu_code\docx\opc\package.py", line 9, in <module>
from docx.opc.part import PartFactory
File "c:\users\kevin\mu_code\docx\opc\part.py", line 12, in <module>
from .oxml import serialize_part_xml
File "c:\users\kevin\mu_code\docx\opc\oxml.py", line 12, in <module>
from lxml import etree
ImportError: cannot import name 'etree'
I have python-docx 0.8.10 and lxml 4.5.0, windows 10. I tried googling already but I'm not sure if I followed the suggestions correctly or if it's applicable in my case (lxml problems). I haven't had any problems installing other modules using "pip install" so I'm stuck and don't know how to proceed from here.
Check this,
Use pip install, to install the docx library and if you have already install it successfully then have a look into its dependencies. I think it is because of the incompatibility with its dependencies that is why you are getting the error.
pip install python-docx
Dependencies
Python 2.6, 2.7, 3.3, or 3.4
lxml >= 2.3.2
I don't know whether this may be an appropriate solution for you. But this is what I generally follow. Just install Anaconda in your system and an environment according to your needs. For your case create an environment for Python 3.4 using the following command
conda create --name py34 python=3.4
You then install libraries according to your needs in the respective environment. Now you can work into each environment without interfering with the libraries of the other environment. To use anaconda kindly follow Anaconda cheatsheet.
Kindly refer to the link. Hope this helps you.
This is almost certainly a problem with the lxml install. python-docx works with all versions of Python >= 2.6.
Instead of import docx, try from lxml import etree. If this produces the same error message, you know you've narrowed it down.
lxml depends on a couple of C libraries, lib2xml and libxslt if I remember correctly. These are sometimes tricky to install. In any case, you'll find solutions to those problems by searching on "lxml install windows" or similar.
Once from lxml import etree works without error I think you'll find import docx does too.

How can I import bs4 from Beautiful Soup in Python3 on OSX 10.12.5?

I'm trying to fix what seems like a common problem with importing a module in Python 3. I'm running OS X 10.12.5 and have Python 3 installed on my MacBook Air and am using Sublime Text to edit and run my code.
When I try this import:
from bs4 import BeautifulSoup
...I get this error:
Traceback (most recent call last):
File "/Users/<myname>/Python/code-python3/Pgm#001", line 5, in <module>
from bs4 import BeautifulSoup
ImportError: No module named 'bs4'
I successfully installed with this PIP and on every re-install I see this:
$ pip install beautifulsoup4
Requirement already satisfied: beautifulsoup4 in ./anaconda/lib/python3.5/site-packages
I've tried qualifying the location with things like:
from ./anaconda/lib/python3.5/site-packages/bs4 import BeautifulSoup
...but invariably get a variety of syntax errors; usually on the first '/'.
I am not using a virtual environment but do plan to read up on that approach as these kind of configuration and setup errors are big time wasters.
When I try to run this right in python3 I get slightly different errors:
>>> from bs4 import BeautifulSoup
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/<myname>/anaconda/lib/python3.5/site-packages/bs4/__init__.py", line 30, in <module>
from .builder import builder_registry, ParserRejectedMarkup
File "/Users/<myname>/anaconda/lib/python3.5/site-packages/bs4/builder/__init__.py", line 314, in <module>
from . import _html5lib
File "/Users/<myname>/anaconda/lib/python3.5/site-packages/bs4/builder/_html5lib.py", line 70, in <module>
class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
AttributeError: module 'html5lib.treebuilders' has no attribute '_base'
Any tips about where the obvious answer exists that I am still missing would be greatly appreciated. I've seen a lot of things about PATH and PYTHONPATH in similar questions but had no success with any of these solutions either.
Yes, I'm not a fan of anaconda but looks like the path is being pointed to:
"/Users//anaconda/lib/python3.5/site-packages"
... and for the sake of keeping things as clean as possible (this being a good example/reason why you'll see people say to use env for your python projects)
Lets google here! Here's how to remove anaconda:
conda install anaconda-clean #first the configs
anaconda-clean --yes
rm -rf ~/anaconda.* #then conda, the ".*" is regex to cover any version
The check your bash_profile and/or bashrc; and here you can set your Python path. Make sure it not pointing to anaconda
if you run
ls -lh `which python # orpython3
You'll get the path of which is the one you'll be setting as the path.

import matplotlib.pyplot failing: _tkagg.pyd not found but IS in the directory of calling module

Python 3.4
Windows 8.1
Installed modules:
matplotlib 1.3.1 for py 3.4
numpy-MLK 1.9.0b1 for py 3.4
dateutil 2.2 for py 3.4
six 1.7.3 for py 3.4
tcl
tkinter
Also msvcp71.dll is in C:\Windows\System32 (installation docs said it needed to be)
Upon running:
import matplotlib.pyplot as plt
I get the following error message:
Traceback (most recent call last):
File "<pyshell#284>", line 1, in <module>
import matplotlib.pyplot as plt
File "D:\Downloaded Programs\Python\lib\site-packages\matplotlib\pyplot.py", line 98, in <module>
_backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup()
File "D:\Downloaded Programs\Python\lib\site-packages\matplotlib\backends\__init__.py", line 28, in pylab_setup
globals(),locals(),[backend_name],0)
File "D:\Downloaded Programs\Python\lib\site-packages\matplotlib\backends\backend_tkagg.py", line 11, in <module>
import matplotlib.backends.tkagg as tkagg
File "D:\Downloaded Programs\Python\lib\site-packages\matplotlib\backends\tkagg.py", line 2, in <module>
from matplotlib.backends import _tkagg
ImportError: DLL load failed: The specified module could not be found.
Point being: Python\lib\site-packages\matplotlib\backends\tkagg.py is trying to
execute
from matplotlib.backends import _tkagg
but failing to do so. However _tkagg.pyd file does exist in the directory
Python\lib\site-packages\matplotlib\backends
Why is this not working then?
I know this is an old thread, but I just ran into the same problem and I found a solution, so I decided to answer it.
By taking a look at matplotlib install documentation, it says:
For Python 3.5 the Visual C++ Redistributable for Visual Studio 2015 needs to be installed.
I installed it's 64 bit version from it's Microsoft website (as I use 64 bit Python 3.6.3, and now the import works fine.
I hope it helps anyone that may face the same issue in the future.

from pandas import * -- Python issue

I am trying to run the following code in python 3.3
from pandas import *
and I receive the following error:
Traceback (most recent call last):
File "C:\Users\Tom\Desktop\ProgrammingStuff\Python\FXCointegrationBacktesting.py", line 9, in <module>
cannot import name text_type
from pandas import *
File "C:\Python33\lib\site-packages\pandas\__init__.py", line 6, in <module>
from . import hashtable, tslib, lib
File "tslib.pyx", line 31, in init pandas.tslib (pandas\tslib.c:48782)
File "C:\Python33\lib\site-packages\dateutil\parser.py", line 24, in <module>
from six import text_type, binary_type, integer_types
ImportError: cannot import name text_type
Not sure what the problem is, I am fairly new to python and I cannot currently find any solutions to this problem on stack overflow.
Thanks!
You're using a version of dateutil that depends on six.
In dateutil <= 1.5 you don't need six, but those versions are not compatible with Python >= 3.0. So, the solution is to install six. However you do that is up to you.
You could do
pip install six
If you choose not to use pip it will depend on your system's package manager how you go about installing it.

Resources