Developing module and using it in Spyder - python-3.x

I'm trying to develop a python module, which I then want to use in Spyder.
Here is how my files are organized in my module :
testing_the_module.py
myModule
-> __init__.py
-> sql_querying.py #contains a function called sql()
testing_the_module.py contains :
import myModule
print(myModule.sql_querying.sql(query = "show tables")) # how this function works is not relevant
__init__.py contains
import myModule.sql_querying
When I use the command line, it works :
> python3 .\testing_the_module.py
[{
'query': 'show tables',
'result': ['table1', 'table2']
}]
It also works if I use the python console :
> python3
Python 3.6.1 |Anaconda 4.4.0 (64-bit)| (default, May 11 2017, 13:25:24) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import myModule
>>> print(myModule.sql_querying.sql(query = "show tables"))
[{
'query': 'show tables',
'result': ['table1', 'table2']
}]
However, when using Spyder, I can't get it to work. Here is what I get when I run (with F9) each of those lines :
import myModule
# no error message
print(myModule.sql_querying.sql(query = "show tables"))
AttributeError: module 'myModule' has no attribute 'sql_querying'
Any idea of why and how to make it work in Spyder ?
Edit to answer comment :
In [665]: sys.path
Out[665]:
['',
'C:\\ProgramData\\Anaconda3\\python36.zip',
'C:\\ProgramData\\Anaconda3\\DLLs',
'C:\\ProgramData\\Anaconda3\\lib',
'C:\\ProgramData\\Anaconda3',
'C:\\ProgramData\\Anaconda3\\lib\\site-packages',
'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\Sphinx-1.5.6-py3.6.egg',
'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\win32',
'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\win32\\lib',
'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\Pythonwin',
'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\setuptools-27.2.0-py3.6.egg',
'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\IPython\\extensions',
'C:\\Users\\fmalaussena\\.ipython']

Related

sqlite3.OperationalError('near "(": syntax error') in Google Colab

Observing some odd behavior with SQLite 2.6, where the ROW_NUMBER() throws an error only in Google Colab (Python 3.6.9), whereas the code works fine in my local Python 3.6.9 and Python 3.9.1 instances. Can you help me debug this further?
Code
import sqlite3, sys
try:
print('Py.version : ' + (sys.version))
print('sqlite3.version : ' + (sqlite3.version))
print('sqlite3.sqlite_version : ' + (sqlite3.sqlite_version)+'\n')
conn = sqlite3.connect(':memory:')
conn.execute('''CREATE TABLE team_data(team text, total_goals integer);''')
conn.commit()
conn.execute("INSERT INTO team_data VALUES('Real Madrid', 53);")
conn.execute("INSERT INTO team_data VALUES('Barcelona', 47);")
conn.commit()
sql='''
SELECT
team,
ROW_NUMBER () OVER (
ORDER BY total_goals
) RowNum
FROM
team_data
'''
print('### DB Output ###')
cursor = conn.execute(sql)
for row in cursor:
print(row)
except Exception as e:
print('ERROR : ' + str(e))
finally:
conn.close()
Output
Google Colab (ROW_NUMBER() causes SQL to fail):
Py.version : 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
sqlite3.version : 2.6.0
sqlite3.sqlite_version : 3.22.0
### DB Output ###
ERROR : near "(": syntax error
Local Python 3.6.9 (Succeeds):
Py.version : 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 14:00:49) [MSC v.1915 64 bit (AMD64)]
sqlite3.version : 2.6.0
sqlite3.sqlite_version : 3.33.0
### DB Output ###
('Barcelona', 1)
('Real Madrid', 2)
Local Python 3.9.1 (Succeeds):
Py.version : 3.9.1 (default, Dec 11 2020, 09:29:25) [MSC v.1916 64 bit (AMD64)]
sqlite3.version : 2.6.0
sqlite3.sqlite_version : 3.33.0
### DB Output ###
('Barcelona', 1)
('Real Madrid', 2)
Note: Above SQL and code is simplified for error reproduction purposes only
The query in question is a window function and support for that was added in version 3.25. You can check the library (opposed to package) version with sqlite3.sqlite_version or as #forpas shared with the query select sqlite_version().
You can upgrade your sqlite version. Use this code.
!add-apt-repository -y ppa:sergey-dryabzhinsky/packages
!apt update
!apt install sqlite3
# MENU: Runtime > Restart runtime
import sqlite3
sqlite3.sqlite_version # '3.33.0'

split username & password from URL in 3.8+ (splituser is deprecated, no alternative)

trying to filter out the user-password from a URL.
(I could've split it manually by the last '#' sign, but I'd rather use a parser)
Python gives a deprecation warning but urlparse() doesn't handle user/password.
Should I just trust the last-#-sign, or is there a new version of split-user?
Python 3.8.2 (default, Jul 16 2020, 14:00:26)
[GCC 9.3.0] on linux
>>> url="http://usr:pswd#www.site.com/path&var=val"
>>> import urllib.parse
>>> urllib.parse.splituser(url)
<stdin>:1: DeprecationWarning: urllib.parse.splituser() is deprecated as of 3.8, use urllib.parse.urlparse() instead
('http://usr:pswd', 'www.site.com/path&var=val')
>>> urllib.parse.urlparse(url)
ParseResult(scheme='http', netloc='usr:pswd#www.site.com', path='/path&var=val', params='', query='', fragment='')
#neigher with allow_fragments:
>>> urllib.parse.urlparse(url,allow_fragments=True)
ParseResult(scheme='http', netloc='us:passw#ktovet.com', path='/all', params='', query='var=val', fragment='')
(Edit: the repr() output is partial & misleading; see my answer.)
It's all there, clear and accessible.
What went wrong: The repr() here is misleading, showing only few properties / values (why? it's another question).
The result is available with explicit property get:
>>> url = 'http://usr:pswd#www.sharat.uk:8082/nativ/page?vari=valu'
>>> p = urllib.parse.urlparse(url)
>>> p.port
8082
>>> p.hostname
'www.sharat.uk'
>>> p.password
'pswd'
>>> p.username
'usr'
>>> p.path
'/nativ/page'
>>> p.query
'vari=valu'
>>> p.scheme
'http'
Or as a one-liner (I just needed the domain):
>>> urllib.parse.urlparse('http://usr:pswd#www.sharat.uk:8082/nativ/page?vari=valu').hostname
www.shahart.uk
Looking at the source code for splituser, looks like they simply use str.rpartition:
def splituser(host):
warnings.warn("urllib.parse.splituser() is deprecated as of 3.8, "
"use urllib.parse.urlparse() instead",
DeprecationWarning, stacklevel=2)
return _splituser(host)
def _splituser(host):
"""splituser('user[:passwd]#host[:port]') --> 'user[:passwd]', 'host[:port]'."""
user, delim, host = host.rpartition('#')
return (user if delim else None), host
which yes, relies on the last occurrence of #.
EDIT: urlparse still has all these fields, see Berry's answer

.strftime doesn't apply zero padding on '%Y' in python:3.7-slim Docker image

I found a strange quirk in the slim version of the python Docker image with regards to date formatting. If you pass it a first-century date, %Y-%m-%d formatting doesn’t yield a zero-padded year-part:
$ docker run -ti python:3.7-slim /bin/bash
root#71f21d562837:/# python
Python 3.7.5 (default, Nov 23 2019, 06:10:46)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import date
>>> d = date(197,1,1)
>>> d.strftime('%Y-%m-%d')
'197-01-01'
But running this on the same python version locally on my macbook does yield 4 digits for the year:
$ python
Python 3.7.5 (default, Nov 1 2019, 02:16:32)
[Clang 11.0.0 (clang-1100.0.33.8)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import date
>>> d = date(197,1,1)
>>> d.strftime('%Y-%m-%d')
'0197-01-01'
The Python docs suggest that %y should yield no zero padding while %Y should.
Same quirk for version 3.6-slim.
The problem with this is that some systems (like BigQuery) requires the zero padding.
What would be the most elegant/least hacky workaround for this? I'm building a custom image derived from python:3.7-slim. I'm open to using a different image with a small footprint, or making an elegant code change.
You can always use a manual workaround to get identical formatting on all platforms:
from datetime import date
d = date(197,1,1)
dstr = d.strftime('%Y-%m-%d')
dstr = ('0'+dstr if len(dstr.split('-')[0]) == 3 else dstr)
print(dstr)

How use dir() function to see inside scrapy module

From the documentation:
Without arguments, return the list of names in the current local scope. With an argument, attempt to return a list of valid attributes for that object.
So i try see inside the scrapy module
import scrapy is a module right, or im wrong?
>>>dir(scrapy)
NameError: name 'scrapy' is not defined
Im complete newb in python and just try understand how works.
How can i see inside modules like documentation examples
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__loader__', '__name__',
'__package__', '__stderr__', '__stdin__', '__stdout__',
'_clear_type_cache', '_current_frames', '_debugmallocstats', '_getframe',
'_home', '_mercurial', '_xoptions', 'abiflags', 'api_version', 'argv',
'base_exec_prefix', 'base_prefix', 'builtin_module_names', 'byteorder',
'call_tracing', 'callstats', 'copyright', 'displayhook',
'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix',
'executable', 'exit', 'flags', 'float_info', 'float_repr_style',
'getcheckinterval', 'getdefaultencoding', 'getdlopenflags',
'getfilesystemencoding', 'getobjects', 'getprofile', 'getrecursionlimit',
'getrefcount', 'getsizeof', 'getswitchinterval', 'gettotalrefcount',
'gettrace', 'hash_info', 'hexversion', 'implementation', 'int_info',
'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path',
'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1',
'setcheckinterval', 'setdlopenflags', 'setprofile', 'setrecursionlimit',
'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdout',
'thread_info', 'version', 'version_info', 'warnoptions']
Try this from your python interpreter:
In [1]: import scrapy
In [2]: dir(scrapy)
Out[2]:
['Field',
'FormRequest',
'Item',
'Request',
'Selector',
'Spider',
'__all__',
'__builtins__',
'__cached__',
'__doc__',
'__file__',
'__loader__',
'__name__',
'__package__',
'__path__',
'__spec__',
'__version__',
'_txv',
'exceptions',
'http',
'item',
'link',
'selector',
'signals',
'spiders',
'twisted_version',
'utils',
'version_info']
This worked for me in both Python 2 and 3. I have also confirmed that it works in both iPython and the standard interpreter. If it does not work for you even with the import, your environment may have gotten messed up in some way, and we can troubleshoot further.
import scrapy is a module right, or im wrong?
In this case scrapy is a module, and import scrapy is the syntax for making that module available in whatever context you are invoking the import from. This section of the Python tutorial has information on modules and importing them.

unicode and str are the same in python3?(jupyter notebook)

according to official document, all str in python3 are in unicode, and actually there in no 'unicode' type in python3.
But there is a strange thing happened when I run jupyter notebook
time = re.findall(r'(\d+/\d+/\d+)', rating_bar.find('span', class_='rating-qualifier').text)[0].split('/')
di['date'] = '/'.join([str.zfill(t,2) for t in time[:2]] + time[2:] )
where rating_bar is a Beautiful Soup node, and jupyter notebook give this error
<ipython-input-9-a5ac4904b840> in parse_page(html)
25 class_='rating-qualifier').text)[
26 0].split('/')
---> 27 di['date'] = '/'.join([str.zfill(t,2) for t in time[:2]] + time[2:] )
28 di['rating'] = float(
29 rating_bar.find('div', class_='i-stars')['title'].split()[0])
TypeError: descriptor 'zfill' requires a 'str' object but received a 'unicode'
it's weird because there is no 'unicode' type in python3. And actually this code run correctly in my terminal.

Resources