module 'seaborn' has no attribute 'distplot' - python-3.x

I've some code like:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
data = pd.read_csv('StudentsPerformance.csv')
#print(data.isnull().sum()) // checking if there are some missing values or not
#print(data.dtypes)checking datatypes of the dataset
# ANALYSÄ°S VALUES OF THE COLUMN'S
"""print(data['gender'].value_counts())
print(data['parental level of education'].value_counts())
print(data['race/ethnicity'].value_counts())
print(data['lunch'].value_counts())
print(data['test preparation course'].value_counts())"""
# Adding column total and average to the dataset
data['total'] = data['math score'] + data['reading score'] + data['writing score']
data['average'] = data ['total'] / 3
sns.distplot(data['average'])
I would like to see distplot of average for visualization but I run the program that gives me an error like
Traceback (most recent call last): File
"C:/Users/usersample/PycharmProjects/untitled1/sample.py", line 22, in
sns.distplot(data['average']) AttributeError: module 'seaborn' has no attribute 'distplot'
I've tried to reinstall and install seaborn and upgrade the seaborn to 0.9.0 but it doesn't work.
head of my data female,"group B","bachelor's
degree","standard","none","72","72","74" female,"group C","some
college","standard","completed","69","90","88" female,"group
B","master's degree","standard","none","90","95","93" male,"group
A","associate's degree","free/reduced","none","47","57","44"

this might be due to removal of paths in environment variables section. Try considering to add your IDE scripts and python folder. I am using pycharm IDE, and did the same and its working fine.

Related

AttributeError : module 'word2number' has no attribute 'word_to_num'

The code I'm working on is with a dataset which contains like numbers in alphabets, So I want to convert it into string to feed to into a Multivariate Model.
!pip install word2number
import pandas as pd
import math
from sklearn import linear_model
import word2number as w2n
print("sucessfully imported all the libraries")
df = pd.read_csv('hiring.csv')
df
print(w2n.word_to_num('one'))
This is my code and the error I'm getting is
AttributeError Traceback (most recent call last)
c:\Users\tanus\Desktop\Machine Learning\Regression\Multivariate Regression\Multivariate_Regression.ipynb Cell 2 in <cell line: 4>()
1 df = pd.read_csv('hiring.csv')
2 df
----> 4 print(w2n.word_to_num('one'))
AttributeError: module 'word2number' has no attribute 'word_to_num'
you have to import w2n module from word2number
from word2number import w2n
print(w2n.word_to_num('two point three'))
You are directly using word_to_num from the module i assume.
Please check the import statement.
The error is possible if you use below import.
import word2number as w2n
Hope this helps

ModuleNotFoundError: No module named 'pandas.lib'

from ggplot import mtcars
While importing mtcars dataset from ggplot on jupyter notebook i got this error
My system is windows 10 and I've already reinstalled and upgraded pandas (also used --user in installation command)but it didn't work out as well. Is there any other way to get rid of this error?
\Anaconda3\lib\site-packages\ggplot\stats\smoothers.py in
2 unicode_literals)
3 import numpy as np
----> 4 from pandas.lib import Timestamp
5 import pandas as pd
6 import statsmodels.api as sm
ModuleNotFoundError: No module named 'pandas.lib'
I Just tried out a way. I hope this works out for others as well. I changed from this
from pandas.lib import Timestamp
to this
from pandas._libs import Timestamp
as the path of the module is saved in path C:\Users\HP\Anaconda3\Lib\site-packages\pandas-libs
is _libs
Also, I changed from
date_types = (
pd.tslib.Timestamp,
pd.DatetimeIndex,
pd.Period,
pd.PeriodIndex,
datetime.datetime,
datetime.time
)
to this
date_types = (
pd._tslib.Timestamp,
pd.DatetimeIndex,
pd.Period,
pd.PeriodIndex,
datetime.datetime,
datetime.time
)
Before that, I went on this path "C:\Users\HP\Anaconda3\Lib\site-packages\ggplot\util.py" to make the same changes in util.py for date_types. This helped me out to get rid of the error I mentioned in my question.

Programming a simple Stock prediction service with Alpha Vantage in Python. I get this error

This is the program for the stock prediction to be simply printed...
from alpha_vantage.timeseries import TimeSeries
# Your key here
key = 'yourkeyhere'
ts = TimeSeries(key)
aapl, meta = ts.get_daily(symbol='AAPL')
print(aapl['2020-22-5'])
I get this error...
Traceback (most recent call last):
File "C:/Users/PycharmProjects/AlphaVantageTest/AlphaVantageTest.py", line 7, in <module>
print(aapl['2020-22-5'])
KeyError: '2020-22-5'
Since that didn't work, I tried getting a little more technical with it...
from alpha_vantage.timeseries import TimeSeries
from alpha_vantage.techindicators import TechIndicators
from matplotlib.pyplot import figure
import matplotlib.pyplot as plt
# Your key here
key = 'W01B6S3ALTS82VRF'
# Chose your output format, or default to JSON (python dict)
ts = TimeSeries(key, output_format='pandas')
ti = TechIndicators(key)
# Get the data, returns a tuple
# aapl_data is a pandas dataframe, aapl_meta_data is a dict
aapl_data, aapl_meta_data = ts.get_daily(symbol='AAPL')
# aapl_sma is a dict, aapl_meta_sma also a dict
aapl_sma, aapl_meta_sma = ti.get_sma(symbol='AAPL')
# Visualization
figure(num=None, figsize=(15, 6), dpi=80, facecolor='w', edgecolor='k')
aapl_data['4. close'].plot()
plt.tight_layout()
plt.grid()
plt.show()
I get these errors...
Traceback (most recent call last):
File "C:/Users/PycharmProjects/AlphaVantageTest/AlphaVantageTest.py", line 9, in <module>
ts = TimeSeries(key, output_format='pandas')
File "C:\Users\PycharmProjects\AlphaVantageTest\venv\lib\site-packages\alpha_vantage\alphavantage.py", line 66, in __init__
raise ValueError("The pandas library was not found, therefore can "
ValueError: The pandas library was not found, therefore can not be used as an output format, please install manually
How can I improve my program so that I don't receive these errors? None of these programs are bad syntax wise. Thank you to anyone that can help.
You need to install pandas. If you're just using pip, you can run pip install pandas if you are using conda to manage your envs you can use conda install pandas.
Glad it worked. According to this meta overflow post: What if I answer a question in a comment?
I am posting my comment as an answer so you can mark the question as answered.

import python file into jupyter notebook

I have a python file bucket.py. I'm trying to import it in to a jupyter notebook using the code below. I'm then trying to use one of the functions in it "exp1" to explore a dataframe. I'm getting the error below. Can someone please tell me how to import a file from a directory so I can use the functions in it, in my jupyter notebook?
code:
import importlib.util
spec = importlib.util.spec_from_file_location("module.name", '/Users/stuff/bucket/bucket.py')
foo = importlib.util.module_from_spec(spec)
foo.exp1(df)
error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-9-e1cc80f06e24> in <module>
----> 1 foo.exp1(harborsideoakland_df)
AttributeError: module 'module.name' has no attribute 'exp1'
bucket.py file:
# import libraries
import numpy as np
import pandas as pd
from time import time
import scipy.stats as stats
from IPython.display import display # Allows the use of display() for DataFrames
# # Pretty display for notebooks
# %matplotlib inline
###########################################
# Suppress matplotlib user warnings
# Necessary for newer version of matplotlib
import warnings
warnings.filterwarnings("ignore", category = UserWarning, module = "matplotlib")
#
# Display inline matplotlib plots with IPython
from IPython import get_ipython
get_ipython().run_line_magic('matplotlib', 'inline')
###########################################
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import warnings
warnings.filterwarnings('ignore')
import seaborn as sns
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
from sklearn.preprocessing import MinMaxScaler
from sklearn.decomposition import PCA
### HELPER FUNCTIONS:
# Initial Exploration
def exp1(df):
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
# shape of data
print('rows and columns: {}'.format(df.shape))
# head data
# display(df.head())
print('')
# data types and columns in data
print('data types and columns in data:')
print('')
#display(df.info())
print(df.info())
print('')
# unique values in each column
print('unique values in each column:')
#display(df.nunique())
print(df.nunique())
print('')
# percentage duplicates
print('percentage duplicates : {}'.format(1-(float(df.drop_duplicates().shape[0]))/df.shape[0]))
print('')
## Percentage of column with missing values
print('Percentage of column with missing values:')
print('')
missingdf=df.apply(lambda x: float(sum(x.isnull()))/len(x))
#display(missingdf.head(n=missingdf.shape[0]))
print(missingdf.head(n=missingdf.shape[0]))
print('')
print('Data snapshot:')
print('')
print(df[:5])
this worked:
import sys
sys.path.append(r'/Users/stuff/bucket/bucket')
import bucket as Lb

Multiple plots against Date and Time using MatPlotLib and Sql3

This may have been asked before but I have been unable to find it.
I am a newbie to programming and working on a project to actively monitor and record voltage levels of four devices. These details are stored in a sqlite3 db with the date and time they were taken.
I am now trying to create a plot in Matplotlib with all four traces on the same graph, by using the Sqllite3 data. I have got one working although it is a little messy.
Here is my code so far:
import sqlite3
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from dateutil import parser
import time
import datetime
conn = sqlite3.connect ('Test.db')
c = conn.cursor()
def graph_data():
c.execute('SELECT DTG, Battery_Level1, Battery_Level2, Battery_Leve3, Battery_Level4 FROM Voltages')
data = c.fetchall()
dates = []
values = []
plt.xlabel('Time')
plt.ylabel('Voltage')
plt.title('PiPower Manager')
for row in data:
dates.append(parser.parse(row[0]))
values.append(row[1])
plt.plot_date(dates, values,'-',)
plt.show()
graph_data()
c.close
conn.close()
However I try to add the other lines it appears to cause an error. Any help would be appreciated.
Update 1
I have mostly tried playing with the section
for row in data:
dates.append(parser.parse(row[0]))
values.append(row[1])
I have added a new section similar as below
for row in data:
dates.append(parser.parse(row[0]))
values.append(row[2])
This does graph but looks odd - If I hash out the original it plots ok.
I have then tried adding row[3] and row [4] but I get an error;
Traceback (most recent call last):
File "/home/pi/plot.py", line 41, in <module>
graph_data()
File "/home/pi/plot.py", line 28, in graph_data
values.append(row[3])
IndexError: tuple index out of range
I am not really sure how to go about this......

Resources