keyerror:0 though index starts from 0 - python-3.x

this is my dataset.
I basically want the number of words in each list from the words column. this is my code
for ind,word in enumerate(brand6['words']):
brand6['len']=len(brand6[ind][word])
However, i am getting an error like this:
Traceback (most recent call last):
File "C:\Users\darshini.nagaraj\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1618, in
pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1626, in
pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\darshini.nagaraj\Desktop\example.py", line 28, in <module>
print((brand6[ind][word]))
File "C:\Users\darshini.nagaraj\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2800, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Users\darshini.nagaraj\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2648, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1618, in
pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1626, in
pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0
I have been working for 3 days on this. any help will be much appreciated!!
Thanks well in advance!!

try this
word_count=[]
for i in range(brand6.shape[0]):
word_count.append(len(brand6['words'][i]))
brand6['word_count']=word_count
the last line will create a new column as 'word_count' in your dataset.

Related

Write Shapefile to AWS S3 with geopandas in Glue Python Shell

I have read shapefile in a zip format from my S3 bucket successfully through geopandas, but I get error when trying to output the same geodataframe as a shapefile to the same S3 bucket.
The code below is how I read the zip file, and it works nicely:
## session for connecting to S3
session = boto3.session.Session(aws_access_key_id='MY-KEY-ID',
aws_secret_access_key='MY-KEY')
s3 = session.resource('s3')
bucket = s3.Bucket('my_bucket')
## read shapefile
TPG = bucket.Object(key='/shapefiles/grid.zip')
TPGrid = geopandas.read_file(TPG.get()['Body'])
But when I tried to output the same geodataframe like this:
TPGrid.to_file(filename='s3://my_bucket/output/TPGrid.zip', driver='ESRI Shapefile')
I will get error code:
ERROR:fiona._env:Only read-only mode is supported for /vsicurl
ERROR:fiona._env:Only read-only mode is supported for /vsicurl
ERROR:fiona._env:Only read-only mode is supported for /vsicurl
ERROR:fiona._env:Unable to open /vsis3/my_bucket/output/TPGrid.zip/TPGrid.shp or /vsis3/my_bucket/output/TPGrid.zip/TPGrid.SHP.
Traceback (most recent call last):
File "fiona/ogrext.pyx", line 1133, in fiona.ogrext.WritingSession.start
File "fiona/_err.pyx", line 291, in fiona._err.exc_wrap_pointer
fiona._err.CPLE_AppDefinedError: Unable to open /vsis3/my_bucket/output/TPGrid.zip/TPGrid.shp or /vsis3/my_bucket/output/TPGrid.zip/TPGrid.SHP.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/runscript.py", line 211, in <module>
runpy.run_path(temp_file_path, run_name='__main__')
File "/usr/local/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/usr/local/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/tmp/glue-python-scripts-c8krhm5u/test_to_file_geo.py", line 40, in <module>
File "/glue/lib/installation/geopandas/geodataframe.py", line 1086, in to_file
_to_file(self, filename, driver, schema, index, **kwargs)
File "/glue/lib/installation/geopandas/io/file.py", line 328, in _to_file
filename, mode=mode, driver=driver, crs_wkt=crs_wkt, schema=schema, **kwargs
File "/glue/lib/installation/fiona/env.py", line 408, in wrapper
return f(*args, **kwargs)
File "/glue/lib/installation/fiona/__init__.py", line 274, in open
**kwargs)
File "/glue/lib/installation/fiona/collection.py", line 165, in __init__
self.session.start(self, **kwargs)
File "fiona/ogrext.pyx", line 1141, in fiona.ogrext.WritingSession.start
fiona.errors.DriverIOError: Unable to open /vsis3/my_bucket/output/TPGrid.zip/TPGrid.shp or /vsis3/my_bucket/output/TPGrid.zip/TPGrid.SHP.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/runscript.py", line 230, in <module>
raise e_type(e_value).with_traceback(new_stack)
File "/tmp/glue-python-scripts-c8krhm5u/test_to_file_geo.py", line 40, in <module>
File "/glue/lib/installation/geopandas/geodataframe.py", line 1086, in to_file
_to_file(self, filename, driver, schema, index, **kwargs)
File "/glue/lib/installation/geopandas/io/file.py", line 328, in _to_file
filename, mode=mode, driver=driver, crs_wkt=crs_wkt, schema=schema, **kwargs
File "/glue/lib/installation/fiona/env.py", line 408, in wrapper
return f(*args, **kwargs)
File "/glue/lib/installation/fiona/__init__.py", line 274, in open
**kwargs)
File "/glue/lib/installation/fiona/collection.py", line 165, in __init__
self.session.start(self, **kwargs)
File "fiona/ogrext.pyx", line 1141, in fiona.ogrext.WritingSession.start
fiona.errors.DriverIOError: Unable to open /vsis3/my_bucket/output/TPGrid.zip/TPGrid.shp or /vsis3/my_bucket/output/TPGrid.zip/TPGrid.SHP.
I have tried several ways, such as using '.csv' or '.shp', but not any one worked.
I am using python 3.6 and packages below, hope these information will help:
geopandas-0.9.0
shapely-1.7.1
fiona-1.8.20
GDAL-3.2.3
I kept fighting with this problem all day....
Any help will be highly appreciated.

Error in running Python code for PLSR modelling

I am trying to develop model using PLSR (Partial Least Squares Regression) in Python3 using code provided https://github.com/pgbrodrick/ensemblePLSR. Sample data is also provided.
When I try to run code, it gives me error
>>> python3 ensemble_plsr.py example_settings.txt
I am using Python (3.7.3), python modules scikit-learn (0.20.2) and pandas (0.23.3).
/usr/lib/python3/dist-packages/sklearn/externals/joblib.py:1: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
n bad bands 57
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/pandas/core/indexes/base.py", line 3078, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: -1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "ensemble_plsr.py", line 173, in <module>
df.pop(col)
File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 760, in pop
result = self[item]
File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 2688, in __getitem__
return self._getitem_column(key)
File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 2695, in _getitem_column
return self._get_item_cache(key)
File "/usr/lib/python3/dist-packages/pandas/core/generic.py", line 2491, in _get_item_cache
values = self._data.get(item)
File "/usr/lib/python3/dist-packages/pandas/core/internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "/usr/lib/python3/dist-packages/pandas/core/indexes/base.py", line 3080, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: -1
In short, you're attempting to remove columns for which there is no reference at row 173 of ensemble_plsr.py, hence the KeyError during execution. What's happening under the hood is that when Python attempts to execute the pop method on the DataFrame for the unspecified/non-existent column, it raises that error. There are different ways to resolve this but this solution will resolve the error you are seeing:
Replace rows 172 & 173 in ensemble_plsr.py with the following:
for col in sf.get_setting('ignore columns'):
if col != 'nothing_here':
df.pop(col)
Replace row 16 in example_settings.txt with the following:
ignore columns(any other columns to remove) = nothing_here
Good news, you're done with this issue. Bad news, you're going to hit the next error down the line but you're on your way!

why KeyError: 0 in flask application 127.0.0.1 - - [31/Oct/2019 00:01:07] "POST /results HTTP/1.1" 500 -?

i am making a flask application in which i had taken input from user through .html page and later i converted the values of different variables to a dictionary , which then converted into data-frame of pandas and stored inside a variable whose value at 0 index i want to display , but it is giving an error , how to solve it?
i am unable to recognize error and its cause.
the code used to predict value and show it includes:
dic = {'MRP':MRP,'Fat':Fat,'Visibility':Visibility,'Weight':Weight,'Years':Years}
fdata_input = pd.DataFrame(dic,index=[0],columns=['MRP','Fat','Visibility','Weight','Years'])
price = rf.predict(fdata_input[0])
//i want to display the value stored in fdata_input ' s index 0
this is error message that i am getting:
Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [31/Oct/2019 00:00:58] "GET / HTTP/1.1" 200 -
[2019-10-31 00:01:07,482] ERROR in app: Exception on /results [POST]
Traceback (most recent call last):
File "C:\Users\HP\Anaconda3\lib\site-packages\pandas\core\indexes\base.py",
line 2657, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 108, in
pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 132, in
pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in
pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in
pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\HP\Anaconda3\lib\site-packages\flask\app.py", line 2292, in
wsgi_app
response = self.full_dispatch_request()
File "C:\Users\HP\Anaconda3\lib\site-packages\flask\app.py", line 1815, in
full_dispatch_request
rv = self.handle_user_exception(e)
File "C:\Users\HP\Anaconda3\lib\site-packages\flask\app.py", line 1718, in
handle_user_exception
reraise(exc_type, exc_value, tb)
File "C:\Users\HP\Anaconda3\lib\site-packages\flask\_compat.py", line 35, in
reraise
raise value
File "C:\Users\HP\Anaconda3\lib\site-packages\flask\app.py", line 1813, in
full_dispatch_request
rv = self.dispatch_request()
File "C:\Users\HP\Anaconda3\lib\site-packages\flask\app.py", line 1799, in
dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "<ipython-input-4-303728ed6a31>", line 51, in results
predicted_stock_price = rf.predict(fdata_input[0])
File "C:\Users\HP\Anaconda3\lib\site-packages\pandas\core\frame.py", line
2927, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Users\HP\Anaconda3\lib\site-packages\pandas\core\indexes\base.py",
line 2659, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 108, in
pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 132, in
pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in
pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in
pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0
127.0.0.1 - - [31/Oct/2019 00:01:07] "POST /results HTTP/1.1" 500 -
i want value at index 0 to be displayed of fdata_input.

Python Tutorial Help NLP customer reviews

I'm fairly new to Python and am following a tutorial on creating a wordcloud based on a customer reviews file. The tutorial link is https://towardsdatascience.com/detecting-bad-customer-reviews-with-nlp-d8b36134dc7e
from wordcloud import WordCloud, STOPWORDS
import pandas as pd
# read data
reviews_df = pd.read_csv("Hotel_Reviews3.csv")
# append the positive and negative text reviews
reviews_df["review"] = reviews_df["Negative_Review"] + reviews_df["Positive_Review"]
# create the label
reviews_df["is_bad_review"] = reviews_df["Reviewer_Score"].apply(lambda x: 1 if x < 5 else 0)
# select only relevant columns
reviews_df = reviews_df[["review", "is_bad_review"]]
reviews_df.head()
Hotel_Reviews3.csv:
https://i.stack.imgur.com/8ZGxj.png
ERROR MESSAGE:
Traceback (most recent call last):
File "C:\Users\stecd\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexes\base.py", line 3078, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Positive_Review'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\stecd\Desktop\WorldCloud\wordCloud.py", line 6, in <module>
reviews_df["review"] = reviews_df["Negative_Review"] + reviews_df["Positive_Review"]
File "C:\Users\stecd\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 2688, in __getitem__
return self._getitem_column(key)
File "C:\Users\stecd\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 2695, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\stecd\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py", line 2489, in _get_item_cache
values = self._data.get(item)
File "C:\Users\stecd\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "C:\Users\stecd\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Positive_Review'
>>>
From the error message i'd guess that Hotel_Reviews3.csv may not have a "Positive_Review" column. It could be that the corresponding table entry is truncated or has whitespaces so that it does not match "Positive_Review".

How do i fix errors in pandas lib while working in python?

I was trying out this ..sorting out the necessary data using pandas taking the data set from quandl. when this showed up on execution.
import pandas as pd
import quandl
df = quandl.get('WIKI/GOOGL')
print(df.head())
df = df[['Adj. Open','Adj. High','Adj. Low','Adj. Close','Adj. Volume']]
print(df.head())
df['HL_PCT']=(df['Adj. High']- df['Adj. Close'])/df['Adj. close'] *100.0
df['PCT_change'] = (df['Adj .Close']-df['Adj. open'])/df['Adj. Open']*100.0
df = df[['Adj. Close','HL_PCT','PCT_change','Adj. Volume']]
so i got that as an error message
update:
i updated my code with right capitalizing
now its got this as error
Traceback (most recent call last):
File "D:\Program Files\Python37\lib\site-packages\pandas\core\indexes\base.py", line 2656, in get_loc
return self._engine.get_loc(key)
File "pandas_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Adj .Close'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "sentdex2.py", line 8, in
df['PCT_change'] = (df['Adj .Close']-df['Adj. Open'])/df['Adj. Open'] *100.0
File "D:\Program Files\Python37\lib\site-packages\pandas\core\frame.py", line 2927, in getitem
indexer = self.columns.get_loc(key)
File "D:\Program Files\Python37\lib\site-packages\pandas\core\indexes\base.py", line 2658, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Adj .Close'
Replace Adj. close with Adj. Close (note the capital C)

Resources