Python comes up with an error in the end the error says KeyError 'Open' - python-3.x

I am trying to extract stock prices from the website 'iex' and everything works fine until I try to plot my data into a visualization model. Can anyone take a look and see what I am doing wrong? This happens when I put the coding for 'open' as well as 'volume' Thank you!
msft['Open'].plot(label='MSFT',figsize=(16,8),title='Open Title')
gm['Open'].plot(label='gm')
ford['Open'].plot(label='ford')plt.legend()
I get the follow error
KeyError: 'Open'

pandas datareader uses lower case "open":
In [11]: from pandas_datareader import data as web
In [12]: msft = web.DataReader('MSFT', 'iex', "2019-01-01", "2019-01-31")
In [13]: msft.head()
Out[13]:
open high low close volume
date
2019-01-02 99.1266 101.3173 98.5192 100.6899 35329345
2019-01-03 99.6743 99.7589 96.7866 96.9858 42578410
2019-01-04 99.2959 102.0740 98.5093 101.4965 44060620
2019-01-07 101.2077 102.8289 100.5505 101.6259 35656136
2019-01-08 102.6018 103.5278 101.2808 102.3628 31514415
In [14]: msft["open"].head()
Out[14]:
date
2019-01-02 99.1266
2019-01-03 99.6743
2019-01-04 99.2959
2019-01-07 101.2077
2019-01-08 102.6018
Name: open, dtype: float64

Related

How to fix AttributeError: type object 'list' has no attribute 'find'"?

from cgitb import text
from bs4 import BeautifulSoup
import requests
website = 'https://www.marketplacehomes.com/rent-a-home/'
result = requests.get(website)
content = result.text
soup = BeautifulSoup(content, 'html.parser')
lists = soup.find_all('div', class_=('tt-rental-row'))
for list in lists:
location = list.find('span', class_="renta;-adress")
beds = list.find('span', class_="renta;-beds")
baths = list.find('span', class_="renta;-beds")
availability = list.find('span', class_="rental-date-available")
info = [location, beds, baths, availability]
print(info)
If I try to run the last line of code, I get:
"IndentationError: expected an indented block"
If I try to run each indentation separately I get:
">>> location = list.find('span', class_="renta;-adress")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'list' has no attribute 'find'"
I'm new to Python and I'm kinda stuck, can anyone please help me?
Note: Your code never runs the for-loop cause your selection never matches the elements in HTML. They are generated dynamically based on data from another ressource and requests do not render websites like a browser, it only uses static contents from response.
Be aware not to use built-in keywords they will cause errors, especialy in your case list.find() will raise one cause the type object 'list' do not has an attribute called find. You could simply check these things using type()
type(soup)
-> its a bs4.BeautifulSoup
type(soup.find_all('div', class_=('tt-rental-row')))
-> its a bs4.element.ResultSet
type(list)
-> its a type
So how to get your goal?
You could also use pandas to directly create a DataFrame and slice it to your needs:
import pandas as pd
pd.read_json('https://app.tenantturner.com/listings-json/2679')
Output:
id dateActivated latitude longitude address city state zip photo title ... baths dateAvailable rentAmount acceptPets applyUrl btnUrl btnText virtualTour propertyType enableWaitlist
0 83600 8/22/2022 35.750499 -86.393972 4481 Jack Faulk St Murfreesboro TN 37127 https://ttimages.blob.core.windows.net/propert... 4481 Jack Faulk St ... 2.0 Now 2195 cats, small dogs, large dogs https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/4481-jack... Schedule Viewing None Single Family False
1 100422 8/31/2022 30.277607 -95.472842 213 Skybranch Court Conroe TX 77304 https://ttimages.blob.core.windows.net/propert... 213 Skybranch Court ... 2.5 Now 2100 cats, small dogs, large dogs https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/213-skybr... Schedule Viewing None Condo Unit False
2 106976 7/27/2022 28.274720 -82.298077 8127 Olive Brook Dr Wesley Chapel FL 33545 https://ttimages.blob.core.windows.net/propert... 8127 Olive Brook Dr ... 2.0 Now 2650 no pets https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/8127-oliv... Schedule Viewing None Single Family False
3 116188 8/15/2022 42.624023 -83.144614 735 Grace Ave Rochester Hills MI 48307 https://ttimages.blob.core.windows.net/propert... 735 Grace Ave ... 2.0 Now 1600 cats, small dogs, large dogs https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/735-grace... Schedule Viewing None Single Family False
4 126846 8/22/2022 32.046455 -81.071181 1810 E 41st St Savannah GA 31404 https://ttimages.blob.core.windows.net/propert... 1810 E 41st St ... 1.0 Now 1395 small dogs https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/1810-e-41... Schedule Viewing None Single Family True
...
91 rows × 22 columns
Example:
To show only specifc columns, simply pass a list of there names.
import pandas as pd
pd.read_json('https://app.tenantturner.com/listings-json/2679')[['address', 'city','state', 'zip', 'title', 'beds', 'baths','dateAvailable']]
Output
address beds baths dateAvailable
0 4481 Jack Faulk St 4 2.0 Now
1 213 Skybranch Court 3 2.5 Now
2 8127 Olive Brook Dr 3 2.0 Now
3 735 Grace Ave 3 2.0 Now
4 1810 E 41st St 3 1.0 Now
... ... ... ... ...
91 rows × 4 columns
Since the word list is a built-in keyword in python you can't use it as variable name try another name
for myList in lists:
location = myList.find('span', class_="renta;-adress")
beds = myList.find('span', class_="renta;-beds")
baths = myList.find('span', class_="renta;-beds")
availability = myList.find('span', class_="rental-date-available")
info = [location, beds, baths, availability]
print(info)

How can I read and process 100 bytes at a time from a large CSV file?

The below csv is only a snippet of my main data file.
customer.csv
customer_id,order_id,number_of_items
10,4736,9
5,3049,1
1,4689,3
6,4114,9
1,4524,15
2,3727,16
3,3507,7
7,3988,3
5,4993,16
6,1945,4
7,3081,7
3,3707,2
5,1739,12
9,4167,17
7,3242,12
2,3109,10
10,2197,20
10,3528,13
8,4917,2
5,1713,19
8,4224,4
7,2160,2
10,2044,19
10,2956,8
3,3906,2
5,2288,16
7,1854,20
7,4404,2
9,1622,2
7,3685,2
10,2755,10
3,3390,10
6,1424,6
3,2127,15
4,1221,15
9,2994,14
1,1413,13
7,2771,7
3,4579,13
10,2208,4
CURRENTLY ALL I HAVE
import os
os.path.getsize("customer.csv") # outputs, 424 bytes
HOW I THINK I NEED TO PROCEED
I think I need to do something with open csv and read bytes? Then look at each row bit wise?
Please note, I am not looking specifically for someone to just give me an answer on how to do this (although that would be appreciated). Therefore, if someone could just point me in the right direction or give me some topics to look into that would be great. Side note, I know I am supposed to use encoding and decoding somewhere for this task.
This script will use the csv to load the data from customer.csv and compute the average using the builtin statistics module:
import csv
from statistics import mean
with open('customer.csv', newline='') as csvfile:
data = csv.DictReader(csvfile)
# group the customers by customer_id
customers = {}
for order in data:
customers.setdefault(order['customer_id'], []).append(int(order['number_of_items']))
# print the `average`:
print('{:<15} {}'.format('customer_id', 'average'))
for k, v in customers.items():
print('{:<15} {:.2f}'.format(k, mean(v)))
Prints:
customer_id average
10 11.86
5 12.80
1 10.33
6 6.33
2 13.00
3 8.17
7 6.88
9 11.00
8 3.00
4 15.00

how to find exponential weighted moving average using dataframe.ewma?

Previously I used the following to calculate the ewma
dataset['26ema'] = pd.ewma(dataset['price'], span=26)
But, in the latest version of pandas pd.ewma has been removed. How to calculate using the new method dataframe.ewma?
dataset['26ema'] = dataset['price'].ewma(span=26)
This is giving an error 'AttributeError: 'Series' object has no attribute 'ewma'
Use Series.ewm:
dataset['price'].ewm(span=26)
See GH11603 for the relevant PR and mapping of the old API to new ones.
Minimal Code Example
s = pd.Series(range(5))
s.ewm(span=3).mean()
0 0.000000
1 0.666667
2 1.428571
3 2.266667
4 3.161290
dtype: float64

Average time without changing format using pandas

Below is my avg_df:
Date Model INumber Type TimeDiff Device
326 20/07/18 TG I625 Devicetime 0:02:31 RD
328 20/07/18 TG I5271 Devicetime 0:00:32 RD
332 20/07/18 TG I660 Devicetime 0:00:31 RD
I want to get average of "TimeDiff". I know that i can convert Time into secs and get avg and can format it back, but would be interested to know if there is any way that i can get without formatting time back and forth. something like below:
print(avg_df.loc[:,"TimeDiff"].mean())
Appreciate any help!
You can get the average if you convert it to timedelta first:
>>> pd.to_timedelta(df['TimeDiff']).mean()
Timedelta('0 days 00:01:11.333333')

How do read a SEC txt-file into a pandas dataframe?

I am trying to use SEC (U.S. Security and Exchange Commision data). The SEC provides useful data in a txtformat. I am using
Financial Statement Data Sets for the second quarter of 2017. You can find the data I use here.
I try to read the txtfiles into a pandas dataframe. I tried it the following ways:
sub = pd.read_fwf('sub.txt')
sub_1 = pd.read_csv('sub.txt')
I get no error with using Pandas' read_fwf function - but the output is utter rubbish. Here is the head of the dataframe:
adsh cik name sic countryba stprba cityba zipba bas1 bas2 baph countryma stprma cityma zipma mas1 mas2 countryinc stprinc ein former changed afs wksi fye form period fy fp filed accepted prevrpt detail instance nciks aciks Unnamed: 1
0 0000002178-17-000038\t2178\tADAMS RESOURCES & ... NaN
1 0000002488-17-000107\t2488\tADVANCED MICRO DEV... NaN
I do get an error when using read_csv: Error tokenizing data. C error: Expected 2 fields in line 7, saw 3
Any ideas on how tor read the data into a pandas dataframe?
It looks like the files are tab separated - that's why you're seeing \t in the results. pandas read_csv defaults to comma separated values, so you have to change the separator. This is controlled by the sep parameter. In addition, you will need to provide the proper encoding (errors are thrown when trying to read the num, pre, and tag files). Generally ISO-8859-1 is a good choice.
#import pandas
import pandas as pd
#read in the .txt file and choose a separator and encoding standard
df = pd.read_csv('sub.txt', sep='\t', encoding='ISO-8859-1')
#output the results
print(df)
adsh cik name \
0 0000002178-17-000038 2178 ADAMS RESOURCES & ENERGY, INC.
1 0000002488-17-000107 2488 ADVANCED MICRO DEVICES INC
2 0000002969-17-000019 2969 AIR PRODUCTS & CHEMICALS INC /DE/
3 0000002969-17-000024 2969 AIR PRODUCTS & CHEMICALS INC /DE/
4 0000003499-17-000010 3499 ALEXANDERS INC
5 0000003545-17-000043 3545 ALICO INC
6 0000003570-17-000073 3570 CHENIERE ENERGY INC

Resources