Python pandas-datareader fails on comma - python-3.x

I am trying to get a price of a stock from google by using pandas-datareader.data but when I try to call Amazon(amazons price right now is over 1,000) it gives me a value error. I assume it is because of the comma in the price. It automatically attempts to turn it into a float so I have no opportunity to use a .replace function.
ValueError: could not convert string to float: '1,001.30'
I seemingly cannot seem to find a workaround to this issue so any help would be very appreciated, thanks.
import pandas_datareader.data as web
def money(stock):
#df = web.DataReader(stock, "google", start=start, end=end)
df2 = web.get_quote_google(stock)

I think there seems to be currently a compatibility issue with panads and pandas_datareader. However, this might solve your problem using yahoo-finance:
use pip install yahoo-finance to install the module and then run
import yahoo_finance
import pandas as pd
symbol = yahoo_finance.Share("AMZN")
google_df = symbol.get_price()
This gives me no error on the price of Amazon

Related

backtesting.py ploting function not working

I'm trying to learn backtesting.py, when I run the following sample code, it pops up these errors, anyone could help? I tried to uninstall the Bokeh package and reinstall an older version, but it doen't work.
BokehDeprecationWarning: Passing lists of formats for DatetimeTickFormatter scales was deprecated in Bokeh 3.0. Configure a single string format for each scale
C:\Users\paul_\AppData\Local\Programs\Python\Python310\lib\site-packages\bokeh\models\formatters.py:399: UserWarning: DatetimeFormatter scales now only accept a single format. Using the first prodvided: '%d %b'
warnings.warn(f"DatetimeFormatter scales now only accept a single format. Using the first prodvided: {fmt[0]!r} ")
BokehDeprecationWarning: Passing lists of formats for DatetimeTickFormatter scales was deprecated in Bokeh 3.0. Configure a single string format for each scale
C:\Users\paul_\AppData\Local\Programs\Python\Python310\lib\site-packages\bokeh\models\formatters.py:399: UserWarning: DatetimeFormatter scales now only accept a single format. Using the first prodvided: '%m/%Y'
warnings.warn(f"DatetimeFormatter scales now only accept a single format. Using the first prodvided: {fmt[0]!r} ")
GridPlot(id='p11925', ...)
import bokeh
import datetime
import pandas_ta as ta
import pandas as pd
from backtesting import Backtest
from backtesting import Strategy
from backtesting.lib import crossover
from backtesting.test import GOOG
class RsiOscillator(Strategy):
upper_bound = 70
lower_bound = 30
rsi_window = 14
# Do as much initial computation as possible
def init(self):
self.rsi = self.I(ta.rsi, pd.Series(self.data.Close), self.rsi_window)
# Step through bars one by one
# Note that multiple buys are a thing here
def next(self):
if crossover(self.rsi, self.upper_bound):
self.position.close()
elif crossover(self.lower_bound, self.rsi):
self.buy()
bt = Backtest(GOOG, RsiOscillator, cash=10_000, commission=.002)
stats = bt.run()
bt.plot()
An issue was opened for this in the GitHub repo:
https://github.com/kernc/backtesting.py/issues/803
A comment in the issue suggests to downgrade bokeh to 2.4.3:
python3 -m pip install bokeh==2.4.3
This worked for me.
I had a similar issue, using Spyder IDE.
Found out I need to call the below for the plot to show for Spyder.
backtesting.set_bokeh_output(notebook=False)
I have update Python to version 3.11 & downgrade bokeh to 2.4.3
This worked for me.
Downgrading Bokeh didn't work for me.
But, after importing backtesting in Jupyter, I needed to do:
backtesting.set_bokeh_output(notebook=False)
The expected plot was then generated in a new interactive browser tab.

Quantstats - TypeError: Invalid comparison between dtype=datetime64[ns, America/New_York] and datetime

I'm trying to use some of the Quantstats modules, specifically the quantstats.reports module, in Anaconda to get some metrics reports on a portfolio I've designed. I'm fairly new to Python/Quantstats and am really just trying to get a feel for the library.
I've written the following code to utilize the report module to spit out a complete html report and save it under the Output folder:
import quantstats as qs
qs.extend_pandas()
stock = qs.utils.download_returns('GLD')
qs.reports.html(stock, output='Output/GLD.html')
I then get the following TypeError:
TypeError: Invalid comparison between dtype=datetime64[ns, America/New_York] and datetime
I believe this may be a result of the datetime64 class being localized to my timezone and datetime remaining TZ naive. Frankly, digging through the Quantstats code has been a little beyond my current skillset.
If anybody has any recommendations for fixes, I would greatly appreciate it.
I came upon this while DDGing exactly the same issue.
Not sure which of your columns has the timezone localisation in it, but
df['date'] = df['date'].dt.tz_localize(None)
will get rid of localization for the column df['date']
Incidentally, the usual situation is that the index of a pandas timeseries contains np.Datetime64 types, but when you assign it to a column via
df['date'] = df.index
the resulting column contains pandas Timestamps.
I got this issue resolved after I had lowered the yfinance version from latest version 0.1.87 to
yfinance => 0.1.74

Attribute Error : Function object has no attribute

I see there are so many questions about this title or for this issue, But I still don't understand why it is occurring.
I have imported Pandas and Numpy.
Then I read my file using pd.read_excel.
Then I viewed the head of my file using .head()
Now, after I sliced my data also the .head method was working fine. But now suddenly it throws an Attribute error and it gets resolved once I re-import my file again, but then, after some time it again gives me the same error. What is wrong that I am doing? and I don't understand this error clearly.
import pandas as pd
import numpy as np
sales = pd.read_excel('SALESC.xlsx', header=0)
sales.isnull().sum()
sales["Date"] = pd.to_datetime(sales['Date of document'])
sales = sales[pd.notnull(sales['Quantity sold']) & pd.notnull(sales['Unit
selling price including tax'])]
sales = sales.iloc[:,[3,6,8,9,10,11,19,35,39]]
sales.head(5)
Can someone explain the problem? and how to resolve it, thanks in advance

How to manually provide a benchmark in zipline

I want to create a reproducible example where the traded series and the benchmark are manually provided. This would make the life of people that are approaching to zipline incredibly easier. In fact, given the recent shut down of Yahoo!Finance API, even introductory examples with zipline are not going to work anymore since an HTTP error will be returned when trying to import the ^GSPC benchmark from Yahoo behind the scenes. As a consequence, nowadays there is not a single code snippet from the official tutorial that works AFAIK.
import pytz
from pandas_datareader import DataReader
from collections import OrderedDict
from zipline.algorithm import TradingAlgorithm
from zipline.api import order, record, symbol, set_benchmark
# Import data from yahoo
data = OrderedDict()
start_date = '01/01/2014'
end_date = '01/01/2017'
data['AAPL'] = DataReader('AAPL',
data_source='google',
start=start_date,
end=end_date)
data['SPY'] = DataReader('SPY',
data_source='google',
start=start_date,
end=end_date)
# panel.minor_axis is ['Open', 'High', 'Low', 'Close', 'Volume'].
panel = pd.Panel(data)
panel.major_axis = panel.major_axis.tz_localize(pytz.utc)
def initialize(context):
set_benchmark(data['SPY'])
def handle_data(context, data):
order(data['AAPL'], 10)
record(AAPL=data.current(data['AAPL'], 'Close'))
algo_obj = TradingAlgorithm(initialize=initialize,
handle_data=handle_data,
capital_base=100000)
perf_manual = algo_obj.run(panel)
Returns: HTTPError: HTTP Error 404: Not Found
Question: how to make the strategy to work using AAPL as traded asset and SPY as benchmark?
Constraint: AAPL and SPY must be manually provided as in the example.
Disclaimer: I'm a maintainer of Zipline.
You can use the csvdir bundle to ingest csv files (tutorial here) and then make a call to set_benchmark() in your initialize() function. I'm also working a branch that allows zipline algorithms to run without a benchmark so even if you're not able to get benchmark data, your algorithm shouldn't crash.
Replace zipline in your requirements.txt with this:
git+https://github.com/quantopian/zipline
Then run pip install -r requirements.txt

datetime Attribute error in python

I'm trying to run to following three lines of python code on command line using Python 3.5.0. It gives me an error- Attribute error: module 'datetime' has no attribute 'date'. I just want to print current date. Please help.
import datetime
current = datetime.date.today()
print(current)
There is nothing wrong with your code. It could be reduced a bit though:
import datetime
datetime.date
which should also cause the error. If this really causes the error, I would say your installation is messed up or, unlikely, there's a bug in Python. Please also make sure you don't have a datetime.py in your working directory. Further, check the output of dir(datetime) after importing it and with a different version of Python.
You shouldn't be getting any error running the above code as there is nothing wrong with it. Also rather than using the above code (which is okay syntax-wise but imports all the names accessible in the datetime moudule), you could use
from datetime import date
current = date.today()
print(current)
since all you want to import is the day's date.
when i run it on python 27. the code returns date with no errors!

Resources