catching the error when item can't be found in textfile - python-3.x

I have a list, btc.txt, with all of the dates from sometime in 2013 to today and the average price of BTC for that day. Here are the contents of the textfile:
Date,Price
...
"Jun 06, 2018",7639.970
"Jun 05, 2018",7567.330
"Jun 04, 2018",7618.500
"Jun 03, 2018",7676.170
"Jun 02, 2018",7590.080
"Jun 01, 2018",7521.070
...
I have a list of dates, and in my code, I am looking for the corresponding date in the btc.txt and finding the average price of BTC for that date:
df = pd.read_csv("btc.txt")
dates = [..., "May 23, 2017", "Jun 04, 2018", "Oct 4, 2018", ...]
initial_p = []
for item in dates:
if(item != "N/A"):
if(df[df["Date"] == item]["Price"].values[0]):
price = df[df["Date"] == item]["Price"].values[0]
print(price)
initial_p.append(price)
else:
num = int(item[len(item)-1])
num -= 1
item[len(item)-1] = num
price = (df[df["Date"] == item]["Price"].values[0])
initial_p.append(price)
else:
initial_p.append(item)
As you can see, there are dates in my list that don't exist, like October 4, 2018. Because it doesn't exist in btc.txt, I get this following error when it gets to that date that doesn't exist:
Traceback (most recent call last):
File "getICOdate.py", line 173, in <module>
initial_price("btc.txt")
File "getICOdate.py", line 159, in initial_price
if(df[df["Date"] == item]["Price"].values[0]):
IndexError: index 0 is out of bounds for axis 0 with size 0
I tried to catch the error by putting an if statement on line 6 of my code as you can see above but it didn't work. Basically what I want to do in my code is that if that date can't be found in btc.txt, the date will be modified to a year back and searched again (so "Oct 04, 2018" will be changed to "Oct 04, 2017")

Related

How to calculate date of birth, given age expressed in years, months, days, as reported on a given date?

I have a 2 months old dataset, from 15 October 2020, containing the ages of different persons. The age of each person is given in years, months, days.
Example
Input: Reference date = 15 October 2020
One entry in the input data set could be:
Input: Age = 21 years, 0 months, 10 days
Corresponding output: Date of Birth = 25 September 1999
This takes into account that their age was reported on 15 October 2020.
I have to convert this data set to get the Date of Birth of each person. How can I accomplish that?
You can use the dateutil.relativedelta function, which you can use to subtract years, months and days from another date.
I'll assume that you know the exact date on which the data was collected, ... let's say 15 October 2020:
from datetime import date
from dateutil.relativedelta import relativedelta
# Let's say the data was taken on 15 October 2020:
refdate = date(2020, 10, 15)
# The data itself: triplets of years, months, days as collected on 15 October 2020:
data = [
[20, 2, 1],
[53, 5, 10],
]
# Extend the data with the birthdate:
for person in data:
person.append(refdate - relativedelta(years=person[0],
months=person[1],
days=person[2]))
# Print the data:
for person in data:
print("{} years, {} months, {} days: born on {}".format(*person))

Regex expression: Expression for Extracting Date is not working with Series object throws an error

I'm trying to extract date from text data. The expression is valid and works fine when I checked in regex101 website. But when applied to the data it throws an error "ValueError: pattern contains no capture groups". My sample text is ["Mar-20-2009", "Mar 20, 2009", "March 20, 2009", "Mar. 20, 2009"," Mar 20 2009"] inputted as a pandas series object.
df2 = pd.Series(["Mar-20-2009", "Mar 20, 2009", "March 20, 2009", "Mar. 20, 2009"," Mar 20 2009"])
df2.str.extractall(r'(?:\d{2} )?(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]* (?:\d{2}, )?\d{4}')
It doesn't match with any date in actuality my expected output is ["March 20, 2009", "Mar 20 2009","Mar 20, 2009"].
Screenshot of error
All of your parenthesized expressions are non-capture groups (?:) so the error message is correct. If you want to capture an expression, don't use the ?: just put it in parenthesis. As is, the pattern will match, but no groups will be captured.
You need to wrap your string in extractall in parenthesis like this:
df2 = pd.Series(["Mar-20-2009", "Mar 20, 2009", "March 20, 2009", "Mar. 20, 2009"," Mar 20 2009"])
df2.str.extractall(r'((?:\d{2} )?(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]* (?:\d{2}, )?\d{4})')
Output:
0
match
1 0 Mar 20, 2009
2 0 March 20, 2009
Here you are creating one capture group that match all those regex expressions.

Python Date and Time

I am trying to create a function that will return variable value depending on date and time. I am trying to create a small competition every month and the competition starts from 15th of one month and ends 15th of next month. If the competition is starting from Jan 15 and ending on Feb 15 then the function should return end date and time which is (Feb 15, 2020 24:00:00) and competition name which is "January 2020". How can i do this any suggestions?
For instance:
def CompetitionDetail():
#doing something here
return competition_end_date, competition_name
In response to your question, I came up with the following below. The returns are the start and end date/times of the competition that begins on the 15th of the current month and ends on the 15th of the following month. The function accepts a date but uses today's date if no date is specified.
from datetime import datetime, date
def competition(date=date.today()):
fifteenth_this_month = datetime(date.year, date.month, 15, 23, 59, 59)
fifteenth_next_month = datetime(date.year, date.month + 1, 15, 23, 59, 59)
competition_name = f'My awesome competition for the month of ' \
f'{fifteenth_this_month:%B %Y}.'
competition_ending = f'This awesome competition ends on ' \
f'{fifteenth_next_month:%b %d, %Y} at 24:00:00.'
print(competition_name, competition_ending, '', sep='\n')
return fifteenth_this_month, fifteenth_next_month
OUTPUT
My awesome competition for the month of August 2020.
This awesome competition ends on Sep 15, 2020 at 24:00:00.
(datetime.datetime(2020, 8, 15, 23, 59, 59), datetime.datetime(2020, 9, 15, 23, 59, 59))

How do I know if today is a day due to change civil local time e.g. daylight saving time in standard python and pandas timestamps?

According to the rules of British Summer Time / daylight saving time (https://www.gov.uk/when-do-the-clocks-change) the clocks:
go forward 1 hour at 1am on the last Sunday in March,
go back 1 hour at 2am on the last Sunday in October.
In 2019 this civil local time change happens on March 31st and October 27th, but the days slightly change every year. Is there a clean way to know these dates for each input year?
I need to check these "changing time" dates in an automatic way, is there a way to avoid a for loop to check the details of each date to see if it is a "changing time" date?
At the moment I am exploring these dates for 2019 just to try to figure out a reproducible/automatic procedure and I found this:
# using datetime from the standard library
march_utc_30 = datetime.datetime(2019, 3, 30, 0, 0, 0, 0, tzinfo=datetime.timezone.utc)
march_utc_31 = datetime.datetime(2019, 3, 31, 0, 0, 0, 0, tzinfo=datetime.timezone.utc)
april_utc_1 = datetime.datetime(2019, 4, 1, 0, 0, 0, 0, tzinfo=datetime.timezone.utc)
# using pandas timestamps
pd_march_utc_30 = pd.Timestamp(march_utc_30) #, tz='UTC')
pd_march_utc_31 = pd.Timestamp(march_utc_31) #, tz='UTC')
pd_april_utc_1 = pd.Timestamp(april_utc_1) #, tz='UTC')
# using pandas wrappers
pd_local_march_utc_30 = pd_march_utc_30.tz_convert('Europe/London')
pd_local_march_utc_31 = pd_march_utc_31.tz_convert('Europe/London')
pd_local_april_utc_1 = pd_april_utc_1.tz_convert('Europe/London')
# then printing all these dates
print("march_utc_30 {} pd_march_utc_30 {} pd_local_march_utc_30 {}".format(march_utc_30, pd_march_utc_30, pd_local_march_utc_30))
print("march_utc_31 {} pd_march_utc_31 {} pd_local_march_utc_31 {}".format(march_utc_31, pd_march_utc_31, pd_local_march_utc_31))
print("april_utc_1 {} pd_april_utc_1 {} pd_local_april_utc_1 {}".format(april_utc_1, pd_april_utc_1, pd_local_april_utc_1))
The output of those print statements is:
march_utc_30 2019-03-30 00:00:00+00:00 pd_march_utc_30 2019-03-30 00:00:00+00:00 pd_local_march_utc_30 2019-03-30 00:00:00+00:00
march_utc_31 2019-03-31 00:00:00+00:00 pd_march_utc_31 2019-03-31 00:00:00+00:00 pd_local_march_utc_31 2019-03-31 00:00:00+00:00
april_utc_1 2019-04-01 00:00:00+00:00 pd_april_utc_1 2019-04-01 00:00:00+00:00 pd_local_april_utc_1 2019-04-01 01:00:00+01:00
I could use a for loop to find out if the current date is the last Sunday of the month, or compare the "hour delta" between the current date and the date of the day after to see if there is a +1, but I am wondering if there is a cleaner way to do this?
Is there something attached to the year e.g. knowing the input year is 2019 then we know for sure the "change date" in March will be day 31st?
Using dateutil.rrule can help (install with pip install python-dateutil).
Because we can fetch dates by weeks, we don't need any loops,
from dateutil.rrule import rrule, WEEKLY
from dateutil.rrule import SU as Sunday
from datetime import date
import datetime
def get_last_sunday(year, month):
date = datetime.datetime(year=year, month=month, day=1)
# we can find max 5 sundays in a months
days = rrule(freq=WEEKLY, dtstart=date, byweekday=Sunday, count=5)
# Check if last date is same month,
# If not this couple year/month only have 4 Sundays
if days[-1].month == month:
return days[-1]
else:
return days[-2]
def get_march_switch(year):
# Get 5 next Sundays from first March
day = get_last_sunday(year, 3)
return day.replace(hour=1, minute=0, second=0, microsecond=0)
def get_october_switch(year):
day = get_last_sunday(year, 10)
return day.replace(hour=2, minute=0, second=0, microsecond=0)
print('2019:')
print(' {}'.format(get_march_switch(2019)))
print(' {}'.format(get_october_switch(2019)))
print('2021:')
print(' {}'.format(get_march_switch(2021)))
print(' {}'.format(get_october_switch(2021)))
get_sundays() returns the 5 next sundays from the first day of the given month, because a month can have maximum 5 sundays.
Then I just check (within get_(march|october)_switch()) if the last given sunday is from the expected month, if not well this month only have 4 sunday, I took this one.
Finally I fix the hours, seconds and microseconds.
Output:
2019:
2019-03-31 01:00:00
2019-10-27 02:00:00
2021:
2021-03-28 01:00:00
2021-10-24 02:00:00
I know the topic is quite old now. However, I had the same question today, and at the end I found a solution which seems quite simple to me, using only the standard datetime:
I want to check whether my date refdate is the October DST day - I did it in the following way:
refdate is my standard datetime object.
If you have a panda timestamp, you can convert it to native datetime using .to_pydatetime()
if refdate.month == 10 and refdate.weekday() == 6 and (refdate + dt.timedelta(weeks = 1)).month == 11:
oct_dst = 1

Excel table function: if then or

i am crunching a large dataset. one column is "date". i want to have a column of Invoice Date. for any transaction happened in january, i want the invoice date to be Jan-31-2017...so on and so forth.
Jan 2, 2017 (invoiceDate Jan 31, 2017)
Jan 18, 2017 (invoiceDate Jan 31, 2017)
Feb 5, 2017 (invoiceDate Feb 28,2017)
......
How to write the If function? thanks.
use the EOMONTH() function
=EOMONTH(A2,0)

Resources