Thats a code a friend of mine helped me with in order to get files from diferent measurement systems, timestamps and layout into on .csv file.
You enter the timeperiod or like in the case below 1 day and the code looks for this timestamps in different files and folders, adjusts timestamps (different Timezone etc.) and puts everything into one .csv file easy to plot. Now I need to rewrite that stuff for different layouts. I managed to get everything working but now I don't want to enter every single day manually into the code :-( , cause I'd need to enter it 3 times in a row --> in order to get the day for one day into one file, dateFrom and dateTo needs to be the same and in the writecsv...section you'd have to enter the date again.
here's the code:
from importer import cloudIndices, weatherall, writecsv,averagesolar
from datetime import datetime
from datetime import timedelta
dateFrom = datetime.strptime("2010-06-21", '%Y-%m-%d')
dateTo = datetime.strptime("2010-06-21", '%Y-%m-%d')
....
code
code
....
writecsv.writefile("data_20100621", header, ciData)
what can I change here so that I get an automatic loop for all data between e.g 2010-06-21 to 2011-06-21
p.s. if i'd entered 2010-06-21in dataFromand 2011-06-21 in dateTo i'd get a huge cvs. file with all the data in it ..... I thought that would be a great idea but it's not really good for plotting so I enden up manually entering day after day which isn't bad if you do it on a regular basis for 2 or 3 days but now a dates showed up and I need to rund the code over it :-(
Generally speaking you should be using datetime.datetime and datetime.timedelta, here is an example of how:
from datetime import datetime
from datetime import timedelta
# advance 5 days at a time
delta = timedelta(days=5)
start = datetime(year=1970, month=1, day=1)
end = datetime(year=1970, month=2, day=13)
print("Starting from: %s" % str(start))
while start < end:
print("advanced to: %s" % str(start))
start += delta
print("Finished at: %s" % str(start))
This little snippet creates a start and end time and a delta to advance using the tools python provides. You can modify it to fit your needs or apply it in your logic.
Related
im working on small project and i need to display date from api , api uses millisecounds and i cant really find a way to get date without time.
So far i didnt find anything usefull on internet.
Code i was using for this is:
ts= millisecounds im using
date = datetime.datetime.fromtimestamp(ts / 1000, tz=datetime.timezone.utc)
print(date)
But it prints something like 2010-10-10 10:10:10.100000+00:00
only thing i want from this is first part (2010-10-10)
how can i get date?
1. Naive Solution:
If you just want the date, you can try using the split method:
Code:
year_month_day = date.split(" ")[0]
print(year_month_day)
Output:
2010-10-10
2. Using strftime():
# using strftime
ts = 1588234567899 # Unix time in milliseconds
ts /= 1000 # Convert millisecondsto seconds
datetime_object = datetime.utcfromtimestamp(ts) # Create datetime object
date = datetime_object.strftime('%Y-%m-%d') # Strip just the date part out
print(date)
Output:
2020-04-30
I have a code which I have it's performance timestamped, and I want to measure the average of time it took to run it on multiple computers, but I just cant figure out how to use the datetime module in python.
Here is how my procedure looks:
1) I have the code which simply writes into a text file the log, where the timestamp looks like
t1=datetime.datetime.now()
...
t2=datetime.datetime.now()
stamp= t2-t1
And that stamp variable is just written in say log.txt so in the log file it looks like 0:07:23.160896 so it seems like it's %H:%M:%S.%f format.
2) Then I run a second python script which reads in the log.txt file and it reads the 0:07:23.160896 value as a string.
The problem is I don't know how to work with this value because if I import it as a datetime it will also append and imaginary year and month and day to it, which I don't want, I simply just want to work with hours and minutes and seconds and microseconds to add them up or do an average.
For example I can just open it in Libreoffice and add the 0:07:23.160896 to 0:00:48.065130 which will give 0:08:11.226026 and then just divide by 2 which will give 0:04:05.613013, and I just can't possibly do that in python or I dont know how to do it.
I have tried everything, but neither datetime.datetime, nor datetime.timedelta allows simply multiplication and division like that. If I just do a y=datetime.datetime.strptime('0:07:23.160896','%H:%M:%S.%f') it will just give out 1900-01-01 00:07:23.160896 and I can't just take a y*2 like that, it doesnt allow arithmetic operations, plus if if I convert it into a timedelta it will also multiply the year,which is ridiculous. I simply just want to add and subtract and multiply time.
Please help me find a way to do this, and not just for 2 variables but possibly even a way to calculate the average of an entire list of timestamps like average(['0:07:23.160896' , '0:00:48.065130', '0:00:14.517086',...]) way.
I simply just want a way to calculate the average of many timestamps and give out it's average in the same format, just as you can just select a column in Libreoffice and take the AVERAGE() function which will give out the average timestamp in that column.
As you have done, you first read the string into a datetime-object using strptime: t = datetime.datetime.strptime(single_time,'%H:%M:%S.%f')
After that, convert the time part of your datestring into a timedelta, so you can easily calculate with times: tdelta = datetime.timedelta(hours=t.hour, minutes=t.minute, seconds=t.second, microseconds=t.microsecond)
Now you can easily calculate with the timedelta object, and convert at the end of the calculations back into a string by str(tdsum)
import datetime
times = ['0:07:23.160896', '0:00:48.065130', '0:12:22.324251']
# convert times in iso-format into timedelta list
tsum = datetime.timedelta()
count = 0
for single_time in times:
t = datetime.datetime.strptime(single_time,'%H:%M:%S.%f')
tdelta = datetime.timedelta(hours=t.hour, minutes=t.minute, seconds=t.second, microseconds=t.microsecond)
tsum = tsum + tdelta
count = count + 1
taverage = tsum / count
average_time = str(taverage)
print(average_time)
I am trying to figure out how to pass a date inputted at a prompt by the user to pandas to search by date. I have both the search and the input prompt working separately but not together. I will show you what I mean. And maybe someone can tell me how to properly pass the date to pandas for the search.
This is how I successfully use pandas to extract rows in an excel sheet if any cell in column emr_first_access_date is greater than or equal to '2019-09-08'
I do this successfully with the following code:
import pandas as pd
HISorigFile = "C:\\folder\\inputfile1.xlsx"
#opens excel worksheet
df = pd.read_excel(HISorigFile, sheet_name='Non Live', skiprows=8)
#locates the columns I want to write to file including date column emr_first_access_date if greater than or equal to '2019-09-08'
data = df.loc[df['emr_first_access_date'] >= '2019-09-08', ['site_name','subs_num','emr_id', 'emr_first_access_date']]
#sorts the data
datasort = data.sort_values("emr_first_access_date",ascending=False)
#this creates the file (data already sorted) in panda with date and time.
datasort.to_excel(r'C:\\folder\sitesTestedInLastWeek.xlsx', index=False, header=True)
However, the date above is hardcoded of course. So, I need the user running this script to input the date. I created a very basic working input prompt with the following:
import datetime
#prompts for input date
TestedDateBegin = input('Enter beginning date to search for sites tested in YYYY-MM-DD format')
year, month, day = map(int, TestedDateBegin.split('-'))
date1 = datetime.date(year, month, day)
Obviously I want to pass TestedDateBegin to pandas, changing the pertinent code line:
data = df.loc[df['emr_first_access_date'] >= '2019-09-08', ['site_name','subs_num','emr_id', 'emr_first_access_date']]
to something like:
data = df.loc[df[b]['emr_first_access_date'] >= 'TestedDateBegin', ['site_name','subs_num','emr_id', 'emr_first_access_date']]
Obviously this doesn't work. But how do I proceed? I am very new to programming so I not always clear how to proceed. Does the date inputted in TestedDateBegin need to be added to a return? Or should it be put in a single item list? What is the right approach? Thx!
This is resolved.
I had to remove the single quotes around TestedDateBegin as python, of course, interpreted that as a string and not a variable. Living and learning. :-)
data = df.loc[df[b]['emr_first_access_date'] >= TestedDateBegin,['site_name','subs_num','emr_id', 'emr_first_access_date']]
Suppose i have a txt. file that looks like this:
0 day0 event_data0
1 day1 event_data1
2 day2 event_data2
3 day3 event_data3
4 day4 event_data4
........
n dayn event_datan
#where:
#n is the event index
#dayn is the day when the event happened. year-month-day format
#event_datan is what happened at the event.
From this file, i need to create a new one with all the events that happened between two specific dates. like after september the 7th 2003 and before christmas 2006.
Could someone help me this problem? Much appreciated!
Looks like the datetime module is what you'll want. Iterate through the file line by line until the timedelta between the current line's date and your beginning threshold date (Sept 7, 2003 in your example) is positive; stop iterating when you breach Christmas 2006. Load the lines into either a pandas dataframe or numpy array.
Lucas, you can try this:
import re
import os
from datetime import datetime as dt
__date_start__ = dt.strptime('2003-09-07', "%Y-%m-%d").date()
__date_end__ = dt.strptime('2006-12-25', "%Y-%m-%d").date()
f = open('file.txt', 'r').read()
os.remove('events.txt')
for i in f:
date = re.search('\d{4}\-\d{2}-\d{2}',i).group(0)
if date != '':
date_converted = dt.strptime(date, '%Y-%m-%d').date()
if (date_converted > __date_start__) and (date_converted < __date_end__):
open('events.txt', 'a').write(i)
You will change __date_start__ and __date_end__ values to your desire interval, then, the code will search in lines a regex that match with the format of date yyyy-mm-dd. So on, it going to compare in range (date start & end) and, if true, append a events.txt file the content of line.
I assume your file is tab delimited so you can use the pandas package to read it. Just add a the first row with the column names (index, date, event) in your .txt file separated by tab and then read in the data.
df = pandas.read_csv('txt_file.txt', sep='\t', index_col=0)
#index_col=0 just sets your first column as index
After you've done so, follow the steps from this link. That will essentially answer your question on how to select events between two dates by simply using this package. That way you can return a new data frame only with those events you need.
You have not described that you want especially for "after September the 7th 2003 and before Christmas 2006." or you have other options for these two dates ?
if specially for "after september the 7th 2003 and before christmas 2006." then you can get result with regex module in my opinion :
import re
c=r"([0-9]{1,2}\s+)(2003-09-07).+(2006-12-25)\s+\w+"
with open("event.txt","r") as f:
file_data=f.readlines()
regex_search=re.search(c,str(file_data))
print(regex_search.group())
You can also use conditions with group() , or you can use findall() method.
is there a readily-available command in Python's datetime to understand a discrete time range given as HH:MM-HH:MM or HH:MM:ss-HH:MM:ss (e.g. 07:30-12:45)? Such a range would be entered like that in a single cell from a CSV file that the script would access.
Or, might specifying just the start time and then a timedelta value be a better idea?
You can just use split() to separate the two time values, then parse each as a datetime.datetime type and then calculate the timedelta.
Example:
from datetime import datetime
time_string = "07:30-12:45"
separate_times = time_string.split("-")
parsed_times = [datetime.strptime(t, "%H:%M") for t in separate_times]
difference = parsed_times[1] - parsed_times[0]
Calling difference.total_seconds() will return the total seconds between the two times and if you aren't interested in the direction of the difference between the times, you can use abs(difference.total_seconds()).