Import excel data and keep date time - excel

thanks in advance for your help. i'm importing data from excel using openpyxl though i'd like to get strings into datetime, below is the code i'm using:
import openpyxl, pprint, datetime
print ('Opening workbook...')
wb= openpyxl.load_workbook('ACLogs_test_Conv2.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
print sheet
ACLogsData = {}
print ('Reading rows...')
for row in range(2, sheet.max_row +1):
pangalan = sheet['B' + str(row)].value
dates = sheet['D' + str(row)].value
time = sheet['E' + str(row)].value
ACLogsData.setdefault(pangalan,{})
ACLogsData[pangalan].setdefault(dates,{})
ACLogsData[pangalan][dates].setdefault(time)

use datetime.strptime()
FMT = '%H:%M' # Whatever format your times are in
for row in range(2, sheet.max_row +1):
pangalan = sheet['B' + str(row)].value
dates = sheet['D' + str(row)].value
time = datetime.strptime(sheet['E' + str(row)].value, FMT)

Related

Track database history

def startlog():
id = enteruser.id
x = time.localtime()
sec = x.tm_sec
min = x.tm_min
hour = x.tm_hour + 1
day = x.tm_mday
date = f"{x.tm_mon}-{x.tm_mday}-{x.tm_year}"
starttime = (day * 86400) + (hour * 3600) + (min * 60) + sec
updatestart = "UPDATE log SET start = ?, date = ? WHERE ID = ?"
c.execute(updatestart, (starttime, date, id,))
conn.commit()
I have this function startlog, and a clone of it endlog.
My database log is consisted of (name, starttime, endtime, date)
Is there any way to keep track of the changes?
Desired output:
Name / Time / Date
x / time1 / date1
x / time2 / date2
I tried creating a list so everytime I'm calling out the function it will append on the list but it disappears after the session.
I used csv for my case since it's just a personal project. I used columns like ID/Time in / Time out / Total Time and used ID to determine which value to display. This is the snippet of my code (using tkinter for gui)
def csvwrite():
with open ('test.cvs', 'a', newline="") as csvfile:
writer = csv.writer(csvfile)
tup1 = (enteruser.id, log.start, log.end)
writer.writerow(tup1)
csvfile.close()
def csvread():
with open('test.cvs', 'r') as csvfile:
reader = csv.reader(csvfile)
filtered = filter(filterer, reader)
res = []
for i in filtered:
print(i)
historylbl = Label(historyWindow.historywndw, text = i)
historylbl.pack()

Scraping Billboard Data with Python; code doesn't crawl to previous chart

First of all, I'm a relative newbie to coding. My goal is to scrape at least the last decade of Billboard Hot 100 charts using the Python code below with billboard.py. My hiccup is I have tried a few variants of while loop statements and none have seemed to work to get me to the previous chart. I have an idea of how it should look from the billboard.py documentation but for whatever reason my code terminates prematurely or outputs an AttributeError: 'ChartEntry' object has no attribute 'previousDate'
Any advice on debugging this and/or corrective code is appreciated. Thank you.
import billboard
import csv
chart = billboard.ChartData('hot-100')
#chart = billboard.ChartData('hot-100', date=None, fetch=True, max_retries=5, timeout=25)
f = open('Hot100.csv', 'w')
headers = 'title, artist, peakPos, lastPos, weeks, rank, date\n'
f.write(headers)
while chart.previousDate:
date = chart.date
for chart in chart:
title = chart.title
artist = chart.artist
peakPos = str(chart.peakPos)
lastPos = str(chart.lastPos)
weeks = str(chart.weeks)
rank = str(chart.rank)
f.write('\"' + title + '\",\"' + artist.replace('Featuring', 'Feat.') + '\",' + peakPos + ',' + lastPos + ',' + weeks + ',' + rank + ',' + date + '\n')
chart = billboard.ChartData('hot-100', chart.previousDate)
f.close()
I figured it out. I had to change how my script was comprehending the for loop.
My revised code below
import billboard
import csv
chart = billboard.ChartData('hot-100')
#chart = billboard.ChartData('hot-100', date=None, fetch=True, max_retries=5, timeout=25)
f = open('hot-100.csv', 'w')
headers = 'title, artist, peakPos, lastPos, weeks, rank, date\n'
f.write(headers)
date = chart.date
while chart.previousDate:
date = chart.date
for song in chart:
title = song.title
artist = song.artist
peakPos = str(song.peakPos)
lastPos = str(song.lastPos)
weeks = str(song.weeks)
rank = str(song.rank)
f.write('\"' + title + '\",\"' + artist.replace('Featuring', 'Feat.') + '\",' + peakPos + ',' + lastPos + ',' + weeks + ',' + rank + ',' + date + '\n')
chart = billboard.ChartData('hot-100', chart.previousDate)
f.close()

Saving xlsx files that arent corrupted via openpyxl

I am generating around 10000 xlsx files to run a Monte Carlo simulation using a program called AMPL.
To generate these files I am using the below python script using openpyxl. The xlsx file that results needs to still be opened and "save as" and replaced as the same xlsx in order for AMPL to recognize it.
I only know how to do this by hand but am looking into suggestions on:
1) What can I modify in the python script to avoid the file corruption so I don't have to save and replace this file by hand.
2) How to "Save as" a batch of xlsx files to the same name xlsx files.
Here is the code
"""
Created on Mon Mar 2 14:59:43 2020
USES OPENPYXL to generate mc tables WITH NAMED RANGE
#author: rsuthar
"""
import openpyxl
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
import scipy.stats as stats
#import seaborn as sns
for k in range(1,2):
wb = openpyxl.Workbook()
sheet = wb.active
#named range for COL table so that AMPL can read it
new_range = openpyxl.workbook.defined_name.DefinedName('COL', attr_text='Sheet!$A$1:$D$64')
wb.defined_names.append(new_range)
#Probability
#Storage temp as truncated normal
#temperature as normal mean 55 with 5F variation
storagetempfarenht = 55.4
storagetempkelvin = (storagetempfarenht + 459.67) * (5.0/9.0)
highesttemp=60.8
lowesttemp=50
sigma = ((highesttemp + 459.67) * (5.0/9.0)) - storagetempkelvin
mu, sigma = storagetempkelvin, sigma
lower, upper = mu-2*sigma , mu+2*sigma
temp = stats.truncnorm.rvs((lower - mu) / sigma, (upper - mu) / sigma, loc=mu, scale=sigma, size=1)
#Generate the color after each condition with temp uncertainty
kterm=0.0019*math.exp((170604/8.314)*((1/288.15)-(1/temp)))
Hterm = '=16.949*EXP((-0.025)*(42 +((124-42)/(1+((EXP(%f*A2:A64*(124-42)))*(124-(16.949*EXP((-0.025)*C2:C64))/((16.949*EXP((-0.025)*C2:C64)-42))))))))' % kterm
#First column
sheet['A1'] = 'DAYD'
number_of_repeats = 5
days=range(1,13)
current_cell_num = 2
for repeat in range(number_of_repeats):
for day in days:
cell_string = 'A%d' % current_cell_num
sheet[cell_string] = day
current_cell_num = current_cell_num + 1
if repeat == number_of_repeats - 1:
for day in range(13,16,1):
cell_string = 'A%d' % current_cell_num
sheet[cell_string] = day
current_cell_num = current_cell_num + 1
#Second Column
sheet['B1'] = 'CROP'
for i, rowOfCellObjects in enumerate(sheet['B2':'B64']):
for n, cellObj in enumerate(rowOfCellObjects):
cellObj.value = 'TA'
#Third Column
sheet['C1'] = 'QUAL'
for i, rowOfCellObjects in enumerate(sheet['C2':'C13']):
for n, cellObj in enumerate(rowOfCellObjects):
cellObj.value = 2
sheet['C1'] = 'QUAL'
for i, rowOfCellObjects in enumerate(sheet['C14':'C25']):
for n, cellObj in enumerate(rowOfCellObjects):
cellObj.value = 3
sheet['C1'] = 'QUAL'
for i, rowOfCellObjects in enumerate(sheet['C26':'C37']):
for n, cellObj in enumerate(rowOfCellObjects):
cellObj.value = 4
sheet['C1'] = 'QUAL'
for i, rowOfCellObjects in enumerate(sheet['C38':'C49']):
for n, cellObj in enumerate(rowOfCellObjects):
cellObj.value = 5
sheet['C1'] = 'QUAL'
for i, rowOfCellObjects in enumerate(sheet['C50':'C64']):
for n, cellObj in enumerate(rowOfCellObjects):
cellObj.value = 1
#fourth Column
sheet['D1'] = 'COL'
for i, rowOfCellObjects in enumerate(sheet['D2':'D64']):
for n, cellObj in enumerate(rowOfCellObjects):
cellObj.value = Hterm
#save the file everytime
wb.save(filename='COL' + str(k) + '.xlsx')

Pandas and adding column and data to a table

Any idea how to add the division(j) to each row?? I run the program and it runs through each division (division 1 through 5). I want to add what division it is to each row. I have the headers 'Name, Gender, State, Position, Grad, Club/HS, Rating, Commitment, Division' at the top of the table. Right now I don't know which division each row is because it is blank. Thanks for your help....
import pandas as pd
max_page_num = 10
with open('results.csv', 'a', newline='') as f:
f.write('Name, Gender, State, Position, Grad, Club/HS, Rating, Commitment, Division\n')
def division():
for j in range(1,5):
division = str(j)
for i in range(max_page_num):
print('page:', i)
graduation = str(2020)
area = "commitments" # "commitments" or "clubplayer"
gender = "m"
page_num = str(i)
source = "https://www.topdrawersoccer.com/search/?query=&divisionId=" + division + "&genderId=m&graduationYear=" + graduation + "&playerRating=&pageNo=" + page_num + "&area=" + area +""
all_tables = pd.read_html(source)
df = all_tables[0]
print('items:', len(df))
df.to_csv('results.csv', header=False, index=False, mode='a')
division()
Simply adding the column 'division' should do it if I understand correctly.
import pandas as pd
max_page_num = 10
with open('results.csv', 'a', newline='') as f:
f.write('Name, Gender, State, Position, Grad, Club/HS, Rating, Commitment, Division\n')
def division():
for j in range(1,5):
division = str(j)
for i in range(max_page_num):
print('page:', i)
graduation = str(2020)
area = "commitments" # "commitments" or "clubplayer"
gender = "m"
page_num = str(i)
source = "https://www.topdrawersoccer.com/search/?query=&divisionId=" + division + "&genderId=m&graduationYear=" + graduation + "&playerRating=&pageNo=" + page_num + "&area=" + area +""
all_tables = pd.read_html(source)
df = all_tables[0]
df['division'] = division
print('items:', len(df))
df.to_csv('results.csv', header=False, index=False, mode='a')
division()

Python issue with reading and calculating data from excel file

I have attached a screenshot of the excel file I am working with
I am trying to read this excel file which has all the states (column B), Counties (column C) and Population (column D). I want to calculate the population for each state.
I know that there are various ways we all can do it and there is certainly a way to do this in fewer lines of easily understandable code. I will appreciate that but I would also like to know how to do this the way I am thinking which is - to first find out unique state names and then loop through the sheet to add all the columns by state.
Here is my code:
x = wb.get_sheet_names()
sheet = wb.get_sheet_by_name('Population by Census Tract')
PopData = {}
StateData = []
i = 3
j = 0
k=""
#First value entered
StateData.append(sheet['B' + str(2)].value)
#Unique State Values calculated
for row in range(i, sheet.max_row + 1):
if any(sheet['B' + str(row)].value in s for s in StateData):
i=i+1
else:
StateData.append(sheet['B' + str(row)].value)
print(StateData)
#Each State's Population calculated
for s in StateData:
for row in range(2, sheet.max_row + 1):
if sheet['B' + str(row)].value == StateData[s]:
j = j + sheet['D' + str(row)].value
PopData[StateData[s]] = j
print(PopData)
I am getting this error:
if sheet['B' + str(row)].value == StateData[s]:
TypeError: list indices must be integers or slices, not str
In the following:
for s in StateData:
for row in range(2, sheet.max_row + 1):
if sheet['B' + str(row)].value == StateData[s]:
j = j + sheet['D' + str(row)].value
PopData[StateData[s]] = j
s is already an element of the StateData list. What you want to do is probably:
for s in StateData:
for row in range(2, sheet.max_row + 1):
if sheet['B' + str(row)].value == s:
j = j + sheet['D' + str(row)].value
PopData[StateData[s]] = j
or
for i, s in enumerate(StateData):
for row in range(2, sheet.max_row + 1):
if sheet['B' + str(row)].value == StateData[i]:
j = j + sheet['D' + str(row)].value
PopData[StateData[s]] = j
but the first alternative is more elegant and (maybe) slightly faster.

Resources