How do I import csv file and use it as instance attributes for my class?
Here is the code ive written:
import random
import csv
class Cars:
def __init__(self, driver, team):
self.driver = driver
self.team = team
class Race:
def __init__(self, lap = 55):
self.lap = lap
self._finished = []
def race(self, list_of_cars):
for c in list_of_cars:
c.distance = 0
while list_of_cars:
for c in list_of_cars:
c.distance += random.randint(100,300)
if c.distance >= self.lap:
self._finished.append(c)
list_of_cars.remove(c)
def print_results(self):
print("Tournament Result\n" + "_" * 18)
for i, c in enumerate (self._finished):
print (i+1, c.driver, c.team)
cars_list = []
with open("driverandteam.csv",'r') as file:
csv_reader = csv.reader(file)
for line in csv_reader:
cars_list.append(Cars(line[0],line[1]))
r = Race(65)
cars = cars_list
r.race(cars)
r.print.results()
driverandteam.csv looks like this
Verstappen, Red Bull
Perez, Red Bull
Hamilton, Mercedes
Bottas, Mercedes
Leclerc, Ferrari
Sainz, Ferrari
Ricciardo, McLaren
Norris, McLaren
Ocon, Alphine
Alonso, Alphine
Tsunoda, AlphaTauri
Gasly, AlphaTauri
Vettel, Aston Martin
Stroll, Aston Martin
Latifi, Williams
Russell, Williams
Raikkonen, Alfa Romeo
Giovinazzi, Alfa Romeo
Mazepin, Haas
Schumacher, Haas
I keep getting "list of out range" error in line 36 of my code but im not understanding why. How can I fix my code so it works.
I ran your code locally and with the provided csv (thanks for providing all the needed info!), I had to make one small code change at the end:
r.print_results()
After that it all ran as expected and I got the following output:
Tournament Result
__________________
1 Verstappen Red Bull
2 Hamilton Mercedes
3 Leclerc Ferrari
4 Ricciardo McLaren
5 Ocon Alphine
6 Tsunoda AlphaTauri
7 Vettel Aston Martin
8 Latifi Williams
9 Raikkonen Alfa Romeo
10 Mazepin Haas
11 Perez Red Bull
12 Sainz Ferrari
13 Alonso Alphine
14 Stroll Aston Martin
15 Giovinazzi Alfa Romeo
16 Bottas Mercedes
17 Gasly AlphaTauri
18 Schumacher Haas
19 Norris McLaren
20 Russell Williams
So I'm unable to reproduce the error you're receiving, let us know if you're still encountering this error and can update any code/data to help reproduce.
Related
from cgitb import text
from bs4 import BeautifulSoup
import requests
website = 'https://www.marketplacehomes.com/rent-a-home/'
result = requests.get(website)
content = result.text
soup = BeautifulSoup(content, 'html.parser')
lists = soup.find_all('div', class_=('tt-rental-row'))
for list in lists:
location = list.find('span', class_="renta;-adress")
beds = list.find('span', class_="renta;-beds")
baths = list.find('span', class_="renta;-beds")
availability = list.find('span', class_="rental-date-available")
info = [location, beds, baths, availability]
print(info)
If I try to run the last line of code, I get:
"IndentationError: expected an indented block"
If I try to run each indentation separately I get:
">>> location = list.find('span', class_="renta;-adress")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'list' has no attribute 'find'"
I'm new to Python and I'm kinda stuck, can anyone please help me?
Note: Your code never runs the for-loop cause your selection never matches the elements in HTML. They are generated dynamically based on data from another ressource and requests do not render websites like a browser, it only uses static contents from response.
Be aware not to use built-in keywords they will cause errors, especialy in your case list.find() will raise one cause the type object 'list' do not has an attribute called find. You could simply check these things using type()
type(soup)
-> its a bs4.BeautifulSoup
type(soup.find_all('div', class_=('tt-rental-row')))
-> its a bs4.element.ResultSet
type(list)
-> its a type
So how to get your goal?
You could also use pandas to directly create a DataFrame and slice it to your needs:
import pandas as pd
pd.read_json('https://app.tenantturner.com/listings-json/2679')
Output:
id dateActivated latitude longitude address city state zip photo title ... baths dateAvailable rentAmount acceptPets applyUrl btnUrl btnText virtualTour propertyType enableWaitlist
0 83600 8/22/2022 35.750499 -86.393972 4481 Jack Faulk St Murfreesboro TN 37127 https://ttimages.blob.core.windows.net/propert... 4481 Jack Faulk St ... 2.0 Now 2195 cats, small dogs, large dogs https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/4481-jack... Schedule Viewing None Single Family False
1 100422 8/31/2022 30.277607 -95.472842 213 Skybranch Court Conroe TX 77304 https://ttimages.blob.core.windows.net/propert... 213 Skybranch Court ... 2.5 Now 2100 cats, small dogs, large dogs https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/213-skybr... Schedule Viewing None Condo Unit False
2 106976 7/27/2022 28.274720 -82.298077 8127 Olive Brook Dr Wesley Chapel FL 33545 https://ttimages.blob.core.windows.net/propert... 8127 Olive Brook Dr ... 2.0 Now 2650 no pets https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/8127-oliv... Schedule Viewing None Single Family False
3 116188 8/15/2022 42.624023 -83.144614 735 Grace Ave Rochester Hills MI 48307 https://ttimages.blob.core.windows.net/propert... 735 Grace Ave ... 2.0 Now 1600 cats, small dogs, large dogs https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/735-grace... Schedule Viewing None Single Family False
4 126846 8/22/2022 32.046455 -81.071181 1810 E 41st St Savannah GA 31404 https://ttimages.blob.core.windows.net/propert... 1810 E 41st St ... 1.0 Now 1395 small dogs https://app.propertyware.com/pw/application/#/... https://app.tenantturner.com/qualify/1810-e-41... Schedule Viewing None Single Family True
...
91 rows × 22 columns
Example:
To show only specifc columns, simply pass a list of there names.
import pandas as pd
pd.read_json('https://app.tenantturner.com/listings-json/2679')[['address', 'city','state', 'zip', 'title', 'beds', 'baths','dateAvailable']]
Output
address beds baths dateAvailable
0 4481 Jack Faulk St 4 2.0 Now
1 213 Skybranch Court 3 2.5 Now
2 8127 Olive Brook Dr 3 2.0 Now
3 735 Grace Ave 3 2.0 Now
4 1810 E 41st St 3 1.0 Now
... ... ... ... ...
91 rows × 4 columns
Since the word list is a built-in keyword in python you can't use it as variable name try another name
for myList in lists:
location = myList.find('span', class_="renta;-adress")
beds = myList.find('span', class_="renta;-beds")
baths = myList.find('span', class_="renta;-beds")
availability = myList.find('span', class_="rental-date-available")
info = [location, beds, baths, availability]
print(info)
Hello I want to find the account text # in the title column, and save it in the new csv. Pandas can do it, I tried to make it but it didn't work.
This is my csv http://www.sharecsv.com/s/c1ed9790f481a8d452049be439f4e3d8/Newnormal.csv
this is my code:
import pandas as pd
data = pd.read_csv("Newnormal.csv")
data.dropna(inplace = True)
sub ='#'
data["Indexes"]= data["title"].str.find(sub)
print(data)
I want results like this
From, to, title Xavier5501,KudiiThaufeeq,RT #KudiiThaufeeq: Royal
Rape, Royal Harassment, Royal Cocktail Party, Royal Pedo, Royal
Bidding, Royal Maalee Bayaan, Royal Slavery..et
Thank you.
reduce records to only those that have an "#" in title
define new column which is text between "#" and ":"
you are left with some records where this leave NaN in to column. I've just filtered these out
df = pd.read_csv("Newnormal.csv")
df = df[df["title"].str.contains("#")==True]
df["to"] = df["title"].str.extract(r".*([#][A-Z,a-z,0-9,_]+[:])")
df = df[["from","to","title"]]
df[~df["to"].isna()].to_csv("ToNewNormal.csv", index=False)
df[~df["to"].isna()]
output
from to title
1 Xavier5501 #KudiiThaufeeq: RT #KudiiThaufeeq: Royal Rape, Royal Harassmen...
2 Suzane24979006 #USAID_NISHTHA: RT #USAID_NISHTHA: Don't step outside your hou...
3 sandeep_sprabhu #USAID_NISHTHA: RT #USAID_NISHTHA: Don't step outside your hou...
4 oliLince #Timothy_Hughes: RT #Timothy_Hughes: How to Get a Salesforce Th...
7 rismadwip #danielepermana: RT #danielepermana: Pak kasus covid per hari s...
... ... ... ...
992 Reptoid_Hunter #sapiofoxy: RT #sapiofoxy: I literally can't believe we ha...
994 KPCResearch #sapiofoxy: RT #sapiofoxy: I literally can't believe we ha...
995 GreySparkUK #VoxSmartGlobal: RT #VoxSmartGlobal: The #newnormal will see mo...
997 Gabboa10 #HuShameem: RT #HuShameem: One of #PGO_MV admin staff test...
999 wanjirunjendu #ntvkenya: RT #ntvkenya: AAK's Mugure Njendu shares insig...
Given a dataframe as follows:
firstname lastname email_address \
0 Doug Watson douglas.watson#dignityhealth.org
1 Nick Holekamp nick.holekamp#rankenjordan.org
2 Rob Schreiner rob.schriener#wellstar.org
3 Austin Phillips austin.phillips#precmed.com
4 Elise Geiger egeiger#puracap.com
5 Paul Urick purick#diplomatpharmacy.com
6 Michael Obringer michael.obringer#lashgroup.com
7 Craig Heneghan cheneghan#west-ward.com
8 Kathy Hirst kathleen.hirst#sunovion.com
9 Stefan Bluemmers stefan.bluemmers#grunenthal.com
companyname
0 Dignity Health
1 Ranken Jordan Pediatric Bridge Hospital
2 WellStar Health System
3 Precision Medical Products, Inc.
4 puracap.com
5 Diplomat Specialty Pharmacy
6 Lash Group
7 West-Ward Pharmaceuticals
8 Sunovion Pharmaceuticals
9 Grünenthal Group
How could I create possible email addresses using common email patterns as such: firstlast#example.com, first.last#example.com, f.last#example.com, lastF#example.com, first_last#example.com, firstL#example.com, etc.
df['email1'] = df.firstname.str.lower() + '.' + df.lastname.str.lower() + '#' + df.companyname.str.replace('\s+', '').str.lower() + '.com'
print(df['email1'])
Out:
0 doug.watson#dignityhealth.com
1 nick.holekamp#rankenjordanpediatricbridgehospi... --->problematic
2 rob.schreiner#wellstarhealthsystem.com
3 austin.phillips#precisionmedicalproducts,inc..com --->problematic
4 elise.geiger#puracap.com.com --->problematic
...
9995 terry.hanley#kempersportsmanagement.com
9996 christine.marks#geocomp.com
9997 darryl.rickner#doe.com
9998 lalit.sharma#lovelylifestyle.com
9999 parul.dutt#infibeam.com
Some of them seems quite problematic, anyone could help to solve this issue? Thanks a lot.
EDITED:
print(df) after applying #Sajith Herath's solution:
Out:
firstname lastname companyname \
0 Nick Holekamp Ranken ...
email
0 nick. ...
You can use a method to create permutations of username with different separators and define a max length that simplify the domain using company name as follows
import pandas as pd
import random
data = {"firstname":["Nick"],"lastname":["Holekamp"],"companyname":["Ranken \
Jordan Pediatric Bridge Hospital"]}
df = pd.DataFrame(data=data)
max_char = 5
emails = []
def simplify_domain(text):
if len(text)>max_char:
text = ''.join([c for c in text if c.isupper()])
return text.lower()
return text.replace("\s+","").lower()
def username_permutations(first_name,last_name):
# define separators
separators = [".", "_", "-"]
#lower case
combinations = list(map(lambda x:f"{first_name.lower()}{x} \
{last_name.lower()}",separators))
#append a random number to tail
n = random.randint(1, 100)
combinations.extend(list(map(lambda x:f"{x}{n}",combinations)))
return combinations
for index,row in df.iterrows():
usernames = username_permutations(row["firstname"],row["lastname"])
email_permutations = list(map(lambda x: f" \
{x}#{simplify_domain(row['companyname'])}.com",usernames))
emails.append(','.join(email_permutations))
df["email"] = emails
Final result will be nick.holekamp#rjpbh.com,nick_holekamp#rjpbh.com,nick-holekamp#rjpbh.com,nick.holekamp66#rjpbh.com,nick_holekamp66#rjpbh.com,nick-holekamp66#rjpbh.com
you can modify simplify_domain method to validate given string such as removing inc or .com values
I've just started to learn python. I want to extract glass temperature data every three hours for academic purpose. The website is below:
https://www.weather.gov.hk/wxinfo/ts/display_graph_grass_e.htm?kp&
I try to use BeautifulSoup to pull out the data using the below script. Here is the result:
Before I find the wanted data, there is a < !--Content End-- > after the JavaScript and I can't scrape the script behind it. Why would that happen and if there is any solution for that?
The data is stored as Javascript Array in the HTML page. We can use re and ast.literal_eval (doc) to retrieve it:
import re
import requests
from ast import literal_eval
url = 'https://www.weather.gov.hk/wxinfo/ts/display_graph_grass_e.htm?kp&'
html_text = requests.get(url).text
station_code = literal_eval(re.findall(r'StationCode\s*=.*?(\(.*?\))', html_text)[0])
station_name = literal_eval(re.findall(r'stnname\s*=.*?(\(.*?\))', html_text)[0])
station_height = literal_eval(re.findall(r'stn_height\s*=.*?(\(.*?\))', html_text)[0])
grass_temp = literal_eval(re.findall(r'grasstemp\s*=.*?(\(.*?\))', html_text)[0])
min_since_17 = literal_eval(re.findall(r'minSince17\s*=.*?(\(.*?\))', html_text)[0])
min_hour = literal_eval(re.findall(r'minHour\s*=.*?(\(.*?\))', html_text)[0])
min_minute = literal_eval(re.findall(r'minMinute\s*=.*?(\(.*?\))', html_text)[0])
rows = [*zip(station_code, station_name, station_height, grass_temp, min_since_17, min_hour, min_minute)]
headers = ['Station Code', 'Station Name', 'Station Height', 'Grass Temp', 'Min_since_17', 'Min Hour', 'Min Minute']
print(''.join('{: <20}'.format(d) for d in headers))
for row in rows:
print(''.join('{: <20}'.format(d) for d in row))
Prints:
Station Code Station Name Station Height Grass Temp Min_since_17 Min Hour Min Minute
kp King's Park 65 25.4 25.3 23 35
tkl Ta Kwu Ling 15 25.4 24.8 17 00
tms Tai Mo Shan 955 21.3 21.3 07 19
companies = pd.read_csv("http://www.richard-muir.com/data/public/csv/CompaniesRevenueEmployees.csv", index_col = 0)
companies.head()
I'm getting this error please suggest what approaches should be tried.
"utf-8' codec can't decode byte 0xb7 in position 7"
Try encoding as 'latin1' on macOS.
companies = pd.read_csv("http://www.richardmuir.com/data/public/csv/CompaniesRevenueEmployees.csv",
index_col=0,
encoding='latin1')
Downloading the file and opening it in notepad++ shows it is ansi-encoded. If you are on a windows system this should fix it:
import pandas as pd
url = "http://www.richard-muir.com/data/public/csv/CompaniesRevenueEmployees.csv"
companies = pd.read_csv(url, index_col = 0, encoding='ansi')
print(companies)
If not (on windows), you need to research how to convert ansi-encoded text to something you can read.
See: https://docs.python.org/3/library/codecs.html#standard-encodings
Output:
Name Industry \
0 Walmart Retail
1 Sinopec Group Oil and gas
2 China National Petroleum Corporation Oil and gas
... ... ...
47 Hewlett Packard Enterprise Electronics
48 Tata Group Conglomerate
Revenue (USD billions) Employees
0 482 2200000
1 455 358571
2 428 1636532
... ... ...
47 111 302000
48 108 600000