networkx output scale problem with matplotlib (re-post) - python-3.x

I'm re-posting this question since I didn't make a good example code in last question.
I'm trying to make a nodes to set in specific location.
But I found out that the output drawing is not... fixed. Let me show you the pic.
So this is the one I make with 10 nodes. worked perfectly as I intended.
Also it has plt.text on the bottom left.
And here's the other picture
As you can see, something is wrong. plt.text is gone, and USA's location is weird. Actually that location is where DEU is located in the first pic. Both pics use same code.
Now, let me show you some of my code.
for spec_df, please download from my gdrive:
https://drive.google.com/drive/folders/11X_i5-pRLGBfQ9vIwQ3hfDU5EWIfR3Uo?usp=sharing
auto_flag = 0
spec_df=pd.read_stata("C:\\"Your_file_loc"\\CombinedHS6_example.dta")
#top_10_list = ["USA","CHN","KOR"] (Try for three nodes)
#or
#auto_flag = 1 (Try for 10 nodes)
df_p = spec_df[['partneriso3','tradevalue']]
df_p = df_p.groupby('partneriso3').sum().reset_index()
df_r = spec_df[['reporteriso3','tradevalue']]
df_r = df_r.groupby('reporteriso3').sum().reset_index()
df_r = df_r.rename(columns={'reporteriso3': 'Nation'})
df_r = df_r.rename(columns={'tradevalue': 'tradevalue_r'})
df_p = df_p.rename(columns={'partneriso3': 'Nation'})
df_s = pd.merge(df_r, df_p, on='Nation', how='outer').fillna(0)
df_s["final"] = df_s['tradevalue'] + df_s['tradevalue_r']
if auto_flag == 1:
df_s = df_s.sort_values(by=['final'], ascending = False).reset_index()
cut = df_s[:10]
else:
cut = df_s[(df_s['Nation'].isin(top_10_list))]
cut['final'] = cut['final'].apply(lambda x: normalize(x, cut['final'].max()))
cut['font_size'] = cut['final'] * 13
cut['final'] = cut['final'] * 1500
top_10_list = list(cut["Nation"])
top10 = spec_df[(spec_df['reporteriso3'].isin(top_10_list))&(spec_df['partneriso3'].isin(top_10_list))]
top10['tradevalue'] = top10['tradevalue'].apply(lambda x: normalize(x, top10['tradevalue'].max()))
top10['tradevalue'] = top10['tradevalue']*10
plt.figure(figsize=(10,10), dpi = 100)
G = nx.from_pandas_edgelist(top10, 'reporteriso3', 'partneriso3', 'tradevalue', create_using= nx.DiGraph())
widths = nx.get_edge_attributes(G,'tradevalue')
pos = {}
pos_cord = [(-0.30779309, -0.26419882), (0.26767895, 0.19524759), (-0.38479095, 0.88179998), (0.33785317, 0.96090914), (0.94090464, 0.40707934), (0.9270665, -0.38403114), (0.41246223, -0.85684049), (-0.32083322, -1.0), (-0.99724456, -0.34947554), (-0.87530367, 0.40950993)]
for t in range(len(top_10_list)):
if top_10_list == "":
continue
else:
pos[top_10_list[t]] = pos_cord[t]
pos_nodes = nudge(pos, 0, 0.12)
nx.draw_networkx_edges(G,pos, width=list(widths.values()), edge_color = '#9ECAE4')
nx.draw_networkx_nodes(G, pos=pos, nodelist = cut['Nation'], node_size= cut['final'], node_color ='#AB89EF', edgecolors ='#000000')
nx.draw_networkx_labels(G,pos_nodes, font_size=15)
plt.text(-1.15,-1.15,s='hs : ')
plt.savefig(location,dpi=300)
Sorry for the crude code. But I want to ask that I'm using fixed coordinates. So nodes are not supposed to move there location. So I think the plt's size is kinda interacting with the contents...? But I don't know how it does that.
Could anyone enlighten me please? This drives me crazy...

Thanks to #Paul Brodersen's comment, I found a way to fix the location.
I just added these codes in my codes.
fig = plt.figure(figsize=(10,10), dpi = 100)
axes = fig.add_axes([0,0,1,1])
axes.set_xlim([-1.3,1.3])
axes.set_ylim([-1.3,1.3])
Thank you for the help again!

Related

ProjError: Error creating Transformer from CRS.:

I am having an issue with Geopandas and PyProj.
I am loading a premade shp file from GeoPandas
borough = gpd.read_file(gpd.datasets.get_path('nybb'))
When trying to do a crs transformation with the following code:
borough = borough.to_crs({'init': 'epsg:4326'})
I get the error:
ProjError: Error creating Transformer from CRS.: (Internal Proj Error: proj_create_operations: SQLite error on SELECT source_crs_auth_name, source_crs_code, target_crs_auth_name, target_crs_code, cov.auth_name, cov.code, cov.table_name, area.south_lat, area.west_lon, area.north_lat, area.east_lon, ss.replacement_auth_name, ss.replacement_code FROM coordinate_operation_view cov JOIN area ON cov.area_of_use_auth_name = area.auth_name AND cov.area_of_use_code = area.code LEFT JOIN supersession ss ON ss.superseded_table_name = cov.table_name AND ss.superseded_auth_name = cov.auth_name AND ss.superseded_code = cov.code AND ss.superseded_table_name = ss.replacement_table_name AND ss.same_source_target_crs = 1 WHERE ((source_crs_auth_name = ? AND source_crs_code = ? AND target_crs_auth_name = ? AND target_crs_code = ?) OR (source_crs_auth_name = ? AND source_crs_code = ? AND target_crs_auth_name = ? AND target_crs_code = ?)) AND cov.deprecated = 0 AND cov.auth_name = ? ORDER BY pseudo_area_from_swne(south_lat, west_lon, north_lat, east_lon) DESC, (CASE WHEN accuracy is NULL THEN 1 ELSE 0 END), accuracy: no such column: ss.same_source_target_crs)
I really am totally clueless about what to do here.
PS: This is my first stackoverflow post, so I apologize in advanced for the poor layout of this post.

How to calculate active hours of an employee using face_recognition for attendance tracking

I am working on face recognition system for my academic project. I want to set the first time an employee was recognized as his first active time and the next time he is being recognized should be recorded as his last active time and then calculate the total active hours based on first active and last active time.
I tried with the following code but I'm getting only the current system time as the start time. can someone help me on what I am doing wrong.
Code:
data = pickle.loads(open(args["encodings"], "rb").read())
vs = VideoStream(src=0).start()
writers = None
time.sleep(2.0)
while True:
frame = vs.read()
rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
rgb = imutils.resize(frame, width=750)
r = frame.shape[1] / float(rgb.shape[1])
boxes = face_recognition.face_locations(rgb)
encodings = face_recognition.face_encodings(rgb, boxes)
names = []
face_names = []
for encoding in encodings:
matches = face_recognition.compare_faces(data["encodings"],
encoding)
name = "Unknown"
if True in matches:
matchedIdxs = [i for (i, b) in enumerate(matches) if b]
counts = {}
for i in matchedIdxs:
name = data["names"][i]
counts[name] = counts.get(name, 0) + 1
name = max(counts, key=counts.get)
names.append(name)
if names != []:
for i in names:
first_active_time = datetime.now().strftime('%H:%M')
last_active_time = datetime.now().strftime('%H:%M')
difference = datetime.strptime(first_active_time, '%H:%M') - datetime.strptime(last_active_time, '%H:%M')
difference = difference.total_seconds()
total_hours = time.strftime("%H:%M", time.gmtime(difference))
face_names.append([i, first_active_time, last_active_time, total_hours])

Value error in assigning to dataframe

I am assigning different data to one dataframe. And I had the following
ValueError: If using all scalar values, you must pass an index
I follow the question post by other Here
But it did not work out.
The following is my code. All you have to do is copy and paste the code to IDE.
import pandas as pd
import numpy as np
#Loading Team performance Data (ExpG (Home away)) For and against
epl_1718 = pd.read_csv("http://www.football-data.co.uk/mmz4281/1718/E0.csv")
epl_1718 = epl_1718[['HomeTeam','AwayTeam','FTHG','FTAG']]
epl_1718 = epl_1718.rename(columns={'FTHG': 'HomeGoals', 'FTAG': 'AwayGoals'})
Home_goal_avg = epl_1718['HomeGoals'].mean()
Away_goal_avg = epl_1718['AwayGoals'].mean()
Home_team_goals = epl_1718.groupby(['HomeTeam'])['HomeGoals'].sum()
Home_count = epl_1718.groupby(['HomeTeam'])['HomeTeam'].count()
Home_team_avg_goal = Home_team_goals/Home_count
Home_team_concede = epl_1718.groupby(['HomeTeam'])['AwayGoals'].sum()
EPL_Home_average_score = epl_1718['HomeGoals'].mean()
EPL_Home_average_conc = epl_1718['HomeGoals'].mean()
Home_team_avg_conc = Home_team_concede/Home_count
Away_team_goals = epl_1718.groupby(['AwayTeam'])['AwayGoals'].sum()
Away_count = epl_1718.groupby(['AwayTeam'])['AwayTeam'].count()
Away_team_avg_goal = Away_team_goals/Away_count
Away_team_concede = epl_1718.groupby(['AwayTeam'])['HomeGoals'].sum()
EPL_Away_average_score = epl_1718['AwayGoals'].mean()
EPL_Away_average_conc = epl_1718['HomeGoals'].mean()
Away_team_avg_conc = Away_team_concede/Away_count
Home_attk_sth = Home_team_avg_goal/EPL_Home_average_score
Home_attk_sth = Home_attk_sth.sort_index().reset_index()
Home_def_sth = Home_team_avg_conc/EPL_Home_average_conc
Home_def_sth = Home_def_sth .sort_index().reset_index()
Away_attk_sth = Away_team_avg_goal/EPL_Away_average_score
Away_attk_sth = Away_attk_sth .sort_index().reset_index()
Away_def_sth = Away_team_avg_conc/EPL_Away_average_conc
Away_def_sth = Away_def_sth.sort_index().reset_index()
Home_def_sth
HomeTeam = epl_1718['HomeTeam'].drop_duplicates().sort_index().reset_index().set_index('HomeTeam')
AwayTeam = epl_1718['AwayTeam'].drop_duplicates().sort_index().reset_index().sort_values(['AwayTeam']).set_index(['AwayTeam'])
#HomeTeam = HomeTeam.sort_index().reset_index()
Team = HomeTeam.append(AwayTeam).drop_duplicates()
Data = pd.DataFrame({"Team":Team,
"Home_attkacking":Home_attk_sth,
"Home_def": Home_def_sth,
"Away_Attacking":Away_attk_sth,
"Away_def":Away_def_sth,
"EPL_Home_avg_score":EPL_Home_average_score,
"EPL_Home_average_conc":EPL_Home_average_conc,
"EPL_Away_average_score":EPL_Away_average_score,
"EPL_Away_average_conc":EPL_Away_average_conc},
columns =['Team','Home_attacking','Home_def','Away_attacking','Away_def',
'EPL_Home_avg_score','EPL_Home_avg_conc','EPL_Away_avg_score','EPL_Away_average_conc'])
In this code, what I am trying to do is to get average goal score per team per game, average goals conceded per team per game.
And then I am calculating other performance factors such as attacking strength, defensive strenght etc.
I have to paste the code as if i use example, creating data frame would work.
Thanks for understanding.
Thanks in advance for the advice too.
The format (or the columns) of final data frame will look like as follow:
Team Home Attacking Home Defensive Away attacking away defensive
and so on as mentioned in the data frame.
It means, there will be only 20 teams under team columns
The shape of dataframe will be ( 20,9)
Regards,
Zep
Here main idea is remove reset_index for Series with index by teams, so variable Team is not necessary and is created as last step by reset_index. Also be carefull with columns names in DataFrame constructor, if there are changed like EPL_Home_average_conc in dictionary and then EPL_Home_avg_conc get NaNs columns:
Home_team_goals = epl_1718.groupby(['HomeTeam'])['HomeGoals'].sum()
Home_count = epl_1718.groupby(['HomeTeam'])['HomeTeam'].count()
Home_team_avg_goal = Home_team_goals/Home_count
Home_team_concede = epl_1718.groupby(['HomeTeam'])['AwayGoals'].sum()
EPL_Home_average_score = epl_1718['HomeGoals'].mean()
EPL_Home_average_conc = epl_1718['HomeGoals'].mean()
Home_team_avg_conc = Home_team_concede/Home_count
Away_team_goals = epl_1718.groupby(['AwayTeam'])['AwayGoals'].sum()
Away_count = epl_1718.groupby(['AwayTeam'])['AwayTeam'].count()
Away_team_avg_goal = Away_team_goals/Away_count
Away_team_concede = epl_1718.groupby(['AwayTeam'])['HomeGoals'].sum()
EPL_Away_average_score = epl_1718['AwayGoals'].mean()
EPL_Away_average_conc = epl_1718['HomeGoals'].mean()
Away_team_avg_conc = Away_team_concede/Away_count
#removed reset_index
Home_attk_sth = Home_team_avg_goal/EPL_Home_average_score
Home_attk_sth = Home_attk_sth.sort_index()
Home_def_sth = Home_team_avg_conc/EPL_Home_average_conc
Home_def_sth = Home_def_sth .sort_index()
Away_attk_sth = Away_team_avg_goal/EPL_Away_average_score
Away_attk_sth = Away_attk_sth .sort_index()
Away_def_sth = Away_team_avg_conc/EPL_Away_average_conc
Away_def_sth = Away_def_sth.sort_index()
Data = pd.DataFrame({"Home_attacking":Home_attk_sth,
"Home_def": Home_def_sth,
"Away_attacking":Away_attk_sth,
"Away_def":Away_def_sth,
"EPL_Home_average_score":EPL_Home_average_score,
"EPL_Home_average_conc":EPL_Home_average_conc,
"EPL_Away_average_score":EPL_Away_average_score,
"EPL_Away_average_conc":EPL_Away_average_conc},
columns =['Home_attacking','Home_def','Away_attacking','Away_def',
'EPL_Home_average_score','EPL_Home_average_conc',
'EPL_Away_average_score','EPL_Away_average_conc'])
#column from index
Data = Data.rename_axis('Team').reset_index()
print (Data)

python3 and libre office calc - setting column width

I am using pyoo to generate reports as open document spreadsheets. pyoo can do everything I require bar setting column widths. Some I want to set as a constant, others as optimal width. From the pyoo website (https://github.com/seznam/pyoo): "If some important feature missing then the UNO API is always available."
A few hours of Googling got me to the class com.sun.star.table.TableColumn which from this page appears to have the properties ("Width" and "OptimalWidth") that I require, but -
>>> x = uno.getClass('com.sun.star.table.TableColumn')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3/dist-packages/uno.py", line 114, in getClass
return pyuno.getClass(typeName)
uno.RuntimeException: pyuno.getClass: uno exception com.sun.star.table.TableColumn is unknown
I have no idea whatsoever how to get this to work. The documentation for UNO is overwelming to say the least...
Any clues would be enormously appreciated.
Example Python-UNO code:
def resize_spreadsheet_columns():
oSheet = XSCRIPTCONTEXT.getDocument().getSheets().getByIndex(0)
oColumns = oSheet.getColumns()
oColumn = oColumns.getByName("B")
oColumn.IsVisible = False
oColumn = oColumns.getByName("C")
oColumn.Width = 7000
oColumn = oColumns.getByName("D")
oColumn.OptimalWidth = True
Documentation:
https://wiki.openoffice.org/wiki/Documentation/BASIC_Guide/Rows_and_Columns
https://wiki.openoffice.org/wiki/Documentation/DevGuide/Spreadsheets/Columns_and_Rows
EDIT:
From the comment, it sounds like you need to work through an introductory tutorial on Python-UNO. Try http://christopher5106.github.io/office/2015/12/06/openoffice-libreoffice-automate-your-office-tasks-with-python-macros.html.
Thank you Jim K, you pointed me in the right direction, I got it working. My python script generates a ten sheet spreadsheet, and manually adjusting the column widths was becoming a pain. I post below my final code if anyone wants to comment or needs a solution to this. It looks like a pigs breakfast to me combining raw UNO calls and pyoo, but it works I guess.
#! /usr/bin/python3.6
import os, pyoo, time, uno
s = '-'
while s != 'Y':
s = input("Have you remembered to start Calc? ").upper()
os.system("soffice --accept=\"socket,host=localhost,port=2002;urp;\" --norestore --nologo --nodefault")
time.sleep(2)
desktop = pyoo.Desktop('localhost', 2002)
doc = desktop.create_spreadsheet()
class ofic:
sheet_idx = 0
row_num = 0
sheet = None
o = ofic()
uno_localContext = uno.getComponentContext()
uno_resolver = uno_localContext.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", uno_localContext )
uno_ctx = uno_resolver.resolve( "uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext" )
uno_smgr = uno_ctx.ServiceManager
uno_desktop = uno_smgr.createInstanceWithContext( "com.sun.star.frame.Desktop", uno_ctx)
uno_model = uno_desktop.getCurrentComponent()
uno_controller = uno_model.getCurrentController()
uno_sheet_count = 0
for i in range(5):
doc.sheets.create("Page {}".format(i+1), index=o.sheet_idx)
o.sheet = doc.sheets[o.sheet_idx]
o.sheet[0, 0].value = "The quick brown fox jumps over the lazy dog"
o.sheet[1, 1].value = o.sheet_idx
uno_controller.setActiveSheet(uno_model.Sheets.getByIndex(uno_sheet_count))
uno_sheet_count += 1
uno_active_sheet = uno_model.CurrentController.ActiveSheet
uno_columns = uno_active_sheet.getColumns()
uno_column = uno_columns.getByName("A")
uno_column.OptimalWidth = True
uno_column = uno_columns.getByName("B")
uno_column.Width = 1350
o.sheet_idx += 1
doc.save("whatever.ods")
doc.close()

How to manipulate nodes with using networkx (Python3, twitter, networkx)

I'm a beginner who try to make followers network of twitter with Python3. I made this code and got result however always one node is separated as you can see like left bottom node.
I have no idea what is happening. In addition, I also want to see smaller world for instance, how can I pick up only 5 friends and elucidate their connection ?? This my result is too huge... It would be greatly appreciated if you could explain the details. Here is my code :
api = twitter.Api(consumer_key = my_consumer_key,
consumer_secret = my_consumer_secret,
access_token_key = my_access_token_key,
access_token_secret = my_access_token_secret,
input_encoding = "UTF-8",
sleep_on_rate_limit=True)
friends = api.GetFriends()
G = networkx.Graph()
for friend in friends:
G.add_edge(myname,friend.screen_name)
for friend in friends[-3:]:
for user in api.GetFriends(friend.id):
if user in friends:
G.add_edge(friend.screen_name,user.screen_name)
pos = spring_layout(G)
draw_networkx_nodes(G, pos, node_size = 100, node_color = 'w')
draw_networkx_edges(G, pos, width = 1)
draw_networkx_labels(G, pos, font_size = 12, font_family = 'sans-
serif', font_color = 'r')
xticks([])
yticks([])
savefig("egonetwork.png")
show()

Resources