How to slice dataframe smartly? - python-3.x

I am working with python especially pandas module. I am slicing in this way dfps7 = dfps5.iloc[:,[1,4,5,7,8,9,10]] and it's working. But I want to know the smart way of representing the continuous part as like 4,5,7,8,9,10 as 4:10. When I tried like dfps7 = dfps5.iloc[:,[1,4:10]] and this is not working. Looking forward for the smartest solution.

Related

Dynamic variable naming in Python

So far I have seen that the majority of responses to this kind of question advice using a list.
All well and good in many circumstances, however not when needing to assign a specific variable name.
Please read and understand that I'm not asking the same question as has been before before downvoting this, if the answer is truly the same (use list/dict) please explain how this is implemented in my specific case,
In my situation I have a pyqt5 gui and I want to change the text of the labels in certain circumstances:
for layer in layerlist:
# reads all layers into a dictionary
layers[layer] = gpd.read_file(gdb_path, driver="FileGDB", layer=layer)
gdf = layers[layer]
if gdf.empty: # loop to find empty datasets
self.lab_1.setText("Empty dataset")
Obviously using self.lab_1 as a static variable will be of little use.
More useful would be to have the label be the same as the layer name so it can address the cotrrect label in the gui.
if gdf.empty:
self. + layer + .setText("Empty dataset")
Using + works fine for adding variables to strings, but wont work in this case. I've done this before in things like php, is it possible in Python, if so how?

I am unable to manipulate the data returned with yfinance

This might be a silly question what I am doing here, but I am unfamiliar with this. I am playing around with the yfinance library to build a small script to track certain tickers. The thing is, let's see for example that I want to see what is the value of S&P500 for today and yesterday so later I can compare, this is as far as I got using the tutorial:
import yfinance as yf
SP500 = yf.Ticker('^GSPC')
SP500 = SP500.history(period="2d")
print(SP500['Close'])
So what I am looking for is the price of that particular ticker at close. But when I run this code, what I get is:
Date
2020-03-20 2304.92
2020-03-23 2237.40
Name: Close, dtype: float64
I am unfamiliar with data offered this way. I am used to receive back tuples, lists, something I can work on. I have tried extracting the numbers only from the lines the pythonic way but I can't get any joy. In this case I am only interested in getting the 2304.92 and 2237.40 numbers to further work on them.
Does anyone have any idea how can I extract these numbers off this matrix?
Thanks in advance

How to write a multi-threaded program for plotting interactively?

I am trying to read data online from a website every 1 seconds using following code, then plot the result in real time. I mean I like to add new last_price in every second to the previous plot and update it.
import time
import requests
for i in range(2600):
time.sleep(1)
with requests.Session() as s:
data = {'ContractCode' : 'SAFTR98'}
r = s.post('http://cdn.ime.co.ir/Services/Fut_Live_Loc_Service.asmx/GetContractInfo', json = data ).json()
for key, value in r.items():
print(r[key]['ContractCode'])
last_prices = (r[key]['LastTradedPrice'])
I used animation.FuncAnimation but didn't work because it plots all the results after the 2600 iterations is done or I stop the program! So I thought maybe I can do the work with multi-threading. But I don't know how exactly should I use it? I searched about but couldn't understand the examples and how should I map it to my problem?
This is not a duplicated question. In the mentioned link I tried to solve the problem using Animations but in this question I am trying to find a new way using multi-threading.

similarity measurement among names?

I have a list of names with me and iam trying to find the most similar 5 names from the list of any given name as a query.
I thought of applying word2vec or else using Text.similar() from nltk.
but iam not sure whether these will work for names as well.
any similarity measure would work for me.
any suggestions?
this not for any project but just i wanted to learn new things.
Since you added NLTK, I assume you are fine working in Python.
Check out the Jellyfish library which contains 10 different algorithms for comparing strings. Some of them will compare just the characters while others will try to guess how a string would be pronounced and help you identify other phrases that are very differently spelt but would sound similar.
The actual algorithms are all written in C and so this library is pretty efficient!
I think you will find the Jaro-Winkler distance to be most useful. Also check out this paper.

n columns of data frame discarded

I am using spatstat package in R to read my road network shapefile which also has some additional attributes.
When i am reading my shapefiles and converting them to as.psp(before I make them an object of linnet), I am getting n columns of data frame discarded. I do not understand why? The columns being discarded are my covariates for a linear network, so I am not able to bring them into my analysis.
Could someone give me an idea why this happens and how to correct it?
Why it happens:
I would guess that we (spatstat authors) need to spend a bit of time discussing with the maptools guys how to handle the additional info in the SpatialLinesDataFrame object, and we haven't done that yet.
How to correct it:
You have to write some code on your own at the moment. You can extract the data from SpatialLinesDataFrame object by accessing the #data slot. Please provide specific data and how you need to use the additional data (what format do you need it in) if you need more help. You can find a few helpful commands here: https://cran.r-project.org/web/packages/spatstat/vignettes/shapefiles.pdf

Resources