I have a column, 'EDU', in my dataframe, df. where I tried to create a dictionary with value_counts(), poe_dict. It looks like this.
edu_m=df['EDU'].sort_values()
poe_dict = edu_m.value_counts(normalize=True).to_dict()
poe_dict
{4: 0.47974705779026877,
3: 0.24588090637625154,
2: 0.172352011241876,
1: 0.10202002459160373}
Now, I'm trying to replace the keys '4,3,2,1' with these strings which I put in a list.
n_keys=["college","more than high school but not college","high school","less than high school"]
If I do each of them individually, this runs ok, giving me the expected result.
In:
poe_dict['college'] = poe_dict.pop(4)
poe_dict['more than high school but not college'] = poe_dict.pop(3)
poe_dict['high school'] = poe_dict.pop(2)
poe_dict['less than high school'] = poe_dict.pop(1)
Out:
{'college': 0.47974705779026877,
'more than high school but not college': 0.24588090637625154,
'high school': 0.172352011241876,
'less than high school': 0.10202002459160373}
however, if I try to do it as a loop, it produces this.
In:
for key, n_key in zip(poe_dict.keys(), n_keys):
poe_dict[n_key] = poe_dict.pop(key)
poe_dict
Out:
{2: 0.172352011241876,
1: 0.10202002459160373,
'high school': 0.47974705779026877,
'less than high school': 0.24588090637625154}
So I dont understand why the loop does not work for keys 2 and 1?
I have tried to debug it as well to see what happens in the loop like this.
In:
for key, n_key in zip(poe_dict.keys(), n_keys):
print (key,n_key)
poe_dict[n_key] = poe_dict.pop(key)
Out:
4 college
3 more than high school but not college
college high school
more than high school but not college less than high school
You loop over the keys of poe_dict in the for loop. However the keys of poe_dict is modified when the statement is poe_dict[n_key] = poe_dict.pop(key) has been run. Therefore, the keys information gets wrong. The correct way is to store the keys of peo_dict into a list list(poe_dict.keys()) and loop over this new list of keys.
poe_dict = {4: 0.47, 3:0.25, 2:0.17, 1:0.10}
n_keys = ['college', 'more than high school but not college','high school', 'less than high school' ]
keylist = list(poe_dict.keys())
for key, n_key in zip(keylist, n_keys):
print (key,n_key)
poe_dict[n_key] = poe_dict.pop(key)
print (poe_dict)
The results will be
{'college': 0.47, 'more than high school but not college': 0.25, 'high school': 0.17, 'less than high school': 0.1}
Related
I'm looking for the process to happen on the fly since it can be generalized and also efficient.
This is the code I tried. It has the basic logic. However this does not work for all inputs.
result=[{4:5},{4:6},{4:7}]
new=[]
new=result.copy()
print("Original",result)
for i in range(0,len(result)-1):
for key, value in result[i].items():
for j in range(1,len(result)):
for key_1, value_1 in result[j].items() :
if i!=j:
if key==key_1:
print("when key=key\n",result[i],"=",result[j])
dict={value:value_1}
new[j]=dict
print(new)
break;
if key==value_1:
print("when key=value\n",result[i],"=",result[j])
dict={value:key_1}
new[j]=dict
print(new)
if value==value_1:
print("when value=value\n",result[i],"=",result[j])
dict={key:key_1}
new[j]=dict
print(new)
else:
break;
print("\nFinal =",new)
print("\nCorrect =[{4:5},{5:6},{6:7}]" )
input: [{4:5},{4:6},{4:7}]
output: [{4:5},{5:6},{6:7}]
I have tried this for converting list of list to json. But Could not convert to proper json format.
My data is
data= [['India',
'India runs mentorship driven incubation.',
'/7e9e130075d3bfcd9e0.png'],
['Capital',
'develops high growth and market-defining India-centric Software and Services Big Data and Analytics and Consumer Mobile or Internet markets.',
'/data/images/512bc5a2937.png']]
titles = ['Country','description','logo']
values = [e for g in grouper(3, data) for e in g]
keys = (titles[i%3] for i in xrange(len(values)))
objs = [dict(g) for g in grouper(3, list(izip(keys, values)))]
print(objs)
result:
[{'Country': ['India', 'India runs mentorship driven incubation.', '/7e9e130075d3bfcd9e0.png'], 'description': ['Capital', 'develops high growth and market-defining India-centric Software and Services Big Data and Analytics and Consumer Mobile or Internet markets.', '/data/images/512bc5a2937.png']}]
But expected result should be in this form.
[{'Country': 'India', 'description': 'India runs mentorship driven incubation.', 'logo': '/7e9e130075d3bfcd9e0.png'}]
What should be reason ?
You can use a one-line list comprehension. First, iterate through data, an for each piece of data (entry), zip it with titles to create an iterable of tuples that can be converted into a dictionary:
data= [['India',
'India runs mentorship driven incubation.',
'/7e9e130075d3bfcd9e0.png'],
['Capital',
'develops high growth and market-defining India-centric Software and Services Big Data and Analytics and Consumer Mobile or Internet markets.',
'/data/images/512bc5a2937.png']]
titles = ['Country','description','logo']
result = [dict(zip(titles, entry)) for entry in data]
print(result)
Output:
[{'Country': 'India',
'description': 'India runs mentorship driven incubation.',
'logo': '/7e9e130075d3bfcd9e0.png'},
{'Country': 'Capital',
'description': 'develops high growth and market-defining India-centric Software and Services Big Data and Analytics and Consumer Mobile or Internet markets.',
'logo': '/data/images/512bc5a2937.png'}]
I am learning how to program in Python 3 and I am working on a project that lets you buy a ticket to a movie. After that you can see your shopping cart with all the tickets that you have bought.
Now, I want after each printed line to add a integer.
For example: 1. Movie1 , 2. Movie2 , etc..
Here is my code that I use to print the films:
if choice == 3:
#try:
print("Daca doresti sa vezi ce filme sunt valabile, scrie exit.")
bilet = str(input("Ce film doresti sa vizionezi?: ").title())
pret = films[bilet]["price"]
cumperi = input("Doresti sa adaugi in cosul de cumparaturi {}$ (y/n)?".format(bilet)).strip().lower()
if cumperi == "y":
bani[0] -= pret
cos.append(bilet)
if choice == 4:
print (*cos, sep="\n")
You can use an integral variable and increase it's value whenever you perform a task.
example set count = 0 and when you does a task place this there count += 1.
Only just started using python this week, so I'm a total beginner. Imagine I have a massive dataset with data like so:
close high low open time symbol
0.04951 0.04951 0.04951 0.04951 7/16/2010 BTC
0.08584 0.08585 0.05941 0.04951 7/17/2010 BTC
0.0808 0.09307 0.07723 0.08584 7/18/2010 ETH
How, using matplotlib, can I plot close with time, only if symbol = BTC? I was thinking something like
bitgroup = df.groupby('symbol')
if bitgroup == 'BTC':
df(['close','time']).plot()
plt.show()
Building on this, I'd then like to use these new groups to create new columns, such as returns, (calculated using (p1-p0)/p0) doing something like this:
def createnewcolumn()
for i in bitgroup
df[returns] = (bitgroup['close'].ix[i] - bitgroup['close'].ix[i-1]) / bitgroup['close'].ix[i-1]
createnewcolumn()
Any help would be greatly appreciated in turning this pseudocode into real code!
df.symbol == 'BTC'
returns a list of [0, 1, 1, 0, 0, 0 ... ] for each row, and then you can use that as a mask on the original data -
df[df.symbol == 'BTC']
I'm trying to feed a record to knn.predict() to make a prediction by using the following code:
person_features = {
'cma': 462, # Metropolitan area
'agegrp': 9, # Age Group
'sex': 1,
'ageimm': 6, # Age group at immigration
'immstat': 1, # Immigrant status
'pob': 21, # Other Eastern Asia
'nol': 4, # First languages
'cip2011': 7, # Major field of study: Mathematics, computer and information sciences
'hdgree': 12, # Hightest Education
}
prediction = knn.predict(person_features)
labels={True: '>50K', False: '<=50K'}
print(labels[prediction])
But it showed
TypeError: float() argument must be a string or a number, not 'dict'
I tried making it into list of tuples like:
person_features= [('cma',462), ('agegrp',9), ('sex',1), ('ageimm',6), ('immstat',1), ('pob',21), ('nol',4), ('cip2011',7), ('hdgree',12)])
But didnt work either.
What should I do to solve this type error? I feel like the solution is easy, but somehow I just could wrap my head around it.
New to programming and just started to learn Python less than three month. So bear with me for my amateur questions and answer!
# I looked up the numbers from the coding book
cma = 462
agegrp = 9
sex = 1
ageimm = 6
immstat = 1
pob = 21
nol = 4
cip2011 =7
hdgree = 12
MoreThan50K = 1 # what I am going to predict, 1 for >50K, 0 for <50K
person_features = [cma, agegrp, sex, ageimm, immstat, pob, nol, cip2011, hdgree, MoreThan50K]
prediction = knn.predict(person_features)
So it was pretty straightforward afterall.