I'm trying to concatenate 2 dataframes keeping only those rows where value of 2 columns is present in both dataframes. For e.g.
tp1 = pd.DataFrame(
{
'A': [1, 2, 3, 4],
'B': [5, 4, 2, 7],
'C': [2, 4, 9, 1],
'D': [1, 9, 7, 0]
})
tp2 = pd.DataFrame(
{
'A': [8, 2, 3, 9],
'B': [6, 4, 2, 4],
'C': [2, 9, 9, 1],
'D': [1, 9, 7, 0]
})
tpOUT = pd.DataFrame(
{
'A': [2, 2, 3],
'B': [4, 4, 2],
'C': [4, 9, 9],
'D': [9, 9, 7]
})
if tp1 and tp2 are 2 dataframes then tpOUT is corresponding output. I googled and found you can do this with pd.merge and pd.concat but cant seem to get it worked.
Related
e.g.
d1 = {'a':[1, 2, 3], 'b': [1, 2, 3]}
d2 = {'a':[4, 5, 6], 'b': [3, 4, 5]}
The output should be like this:
{'a':[1, 2, 3, 4, 5, 6], 'b': [1, 2, 3, 4, 5]}
If the value repeats itself, it should be recorded only once.
Assuming both dictionaries have the same keys and all keys are present in both dictionaries.
One way to achieve could be:
d1 = {'a':[1, 2, 3], 'b': [1, 2, 3]}
d2 = {'a':[4, 5, 6], 'b': [3, 4, 5]}
# make a list of both dictionaries
ds = [d1, d2]
# d will be the resultant dictionary
d = {}
for k in d1.keys():
d[k] = [d[k] for d in ds]
d[k] = list(set([item for sublist in d[k] for item in sublist]))
print(d)
Output
{'a': [1, 2, 3, 4, 5, 6], 'b': [1, 2, 3, 4, 5]}
list_of_lists = [[1, 2, 3, 4], [1, 5, 6, 7], [1, 8, 9, 10]]
I would like to get to:
transposed_list = [[1, 2, 5, 8], [1, 3, 6, 9], [1, 4, 7, 10]]
In other words, only transpose from the 2nd element in the list, keeping the first element in place.
Try:
list_of_lists = [[1, 2, 3, 4], [1, 5, 6, 7], [1, 8, 9, 10]]
out = [
[list_of_lists[i][0]] + list(l)
for i, l in enumerate(zip(*(l[1:] for l in list_of_lists)))
]
print(out)
Prints:
[[1, 2, 5, 8], [1, 3, 6, 9], [1, 4, 7, 10]]
I have a dictionary that looks something like that
{A:{'score': 0, 'throw1': [3, 2, 5, 6], 'throw2': [1, 5, 5, 1]},
'B': {'score': 0, 'throw1': [2, 2, 3, 6], 'throw2': [6, 4, 2, 2]}}
A and B are players in this game and the throw1 and throw2 are their dice rolls. Each player has 4 attempts.
My question is how do i extract both throw1 and throw2 from the dictionary and sum their respective attempts together for each player. For instance, Player A threw 3 and 1 for their first attempt on both throws. I want the answer to be 4
you can use the zip that returns an iterator of tuples
data = {'A': {'score': 0, 'throw1': [3, 2, 5, 6], 'throw2': [1, 5, 5, 1]},
'B': {'score': 0, 'throw1': [2, 2, 3, 6], 'throw2': [6, 4, 2, 2]}}
player_A_1_results = data['A']['throw1']
player_A_2_results = data['A']['throw2']
for f, s in zip(player_A_1_results, player_A_2_results):
print(f + s)
You could iterate each players throw summary like below
data = {'A': {'score': 0, 'throw1': [3, 2, 5, 6], 'throw2': [1, 5, 5, 1]},
'B': {'score': 0, 'throw1': [2, 2, 3, 6], 'throw2': [6, 4, 2, 2]}}
for player, info in data.items():
print("Throw summary of player "+player)
for t1, t2 in zip(info["throw1"], info["throw2"]):
print(t1+t2)
I have a dataframe with 12 different features. And I would like to plot histograms for each in one go on a panel 4x3.
test = pd.DataFrame({
'a': [10, 5, -2],
'b': [2, 3, 1],
'c': [10, 5, -2],
'd': [-10, -5, 2],
'aa': [10, 5, -2],
'bb': [2, 3, 1],
'cc': [10, 5, -2],
'dd': [-10, -5, 2],
'aaa': [10, 5, -2],
'bbb': [2, 3, 1],
'ccc': [10, 5, -2],
'ddd': [-10, -5, 2]
})
I can do it by writing something like the code below:
# plot
f, axes = plt.subplots(3, 4, figsize=(20, 10), sharex=True)
sns.distplot( test["a"] , color="skyblue", ax=axes[0, 0])
sns.distplot( test["b"] , color="olive", ax=axes[0, 1])
sns.distplot( test["c"] , color="teal", ax=axes[0, 2])
sns.distplot( test["d"] , color="grey", ax=axes[0, 3])
...
How can I loop and iterate through features in an elegant way instead? I'd like to assign the same four colors for each row.
you can include everything in a for loop:
colors =["skyblue", "olive", "teal", "grey"]
f, axes = plt.subplots(3, 4, figsize=(20, 10), sharex=True)
for i, ax in enumerate(axes.flatten()):
sns.distplot( test.iloc[:, i] , color=colors[i%4], ax=ax)
Seaborn provides a FacetGrid for such purposes.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
test = pd.DataFrame({
'a': [10, 5, -2],
'b': [2, 3, 1],
'c': [10, 5, -2],
'd': [-10, -5, 2],
'aa': [10, 5, -2],
'bb': [2, 3, 1],
'cc': [10, 5, -2],
'dd': [-10, -5, 2],
'aaa': [10, 5, -2],
'bbb': [2, 3, 1],
'ccc': [10, 5, -2],
'ddd': [-10, -5, 2]
})
data = pd.melt(test)
data["hue"] = data["variable"].apply(lambda x: x[:1])
g = sns.FacetGrid(data, col="variable", col_wrap=4, hue="hue")
g.map(sns.distplot, "value")
plt.show()
I want to use a python function or library - if any - for creating a new matrix whose first row beginning from the right-below is created by using old matrix's first column beginning from the left-top. That matrix can have different columns and rows but of course my new matrix have to have same dimension as previous one. My will is something like that:
In keeping with the brief style of the question:
In [467]: alist = [5,6,4,3,4,5,3,2,5,3,1,2,2,3,2,1,3,1,1,1]
In [468]: arr = np.array(alist).reshape(4,5)
In [469]: arr
Out[469]:
array([[5, 6, 4, 3, 4],
[5, 3, 2, 5, 3],
[1, 2, 2, 3, 2],
[1, 3, 1, 1, 1]])
In [470]: arr.reshape(5,4)
Out[470]:
array([[5, 6, 4, 3],
[4, 5, 3, 2],
[5, 3, 1, 2],
[2, 3, 2, 1],
[3, 1, 1, 1]])
In [471]: arr.reshape(5,4,order='F')
Out[471]:
array([[5, 3, 2, 1],
[5, 2, 1, 4],
[1, 3, 3, 3],
[1, 4, 5, 2],
[6, 2, 3, 1]])
In [473]: np.rot90(_)
Out[473]:
array([[1, 4, 3, 2, 1],
[2, 1, 3, 5, 3],
[3, 2, 3, 4, 2],
[5, 5, 1, 1, 6]])