Why does paired columns with the same name from a df get changed after being imported on Python using pandas? - python-3.x

I realized of something very weird today, I have a .csv file which contains a df that is displayed as shown below when open with Excel:
One could think after executing the following code on Python3x:
import pandas as pd
metadata_file_path = r'C:\Users\ResetStoreX\Pictures\Metadata.csv'
df_metadata = pd.read_csv(metadata_file_path, index_col=0)
print(df_metadata)
The expected output should be this one down below:
0 0 1 1 2 2 3 3 4 4 5 5
0 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes None
1 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes Brown
2 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes Green
3 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes Purple
4 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes Sand
However, it ends up being this one instead:
0 0.1 1 1.1 2 2.1 3 3.1 4 4.1 5 5.1
0 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes None
1 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes Brown
2 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes Green
3 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes Purple
4 Background Ocean Body Crab Colour Dark green Eyes type Antennae Claws None Spikes Sand
As can be seen, the columns with the same name were modified by Pandas (or Python) when imported, so it was added 0.1 to the next column with the same name of the previous one.
I don't understand why this happens, and if possible, I would like to know a way of preventing this unexpected modification.

Pandas read_* methods always prevent duplicated columns names, because is problem with selecting.
If use df[0] it select both columns, not one.
For original columns names is possible use:
df.columns = df.columns.str.split('.').str[0].astype(int)
Another idea is used first values before . for grouping without change columns names:
row = 0
d = {x.iat[0]: x.iat[1] for name, x in df.iloc[row].groupby(lambda x: x.split('.')[0], level=0)}

Related

Power Query to sum for each color and each size, return a value on the available size list

I have a list of shirt colors and the suggested sizes of them. I would like to create a new column in the query and have a value of list based on the filtered elements ( I have no idea how to explain it differently, feel free to correct me). So the rules are:
If the current color has XS in the occurrence list then the value of the row needs to be "YES"
If the current color does not have XS but has XXL,XL or L then the value should be "XYES"
Otherwise the value needs to be "NO"
Jacket Color
Jacket Size
Black
XS
Black
XS
Black
S
Blue
XS
Blue
L
Blue
XL
Blue
XXL
Blue
XL
Blue
XXL
Green
XS
Green
S
Green
M
Red
XS
Red
XXL
Red
S
Red
XXL
White
S
White
M
The table should look like this:
Jacket Color
Jacket Size
New_col
Black
XS
YES
Black
XS
YES
Black
S
YES
Blue
XS
XYES
Blue
L
XYES
Blue
XL
XYES
Blue
XXL
XYES
Blue
XL
XYES
Blue
XXL
XYES
Green
XS
YES
Green
S
YES
Green
M
YES
Red
XS
XYES
Red
XXL
XYES
Red
S
XYES
Red
XXL
XYES
White
S
NO
White
M
NO
I am not that big of a tech guy myself, if you can help me how to google the answer, that is good aswell.Thank you in advance.
Tried everything I could with this little knowledge I have about power query. If this could be solved by me, I would have a job right now.
Your result data doesn't match your sample data but this is the process.
Import data in PQ
Select Jacket Color and then group by from the ribbon. Enter the following:
Add a new custom column from the ribbon and enter the following:
if List.Contains([All][Jacket Size], "XS") then "Yes" else if List.ContainsAny([All][Jacket Size], {"XXL","XL", "L"}) then "XYES" else "No"
Expand the column to get all rows back.

Changing the color of agents in NetLogo according to a turtle-own variable

I am writing a simple food exchange model in netlogo and I want the agents to change their color as their [food] level changes in the model. The amount of food is in range [0,1] and I want the color to change from white to red (white = food level of zero and red = food level of 1) with the code below:
ask turtles [
set color scale-color red food 1 0 ]
But my turtles turn black somehow in the middle of food exchange! Turtles own food value can be any floating point number in the range [0,1]. Does anyone know how I can keep the color within the light shades of red (red to white) and no black?
Scale-color and ranges
From the example above, the color and number are correct, but the issue seems to be with the range provided. Since food is within [0,1], the color gradient should match the changes, though it will be from 0 (white) to 1 (black).
As JenB mentioned, you might want to extend the range of the expected values. Changing the range from [0,1] to [0,2] for scale-color would help, since with scale-color the midpoint of the range is the color provided.
[ set color scale-color red food 2 0 ]
As long as food is within [0,1], this example should fluctuate between red and white.

Excel - Overlapping Data - Pivottable

Is it possible to create a table for data with overlapping values within the same column?
I would prefer a pivot table where I could slice the data instead of Venn Diagram.
Data
1. Red / Material 1
2. Red / Material 2
3. Red / Material 3
4. Red / Material 4
5. Red / Material 5
6. Blue / Material 1
7. Blue / Material 6
8. Blue / Material 7
9. Blue / Material 8
10. Blue / Material 9
11. Blue / Material 10
12. Blue / Material 11
13. Blue / Material 12
14. Green / Material 1
15. Green / Material 2
16. Green / Material 6
17. Green / Material 7
18. Green / Material 8
19. Green / Material 13
20. Green / Material 14
First, create a table that has combinations of colors like this:
Color Color2
-------------
Red Red
Red Blue
Red Green
Blue Red
Blue Blue
Blue Green
Green Red
Green Blue
Green Green
One way to do this is to created a calculated Colors table like this:
Colors = CROSSJOIN(SELECTCOLUMNS(VALUES(Data[Color]), "Color", Data[Color]),
SELECTCOLUMNS(VALUES(Data[Color]), "Color2", Data[Color]))
Now we can create a calculated column on this table that counts the intersecting values:
Count =
VAR Materials1 = CALCULATETABLE(VALUES(Data[Material]),
Data[Color] = EARLIER(Colors[Color]))
VAR Materials2 = CALCULATETABLE(VALUES(Data[Material]),
Data[Color] = EARLIER(Colors[Color2]))
RETURN IF(Colors[Color] = Colors[Color2], BLANK(),
COUNTROWS(INTERSECT(Materials1, Materials2)))
Now you can set them up in a matrix visual with Color on the Rows and Color2 on the Columns and Count in the Values box.

Excel sort by similar cell data

So I have this list of data that i need to compare 2 spreadsheets with. Im going to simplify it with a list like below(Column A being a part number, and column B being a quantity):
Spreadsheet 1:
Red 1
Blue 2
Green 1
Orange 6
Yellow 8
Spreadsheet 2:
Red 1
Green 1
Blue 2
Orange 6
Yellow 8
Silver 2
Brown 3
Now what i would like my output to be:
Red 1
Blue 2
Green 1
Orange 6
Yellow 8
Silver 2
Brown 3
Notice that im sorting it so that list 2 aligns with list one, and if list 2 contains things that are not on list 1 it puts it at the bottom(preferably vice-versa compatible). Im not sure if this is even possible, but if it is it will GREATLY decrease my workload so any help is MUCH appreciated. Thanks for your time!

How to extract a particular word from a cell in excel that matches a list of values

How to extract a particular word from a cell in excel from a list (array) of possible values and return that matched word.
I have a list of products, but I just want to know the color. The color is embedded in the products description, therefore I need to extract it out of the item description. Here is a brief example of a list the item descriptions:
CORNER CCP 26" BARN RED
CORNER CCP 28" KHAKI
CORNER CCP 28" SLATE GRAY
CORNER RS EZ ANTIQUE GRAY
CORNER RS EZ 26" ASHWOOD GRAY
CORNER,RSC EZ,AUTUMN CEDAR
CORNER RS EZ 26" BARN RED
CORNER RS EZ 26" CANARY YELLOW
CORNER RS EZ 26" COASTAL BROWN
CORNER,RS EZ 26" COASTAL CLAY
CORNER,RS EZ 26"COASTAL CEDAR
CORNER RS EZ 26" CYPRESS GREEN
CORNER RS EZ 26" CLASSIC WHIT
I want to then compare that Item description with a list of colors I have and then just return those color names.
Amaranth
Amber
Amethyst
Apricot
Aquamarine
Azure
Baby blue
Beige
Black
Blue
Blue-green
Blue-violet
Blush
Bronze
Brown
Burgundy
Byzantium
Carmine
Cerise
Cerulean
Champagne
Chartreuse green
Chocolate
Cobalt blue
Coffee
Copper
Coral
Crimson
Cyan
Desert sand
Electric blue
Emerald
Erin
Gold
Gray
Green
Harlequin
Indigo
Ivory
Jade
Jungle green
Lavender
Lemon
Lilac
Lime
Magenta
Magenta rose
Maroon
Mauve
Navy blue
Ocher
Olive
Orange
Orange-red
Orchid
Peach
Pear
Periwinkle
Persian blue
Pink
Plum
Prussian blue
Puce
Purple
Raspberry
Red
Red-violet
Rose
Ruby
Salmon
Sangria
Sapphire
Scarlet
Silver
Slate gray
Spring bud
Spring green
Tan
Taupe
Teal
Turquoise
Violet
Viridian
White
Yankees Blue
Yellow
Place the data in column A.
Place the list of colors in column B.
Place the following array formula in column C and fill down:
=IFERROR(INDEX($B$1:$B$86,MATCH(1,COUNTIF($A1,"*"&$B$1:$B$86&"*"),0)),"")
Note that this is an array formula. In order for it to operate correctly, first copy and paste from your browser window to Excel, then, with the same cell selected, click in the formula bar (or press F2) and press Control + Shift + Enter. There should now be braces around the formula.
If the string in column A contains more than one of the colors in column B, the first matching column B color will be listed.

Resources