Im new in pandas so would like to ask some help according to excel file.
Here I have some sheet with column 1:
Index Column1
1 PF7293
2 NodeB Name=SN5208, LogicRNCID=106
3 KL5083
4 Label=DL7765A3U-2, CellID=28643, LogicRNCID=201
and I wanna create another column2 that should have some word from column1 and look like that:
Index Column2
1 PF7293
2 SN5208
3 KL5083
4 DL7765
in excel we used MID. I would like to do the same using pandas. Thank you!
Question 2
New sheet looks like that:
Column1 Column2
KL7110 BTS works
KS5007 BSS works
KL5066 Planned works
KL5147 Planned works
KL5066 Unplanned work
KL5077 Power work
KL5077 Power work
AN9045 MW work
I wanna delete same value from Column 2 for one value in column1.
For example here is 2 KL5077 in column1 with same value in column2 I would Like to delete one of them.
And second problem here is 2 KL5066 in Column1 with different value in Column2 and in this case I would like to put values in Column2 together like "Planned work/Unplanned work". Hope I ve explained well))
You could try Series.str.extract:
df['Column2'] = df['Column1'].str.extract(r'([A-Z]{2}\d{4})')
Where the regex pattern here can be though of as "2 uppercase letters" followed by "4 digits"
[out]
Index Column1 Column2
0 1 PF7293 PF7293
1 2 NodeB Name=SN5208, LogicRNCID=106 SN5208
2 3 KL5083 KL5083
3 4 Label=DL7765A3U-2, CellID=28643, LogicRNCID=201 DL7765
Update
For the 2nd problem:
1) To drop duplicate rows use:
df.drop_duplicates(subset=['Column1', 'Column2'], inplace=True)
2) To join multiple 'Column2' values use:
df_new = df.groupby('Column1')['Column2'].apply('/'.join).reset_index()
[out]
Column1 Column2
0 AN9045 MW work
1 KL5066 Planned works/Unplanned work
2 KL5077 Power work
3 KL5147 Planned works
4 KL7110 BTS works
5 KS5007 BSS works
Related
i need your help please in python
My problem is that i have a lot of excel table and i need to modify values in the one of them from the other table like this :
TABLE 1 :
CODE OLDNOUN COLUMN3 COLUMN4 .... COLUMNX
1 AZE
2 QSD
3 WXC
TABLE 2 :
CODE NEWNOUN ATTRIBUT
1 ABC A1
2 DEF B4
3 GHI C2
MYFINAL TABLE SHOUD be like that :
CODE OLDNOUN COLUMN3 COLUMN4 .... COLUMNX ATTRIBUT
1 ABC A1
2 DEF B4
3 GHI C2
In code it's like that :
IF TABLE_1.CODE == TABLE_2.CODE then TABLE_1.OLDNOUN = TABLE2_NEWNOUN
and to create the new column.
I don't know how to do it in python and thanks for ur help :)
You can work with Excel in python by using openpyxl library. This link right here will help you get quickly started with the stuff you need to know in order to be able to do whatever you want to.
Link: https://automatetheboringstuff.com/chapter12/
I have one column that contains text such as,
column1
3
4
5
6
7
8
9.2
10
11
txt1
txt2
I want to create a new column2 that gives me the following output.
column1 column2
3 3-6
4 3-6
5 3-6
6 3-6
7 7-10
8 7-10
9.2 7-10
10 7-10
11 11
txt1 txt1
txt2 txt2
I have tried with the following Dax function but i dont get it to work as it only returns "value if false". My format on Column1 is text.
column2 = IF(CONTAINS(Table1;Table1[column1];"3";Table1[Column1];"4");"3-8";"9.5-10").........
I have tried with the FIND function aswell without luck.
Someone have any tips? If someone nows how to do this in Excel perhaps it could be figured out that way?:D
/D
I'm not sure exactly what your logic for bucketing values is, but you should be able to write something along these lines:
Column2 = SWITCH(TRUE(),
ISERROR(VALUE(Table1[Column1])), Table1[Column1],
VALUE(Table1[Column1]) >= 3 && VALUE(Table1[Column1]) <= 6, "3-6",
VALUE(Table1[Column1]) >= 7 && VALUE(Table1[Column1]) <= 10, "7-10",
Table1[Column1])
This SWITCH function will return the first thing that evaluates to true, otherwise, it returns the last argument. The first pair checks if the value can be converted to a number and if not returns the original value. The next two pairs check if the number is in certain ranges and returns specified strings for those ranges.
Here's a link that explains the SWITCH(TRUE()...) construction in more detail:
https://powerpivotpro.com/2015/03/the-diabolical-genius-of-switch-true/
i would like to count number for every 7 rows, data are in one column. i use this formula, but it is not working.
from B8 to B14329, for every 7 rows, count number if it is equal to 3. so i know how many 3 in every 7 rows.
=COUNTIFS(B8:B14329, OFFSET($B$7,(ROW()-12)*7,0,7,1),B8:B14329,=3)
Thanks a lot!
i want something like this:
data count
3
2
3
1
3
3
1 4
1
2
2
3
3
1
1 2
.....
....
...
Simple and easy:
=SUMPRODUCT((B8:B14329=3)*(MOD(ROW(B8:B14329),7)=1))
Just change the =1 to your needs. To start with row 1 =1, 2 =2 ... 6 =6, 7 =0. This way, to start count at row 8 it is =1
EDIT: having your exaple now, you want something completely different... lol.
=IF(MOD(ROW(),7)=0,COUNTIF(A8:A14,3),"")
Put this in row 14 and then drag down... change the =0 as you need it.
Here's what I would do
Add a new column with the row index (8 to 14239) in your case
Add Yet another column, with a formula to tell whether the column you just added is a multiple of 7. Put it's value like "TRUE" or "FALSE"
You can use the MOD function to check the remainder of the division.
= MOD ( Number , Divisor )
By now, you should have, aside from the columns you already have, something like:
8-----FALSE
9-----FALSE
10-----FALSE
11-----FALSE
12-----FALSE
13-----FALSE
14-----TRUE
15-----FALSE
Once you have that, just apply a filter on the "TRUE/FALSE" column, select the "TRUE" values and you will be able to count the number of "3"s on the actual value column, by also using a filter on it.
I hope it helps, and it's easier than a really messy formula.
I have an Summary sheet set up data set up as follows-
Cat A Cat B Cat C Cat D
Name 1 0 0 0 0
Name 2 2 3 2 2
Name 3 2 2 2 2
Name 4 3 2 2 3
Name 5 2 3 2 3
I also then have separate tabs for each of Name1 through to Name 5.
The summary sheet contains the maximum values for each category from each tab. So the Cell at Cat A Name 1 should show the maximum value on Sheet(Name1) in the Cat A column.
So far so good. However each tab may not contain the same categories, so therefore I would like teh summary sheet to check the maximum value in each column by doing a search on the Cat name.
So far I have this-
=MATCH(Overview!S$1,Name1!$C$1:$V$1,0)
Which returns the column number with the right Category, in this case 13. So I can find the right column. What I am struggling with is to now find the maximum value in the column.
Can anyone help?
Thanks
IAssuming your search range goes to row 1000:
=MAX(INDEX(Name1!$C$2:$V$1000,0,MATCH(Overview!S$1,Name1!$C$1:$V$1,0)))
The 0 Row argument in Index means to select the entire column.
The Offset function is your key here.
After you've got the value from the match, you can pass it to the offset to get the correct column.
So, for example, you probably want something like:
=Max(Name1!$C1:$C2000)
But you don't know whether you should use the C column or the D column or whatever, in this case, it was 13, so is that the P column? (c=3, the match was 13 so 3+13 = 16 = P?), so I think you want something like this:
=Max(Offset(Name1!$C$1:$C$2000, 0, [result of your match expression] - 1))
Here's an example of what I think you want in GoogleDocs:
https://docs.google.com/spreadsheet/ccc?key=0Ai45AJPc2AWMdGRlZXNIdlZBaHJxc01qVlJWa1N1WXc
I have a table with first column as primary key. Ex:
id value1 value2
1 10 5
2 2 3
3 12 5
..
I also have a second list of id's I want to select, which can have repeated ids. Ex:
selectId
1
2
2
2
5
10
..
How can I "merge" the two tables (something like INNER JOIN) to obtain:
id value1 value2
1 10 5
2 2 3
2 2 3
2 2 3
5 99 99
10 22 22
..
I tried using 'Microsoft Query' from Data > Extern Data to join the two tables. The problem is that it seems it cannot handle tables with more than 256 columns.
Thanks
UPDATE:
Thanks, VLOOKUP works as intended.
However one problem is that if the row was found but that corresponding column was blank, this function returns 0 (where I expected it to return an empty cell), and since zero is a valid value, I have no way to differentiate between the two (blank and zero)?
Any help is appreciated..
If this is Excel -like the title says- just use vlookups.
Not very relational, but that's the Excel way.
Using the VLOOKUP function would get you the data in the layout you require.
If you are using Tables in Excel 2007, the formula would look like this based on the example below.
in cell B8
=VLOOKUP([selectId],Table1,2,FALSE)
in cell C8
=VLOOKUP([selectId],Table1,3,FALSE)
Lookup screenshot http://img208.imageshack.us/img208/1/lookupz.png
It is not clear where you store your data, but it looks like you have this problem, described on Microsoft site:
http://support.microsoft.com/kb/272729