Replace Pattern matching portion with some other value in python 3 - python-3.x

I want to replace ROAD with RD
addr = ['100 NORTH MAIN ROAD',
'100 BROAD ROAD APT.',
'SAROJINI DEVI ROAD',
'BROAD AVENUE ROAD']
Output
output : ['100 NORTH MAIN RD.',
'100 BROAD RD. APT.',
'SAROJINI DEVI RD.',
'BROAD AVENUE RD.']
i tried below mentioned code
new_address=[word.replace("ROAD","RD") for word in addr]
but not getting desired output (BROAD is also getting replaces by RD.)
['100 NORTH MAIN RD.', '100 BRD. RD. APT.', 'SAROJINI DEVIRD.', 'BRD. AVENUE RD.']

In this current example can do:
new_address=[word.replace(" ROAD"," RD.") for word in addr]
Or in general sence, use regex:
new_address = [re.sub(r'\bROAD\b', 'RD.', w) for w in l]
# ['100 NORTH MAIN RD.', '100 BROAD RD. APT.', 'SAROJINI DEVI RD.',
# 'BROAD AVENUE RD.']

Related

NoSuchElementException: M no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/form/table"}

I'm trying to get the table with all ISIN codes from following website, but I'm getting Error :
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/form/table"}
Code
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
PATH = r"C:\Users\HP\Downloads\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get('http://stockcare.net/ISINNumber.asp')
table = driver.find_element_by_xpath('/html/body/form/table')
print(table)
driver.quit()
As I get the table I want to store it into the pandas DataFrame. Can someone help me out?
There is a frame in this site. That's why it is throwing no such element.
You need to first switch to that frame. Then try to get innerHTML.
driver.switchTo().frame(1);
String table = driver.findElement(By.xpath("/html/body/form/table")).getAttribute("innerHTML");
Thread.sleep(3000);
System.out.println("Table: "+ table);
Output: Your output will be like this. Now from this, you can extract the values.
Table:
<tbody><tr bgcolor="#6699cc" style="color:white;">
<th width="8%" class="specialTDBig">Group</th>
<th width="10%" class="specialTDBig">BSE Code</th>
<th width="33%" class="specialTDBig">BSE Name</th>
<th width="16%" class="specialTDBig">ISINNO</th>
<th width="20%" class="specialTDBig">Old Name</th>
</tr>
<tr>
<td class="specialTDBig">A</td>
<td class="specialTDBig">523395</td>
<td class="specialTDBig">3M India Ltd </td>
<td class="specialTDBig">INE470A01017</td>
To print the table data as the elements are within an <iframe> you have to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the desired element to be visible.
You can use either of the following Locator Strategies:
Using CSS_SELECTOR:
driver.get("http://stockcare.net/ISINNumber.asp")
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"form[name='frmScrip'] iframe")))
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "form[name='frmDetails']"))).text)
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Console Output:
Group BSE Code BSE Name ISINNO Old Name
A 523395 3M India Ltd INE470A01017 Birla 3m
A 524348 Aarti Drugs INE767A01016
A 524208 Aarti Ind(5) INE769A01020
A 541988 Aavas Financiers INE216P01012
A 500002 Abb India Ltd.(2) INE117A01022 Abb Ltd.(2)
A 543187 ABB Power Prod(2) INE07Y701011
A 500488 Abbott (I) Ltd INE358A01014 Knoll Pharm.
A 500410 ACC INE012A01025
A 532762 Action Const(2) INE731H01025
A 512599 Adani Enter(1) INE423A01024
A 541450 Adani Green Energy INE364U01010
A 532921 Adani Ports(2) INE742F01042 Mundra Port
A 519183 ADF Foods Ltd INE982B01019 American Dry
A 535755 Adi.Bir.Fash INE647O01011 Pantaloons F
A 540691 Adit.Bir.Capital INE674K01013
A 540025 Advanced Enzym(2) INE837H01020
A 500003 Aegis Logist(1) INE208C01025
A 542752 Affle(India)(2) INE00WC01027
A 500215 Agro Tech Food INE209A01019 ITC Agro-tec
A 532811 Ahluwalia Co(2) INE758C01029
A 532683 Aia Engineer(2) INE212H01026
A 532331 Ajanta Pharma(2) INE031B01049
A 500710 Akzo Nobel India INE133A01011 ICI India
A 506235 Alembic Ltd(2) INE426A01027
A 533573 Alembic Pha(2) INE901L01018
A 539523 Alkem Laborat(2) INE540L01014
A 506767 Alkyl Amines(2) INE150B01039
A 532480 Allahabad Bk INE428A01015
A 532749 Allcargo Logis(2) INE418H01029 Allcargo Glo
A 500008 Amar Raja B(1) INE885A01032
A 540902 Amber Enterp.India INE371P01015
A 500425 Ambuja Cem(2) INE079A01024
A 532418 Andhra Bank INE434A01013
A 532259 Apar Ind. INE372A01015
A 523694 Apcotex Indus(2) INE116A01032
A 540692 Apex Frozen Foods INE346W01013
A 533758 APL Apollo Tub(2) INE702C01027
A 508869 Apollo Hosp.(5) INE437A01024
A 531761 Apollo Pipes Ltd INE126J01016
A 538566 Apollo Tricoat Tu(2) INE919P01029 Best Steel L
A 500877 Apollo Tyre(1) INE438A01022
A 532475 Aptech Ltd INE266F01018 Aptech Train
A 531179 Arman Fin.Serv. INE109C01017 Arman Leasin
A 542484 Arvind Fashion (4) INE955V01021
A 500101 Arvind Ltd INE034A01011 Arvind Mills
A 515030 Asahi Ind Gl(1) INE439A01020
A 500477 Ashok Leyland(1) INE208A01029
A 533271 Ashoka Buildcon(5) INE442H01029
A 532888 Asian Granito INE022I01019
A 500820 Asian Paint(1) INE021A01026
A 533138 Astec Lifesci INE563J01010
A 540975 Aster DM Healthcare INE914M01019
A 532493 Astra Micro(2) INE386C01029
A 532830 Astral Ltd(1) INE006I01046 Astral Poly(
A 506820 Astrazen.Ph(2) INE203A01020 Astra Idl
A 500027 Atul Ltd. INE100A01010
A 540611 AU Small Finan Bank INE949L01017
A 524804 Aurobindo Ph(1) INE406A01037
A 539289 Aurum Proptech Ltd(5 INE898S01029 Majesco Ltd(
A 505010 Auto Axles INE449A01011
A 512573 Avanti Feeds(1) INE871C01038
A 540376 Avenue Supermarts INE192R01011
A 532215 Axis Bank (2) INE238A01034
A 532977 Bajaj Auto Ltd INE917I01010
A 533229 Bajaj Consumer Ca(1) INE933K01021 Bajaj Corp(1
A 500031 Bajaj Elec(2) INE193E01025
A 500034 Bajaj Finance(2) INE296A01024
A 532978 Bajaj Finvest(5) INE918I01018
A 500032 Bajaj Hind Sugar(1) INE306A01021 Bajaj Hindus
A 500490 Bajaj Hold&Inves INE118A01012 Bajaj Auto
A 530999 Balaji Amine(2) INE050E01027
A 532382 Balaji Tele(2) INE794B01026
A 502355 Balkrish Ind(2) INE787D01026
A 523319 Balmer Lawri INE164A01016
A 500038 Balrampur Chi(1) INE119A01028
A 541153 Bandhan Bank Ltd INE545U01014
A 532525 Bank Maharashtra INE457A01014
A 532134 Bank of Baroda(2) INE028A01039
A 532149 Bank of India INE084A01016
A 500041 Bannari Aman INE459A01010
A 500042 Basf India INE373A01013
A 500043 Bata India(5) INE176A01028
A 506285 Bayer Corp INE462A01022 Bayer India
A 500048 BEML Ltd INE258A01016 Bh.Earth Mov
A 509480 Berger Paint(1) INE463A01038
A 533303 BF Investment Ltd INE878K01010
A 532430 BF Utilities(5) INE243D01012
A 500052 Bhansali Eng(1) INE922A01025
A 503960 Bharat Bijle(10) INE464A01028
A 541143 Bharat Dynamics INE171Z01018
A 500049 Bharat Elec(1) INE263A01024
A 533228 Bharat Financial INE180K01011 SKS Microfin
A 500493 Bharat Forge(2) INE465A01025
A 500103 Bharat heavy(2) INE257A01026
A 500547 Bharat Petro INE029A01011
A 532454 Bharti Artl(5) INE397D01024
A 532523 Biocon Ltd(5) INE376G01013
A 500335 Birla Corp INE340A01012
A 532400 BirlaSoft Ltd(2) INE836A01035 Kpit Technol
A 506197 Bliss Gvs Ph(1) INE416D01022 Bliss Chem
A 526612 Blue Dart Ex INE233B01017
A 500067 Blue Star(2) INE472A01039
A 524370 Bodal Chem(2) INE338D01028 Dintex Dye
A 501425 Bombay Burmah(2) INE050A01025
A 500020 Bombay Dyeing(2) INE032A01023
A 502219 Borosil Renewable INE666D01022 Borosil Glas
A 500530 BOSCH LTD INE323A01026 Motor ind
A 532929 Brigade Enterpr INE791I01019
A 500825 Britania Ind(1) INE216A01030
A 532321 Cadila Health(1) INE010B01027
A 532834 Camlin Fine Sc(1) INE052I01032
A 532483 Canara Bank INE476A01014
A 511196 Canfin Homes(2) INE477A01020
A 524742 Caplin Point(2) INE475E01026
A 531595 Capri Global Capi INE180C01018 Money Mat Fi
A 513375 Carborundum(1) INE120A01034
A 534804 CARE Ratings Ltd INE752H01013 Credit Analy
A 500870 Castrol Ind(2) INE172A01027
A 519600 CCL Products(2) INE421D01022
A 500878 Ceat Limited INE482A01020
A 532885 Central Bank INE483A01010
A 500040 Century INE055A01016
A 532548 Centuryply(1) INE348B01021
A 532443 Cera Sanitar(5) INE739E01017 Madhusudan O
A 500084 CESC Ltd.(1) INE486A01021
A 542399 Chalet Hotels Ltd INE427F01016
A 500085 Chambal Fert INE085A01013
A 500110 Chennai Pet. INE178A01016 Madras Ref.
A 511243 Chola.Inv&fin(2) INE121A01024
A 504973 Cholamandalam Fin(2) INE149A01025 TI Fin.Holdi
A 534758 Cigniti Techno INE675C01017 Chakkilam In
A 500087 Cipla Ltd(2) INE059A01026
A 532210 City Union Bnk(1) INE491A01021
A 506390 Clariant Chem INE492A01029 Colour Chem
A 533278 Coal India Ltd INE522F01014
A 540678 Cochin Shipyard INE704P01017
A 532541 Coforge Ltd INE591G01017 NIIT Tech
A 500830 Colgate Palm(1) INE259A01022 Colgate
A 531344 Contain.Corp(5) INE111A01025
A 506395 Coromand.Inter(1) INE169A01031 Coromand.Fer
A 532179 Corpn.Bank(2) INE112A01023
A 508814 Cosmo Films INE757A01017 Cosmo Fil(pr
A 541770 Credit Access Grameen INE741K01010
A 500092 CRISIL Rating(1) INE007A01025
A 539876 Crompt.Grev.Cons(2) INE299U01018
A 542867 CSB Bank Ltd INE679A01013
A 500480 Cummins Ind(2) INE298A01020 Kirlos.cumm
A 532175 Cyient Ltd(5) INE136B01020 Infotech Ent
A 500096 Dabur India(1) INE016A01026
A 500097 Dalmia Bhar.Sug(2) INE495A01022 Dalmia Cemen
A 542216 Dalmia Bharat Ltd INE00R701025 Odisha Cemen
A 533151 DB CORP LTD INE950I01011
A 532772 DCB Bank Ltd INE503A01015 Dev Cred ban
A 523367 Dcm Shriram Ltd(2) INE499A01024 Dcm Shriram
A 500645 Deepak Fert. INE501A01019
A 506401 Deepak Nitrit(2) INE288B01029
A 532848 Delta Corp (1) INE124G01033 Arrow Webtex
A 533137 Den Networks Ltd INE947J01015
A 532121 Dena Bank INE077A01010
A 519588 DFM Foods(2) INE456C01020
A 500119 Dhampur Suga INE041A01016
A 507717 Dhanuka Agrit(2) INE435G01025 Dhanuka Pest
A 540047 Dilip Buildcon INE917M01012
A 540701 Dishman Carbogen Amics INE385W01011
A 532488 Divi's Lab.(2) INE361B01024
A 540699 Dixon Techno(2) INE935N01020
A 532868 Dlf Ltd(2) INE271C01023
A 541403 Dollar Indus.(2) INE325C01035
A 539524 DR Lal Pathlabs INE600L01024
A 500124 DR Reddys Lab(5) INE089A01023
A 523618 Dredging Cor INE506A01018
A 532610 Dwarkes Sugar(1) INE366A01041
A 532927 Eclerx Serv.Ltd INE738I01010
A 532922 Edelweiss Fin(1) INE532F01054 Edelweiss Ca
A 505200 Eicher Motor(1) INE066A01021
A 500125 Eid Parry(1) INE126A01031
A 500840 Eih Ltd(2) INE230A01023
A 500123 Elantas Beck In INE280B01018 Schenec.Beck
A 522074 Elgi Equipment(1) INE285A01027
A 531162 Emami Ltd(1) INE548C01032 Himani Ltd.
A 540153 Endurance Tech. INE913H01037
A 532178 Engineers(i)(5) INE510A01028
A 500135 EPL Limited(2) INE255A01020 Essel Propp(
A 539844 Equitas Holdings INE988K01017
A 540596 Eris Lifescienc(1) INE406M01024
A 500133 Esab India INE284A01012
A 500495 Escorts Ltd INE042A01014
A 531508 Eveready Ind(5) INE128A01029
A 500650 Excel Ind.(5) INE369A01029
A 500086 Exide Ind(1) INE302A01020 Chloride Ltd
A 531599 Fdc Ltd(1) INE258B01022
A 505744 Fed Mog Goetze INE529A01010 Goetze India
A 500469 Federal Bank(2) INE171A01029
A 541557 Fine Organic Indus. INE686Y01026
A 500144 Finolex Cabl(2) INE235A01022
A 500940 Finolex Ind.(2) INE183A01024
A 532809 Firstsource Solut INE684F01012
A 500033 Force Motor INE451A01017
A 532843 Fortis Healthcare INE061F01013
A 533400 Future Consum.(6) INE220J01025 Future Consu
A 536507 Future Lifestyl(2) INE452O01016
A 540064 Future Retail Lt(2) INE752P01024
A 505714 Gabriel Ind(1) INE524A01029
A 532155 Gail (I) Ltd INE129A01019
A 540935 Galaxy surfactants INE600K01018
A 542011 Garden Reach ShipBld INE382Z01011
A 509557 Garware Technical INE276A01018 Garware Wall
A 532622 Gateway Dist INE852F01015
A 532345 Gati Ltd(2) INE152B01027 Gati Corpn.
A 532309 GE Power India Lt INE878A01011 Alstom India
A 500620 GE Shiping INE017A01032
A 522275 GE T&D India(2) INE200A01026 Alstom T&D (
A 540755 General Insur.Corp(5) INE481Y01014
A 500173 GFL Ltd(1) INE538A01037 Guj.Flouro(1
A 511676 GIC Housing INE289B01019
A 507815 Gillette Ind INE322A01010 Ind.Shaving
A 500660 Glaxo Ltd. INE159A01016
A 500676 Glaxosmithkl INE264A01014 Smithkl.Co
A 532296 Glenmark Phar(1) INE935A01035
A 505255 Gmm Pfaudler(2) INE541A01023 Guj.Machine
A 532754 Gmr Infrastr(1) INE776C01039
A 500163 Godfrey Phil(2) INE260B01028
A 540743 Godrej Agrovet INE850D01014
A 532424 Godrej Consum(1) INE102D01028
A 500164 Godrej Indus(1) INE233A01035 Godrej Soaps
A 533150 Godrej Propert(5) INE484J01027
A 500168 Goodyear (I) INE533A01012
A 532482 Granules(i)(1) INE101D01020
A 509488 Graphite Ind(2) INE371A01025 Carbon Everf
A 500300 Grasim Ind.(2) INE047A01021
A 501455 Greaves Cotton(2) INE224A01026 Greaves Ltd.
A 526797 Greenply Ind(1) INE461C01038
A 506076 Grindwell Nort(5) INE536A01023
A 511288 Gruh Finance(2) INE580B01029
A 530001 Guj.Alkali INE186A01019
A 524226 Guj.Amb.Exp(1) INE036B01030
A 500171 Guj.H.Chem INE539A01019
A 517300 Guj.Ind.Pow. INE162A01010
A 532181 Guj.Mineral(2) INE131A01031
A 500670 Guj.Narmada INE113A01013
A 532702 Guj.Petronet INE246F01010
A 500690 Guj.St.Fe.Ch(2) INE026A01025
A 542812 Gujarat Fluoroch(1) INE09N301011
A 539336 Gujarat Gas Ltd(2 INE844O01030
A 533248 Gujarat Pipavav INE517F01014
A 538567 Gulf Oil Lubri(2) INE635Q01029
A 541019 H.G.Infra Eng. INE926X01010
A 533162 Hathway Cable(2) INE982F01036
A 531531 Hatsun Agro(1) INE473B01035 Hatsun Milk
A 517354 Havells Ind(1) INE176B01034
A 508486 Hawkins Cook INE979B01015
A 532281 HCL Techno(2) INE860A01027
A 541729 HDFC Asset Manag(5) INE127D01025
A 500180 HDFC Bank(2) INE040A01026
A 540777 HDFC Insurance Co INE795G01014 HDFC Standar
A 500010 HDFC(2) INE001A01036
A 539787 Healthcare Global Ent INE075I01017
A 509631 HEG Ltd. INE545A01016
A 500292 Heideberg Cement INE578A01017 Mys Cement
A 519552 Heritage Food(5) INE978A01027
A 500182 Hero Motocorp(2) INE158A01026 Hero Honda(2
A 524669 Hester Biosciences INE782E01017 Hester Pharm
A 532129 Hexaware Tec(2) INE093A01033 Aptech Ltd.
A 524735 Hikal Ltd(2) INE475B01022
A 509675 HIL Ltd INE557A01011 Hyd.Indus
A 500183 Him.Fut.Comm(1) INE548A01028
A 500184 Himadri Spec(1) INE019C01026 Himadri Chem
A 514043 Himatsing Seid(5) INE049A01027
A 500185 Hind.Const(1) INE549A01026
A 513599 Hind.Copper (5) INE531E01026
A 500186 Hind.Oil Exp INE345A01011
A 500104 Hind.Petrol INE094A01015
A 500696 Hind.Unilever(1) INE030A01027 Hind.Lever(1
A 500188 Hind.Zinc(2) INE267A01025
A 500440 Hindalco(1) INE038A01020
A 541154 Hindustan Aeronautics INE066F01012
A 522064 Honda India Power Prod INE634A01018 Honda Siel
A 517174 Honeywell INE671A01010 Tata Honeywl
A 540530 Housing & Urban Dev INE031A01017
A 509820 Huntamaki India L INE275B01026 Huntamaki PP
A 532174 ICICI Bankin(2) INE090A01021
A 540716 ICICI Lombard INE765G01017
A 540133 ICICI Prudent Life INE726G01019
A 541179 ICICI Securit(5) INE763G01038
A 532835 Icra Ltd INE725G01011
A 500116 IDBI Ltd. INE008A01015 Industrial D
A 539437 IDFC FIRST BANK L INE092T01019 IDFC BANK LT
A 532659 Idfc Ltd INE043D01016
A 505726 IFB Ind.Ltd. INE559A01017
A 500106 IFCI Ltd. INE039A01010
A 517380 Igarshi Mot INE188B01013 CG Igarshi M
A 532636 IIFL Holding (2) INE530B01024 India Infoli
A 542773 IIFL Securities(2) INE489L01022
A 542772 IIFL Wealth Manag(2) INE466L01020
A 500201 Ind.Glycols INE560A01015
A 504741 Ind.Hume Pip(2) INE323C01030
A 530005 India Cement INE383A01012
A 532189 India Touris INE353K01014
A 535789 Indiabul Hous.F(2) INE148I01020
A 532832 Indiabul Real(2) INE069I01010
A 542726 IndiaMart Intermesh INE933S01016
A 532814 Indian Bank INE562A01011
A 540750 Indian Energy Exc(1) INE022Q01020
A 500850 Indian Hotel(1) INE053A01029
A 530965 Indian Oil Corp INE242A01010
A 532388 Indian Over. INE565A01014
A 521016 Indo Count(2) INE483B01026
A 532612 Indoco Rem(2) INE873D01024
A 541336 Indostar Capital Fin INE896L01010
A 532514 Indra Gas(2) INE203G01027
A 534816 Indus Towers INE121J01017 Bharti Infra
A 532187 Indusind Bnk INE095A01012
A 539807 Infibeam Avenues(1) INE483S01020 Infibeam Inc
A 532777 Info Edge INE663F01024
A 500209 Infosys Ltd(5) INE009A01021
A 500210 Ingersoll INE177A01018
A 532706 Inox Leisure INE312H01016
A 532851 Insecticides INEO7OI01018
A 538835 Intellect Design(5) INE306R01017
A 539448 InterGlobe Aviation INE646L01027
A 524164 IOL Chemical & Pharma INE485C01011 IndsOrganic
A 500214 ION Exchange INE570A01014
A 524494 Ipca Lab.Ltd (2) INE571A01020
A 532947 IRB Infrastruct INE821I01014
A 541956 Ircon Inter.Ltd(2 INE962Y01021
A 542830 IRCTC Railway Cateri INE335Y01020
A 533033 ISGEC Heavy Eng.( INE858B01029
A 500875 Itc Ltd(1) INE154A01025
A 509496 Itd Cement(1) INE686A01026
A 523610 ITI Ltd. INE248A01017
A 532209 J&K Bank(1) INE168A01041
A 532940 J.Kumar Infrapro(5) INE576I01022
A 532705 Jagran Praka(2) INE199G01027
A 512237 Jai Corp Ltd(1) INE070D01027
A 500219 Jain Irri(2) INE175A01038
A 532532 Jaipra.Associt(2) INE455F01025
A 520051 Jamna Auto(1) INE039C01032
A 506943 Jb Chemical(2) INE572A01028
A 500227 Jind.Polys INE197D01010
A 500378 Jind.Saw(2) INE324A01024 Saw Pipes
A 539597 Jind.Stain(His)(1) INE455T01018
A 532508 Jind.Stainl(2) INE220G01021 Jsl Stainles
A 532286 Jind.Steel Pow(1) INE749A01030
A 532644 Jk Cement INE823G01014
A 500380 JK Laksh.Cem(5) INE786A01032
A 532162 JK Paper INE789E01012 Central Pulp
A 530007 JK Tyre & Ind(2) INE573A01042
A 523405 JM Financial(1) INE780C01023
A 522263 JMC Projects(2) INE890A01024
A 523398 Johnson Controls- INE782A01015 Hitachi Home
A 532642 JSW Holdings Ltd INE824G01012 Jindal South
A 500228 JSW Steel(1) INE019A01038
A 520057 JTEKT India Ltd(1 INE643A01035 Sona Koyo St
A 533155 Jubilant Foodwork INE797F01012
A 530019 Jubilant Pharmova INE700A01033 Jubilant Lif
A 535648 Just Dial Ltd INE599M01018
A 532926 Jyothy Lab.(1) INE668F01031 Jyothy Labor
A 500233 Kajaria Cer(1) INE217B01028
A 532268 Kale Consul. INE793A01012
A 522287 Kalpa.Power(2) INE220B01022
A 500235 Kalyani Stel(5) INE907A01026
A 532468 Kama Holdings Ltd INE411F01010 Srf Polymers
A 500165 Kansai Nerolec(1) INE531A01024
A 532652 Karnatka Bank INE614B01018 590002
A 532899 Kaveri Seed (2) INE455I01029
A 532714 Kec Intern(2) INE389H01022
A 517569 Kei Indust(2) INE878B01027
A 505890 Kenna Metal W INE717A01029
A 532967 Kiri Industries INE415I01015 Kiridyes & C
A 533293 Kirl.OilEng(2) INE146L01010
A 521248 Kitex Garment(1) INE602G01020
A 532942 KNR Constr.(2) INE634I01029
A 532924 Kolte-Patil Dep INE094I01018
A 500247 Kotak Bank(5) INE237A01028
A 542651 Kpit Techno.Ltd INE04I401011
A 532889 KPR.Mill ltd(1 INE930H01031
A 530813 KRBL Ltd.(1) INE001B01026 Khushi Ram B
A 500249 KSB Ltd INE999A01015 KSB Pumps
A 533519 L&T Fin.Holding INE498L01015
A 540115 L&T Techno.Service(2) INE010V01017
A 534690 Laksh.Vilas Bank INE694C01018
A 500252 Lakshmi Mach(10) INE269B01029
A 526947 La-opala Rg(2) INE059D01020
A 540005 Larsen & Toubro Inf(1) INE214T01019
A 500510 Larsen Tubro(2) INE018A01030
A 540222 Laurus Labs Ltd(2 INE947Q01028
A 541233 Lemon Tree Hotels INE970X01018
A 500250 LG Balkrishnan INE337A01034 Lg Balabros
A 500253 LIC Hous.Fin(2) INE115A01026
A 531633 Lincoln Pharm INE405C01035
A 523457 Linde india Ltd. INE473A01011 BOC (i) Ltd.
A 532783 Lt Foods Ltd(1) INE818H01020
A 500257 Lupin Ltd.(2) INE326A01037 Lupin Chem
A 539542 Lux Indus.Ltd(2) INE150G01020
A 532720 M&M Finance(2) INE774D01024
A 500266 Mah.Scooter INE288A01013
A 500265 Maha.Seamless(5) INE271B01025
A 539957 Mahanagar Gas Ltd INE002S01010
A 532756 Mahind CIE Automo INE536H01010 Mahind Forgi
A 532313 Mahind Lifes Dev INE813A01018 Mahindra Ges
A 500520 Mahindra & Mah(5) INE101A01026
A 533088 Mahindra Holidays INE998I01010
A 540768 Mahindra Logistics INE766P01016
A 531213 Manappuram Fin(2) INE522D01027 Manappuram G
A 502157 Mangalam Cem INE347A01017
A 531642 Marico (1) INE196A01026
A 524404 Marksans(1) INE750C01026 Tasc Pharma.
A 532500 Maruti Suzuki(5) INE585B01010 Maruti Udyog
A 540749 MAS Financial Serv INE348L01012
A 523704 Mastek Ltd(5) INE759A01021
A 500271 Max Fin.Serv.(2) INE180A01020 Max India(2)
A 539981 Max India Ltd(2) INE153U01017
A 522249 Mayur Uniquo(5) INE040D01038
A 532865 Meghmani Organ(1) INE974H01013
A 542650 Metropolis Health(2) INE112L01020
A 538962 Minda Corpo.Lt(2) INE842C01021
A 532539 Minda Ind(2) INE405E01023
A 532819 Mindtree LTD INE018I01017
A 513377 Mineral&Metl(1) INE123F01029 MMTC Ltd
A 541195 Mishra Dhatu Nigam INE099Z01011
A 533286 MOIL LTD INE490G01020
A 533080 Mold-Tek Packaging INE893J01011 MoldTek Plas
A 524084 Monsanto (I) INE274B01011 Monsanto Che
A 500288 Morepen Lab(2) INE083A01026
A 517334 Motherson Sumi(1) INE775A01035
A 532892 Motilal OswalF(1) INE338I01027
A 526299 Mphasis Bfl INE356A01018 BFL Software
A 500290 MRF Ltd. INE883A01011
A 500109 MRPL INE103A01014
A 542597 MSTC Limited INE255X01014
A 534091 Multi Commodty.Exch INE745G01035
A 533398 Muthoot Finance INE414G01012
A 539551 Narayana Hrudayalaya INE410P01011
A 523630 Nat.Fertiliser INE870D01012
A 526371 Nat.Mineral(1) INE584A01023
A 500298 Nat.Peroxide INE585A01020
A 524816 Natco Pharma INE987B01018
A 537291 Nath Bio-Genes(I) INE448G01010
A 532234 National Alum(5) INE139A01034
A 513023 Nava Bharat V(2) INE725A01022
A 532504 Navin Fluori(2) INE048G01026
A 508989 Navneet Education(2) INE060A01024 Navneet Publ
A 534309 NBCC(India) Lt(1) INE095N01031
A 500294 NCC Ltd(2) INE868B01028 Nagar.Constr
A 502168 NCL Ind. INE732C01016
A 542665 Neogen Chemicals INE136S01016
A 505355 NESCO LTD INE317F01027 New Std.Eng.
A 500790 Nestle (I)Ltd. INE239A01016
A 532798 Network18 Med(5) INE870H01013 Network18 Fi
A 524558 Neuland Lab. INE794A01010
A 540900 Newgen Softw.Tech INE619B01017
A 533098 NHPC LIMITED INE848E01016
A 500304 NIIT Ltd.(2) INE161A01038
A 523385 Nilkamal Pls INE310A01015
A 540767 Nippon Life India Asset INE298J01013 Reliance Nip
A 500307 Nirlon INE910A01012
A 513683 NLC India Ltd INE589A01014 Neyveli Lign
A 500730 NOCIL INE163A01018
A 500672 Novartis India(5) INE234A01025 Hind.ciba
A 530367 Nrb Bearing(2) INE349A01021
A 532555 NTPC Ltd INE733E01010
A 531209 Nucle.Soft E INE096B01018
A 533273 Oberoi Realty INE093I01010
A 533106 OIL INDIA LTD INE274J01014
A 532880 Omaxe Ltd INE800H01010
A 500312 ONGC Corp.(5) INE213A01029
A 532466 Oracle Financ(5) INE881D01027 Iflex Solu(5
A 535754 Orient Cement(1) INE876N01018
A 541301 Orient Electric INE142Z01019
A 506579 Orient.Carb. INE321D01016
A 500315 Oriental Bank INE141A01014
A 532827 Page Industr INE761H01022
A 532900 Paisalo Digital INE420C01042 SE Investmen
A 531349 Panacea Biot(1) INE922B01023
A 539889 Parag Milk Foods INE883N01014
A 531120 Patel Engg.(1) INE244B01030
A 534809 PC Jeweller Ltd INE785M01013
A 533179 Persistent System INE262H01013
A 532522 Petronet Lng INE347G01014
A 500680 Pfizer Ltd. INE182A01018
A 506590 Phil.Carbon(2) INE602A01023
A 503100 Phoenix Mill(2) INE211B01039
A 523642 PI Indus.Ltd(1) INE603J01030
A 500331 Pidilite Ind(1) INE318A01026
A 539883 Pilani Inv.&Ind INE417C01014
A 500302 Piramal Enter(2) INE140A01024 Piramal Heal
A 540173 PNB Housing Fin INE572E01012
A 539150 PNC Infratech (2) INE195J01029
A 531768 Poly Medicur(5) INE205C01021
A 542652 Polycab India INE455K01017
A 524051 Polyplex INE633B01018
A 524000 Poonawalla Fincor INE511C01022 Magma Fincor
A 532810 Power Finan INE134E01011
A 532898 Power Grid Corp. INE752E01010
A 539302 Power Mech Proj. INE211R01019
A 506022 Prakash Ind. INE603A01013
A 540724 Prataap Snacks (5) INE393P01035
A 533274 Prestige EstatePr INE811K01011
A 540293 Pricol Limited(1) INE726V01018
A 542907 Prince Pipes&Fittings INE689W01016
A 500338 Prism Johnson INE010A01011 Prism Cement
A 530117 Privi Speciality INE959A01019 Fairchem Spe
A 500126 Procter & Gamble INE199A01012 Merck Ltd
A 500459 Procter&gamb INE179A01014
A 540544 PSP Projects Ltd INE488V01015
A 532524 PTC India INE877F01012
A 533344 PTC India Financ INE560K01014
A 532461 Punj.Nat.Bank(2) INE160A01022
A 532689 Pvr Ltd INE191H01014
A 539978 Quess Corp Ltd INE615P01015
.
.
.
A 532648 Yes Bank ltd(2) INE528G01027
A 505537 Zee Enter(1) INE256A01028 Zee Telefilm
A 504067 Zensar Tech.(2) INE520A01027
A 531335 Zydus Wellness INE768C01010 Carnation He
Reference
You can find a couple of relevant discussions in:
Ways to deal with #document under iframe
Switch to an iframe through Selenium and python
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element while trying to click Next button with selenium
selenium in python : NoSuchElementException: Message: no such element: Unable to locate element

How to resolve pandas length error for rows/columns

I have raised the SO Question here and blessed to have an answer from #Scott Boston.
However i am raising another question about an error ValueError: Columns must be same length as key as i am reading a text file and all the rows/columns are not of same length, i tried googling but did not get an answer as i don't want them to be skipped.
Error
b'Skipping line 2: expected 13 fields, saw 14\nSkipping line 5: expected 13 fields, saw 14\nSkipping line 6: expected 13 fields, saw 16\nSkipping line 7: expected 13 fields, saw 14\nSkipping line 8: expected 13 fields, saw 15\nSkipping line 9: expected 13 fields, saw 14\nSkipping line 20: expected 13 fields, saw 19\nSkipping line 21: expected 13 fields, saw 16\nSkipping line 23: expected 13 fields, saw 14\nSkipping line 24: expected 13 fields, saw 16\nSkipping line 27: expected 13 fields, saw 14\n'
My pandas dataframe generator
#!/usr/bin/python3
import pandas as pd
#
cvc_file = pd.read_csv('kids_cvc',header=None,error_bad_lines=False)
cvc_file[['cols', 0]] = cvc_file[0].str.split(':', expand=True) #Split first column on ':'
df = cvc_file.set_index('cols').transpose() #set_index and transpose
print(df)
Result
$ ./read_cvc.py
b'Skipping line 2: expected 13 fields, saw 14\nSkipping line 5: expected 13 fields, saw 14\nSkipping line 6: expected 13 fields, saw 16\nSkipping line 7: expected 13 fields, saw 14\nSkipping line 8: expected 13 fields, saw 15\nSkipping line 9: expected 13 fields, saw 14\nSkipping line 20: expected 13 fields, saw 19\nSkipping line 21: expected 13 fields, saw 16\nSkipping line 23: expected 13 fields, saw 14\nSkipping line 24: expected 13 fields, saw 16\nSkipping line 27: expected 13 fields, saw 14\n'
cols ab ad an ed eg et en eck ell it id ig im ish ob og ock ut ub ug um un ud uck ush
0 cab bad ban bed beg bet den beck bell bit bid big dim fish cob bog dock but cub bug bum bun bud buck gush
1 dab dad can fed keg get hen deck cell fit did dig him dish gob cog lock cut hub dug gum fun cud duck hush
2 gab had fan led leg jet men neck dell hit hid fig rim wish job dog rock gut nub hug hum gun dud luck lush
3 jab lad man red peg let pen peck jell kit kid gig brim swish lob fog sock hut rub jug mum nun mud muck mush
4 lab mad pan wed NaN met ten check sell lit lid jig grim NaN mob hog tock jut sub lug sum pun spud puck rush
5 nab pad ran bled NaN net then fleck tell pit rid pig skim NaN rob jog block nut tub mug chum run stud suck blush
File contents
$ cat kids_cvc
ab: cab, dab, gab, jab, lab, nab, tab, blab, crab, grab, scab, stab, slab
at: bat, cat, fat, hat, mat, pat, rat, sat, vat, brat, chat, flat, gnat, spat
ad: bad, dad, had, lad, mad, pad, sad, tad, glad
an: ban, can, fan, man, pan, ran, tan, van, clan, plan, scan, than
ag: bag, gag, hag, lag, nag, rag, sag, tag, wag, brag, drag, flag, snag, stag
ap: cap, gap, lap, map, nap, rap, sap, tap, yap, zap, chap, clap, flap, slap, snap, trap
am: bam, dam, ham, jam, ram, yam, clam, cram, scam, slam, spam, swam, tram, wham
ack: back, hack, jack, lack, pack, rack, sack, tack, black, crack, shack, snack, stack, quack, track
ash: bash, cash, dash, gash, hash, lash, mash, rash, sash, clash, crash, flash, slash, smash
ed: bed, fed, led, red, wed, bled, bred, fled, pled, sled, shed
eg: beg, keg, leg, peg
et: bet, get, jet, let, met, net, pet, set, vet, wet, yet, fret
en: den, hen, men, pen, ten, then, when
eck: beck, deck, neck, peck, check, fleck, speck, wreck
ell: bell, cell, dell, jell, sell, tell, well, yell, dwell, shell, smell, spell, swell
it: bit, fit, hit, kit, lit, pit, sit, wit, knit, quit, slit, spit
id: bid, did, hid, kid, lid, rid, skid, slid
ig: big, dig, fig, gig, jig, pig, rig, wig, zig, twig
im: dim, him, rim, brim, grim, skim, slim, swim, trim, whim
ip: dip, hip, lip, nip, rip, sip, tip, zip, chip, clip, drip, flip, grip, ship, skip, slip, snip, trip, whip
ick: kick, lick, nick, pick, sick, tick, wick, brick, chick, click, flick, quick, slick, stick, thick, trick
ish: fish, dish, wish, swish
in: bin, din, fin, pin, sin, tin, win, chin, grin, shin, skin, spin, thin, twin
ot: cot, dot, got, hot, jot, lot, not, pot, rot, tot, blot, knot, plot, shot, slot, spot
ob: cob, gob, job, lob, mob, rob, sob, blob, glob, knob, slob, snob
og: bog, cog, dog, fog, hog, jog, log, blog, clog, frog
op: cop, hop, mop, pop, top, chop, crop, drop, flop, glop, plop, shop, slop, stop
ock: dock, lock, rock, sock, tock, block, clock, flock, rock, shock, smock, stock
ut: but, cut, gut, hut, jut, nut, rut, shut
ub: cub, hub, nub, rub, sub, tub, grub, snub, stub
ug: bug, dug, hug, jug, lug, mug, pug, rug, tug, drug, plug, slug, snug
um: bum, gum, hum, mum, sum, chum, drum, glum, plum, scum, slum
un: bun, fun, gun, nun, pun, run, sun, spun, stun
ud: bud, cud, dud, mud, spud, stud, thud
uck: buck, duck, luck, muck, puck, suck, tuck, yuck, chuck, cluck, pluck, stuck, truck
ush: gush, hush, lush, mush, rush, blush, brush, crush, flush, slush
Note:
It's making the first row/column as a master one which has 13 values and skipping all the columns which are more than 13 columns.
I couldn't figure out a pandas way to extend the columns, but converting the rows to a dictionary made things easier.
ss = '''
ab: cab, dab, gab, jab, lab, nab, tab, blab, crab, grab, scab, stab, slab
at: bat, cat, fat, hat, mat, pat, rat, sat, vat, brat, chat, flat, gnat, spat
ad: bad, dad, had, lad, mad, pad, sad, tad, glad
.......
un: bun, fun, gun, nun, pun, run, sun, spun, stun
ud: bud, cud, dud, mud, spud, stud, thud
uck: buck, duck, luck, muck, puck, suck, tuck, yuck, chuck, cluck, pluck, stuck, truck
ush: gush, hush, lush, mush, rush, blush, brush, crush, flush, slush
'''.strip()
with open ('kids.cvc','w') as f: f.write(ss) # write data file
######################################
import pandas as pd
dd = {}
maxcnt=0
with open('kids.cvc') as f:
lines = f.readlines()
for line in lines:
line = line.strip() # remove \n
len1 = len(line) # words have leading space
line = line.replace(' ','')
cnt = len1 - len(line) # get word (space) count
if cnt > maxcnt: maxcnt = cnt # max word count
rec = line.split(':') # header : words
dd[rec[0]] = rec[1].split(',') # split words
for k in dd:
dd[k] = dd[k] + ['']*(maxcnt-len(dd[k])) # add extra values to match max column
df = pd.DataFrame(dd) # convert dictionary to dataframe
print(df.to_string(index=False))
Output
ab at ad an ag ap am ack ash ed eg et en eck ell it id ig im ip ick ish in ot ob og op ock ut ub ug um un ud uck ush
cab bat bad ban bag cap bam back bash bed beg bet den beck bell bit bid big dim dip kick fish bin cot cob bog cop dock but cub bug bum bun bud buck gush
dab cat dad can gag gap dam hack cash fed keg get hen deck cell fit did dig him hip lick dish din dot gob cog hop lock cut hub dug gum fun cud duck hush
gab fat had fan hag lap ham jack dash led leg jet men neck dell hit hid fig rim lip nick wish fin got job dog mop rock gut nub hug hum gun dud luck lush
jab hat lad man lag map jam lack gash red peg let pen peck jell kit kid gig brim nip pick swish pin hot lob fog pop sock hut rub jug mum nun mud muck mush
lab mat mad pan nag nap ram pack hash wed met ten check sell lit lid jig grim rip sick sin jot mob hog top tock jut sub lug sum pun spud puck rush
nab pat pad ran rag rap yam rack lash bled net then fleck tell pit rid pig skim sip tick tin lot rob jog chop block nut tub mug chum run stud suck blush
tab rat sad tan sag sap clam sack mash bred pet when speck well sit skid rig slim tip wick win not sob log crop clock rut grub pug drum sun thud tuck brush
blab sat tad van tag tap cram tack rash fled set wreck yell wit slid wig swim zip brick chin pot blob blog drop flock shut snub rug glum spun yuck crush
crab vat glad clan wag yap scam black sash pled vet dwell knit zig trim chip chick grin rot glob clog flop rock stub tug plum stun chuck flush
grab brat plan brag zap slam crack clash sled wet shell quit twig whim clip click shin tot knob frog glop shock drug scum cluck slush
scab chat scan drag chap spam shack crash shed yet smell slit drip flick skin blot slob plop smock plug slum pluck
stab flat than flag clap swam snack flash fret spell spit flip quick spin knot snob shop stock slug stuck
slab gnat snag flap tram stack slash swell grip slick thin plot slop snug truck
spat stag slap wham quack smash ship stick twin shot stop
snap track skip thick slot
trap slip trick spot

Extracting countries from string

I am trying to go through a column of data frame in python 3. What I need to do is take from each row the country that it is mentioned and the number of times that country is mentioned.
i.e. if I have this row:
['[Aydemir, Deniz', ' Gunduz, Gokhan', ' Asik, Nejla] Bartin Univ, Fac Forestry, Dept Forest Ind Engn, TR-74100 Bartin, Turkey', ' [Wang, Alice] Lulea Univ Technol, Wood Technol, Skelleftea, Sweden']
it needs to output a list: ['Turkey', 'Sweden']
and if I have this row:
['[Fang, Qun', ' Cui, Hui-Wang] Zhejiang A&F Univ, Sch Engn, Linan 311300, Peoples R China', ' [Du, Guan-Ben] Southwest Forestry Univ, Kunming 650224, Yunnan, Peoples R China']
the output should be: ['China', 'China'].
I have written this code but it is not working as I want to:
from geotext import GeoText
sentence = df.iloc[0,0]
places = GeoText(sentence)
print(places.countries)
It prints only the country once and in some cases when it is USA it doesn't recognize the abbreviation. Can you help me figure out what to do?
l = [['[Aydemir, Deniz\', \' Gunduz, Gokhan\', \' Asik, Nejla] Bartin Univ, Fac Forestry, Dept Forest Ind Engn, TR-74100 Bartin, Turkey\', \' [Wang, Alice] Lulea Univ Technol, Wood Technol, Skelleftea, Sweden',1990],
['[Fang, Qun\', \' Cui, Hui-Wang] Zhejiang A&F Univ, Sch Engn, Linan 311300, Peoples R China\', \' [Du, Guan-Ben] Southwest Forestry Univ, Kunming 650224, Yunnan, Peoples R China',2005],
['[Blumentritt, Melanie\', \' Gardner, Douglas J.\', \' Shaler, Stephen M.] Univ Maine, Sch Resources, Orono, ME USA\', \' [Cole, Barbara J. W.] Univ Maine, Dept Chem, Orono, ME 04469 USA',2012]]
dataf = pd.DataFrame(l, columns = ['Authors', 'Year'])
I tried to do this code but I have the same problem, it doesn't give all the counties only one per row:
def find_country(n):
for c in pycountry.countries:
if str(c.name).lower() in n.lower():
return c.name
country1 = (dataf['Authors']
.replace(r"\bUSA\b", "United States", regex=True)
.apply(lambda x: find_country(x)))
USA does not seem to be detected correctly by geotext - it's worth trying to raise an issue with that package. As a workaround here, I replace USA with United States, which is correctly detected.
df = (dataf['Authors']
.replace(r"\bUSA\b", "United States", regex=True)
.apply(lambda x: geotext.GeoText(x).countries)
)
I'm not sure what you were doing before, but this will get the list of countries for each of the rows in Author, including duplicates.
0 [Turkey, Sweden]
1 [China, China]
2 [United States, United States]
Name: Authors, dtype: object
As mentioned in the comment, if you want to have an actual list of lists, just add tolist() to the end.
df.tolist()
[['Turkey', 'Sweden'], ['China', 'China'], ['United States', 'United States']]

Creating a table using a list of strings

I am needing to convert a list of lists of strings into a three column table where the first column is 1 space longer than the longest string. I have figured out how to identify the longest string and how long it is, but getting the table to form has been quite tricky. Here is the program with the lists in it and it shows you that the longest one is 26 characters long.
def main():
mycities = [['Cape Girardeau', 'MO', '63780'], ['Columbia', 'MO', '65201'],
['Kansas City', 'MO', '64108'], ['Rolla', 'MO', '65402'],
['Springfield', 'MO', '65897'], ['St Joseph', 'MO', '64504'],
['St Louis', 'MO', '63111'], ['Ames', 'IA', '50010'], ['Enid',
'OK', '73773'], ['West Palm Beach', 'FL', '33412'],
['International Falls', 'MN', '56649'], ['Frostbite Falls',
'MN', '56650']]
col_width = max(len(item) for sub in mycities for item in sub)
print(col_width)
main()
Now I am just needing to get it to print off like this:
Cape Girardeau MO 63780
Columbia MO 65201
Kansas City MO 64108
Springfield MO 65897
St Joseph MO 64504
St Louis MO 63111
Ames IA 50010
Enid OK 73773
West Palm Beach FL 33412
International Falls MN 56649
Frostbite Falls MN 56650
You're off to the right start -- as an example given the specific structure to the lists you have, you can use the col_width you calculated to determine the number of spaces you'd need after the name of each city to append to the end of each city name:
for city in mycities:
# append the string with the number of spaces required
city_padded = city[0] + " " + " "*(col_width-len(city[0]))
print(city_padded + city[1] + " " + city[2])
Given your example, this will produce:
Cape Girardeau MO 63780
Columbia MO 65201
Kansas City MO 64108
Rolla MO 65402
Springfield MO 65897
St Joseph MO 64504
St Louis MO 63111
Ames IA 50010
Enid OK 73773
West Palm Beach FL 33412
International Falls MN 56649
Frostbite Falls MN 56650
Note in the original version of your question, you are missing commas in your sublists in your mycities variable, for which I've added in an edit.
As a side note, it is convention in Python that words be separated by underscores in variable names for readability, so you might rename mycities to my_cities.
pep8 ref: (https://www.python.org/dev/peps/pep-0008/#function-and-variable-names)
"String name".ljust(26) will add spaces to the end of your string. For example,
Ames.ljust(26) will result in 'Ames (22 spaces here)', and then the next column will print after. If you are not sure what the longest city will be, you could replace the 26 with len(cities[-1]) after ordering the cities in a list by length. To do this, you can do sortedCities = sorted(cityListVariable, key=len)
def main():
cities = ['Cape Girardeau, MO 63780', 'Columbia, MO 65201', 'Kansas City, MO 64108', 'Rolla, MO 65402',
'Springfield, MO 65897', 'St Joseph, MO 64504', 'St Louis, MO 63111', 'Ames, IA 50010',
'Enid, OK 73773', 'West Palm Beach, FL 33412', 'International Falls, MN 56649', 'Frostbite Falls, MN 56650',
'Charlotte, NC 28241', 'Upper Marlboro, MD 20774', 'Camdenton, MO 65020', 'San Fransisco, CA 94016'] #create list of information
for x in cities:
col = x.split(",")
if(len(col) == 2):
city = col[0].strip()
temp = col[1].strip()
else:
city = x[:15].strip()
temp = c[15:].strip()
state = temp[:2]
zipCode = int(temp[-5::])
print("%-20s\t%s\t%d"%(city, state, zipCode))
main()

Using unstack with Pandas

I am getting an exception when applying unstack, and would like to understand it.
For a reproducible example:
(to load the data: pd.DataFrame(json.loads(titanic)))
titanic
'{"home.dest":{"0":"St Louis, MO","1":"Montreal, PQ \\/ Chesterville, ON","2":"Montreal, PQ \\/ Chesterville, ON","3":"Montreal, PQ \\/ Chesterville, ON","4":"Montreal, PQ \\/ Chesterville, ON","5":"New York, NY","6":"Hudson, NY","7":"Belfast, NI","8":"Bayside, Queens, NY","9":"Montevideo, Uruguay","10":"New York, NY","11":"New York, NY","12":"Paris, France","13":null,"14":"Hessle, Yorks","15":"New York, NY","16":"Montreal, PQ","17":"Montreal, PQ","18":null,"19":"Winnipeg, MN"},"pclass":{"0":1,"1":1,"2":1,"3":1,"4":1,"5":1,"6":1,"7":1,"8":1,"9":1,"10":1,"11":1,"12":1,"13":1,"14":1,"15":1,"16":1,"17":1,"18":1,"19":1},"survived":{"0":1,"1":1,"2":0,"3":0,"4":0,"5":1,"6":1,"7":0,"8":1,"9":0,"10":0,"11":1,"12":1,"13":1,"14":1,"15":0,"16":0,"17":1,"18":1,"19":0},"name":{"0":"Allen, Miss. Elisabeth Walton","1":"Allison, Master. Hudson Trevor","2":"Allison, Miss. Helen Loraine","3":"Allison, Mr. Hudson Joshua Creighton","4":"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)","5":"Anderson, Mr. Harry","6":"Andrews, Miss. Kornelia Theodosia","7":"Andrews, Mr. Thomas Jr","8":"Appleton, Mrs. Edward Dale (Charlotte Lamson)","9":"Artagaveytia, Mr. Ramon","10":"Astor, Col. John Jacob","11":"Astor, Mrs. John Jacob (Madeleine Talmadge Force)","12":"Aubart, Mme. Leontine Pauline","13":"Barber, Miss. Ellen \\"Nellie\\"","14":"Barkworth, Mr. Algernon Henry Wilson","15":"Baumann, Mr. John D","16":"Baxter, Mr. Quigg Edmond","17":"Baxter, Mrs. James (Helene DeLaudeniere Chaput)","18":"Bazzani, Miss. Albina","19":"Beattie, Mr. Thomson"},"sex":{"0":"female","1":"male","2":"female","3":"male","4":"female","5":"male","6":"female","7":"male","8":"female","9":"male","10":"male","11":"female","12":"female","13":"female","14":"male","15":"male","16":"male","17":"female","18":"female","19":"male"},"age":{"0":29.0,"1":0.92,"2":2.0,"3":30.0,"4":25.0,"5":48.0,"6":63.0,"7":39.0,"8":53.0,"9":71.0,"10":47.0,"11":18.0,"12":24.0,"13":26.0,"14":80.0,"15":null,"16":24.0,"17":50.0,"18":32.0,"19":36.0},"sibsp":{"0":0,"1":1,"2":1,"3":1,"4":1,"5":0,"6":1,"7":0,"8":2,"9":0,"10":1,"11":1,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":0,"19":0},"parch":{"0":0,"1":2,"2":2,"3":2,"4":2,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":1,"17":1,"18":0,"19":0},"ticket":{"0":"24160","1":"113781","2":"113781","3":"113781","4":"113781","5":"19952","6":"13502","7":"112050","8":"11769","9":"PC 17609","10":"PC 17757","11":"PC 17757","12":"PC 17477","13":"19877","14":"27042","15":"PC 17318","16":"PC 17558","17":"PC 17558","18":"11813","19":"13050"},"fare":{"0":211.3375,"1":151.55,"2":151.55,"3":151.55,"4":151.55,"5":26.55,"6":77.9583,"7":0.0,"8":51.4792,"9":49.5042,"10":227.525,"11":227.525,"12":69.3,"13":78.85,"14":30.0,"15":25.925,"16":247.5208,"17":247.5208,"18":76.2917,"19":75.2417},"cabin":{"0":"B5","1":"C22 C26","2":"C22 C26","3":"C22 C26","4":"C22 C26","5":"E12","6":"D7","7":"A36","8":"C101","9":null,"10":"C62 C64","11":"C62 C64","12":"B35","13":null,"14":"A23","15":null,"16":"B58 B60","17":"B58 B60","18":"D15","19":"C6"},"embarked":{"0":"S","1":"S","2":"S","3":"S","4":"S","5":"S","6":"S","7":"S","8":"S","9":"C","10":"C","11":"C","12":"C","13":"S","14":"S","15":"S","16":"C","17":"C","18":"C","19":"C"},"boat":{"0":"2","1":"11","2":null,"3":null,"4":null,"5":"3","6":"10","7":null,"8":"D","9":null,"10":null,"11":"4","12":"9","13":"6","14":"B","15":null,"16":null,"17":"6","18":"8","19":"A"},"body":{"0":null,"1":null,"2":null,"3":135.0,"4":null,"5":null,"6":null,"7":null,"8":null,"9":22.0,"10":124.0,"11":null,"12":null,"13":null,"14":null,"15":null,"16":null,"17":null,"18":null,"19":null}}'
I create a multi index with the following command:
titanic = titanic.set_index(['name', 'home.dest'])
Then I want to unstack.
titanic.unstack(level = 'home.dest')
I get the following exception message:
ValueError: Index contains duplicate entries, cannot reshape
The error is saying that your choice of columns in which you built the MultiIndex is not unique and therefore it has problems unstacking because there are ambiguities.
One way to fix this is to guarantee the uniqueness by adding a counter.
counts = titanic.gropuby(['name', 'home.dest']).cumcount().rename('Counter')
titanic = titanic.set_index(['name', 'home.dest', counts])
Then your unstack will work
titanic.unstack(level = 'home.dest')
But I'd advise maybe
titanic.unstack(['home.dest', 'Counter'])
Otherwise, you'll have to aggregate with a groupby
titanic.groupby(['name', 'home.dest']).first().unstack()

Resources