How to structure (take out) JSON from a text string. (Python) - python-3.x
I have a text string/script which I took out from a webpage. I would like to clean/structure that text string/Script so that I can only get JSON out of it. But its very long that I lost finding beginning and ending of JSON from that text. Does anyone help me out or advice a online website which can help to find the beginning and ending of JSON from that text. Many Thanks
window.__NUXT__=function(e,l,a,t,r,s,i,o,n,d){return{layout:s,data:[{product:{active_gtin:"5711555000616",active_supplier:"0000009002",active_supplier_product_id:"000000000091052931-EA",brand:"Prosonic",description:"Prosonic 32\" TV med Android og Full-HD opløsning. Android styresystemet giver dig let adgang til Netflix, Viaplay og TV2 Play samt mange andre apps og med indbygget Chromecast kan du let caste indhold til TV'et.",display_list_price:l,display_sales_price:l,energy_class:"A+",energy_class_color_code:"lev_3",energy_label:i,erp_product_id:o,gallery_images:[i,"https://sg-dam.imgix.net/services/assets.img/id/13a13e85-efe7-48eb-bb6c-953abc94fb08/size/original","https://sg-dam.imgix.net/services/assets.img/id/e0c39be1-eb82-4652-88f4-992226390a3f/size/original","https://sg-dam.imgix.net/services/assets.img/id/9bc81449-64ba-44c0-b691-31b22bf5dc91/size/original"],hybris_code:n,id:n,image_primary:"https://sg-dam.imgix.net/services/assets.img/id/f8d59494-3da7-4cb7-9dd8-e8d16577e7c4/size/original",in_stock_stores_count:15,is_approved_for_sale:t,is_exposed:t,is_reservable:t,name:'Prosonic 32" 32and6021 LED tv',online_from:16000344e5,online_to:2534022108e5,primary_category_path:"/elektronik/tv",product_url:"/produkter/prosonic-32-32and6021-led-tv/100553115/",sales_price:e,show_discount_message:a,sku:o,specifications:'[{"features":[{"code":"text-TvMemory","label":"Tekst TV hukommelse","value":"1000"}],"label":"Tekst TV hukommelse"},{"features":[{"code":"tvFeatures","label":"TV funktioner","value":"Netflix"},{"code":"tvFeatures","label":"TV funktioner","value":"SmartTV"},{"code":"tvFeatures","label":"TV funktioner","value":"Wi-Fi indbygget"}],"label":"TV funktioner"},{"features":[{"code":"TV.tvApps","label":"TV Apps","value":"Amazon"},{"code":"TV.tvApps","label":"TV Apps","value":"Apple TV"},{"code":"TV.tvApps","label":"TV Apps","value":"Blockbuster"},{"code":"TV.tvApps","label":"TV Apps","value":"Boxer"},{"code":"TV.tvApps","label":"TV Apps","value":"Dplay"},{"code":"TV.tvApps","label":"TV Apps","value":"DR TV"},{"code":"TV.tvApps","label":"TV Apps","value":"Google Play Store"},{"code":"TV.tvApps","label":"TV Apps","value":"HBO Nordic"},{"code":"TV.tvApps","label":"TV Apps","value":"Min Bio"},{"code":"TV.tvApps","label":"TV Apps","value":"Netflix"},{"code":"TV.tvApps","label":"TV Apps","value":"Rakuten TV"},{"code":"TV.tvApps","label":"TV Apps","value":"SF Anytime"},{"code":"TV.tvApps","label":"TV Apps","value":"Skype"},{"code":"TV.tvApps","label":"TV Apps","value":"Spotify"},{"code":"TV.tvApps","label":"TV Apps","value":"TV2 play"},{"code":"TV.tvApps","label":"TV Apps","value":"Viaplay"},{"code":"TV.tvApps","label":"TV Apps","value":"YouSee"},{"code":"TV.tvApps","label":"TV Apps","value":"Youtube"}],"label":"TV Apps"},{"features":[{"code":"connectivity.videoConnectivity","label":"Video tilslutning","value":"composite"}],"label":"Video tilslutning"},{"features":[{"code":"screen.monitorLanguageList","label":"Skærmsprog","value":"Dansk"}],"label":"Skærmsprog"},{"features":[{"code":"builtInSpeakers.soundFunction","label":"Lydfunktioner","value":"Bluetooth"}],"label":"Lydfunktioner"},{"features":[{"code":"productionYear","label":"Produktionsår","value":"2.020"}],"label":"Produktionsår"},{"features":[{"code":"electronics.manufacturerNum","label":"Producentens Varenummer","value":"32AND6021"}],"label":"Producentens Varenummer"},{"features":[{"code":"TV.hdrLOV","label":"HDR","value":"HDR 10"}],"label":"HDR"},{"features":[{"code":"TV.isSleepTimerPresent","label":"Sleep timer","value":"Ja"}],"label":"Sleep timer"},{"features":[{"code":"isPVRFunctionPresent","label":"PVR funktion","value":"Ja"}],"label":"PVR funktion"},{"features":[{"code":"accessoriesIncluded","label":"Tilbehør inkluderet","value":"stand og remote"}],"label":"Tilbehør inkluderet"},{"features":[{"code":"screenTechnologyDesc","label":"Skærmteknologi","value":"LED"}],"label":"Skærmteknologi"},{"features":[{"code":"tvTunerList","label":"TV-tuners","value":"CI+"},{"code":"tvTunerList","label":"TV-tuners","value":"DVB-C"},{"code":"tvTunerList","label":"TV-tuners","value":"DVB-S"},{"code":"tvTunerList","label":"TV-tuners","value":"DVB-T2"},{"code":"tvTunerList","label":"TV-tuners","value":"MPEG4 tuner"}],"label":"TV-tuners"},{"features":[{"code":"TV.vesaStandardList","label":"Vægbeslag Vesa standard","value":"75x75"}],"label":"Vægbeslag Vesa standard"},{"features":[{"code":"connectivity.hdmiCount","label":"Antal HDMI","value":"3"}],"label":"Antal HDMI"},{"features":[{"code":"builtInSpeakers.speakerEffect","label":"Højtalereffekt","value":"12"}],"label":"Højtalereffekt"},{"features":[{"code":"usbCount","label":"Antal USB stik","value":"1"}],"label":"Antal USB stik"},{"features":[{"code":"TVResolution","label":"TV opløsning","value":"Full HD"}],"label":"TV opløsning"},{"features":[{"code":"picturePlayers.supportedImageFormats","label":"Understøttede Billed Formater","value":"JPG,BMP,PNG,GIF"}],"label":"Understøttede Billed Formater"},{"features":[{"code":"scartCount","label":"Antal scartstik","value":"0"}],"label":"Antal scartstik"},{"features":[{"code":"connectivity.usbcount2","label":"Antal USB 2.0 porte","value":"1"}],"label":"Antal USB 2.0 porte"},{"features":[{"code":"Color","label":"Produktfarve","value":"sort"}],"label":"Produktfarve"},{"features":[{"code":"TV.isWatchAndTimerFunctionOnOffPresent","label":"Ur og timerfunktion til\\/fra","value":"Ja"}],"label":"Ur og timerfunktion til\\/fra"},{"features":[{"code":"TV.isAutomaticChannelSearchAvailable","label":"Automatisk kanalsøgning","value":"Ja"}],"label":"Automatisk kanalsøgning"},{"features":[{"code":"screen.screenResolution","label":"Skærmopløsning","value":"Full-HD 1920 x 1080"}],"label":"Skærmopløsning"},{"features":[{"code":"TV.software","label":"TV software","value":"Android"}],"label":"TV software"},{"features":[{"code":"connectivity.connectivityDesc","label":"Andre tilslutningsmuligheder","value":"Composite, Audio in, VGA, optisk lyd ud,"}],"label":"Andre tilslutningsmuligheder"},{"features":[{"code":"TV.twinTuner","label":"Twin Tuner","value":"Nej"}],"label":"Twin Tuner"},{"features":[{"code":"picturePlayers.supportedVideoFileFormats","label":"Understøttede videofil formater","value":".MPG .MPEG.DAT.VOB.MKV.MP4 \\/ .M4A \\/ .M4V.MOV.FLV.3GP \\/ 3GPP.TS \\/ .M2TS.RMVB .RM.AVI.ASF .WMV.WEBM"}],"label":"Understøttede videofil formater"},{"features":[{"code":"isInternetBrowserPresent","label":"Internet browser","value":"Ja"}],"label":"Internet browser"},{"features":[{"code":"wirelessConnectivityOptionList","label":"Trådløse tilslutningsmuligheder","value":"Bluetooth"},{"code":"wirelessConnectivityOptionList","label":"Trådløse tilslutningsmuligheder","value":"Wi-Fi indbygget"}],"label":"Trådløse tilslutningsmuligheder"}]',step_product_id:"GR14425172",stock_count_online:2874,stock_count_status_online:"in_stock",stock_type:"NORMAL",summary:"Med Android og indbygget Chromecast",msg_sales_price_per_unit:l,package_display_sales_price:l,promotion_text:e,f_campaign_name:[]},loadingProduct:a}],error:e,state:{User:{UID:l,isLoggedIn:a,nickname:l,address:{firstName:l,lastName:l,address:l,postalCode:l,city:l,mobile:l,email:l,country:l},isDeliveryMethodSet:a,lastSeenProducts:[],wishlistProducts:[]},Tracking:{trackedOrders:[],activeRoute:e,oldRoute:e,cookieConsentGiven:a,initialRouteTracked:a},Search:{showDrawer:a,hideGlobalSearch:a,query:l,queryString:l,queries:[],brands:[],categories:[]},Products:{products:[]},ProductDialog:{showType:a,productId:e,quantity:e,error:e},plugins:{Cart:{checkoutErrorPlugin:{},productDialogPlugin:{}},TechnicalError:{technicalErrorPlugin:{}},Tracking:{gtmPlugin:{},gtmHandlers:{appInitializedHandler:{},bannerClickedHandler:{},bannerViewedHandler:{},checkoutStepChangedHandler:{},clickCollectCompletedHandler:{},cookieConsentGivenHandler:{},externalLinkClickedHandler:{},helpers:{},notFoundPageViewedHandler:{},orderCompletedHandler:{},plpProductsViewedHandler:{},productAddedHandler:{},productClickedHandler:{},productDetailViewedHandler:{},productQuantityChangeHandler:{},productRemovedHandler:{},recommendationsClickedHandler:{},recommendationsViewedHandler:{},routeChangedHandler:{},siteSearchHandler:{}}},User:{userPlugin:{}}},Payment:{paymentMethod:e,termsAccepted:a},OAuth:{accessToken:e,expiry:0,timestamp:e,trackingId:e},Navigation:{hierarchy:e,path:[],loading:a,lastFetchedTopNode:l},Layout:{eyebrow:{default:e},footer:{default:e},layout:s},InfoBar:{infoBars:[],infoBarMappers:{}},Delivery:{isFetchingPickups:a,deliveries:{},pickups:{},selectedDeliveries:{}},ClickCollect:{loading:a,showDrawer:a,baseMapLocation:e,stores:[],selectedStore:e,product:e,quantity:1,form:{name:l,email:l,countryDialCode:"45",phone:l,terms:a},reservation:e,error:a,filters:{inStockOnly:t}},Checkout:{panelState:{userInfo:{},delivery:{},payment:{mustVisit:t},store:{}},desiredPanel:"auto",panelValidators:{}},Cart:{data:{id:l,lineItems:[],totalLineItemsQuantity:0,totalSalesPrice:r,totalShippingSalesPrice:r,employeeNumber:e,loyaltyNumber:e,deliveries:[],totalLineItemSalesPrice:r,totalLineItemListPrice:r,totalLineItemDiscount:r,totalShippingListPrice:r,totalShippingPriceDiscount:r,orderNumber:e,totalSalesPriceNumber:0,isActive:t,isAllLineItemsValid:t,shippingAddress:d,billingAddress:d,hash:l,discountCodes:[],source:"USER_DEVICE"},loading:{},error:e,assistedSalesMode:a,assistedSalesStoreNumber:e},Breadcrumb:{categoryTree:{},productCategory:l,lookupBreadcrumbTasks:{},currentCategoryPage:[],helpers:{}}},serverRendered:t}}(null,"",!1,!0,"0,00","default","https://sg-dam.imgix.net/services/assets.img/id/87a045c1-0923-4575-81ce-fd9b7c3bfbf6/size/original","91052931-EA","100553115",void 0)
You can use a RegEx to get the Jsons from your string.
I have used this pattern: {(?:[^{}]*{[^{]*})*[^{}]*}
The above regex checks only the Json in one level deep.
Code:
import re
import json
input_data = """window.__NUXT__=funct ... A","100553115",void 0)"""
def json_validate(input_str):
founds = re.findall(r"{(?:[^{}]*{[^{]*})*[^{}]*}", input_str)
valid_jsons = []
for x in founds:
try:
valid_jsons.append(json.loads(x))
except json.JSONDecodeError:
continue
return valid_jsons
getting_jsons = json_validate(input_data)
for one_json in getting_jsons:
print(one_json)
print(len(getting_jsons))
It can find several (32) valid Jsons in your string:
>>> python3 test.py
{'features': [{'code': 'text-TvMemory', 'label': 'Tekst TV hukommelse', 'value': '1000'}], 'label': 'Tekst TV hukommelse'}
{'features': [{'code': 'tvFeatures', 'label': 'TV funktioner', 'value': 'Netflix'}, {'code': 'tvFeatures', 'label': 'TV funktioner', 'value': 'SmartTV'}, {'code': 'tvFeatures', 'label': 'TV funktioner', 'value': 'Wi-Fi indbygget'}], 'label': 'TV funktioner'}
{'features': [{'code': 'TV.tvApps', 'label': 'TV Apps', 'value': 'Amazon'}, {'code ...
I have found another solution which approaches the issue from totally different way: https://stackoverflow.com/a/54235803/11502612
I have tested the code from the above answer and I got the same output. It means the result is correct (probably).
Would it not be easier to do something like
import json
data = json.dumps(your_string)
Then iterate over it to find the values. Alternatively you can look for the value locations with
find("{")
Don't know if this is what your looking for but thought it may spark an idea / alternative view
Related
New to using Spotipy and Python3, I want to use new_releases() to print Artist name and the artist album
So far I was able to print out all albums by a person of my choosing using this spotify = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials(client_id, client_secret)) results = spotify.artist_albums(posty_uri, album_type='album') albums = results['items'] while results['next']: results = spotify.next(results) albums.extend(results['items']) for album in albums: print(album['name']) I was trying to do a similar process for new_releases() by doing this newReleases = spotify.new_releases() test = newReleases['items'] but this throws me an error on the line test = newReleases['items']. If anyone is familiar with Spotipy and knows how to return things like release date, artist name, album name from new_releases() I would greatly appreciate it.
I'm a little confused because the documentation says that the new_releases method returns a list. In any event, it is a one-item dictionary which contains a list. However that list contains dictionaries which seem a bit unwieldy, so I understand why you're asking this question. You can make use of the collections.namedtuple data structure to make it easier to see the relevant information. I don't claim that this is the best way to transform this data, but it seems to me a decent way. import collecdtions as co # namedtuple data structure that will be easier to understand and use Album = co.namedtuple(typename='Album',field_names=['album_name', 'artist_name', 'release_date']) newReleases2 = [] # couldn't think of a better name for album in newReleases['albums']['items']: artist_sublist = [] for artist in album['artists']: artist_sublist.append(artist['name']) newReleases2.append(Album(album_name=album['name'], artist_name=artist_sublist, release_date=album['release_date'])) This results in the following list of namedtuples: [Album(album_name='Only Wanna Be With You (Pokémon 25 Version)', artist_name=['Post Malone'], release_date='2021-02-25'), Album(album_name='AP (Music from the film Boogie)', artist_name=['Pop Smoke'], release_date='2021-02-26'), Album(album_name='Like This', artist_name=['2KBABY', 'Marshmello'], release_date='2021-02-26'), Album(album_name='Go Big (From The Amazon Original Motion Picture Soundtrack Coming 2 America)', artist_name=['YG', 'Big Sean'], release_date='2021-02-26'), Album(album_name='Here Comes The Shock', artist_name=['Green Day'], release_date='2021-02-21'), Album(album_name='Spaceman', artist_name=['Nick Jonas'], release_date='2021-02-25'), Album(album_name='Life Support', artist_name=['Madison Beer'], release_date='2021-02-26'), Album(album_name="Drunk (And I Don't Wanna Go Home)", artist_name=['Elle King', 'Miranda Lambert'], release_date='2021-02-26'), Album(album_name='PROBLEMA', artist_name=['Daddy Yankee'], release_date='2021-02-26'), Album(album_name='Leave A Little Love', artist_name=['Alesso', 'Armin van Buuren'], release_date='2021-02-26'), Album(album_name='Rotate', artist_name=['Becky G', 'Burna Boy'], release_date='2021-02-22'), Album(album_name='BED', artist_name=['Joel Corry', 'RAYE', 'David Guetta'], release_date='2021-02-26'), Album(album_name='A N N I V E R S A R Y (Deluxe)', artist_name=['Bryson Tiller'], release_date='2021-02-26'), Album(album_name='Little Oblivions', artist_name=['Julien Baker'], release_date='2021-02-26'), Album(album_name='Money Long (feat. 42 Dugg)', artist_name=['DDG', 'OG Parker'], release_date='2021-02-26'), Album(album_name='El Madrileño', artist_name=['C. Tangana'], release_date='2021-02-26'), Album(album_name='Skegee', artist_name=['JID'], release_date='2021-02-23'), Album(album_name='Coyote Cry', artist_name=['Ian Munsick'], release_date='2021-02-26'), Album(album_name='Rainforest', artist_name=['Noname'], release_date='2021-02-26'), Album(album_name='The American Negro', artist_name=['Adrian Younge'], release_date='2021-02-26')] If you wanted to see the artist(s) associated with the 11th album in this list, you could do this: In [62]: newReleases2[10].artist_name Out[62]: ['Becky G', 'Burna Boy'] Edit: in a comment on this answer, OP requested getting album cover as well. Please see helper function, and slightly modified code below: import os import requests def download_album_cover(url): # helper function to download album cover # using code from: https://stackoverflow.com/a/13137873/42346 download_path = os.getcwd() + os.sep + url.rsplit('/', 1)[-1] r = requests.get(url, stream=True) if r.status_code == 200: with open(download_path, 'wb') as f: for chunk in r.iter_content(1024): f.write(chunk) return download_path # modified data structure Album = co.namedtuple(typename='Album',field_names=['album_name', 'album_cover', 'artist_name', 'release_date']) # modified retrieval code newReleases2 = [] for album in newReleases['albums']['items']: album_cover = download_album_cover(album['images'][0]['url']) artist_sublist = [] for artist in album['artists']: artist_sublist.append(artist['name']) newReleases2.append(Album(album_name=album['name'], album_cover=album_cover, artist_name=artist_sublist, release_date=album['release_date'])) Result: [Album(album_name='Scary Hours 2', album_cover='/home/adamcbernier/ab67616d0000b2738b20e4631fa15d3953528bbc', artist_name=['Drake'], release_date='2021-03-05'), Album(album_name='Boogie: Original Motion Picture Soundtrack', album_cover='/home/adamcbernier/ab67616d0000b27395e532805e8c97be7a551e3a', artist_name=['Various Artists'], release_date='2021-03-05'), Album(album_name='Hold On', album_cover='/home/adamcbernier/ab67616d0000b273f33d3618aca6b3cfdcd2fc43', artist_name=['Justin Bieber'], release_date='2021-03-05'), Album(album_name='Serotonin', album_cover='/home/adamcbernier/ab67616d0000b2737fb30ee0638c764d6f3247d2', artist_name=['girl in red'], release_date='2021-03-03'), Album(album_name='Leave The Door Open', album_cover='/home/adamcbernier/ab67616d0000b2736f9e6abbd6fa43ac3cdbeee0', artist_name=['Bruno Mars', 'Anderson .Paak', 'Silk Sonic'], release_date='2021-03-05'), Album(album_name='Real As It Gets (feat. EST Gee)', album_cover='/home/adamcbernier/ab67616d0000b273f0f6f6144929a1ff72001f5e', artist_name=['Lil Baby', 'EST Gee'], release_date='2021-03-04'), Album(album_name='Life’s A Mess II (with Clever & Post Malone)', album_cover='/home/adamcbernier/ab67616d0000b2732e8d23414fd0b81c35bdedea', artist_name=['Juice WRLD'], release_date='2021-03-05'), Album(album_name='slower', album_cover='/home/adamcbernier/ab67616d0000b273b742c96d78d9091ce4a1c5c1', artist_name=['Tate McRae'], release_date='2021-03-03'), Album(album_name='Sacrifice', album_cover='/home/adamcbernier/ab67616d0000b27398bfcce8be630dd5f2f346e4', artist_name=['Bebe Rexha'], release_date='2021-03-05'), Album(album_name='Poster Girl', album_cover='/home/adamcbernier/ab67616d0000b273503b16348e47bc3c1c823eba', artist_name=['Zara Larsson'], release_date='2021-03-05'), Album(album_name='Beautiful Mistakes (feat. Megan Thee Stallion)', album_cover='/home/adamcbernier/ab67616d0000b273787f41be59050c46f69db580', artist_name=['Maroon 5', 'Megan Thee Stallion'], release_date='2021-03-03'), Album(album_name='Pay Your Way In Pain', album_cover='/home/adamcbernier/ab67616d0000b273a1e1b4608e1e04b40113e6e1', artist_name=['St. Vincent'], release_date='2021-03-04'), Album(album_name='My Head is a Moshpit', album_cover='/home/adamcbernier/ab67616d0000b2733db806083e3b649f1d969a4e', artist_name=['Verzache'], release_date='2021-03-05'), Album(album_name='When You See Yourself', album_cover='/home/adamcbernier/ab67616d0000b27377253620f08397c998d18d78', artist_name=['Kings of Leon'], release_date='2021-03-05'), Album(album_name='Mis Manos', album_cover='/home/adamcbernier/ab67616d0000b273d7210e8d6986196b28d084ef', artist_name=['Camilo'], release_date='2021-03-04'), Album(album_name='Retumban2', album_cover='/home/adamcbernier/ab67616d0000b2738a79a82236682469aecdbbdf', artist_name=['Ovi'], release_date='2021-03-05'), Album(album_name='Take My Hand', album_cover='/home/adamcbernier/ab67616d0000b273b7839c3ba191de59f5d3a3d7', artist_name=['LP Giobbi'], release_date='2021-03-05'), Album(album_name="Ma' G", album_cover='/home/adamcbernier/ab67616d0000b27351b5ebb959c37913ac61b033', artist_name=['J Balvin'], release_date='2021-02-28'), Album(album_name='Aspen', album_cover='/home/adamcbernier/ab67616d0000b27387d1d17d16cf131765ce4be8', artist_name=['Young Dolph', 'Key Glock'], release_date='2021-03-05'), Album(album_name='Only The Family - Lil Durk Presents: Loyal Bros', album_cover='/home/adamcbernier/ab67616d0000b273a3df38e11e978b34b47583d0', artist_name=['Only The Family'], release_date='2021-03-05')]
TypeError: Object of type 'Location' is not JSON serializable
i am using geopy library for my Flask web app. i want to save user location which i am getting from my modal(html form) in my database(i am using mongodb), but every single time i am getting this error: TypeError: Object of type 'Location' is not JSON serializable Here's the code: #app.route('/register', methods=['GET', 'POST']) def register_user(): if request.method == 'POST': login_user = mongo.db.mylogin existing_user = login_user.find_one({'email': request.form['email']}) # final_location = geolocator.geocode(session['address'].encode('utf-8')) if existing_user is None: hashpass = bcrypt.hashpw( request.form['pass'].encode('utf-8'), bcrypt.gensalt()) login_user.insert({'name': request.form['username'], 'email': request.form['email'], 'password': hashpass, 'address': request.form['add'], 'location' : session['location'] }) session['password'] = request.form['pass'] session['username'] = request.form['username'] session['address'] = request.form['add'] session['location'] = geolocator.geocode(session['address']) flash(f"You are Registerd as {session['username']}") return redirect(url_for('home')) flash('Username is taken !') return redirect(url_for('home')) return render_template('index.html') Please Help, let me know if you want more info..
According to the geolocator documentation the geocode function "Return a location point by address" geopy.location.Location objcet. Json serialize support by default the following types: Python | JSON dict | object list, tuple | array str, unicode | string int, long, float | number True | true False | false None | null All the other objects/types are not json serialized by default and there for you need to defined it. geopy.location.Location.raw Location’s raw, unparsed geocoder response. For details on this, consult the service’s documentation. Return type: dict or None You might be able to call the raw function of the Location (the geolocator.geocode return value) and this value will be json serializable.
Location is indeed not json serializable: there are many properties in this object and there is no single way to represent a location, so you'd have to choose one by yourself. What type of value do you expect to see in the location key of the response? Here are some examples: Textual address In [9]: json.dumps({'location': geolocator.geocode("175 5th Avenue NYC").address}) Out[9]: '{"location": "Flatiron Building, 175, 5th Avenue, Flatiron District, Manhattan Community Board 5, Manhattan, New York County, New York, 10010, United States of America"}' Point coordinates In [10]: json.dumps({'location': list(geolocator.geocode("175 5th Avenue NYC").point)}) Out[10]: '{"location": [40.7410861, -73.9896298241625, 0.0]}' Raw Nominatim response (That's probably not what you want to expose in your API, assuming you want to preserve an ability to change geocoding service to another one in future, which might have a different raw response schema). In [11]: json.dumps({'location': geolocator.geocode("175 5th Avenue NYC").raw}) Out[11]: '{"location": {"place_id": 138642704, "licence": "Data \\u00a9 OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright", "osm_type": "way", "osm_id": 264768896, "boundingbox": ["40.7407597", "40.7413004", "-73.9898715", "-73.9895014"], "lat": "40.7410861", "lon": "-73.9896298241625", "display_name": "Flatiron Building, 175, 5th Avenue, Flatiron District, Manhattan Community Board 5, Manhattan, New York County, New York, 10010, United States of America", "class": "tourism", "type": "attraction", "importance": 0.74059885426854, "icon": "https://nominatim.openstreetmap.org/images/mapicons/poi_point_of_interest.p.20.png"}}' Textual address + point coordinates In [12]: location = geolocator.geocode("175 5th Avenue NYC") ...: json.dumps({'location': { ...: 'address': location.address, ...: 'point': list(location.point), ...: }}) Out[12]: '{"location": {"address": "Flatiron Building, 175, 5th Avenue, Flatiron District, Manhattan Community Board 5, Manhattan, New York County, New York, 10010, United States of America", "point": [40.7410861, -73.9896298241625, 0.0]}}'
Python flickrapi search.photos returns the same picture on every page
I'm testing the flickrapi for python and have some code that randomly chooses a picture of Chinese food. It does this by getting 1 result on 1 page and using the total number of pages in that result to choose 1 result on 1 random page. Here is the code I'm using to get the images: flickr = flickrapi.FlickrAPI(api_key='mykey', secret='mysecret', format='parsed-json', cache=False) data1 = flickr.photos.search(tags='Chinese Food', page=1, per_page=1, tag_mode='all', media='photos', content_type=1) data2 = flickr.photos.search(tags='Chinese Food', page=randint(1, data1['photos']['pages']), per_page=1, tag_mode='all', media='photos', content_type=1, extras='url_l') No matter what I do the result in data2 is always the exact same image returned in data1, I could get the first result from page 1 and the first result from page 3472 and the image is exactly the same every time. Here is a sample of the data returned //From data1 {'photos': {'page': 1, 'pages': 70007, 'perpage': 1, 'total': '70007', 'photo': [{'id': '35800805325', 'owner': '24171591#N06', 'secret': '408928a034', 'server': '4261', 'farm': 5, 'title': 'Personalized Maple Wood Chopsticks', 'ispublic': 1, 'isfriend': 0, 'isfamily': 0}]}, 'stat': 'ok'} //From data2 {'photos': {'page': 41043, 'pages': 70007, 'perpage': 1, 'total': '70007', 'photo': [{'id': '35800805325', 'owner': '24171591#N06', 'secret': '408928a034', 'server': '4261', 'farm': 5, 'title': 'Personalized Maple Wood Chopsticks', 'ispublic': 1, 'isfriend': 0, 'isfamily': 0, 'url_l': 'https://farm5.staticflickr.com/4261/35800805325_408928a034_b.jpg', 'height_l': '859', 'width_l': '1024'}]}, 'stat': 'ok'} Notice the id and title in both sets of data are exactly the same and the page numbers are different. I've tested this in the Flickr API explorer with the exact same parameters and I do get the same image when I specify page 1 but I also get a completely different image if I specify any other page, so this seems to be an issue with the python flickrapi implementation or one of its dependencies maybe? I can't seem to find the issue. What is going on?
Looks like other people have been having this issue with Flickr's API since 2011 and it still hasn't been fixed. So, it doesn't seem to be related to Python or the Python Flickr module. I was able to improve the "randomness" by increasing the number of results per page which is something I didn't want to do but it's the only thing that works.
Dictionary text file Python
text Donald Trump: 791697302519947264,1477604720,Ohio USA,Twitter for iPhone,5251,1895 Join me live in Springfield, Ohio! Lit <<<EOT 781619038699094016,1475201875,United States,Twitter for iPhone,31968,17246 While Hillary profits off the rigged system, I am fighting for you! Remember the simple phrase: #FollowTheMoney... <<<EOT def read(text): with open(text,'r') as f: for line in f: Is there a way that i can separate each information for the candidates So for example for Donald Trump it should be [ [Donald Trump], [791697302519947264[[791697302519947264,1477604720,'Ohio USA','Twitter for iPhone',5251,18951895], 'Join['Join me live in Springfield, Ohio! Lit']Lit']], [781619038699094016[[781619038699094016,1475201875,'United States','Twitter for iPhone',31968,1724617246], 'While['While Hillary profits off the rigged system, I am fighting for you! Remember the simple phrase: #FollowTheMoney...']']] ] The format of the file is the following: ID,DATE,LOCATION,SOURCE,FAVORITE_COUNT,RETWEET_COUNT text(the tweet) So basically after the 6 headings, everything after that is a tweet till '<< Also is there a way i can do this for every candidate in the file
I'm not sure why you need a multi-dimensional list (I would pick tuples and dictionaries if possible) but this seems to produce the output you asked for: >>> txt = """Donald Trump: ... 791697302519947264,1477604720,Ohio USA,Twitter for iPhone,5251,1895 ... Join me live in Springfield, Ohio! ... Lit ... <<<EOT ... 781619038699094016,1475201875,United States,Twitter for iPhone,31968,17246 ... While Hillary profits off the rigged system, I am fighting for you! Remember the simple phrase: #FollowTheMoney... ... <<<EOT ... Another Candidate Name: ... 12312321,123123213,New York USA, Twitter for iPhone,123,123 ... This is the tweet text! ... <<<EOT""" >>> >>> >>> buffer = [] >>> tweets = [] >>> >>> for line in txt.split("\n"): ... if not line.startswith("<<<EOT"): ... buffer.append(line) ... else: ... if buffer[0].strip().endswith(":"): ... tweets.append([buffer.pop(0).rstrip().replace(":", "")]) ... metadata = buffer.pop(0).split(",") ... tweet = [" ".join(line for line in buffer).replace("\n", " ")] ... tweets.append([metadata, tweet]) ... buffer = [] ... >>> >>> from pprint import pprint >>> >>> pprint(tweets) [['Donald Trump'], [['791697302519947264', '1477604720', 'Ohio USA', 'Twitter for iPhone', '5251', '1895'], ['Join me live in Springfield, Ohio! Lit']], [['781619038699094016', '1475201875', 'United States', 'Twitter for iPhone', '31968', '17246'], ['While Hillary profits off the rigged system, I am fighting for you! Remember the simple phrase: #FollowTheMoney... ']], ['Another Candidate Name'], [['12312321', '123123213', 'New York USA', ' Twitter for iPhone', '123', '123'], ['This is the tweet text!']]] >>>
I am not quite understanding... but here is my example to read a file line by line then add that line to a string of text to post to twitter. candidates = open("FILEPATH WITH DOUBLE \") #example "C:\\users\\fox\\desktop\\candidates.txt" for candidate in candidates(): candidate = candidate.rstrip('\n') #removes new line(this is mandatory) #next line post means post to twitter post("propaganda here " + candidate + "more propaganda) note for every line in that file this code will post to twitter ex.. 20 lines means twenty twitter posts
'location' filter on database.search filtering everything
I have imported the ecoinvent database as ei The search function works quite well: In[0] eidb.search("glass", filter = {'name':'green', 'product':'packaging' } ) Excluding 296 filtered results Out[0]: ['packaging glass production, green' (kilogram, RER w/o CH+DE, None), 'packaging glass production, green' (kilogram, DE, None), 'packaging glass production, green' (kilogram, RoW, None), 'packaging glass production, green' (kilogram, CH, None), 'packaging glass production, green, without cullet' (kilogram, GLO, None), 'market for packaging glass, green' (kilogram, GLO, None)] This is exactly as one would hope. However, filtering on 'location' does not work so well: In[1] eidb.search("glass", filter = {'location':'DE', } ) Excluding 304 filtered results Out[1]: [] According to the above result, I should have at least two results. 'location' is definitely an accepted filter, and DE is definitely one of the locations (e.g. eidb.get('d2db85e14baf9e47bdbb824797420f08').get('location') returns DE). I observe this anytime location is used as filter, e.g. eidb.search('*', filter = {'location':'CA-QC'}) returns an empty list. Why?
I have no idea why this is occurring, but you can get the behaviour you are looking for by putting the location code in lowercase: In [1]: db.search("glass", filter={"location": "de"}) Excluding 103 filtered results Out[1]: ['glass tube plant' (unit, DE, ['glass', 'construction']), 'glass tube, borosilicate, at plant' (kilogram, DE, ['glass', 'construction']), 'packaging glass, white, at plant' (kilogram, DE, ['glass', 'packaging']), 'packaging glass, brown, at plant' (kilogram, DE, ['glass', 'packaging']), 'packaging glass, green, at plant' (kilogram, DE, ['glass', 'packaging']), 'solar collector glass tube, with silver mirror, at plant' (kilogram, DE, ['glass', 'construction']), 'photovoltaic laminate, CdTe, at plant' (square meter, DE, ['photovoltaic', 'production of components'])] Please file this as a bug for bw2data.
Probably you have already noticed, but for the case of Quebec, using just the final part would work (e.g. eidb.search('*', filter = {'location':'qc'})). I've checked and in ecoinvent there are no regions with the location code QC, so there is no risk of including activities from other regions.
The problem does not seems to be only with uppercase and punctuation characters ei.search('photovoltaic laminate, CdTe', filter={"location": "US"} ) Excluding 7 filtered results [] P.S: weird but in this case filter={"location": "DE"} find the correct dataset also with uppercase