[{'id': 2, 'Registered Address': 'Line 1: 1 Any Street Line 2: Any locale City: Any City Region / State: Any Region Postcode / Zip code: BA2 2SA Country: GB Jurisdiction: Any Jurisdiction'}]
I have the above read into a dataframe and that is the output so far. The issue is I need to break out the individual elements - due to names of places etc the values may or may not have spaces in them - looking at the above my keys are Line 1, Line 2, City, Region / State, Postcode / Zip, Country, Jurisdiction.
Output required for the "Registered Address"-'key'is the keys and values
"Line 1": "1 Any Street"
"Line 2": "Any locale"
"City": "Any City"
"Region / State": "Any Region"
"Postcode / Zip code": "BA2 2SA"
"Country": "GB"
"Jurisdiction": "Any Jurisdiction"
Just struggling to find a way to get to the end result.I have tried to pop out and use urllib.prse but fell short - is anypone able to point me in the best direction please?
Tried to write a code that generalizes your question, but there were some limitations, regarding your data format. Anyway I would do this:
def address_spliter(my_data, my_keys):
address_data = my_data[0]['Registered Address']
key_address = {}
for i,k in enumerate(keys):
print(k)
if k == 'Jurisdiction:':
key_address[k] = address_data.split('Jurisdiction:')[1].removeprefix(' ').removesuffix(' ')
else:
key_address[k] = address_data.split(k)[1].split(keys[i+1])[0].removeprefix(' ').removesuffix(' ')
return key_address
were you can call this function like this:
my_data = [{'id': 2, 'Registered Address': 'Line 1: 1 Any Street Line 2: Any locale City: Any City Region / State: Any Region Postcode / Zip code: BA2 2SA Country: GB Jurisdiction: Any Jurisdiction'}]
and
my_keys = ['Line 1:','Line 2:','City:', 'Region / State:', 'Postcode / Zip code:', 'Country:', 'Jurisdiction']
As you can see It'll work if only the sequence of keys is not changed. But anyway, you can work around this idea and change it base on your problem accordingly if it doesn't go as expected.
i want to get actor name out of this json file page_title and then match this with database i tried using nltk and spacy but there i have to train data. Do i have train for each and ever sentence i have more than 100k sentences. If i sit to train data it will takes a month or more. Is there any way that i can dump K_actor database to train spacy, nltk or any other way.
{"page_title": "Sonakshi Sinha To Auction Sketch Of Buddha To Help Migrant Labourers", "description": "Sonakshi Sinha took to Instagram to share a timelapse video of a sketch of Buddha that she made to auction to raise funds for migrant workers affected by Covid-19 crisis. ", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589815261_1589815196489_copy_875x583.jpg", "post_url": "https://www.news18.com/news/movies/sonakshi-sinha-to-auction-sketch-of-buddha-to-help-migrant-labourers-2626123.html"}
{"page_title": "Anushka Sharma Calls Virat Kohli 'A Liar' on IG Live, Nushrat Bharucha Gets Propositioned on Twitter", "description": "In an Instagram live interaction with Sunil Chhetri, Virat Kohli was left embarrassed after Anushka Sharma called him a 'jhootha' from behind the camera. This and more in today's wrap.", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589813980_1589813933996_copy_875x583.jpg", "post_url": "https://www.news18.com/news/movies/anushka-sharma-calls-virat-kohli-a-liar-on-ig-live-nushrat-bharucha-gets-propositioned-on-twitter-2626093.html"}
{"page_title": "Ranveer Singh Shares a Throwback to the Days When WWF was His Life", "description": "Ranveer Singh shared a throwback picture from his childhood where he could be seen posing in front of a poster of WWE legend Hulk Hogan.", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589812401_screenshot_20200518-195906_chrome_copy_875x583.jpg", "post_url": "https://www.news18.com/news/movies/ranveer-singh-shares-a-throwback-to-the-days-when-wwf-was-his-life-2626067.html"}
{"page_title": "Salman Khan's Love Song 'Tere Bina' Gets 26 Million Views", "description": "Salman Khan's song Tere Bina, which was launched a few days ago, had garnered 12 million views within 24 hours. As it continues to trend, it has garnered 26 million views in less than a week.", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589099778_screenshot_20200510-135934_chrome_copy_875x583.jpg", "post_url": "https://www.news18.com/news/movies/salman-khans-love-song-tere-bina-gets-26-million-views-2626077.html"}
{"page_title": "Yash And Radhika Pandit Pose With Their Kids For a Perfect Family Picture", "description": "Kannada actor Yash tied the knot with actress Radhika Pandit in 2016. The couple shares two kids together.", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589812187_yash.jpg", "post_url": "https://www.news18.com/news/movies/yash-and-radhika-pandit-pose-with-their-kids-for-a-perfect-family-picture-2626055.html"}
{"page_title": "Malaika Arora Shares Beach Vacay Boomerang With Hopeful Note", "description": "Malaika Arora shared a throwback boomerang from a beach vacation where she could be seen playfully spinning. She also shared a hopeful message along with it.", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589810291_screenshot_20200518-192603_chrome_copy_875x583.jpg", "post_url": "https://www.news18.com/news/movies/malaika-arora-shares-beach-vacay-boomerang-with-hopeful-note-2626019.html"}
{"page_title": "Actor Nawazuddin Siddiqui's Wife Aaliya Sends Legal Notice To Him Demanding Divorce, Maintenance", "description": "The notice was sent to the ", "image_url": "https://images.news18.com/ibnlive/uploads/2019/10/Nawazuddin-Siddiqui.jpg", "post_url": "https://www.news18.com/news/movies/actor-nawazuddin-siddiquis-wife-aaliya-sends-legal-notice-to-him-demanding-divorce-maintenance-2626035.html"}
{"page_title": "Lisa Haydon Celebrates Son Zack\u2019s 3rd Birthday With Homemade Cake And 'Spiderman' Surprise", "description": "Lisa Haydon took to Instagram to share some glimpses from the special day. In the pictures, we can spot a man wearing a Spiderman costume.", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589807960_lisa-rey.jpg", "post_url": "https://www.news18.com/news/movies/lisa-haydon-celebrates-son-zacks-3rd-birthday-with-homemade-cake-and-spiderman-surprise-2625953.html"}
{"page_title": "Chiranjeevi Recreates Old Picture with Wife, Says 'Time Has Changed'", "description": "Chiranjeevi was last seen in historical-drama Sye Raa Narasimha Reddy. He was shooting for his next film, Acharya, before the coronavirus lockdown.", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589808242_pjimage.jpg", "post_url": "https://www.news18.com/news/movies/chiranjeevi-recreates-old-picture-with-wife-says-time-has-changed-2625973.html"}
{"page_title": "Amitabh Bachchan, Rishi Kapoor\u2019s Pout Selfie Recreated By Abhishek, Ranbir is Priceless", "description": "A throwback picture that has gone viral on the internet shows Ranbir Kapoor and Abhishek Bachchan recreating a selfie of their fathers Rishi Kapoor and Amitabh Bachchan.", "image_url": "https://images.news18.com/ibnlive/uploads/2020/05/1589807772_screenshot_20200518-184521_chrome_copy_875x583.jpg", "post_url": "https://www.news18.com/news/movies/amitabh-bachchan-rishi-kapoors-pout-selfie-recreated-by-abhishek-ranbir-is-priceless-2625867.html"}
Something that you can do is to create an annoter script wherein you can replace actor names with '###' or some other string (which will be replaced later with actor names (entities) for training).
I trained 68K data/sentences in 9 hrs with my i3 laptop. You can dump data like this and the output file can be used for training the model.
That will save time and also give you ready made training data format for SpaCy.
from nltk import word_tokenize
from pandas import read_csv
import re
import os.path
def annot(Label, entity, textlist) :
finaldict = []
for text_token in textlist:
textbk=text_token
for value in entity:
#if entity has multi tokens
text=textbk
text=text_token
text=str(text).replace('###',value)
text=text.lower()
text = re.sub('[^a-zA-Z0-9\n\.]',' ', text)
if len(word_tokenize(value))<2:
#print('I am here')
newtext=word_tokenize(text)
traindata=[]
prev_length=0
prev_pos=0
k=0
while k != len(newtext):
if k == 0:
prev_pos=0
prev_length=len(newtext[k])
if value.lower()== str(newtext[k]):
ent=Label
tup=(prev_pos,prev_length,ent)
traindata.append(tup)
else:
pass
else :
prev_pos=prev_length+1
prev_length=prev_length+len(newtext[k])+1
if value.lower()==str(newtext[k]):
ent=Label
tup=(prev_pos,prev_length,ent)
traindata.append(tup)
else:
pass
k=k+1
mydict={'entities':traindata}
finaldict.append((text,mydict))
else:
traindata=[]
try:
begin=text.index(value.lower())
ent=Label
tup=(begin,len(value.lower()),ent)
traindata.append(tup)
except ValueError:
pass
mydict={'entities':traindata}
finaldict.append((text,mydict))
return finaldict
def getEntities(csv_file, column) :
df = read_csv(csv_file)
return df[column].to_list()
def getSentences(file_name) :
with open(file_name) as file1 :
sentences = [line1.rstrip('\n') for line1 in file1]
return sentences
def saveData (data, filename, path) :
filename = os.path.join(path, filename)
with open(filename, 'a') as file :
for sent in data :
file.write("{}\n".format(sent))
ents = getEntities(csv_file, column_name) #Actor names in your case
entities = [ent for ent in ents if str(ent) != 'nan']
sentences = getSentences(filepathandname) #Considering you have the sentences in a text file
label = 'ACTOR_NAMES'
data = annot(label, entities, sentences)
saveData(data, 'train_data.txt', path)
Hope this is a relevant answer for your question.
I received a CSV file that includes a combination of string and tuple elements and cannot find a way to parse it properly. Am I missing something obvious?
csvfile
"presentation_id","presentation_name","sectionId","sectionNumber","courseId","courseIdentifier","courseName","activity_id","activity_prompt","activity_content","solution","event_timestamp","answer_id","answer","isCorrect","userid","firstname","lastname","email","role"
"26cc7957-5a6b-4bde-a996-dd823f54ece7","3-Axial Skeleton F18","937c47b0-cc66-4938-81de-1b1b58388499","001","3b5b5e49-1798-4eab-86d7-186cf59149b4","MOVESCI 230","Human Musculoskeletal Anatomy","62d059e8-9ab4-41d4-9eb8-00ba67d9fac9","A blow to which side of the knee might tear the medial collateral ligament?","{"choices":["medial","lateral"],"type":"MultipleChoice"}","{"solution":[1],"selectAll":false,"type":"MultipleChoice"}","2018-09-30 23:54:16.000","7b5048e5-7460-49f8-a64a-763b7f62d771","{"solution":[1],"type":"MultipleChoice"}","1","57ba970d-d02b-4a10-a64d-56f02336ee08","Student","One","student1#example.com","Student"
"26cc7957-5a6b-4bde-a996-dd823f54ece7","3-Axial Skeleton F18","937c47b0-cc66-4938-81de-1b1b58388499","001","3b5b5e49-1798-4eab-86d7-186cf59149b4","MOVESCI 230","Human Musculoskeletal Anatomy","f82cb32b-45ce-4d3a-aa74-b3fa1a1038a2","What is the name of this movement?","{"choices":["right rotation","left rotation","right lateral rotation","left lateral rotation"],"type":"MultipleChoice"}","{"solution":[1],"selectAll":false,"type":"MultipleChoice"}","2018-09-30 23:20:33.000","d6cce4d9-37ae-409e-afc5-54ad79f86226","{"solution":[3],"type":"MultipleChoice"}","0","921d1b9b-f550-4289-89f1-2a805b27eeb3","Student","Two","student2#example.com","Student"
where 1st row is titles, 2nd starts the data
with open(filepathcsv) as csvfile:
readCSV = csv.reader(csvfile)
for row in readCSV:
numcolumns = len(row)
print(numcolumns,": ",row)
yields:
20 : ['presentation_id', 'presentation_name', 'sectionId', 'sectionNumber', 'courseId', 'courseIdentifier', 'courseName', 'activity_id', 'activity_prompt', 'activity_content', 'solution', 'event_timestamp', 'answer_id', 'answer', 'isCorrect', 'userid', 'firstname', 'lastname', 'email', 'role']
25 : ['26cc7957-5a6b-4bde-a996-dd823f54ece7', '3-Axial Skeleton F18', '937c47b0-cc66-4938-81de-1b1b58388499', '001', '3b5b5e49-1798-4eab-86d7-186cf59149b4', 'MOVESCI 230', 'Human Musculoskeletal Anatomy', '62d059e8-9ab4-41d4-9eb8-00ba67d9fac9', 'A blow to which side of the knee might tear the medial collateral ligament?', '{choices":["medial"', 'lateral]', 'type:"MultipleChoice"}"', '{solution":[1]', 'selectAll:false', 'type:"MultipleChoice"}"', '2018-09-30 23:54:16.000', '7b5048e5-7460-49f8-a64a-763b7f62d771', '{solution":[1]', 'type:"MultipleChoice"}"', '1', '57ba970d-d02b-4a10-a64d-56f02336ee08', 'William', 'Muter', 'wmuter#umich.edu', 'Student']
27 : ['26cc7957-5a6b-4bde-a996-dd823f54ece7', '3-Axial Skeleton F18', '937c47b0-cc66-4938-81de-1b1b58388499', '001', '3b5b5e49-1798-4eab-86d7-186cf59149b4', 'MOVESCI 230', 'Human Musculoskeletal Anatomy', 'f82cb32b-45ce-4d3a-aa74-b3fa1a1038a2', 'What is the name of this movement?', '{choices":["right rotation"', 'left rotation', 'right lateral rotation', 'left lateral rotation]', 'type:"MultipleChoice"}"', '{solution":[1]', 'selectAll:false', 'type:"MultipleChoice"}"', '2018-09-30 23:20:33.000', 'd6cce4d9-37ae-409e-afc5-54ad79f86226', '{solution":[3]', 'type:"MultipleChoice"}"', '0', '921d1b9b-f550-4289-89f1-2a805b27eeb3', 'Noah', 'Willett', 'willettn#umich.edu', 'Student']
csv.reader is parsing each row differently because of complicated structure with embedded curly braced elements.
...but I expect 20 elements in each row.
The in the records, not the code. Your code works fine. To solve the problem you need to fix csv file because the fields with json content weren't serialised correctly.
Just change one quote sign " to two signs "" to escape them.
Here the example of fixed csv row.
"26cc7957-5a6b-4bde-a996-dd823f54ece7","3-Axial Skeleton F18","937c47b0-cc66-4938-81de-1b1b58388499","001","3b5b5e49-1798-4eab-86d7-186cf59149b4","MOVESCI 230","Human Musculoskeletal Anatomy","f82cb32b-45ce-4d3a-aa74-b3fa1a1038a2","What is the name of this movement?","{""choices"":[""right rotation"",""left rotation"",""right lateral rotation"",""left lateral rotation""],""type"":""MultipleChoice""}","{""solution"":[1],""selectAll"":false,""type"":""MultipleChoice""}","2018-09-30 23:20:33.000","d6cce4d9-37ae-409e-afc5-54ad79f86226","{""solution"":[3],""type"":""MultipleChoice""}","0","921d1b9b-f550-4289-89f1-2a805b27eeb3","Student","Two","student2#example.com","Student"
And the result of your code after fix:
20 : ['26cc7957-5a6b-4bde-a996-dd823f54ece7', '3-Axial Skeleton F18', '937c47b0-cc66-4938-81de-1b1b58388499', '001', '3b5b5e49-1798-4eab-86d7-186cf59149b4', 'MOVESCI 230', 'Human Musculoskeletal Anatomy', 'f82cb32b-45ce-4d3a-aa74-b3fa1a1038a2', 'What is the name of this movement?', '{"choices":["right rotation","left rotation","right lateral rotation","left lateral rotation"],"type":"MultipleChoice"}', '{"solution":[1],"selectAll":false,"type":"MultipleChoice"}', '2018-09-30 23:20:33.000', 'd6cce4d9-37ae-409e-afc5-54ad79f86226', '{"solution":[3],"type":"MultipleChoice"}', '0', '921d1b9b-f550-4289-89f1-2a805b27eeb3', 'Student', 'Two', 'student2#example.com', 'Student']
Thank you all for your suggestions!
Also, my apologies, as I did not include the raw CSV file I was trying to parse (example here:)
"b5ae18d3-b6dd-4d0a-84fe-7c43df472571"|"Climate_Rapid_Change_W18.pdf"|"18563b1e-a467-44b3-aed7-3607a1acd712"|"001"|"c86c8c8d-dca6-41cd-a010-a83e40d93e75"|"CLIMATE 102"|"Extreme Weather"|"278c4561-c834-4343-a770-3f544966f633"|"Which European city is at the same latitude as Ann Arbor?"|"{"choices":["Stockholm, Sweden","Berlin, Germany","London, England","Paris, France","Madrid, Spain"],"type":"MultipleChoice"}"|"{"solution":[4],"selectAll":false,"type":"MultipleChoice"}"|"2019-01-31 22:11:08.000"|"81392cd3-28e9-4e2e-8a33-018104b1f4d1"|"{"solution":[3,4],"type":"MultipleChoice"}"|"0"|"2db10c95-b507-4211-8244-394361148b22"|"Student"|"One"|"student1#umich.edu"|"Student"
"ee73fdaf-a926-4899-b0f7-9b942f1b44ad"|"6-Elbow, Wrist, Hand W19"|"48539109-529e-4359-83b9-2ae81be0532c"|"001"|"3b5b5e49-1798-4eab-86d7-186cf59149b4"|"MOVESCI 230"|"Human Musculoskeletal Anatomy"|"fcd7c673-d944-48c3-8a09-f458e03f8c44"|"What is the name of this movement?"|"{"choices":["first phalangeal joint","first proximal interphalangeal joint","first distal interphalangeal joint","first interphalangeal joint"],"type":"MultipleChoice"}"|"{"solution":[3],"selectAll":false,"type":"MultipleChoice"}"|"2019-01-31 22:07:32.000"|"9016f36c-41f5-4e14-84a9-78eea682c802"|"{"solution":[3],"type":"MultipleChoice"}"|"1"|"7184708d-4dc7-42e0-b1ea-4aca51f00fcd"|"Student"|"Two"|"student2#umich.edu"|"Student"
You are correct that the problem was the form of the CSV file.
I changed readCSV = csv.reader(csvfile) to readCSV = csv.reader(csvfile, delimiter="|", quotechar='|')
I then took the resulting list and removed the extraneous quotation marks from each element.
The rest of the program now works properly.
I am fairly new to python, and I was trying to sort this string in a certain way (Taken off a database):
6392079|||| 1.0|03/09/2017|PARADIGM REAL-TIME REVEL INSULIN INFUSION PUMP|INSULIN INFUSION PUMP / SENSOR AUGMENTED|MEDTRONIC MINIMED|18000 DEVONSHIRE STREET||NORTHRIDGE|CA|91325||US|91325||MMT-723LNAH|MMT-723LNAH|||0LP|R|01/29/2014|OYC||Y
This is the standard format for these types of strings:
MDR_REPORT_KEY|DEVICE_EVENT_KEY|IMPLANT_FLAG|DATE_REMOVED_FLAG|DEVICE_SEQUENCE_NO|DATE_RECEIVED|BRAND_NAME|GENERIC_NAME|MANUFACTURER_D_NAME|MANUFACTURER_D_ADDRESS_1|MANUFACTURER_D_ADDRESS_2|MANUFACTURER_D_CITY|MANUFACTURER_D_STATE_CODE|MANUFACTURER_D_ZIP_CODE|MANUFACTURER_D_ZIP_CODE_EXT|MANUFACTURER_D_COUNTRY_CODE|MANUFACTURER_D_POSTAL_CODE|EXPIRATION_DATE_OF_DEVICE|MODEL_NUMBER|CATALOG_NUMBER|LOT_NUMBER|OTHER_ID_NUMBER|DEVICE_OPERATOR|DEVICE_AVAILABILITY|DATE_RETURNED_TO_MANUFACTURER|DEVICE_REPORT_PRODUCT_CODE|DEVICE_AGE_TEXT|DEVICE_EVALUATED_BY_MANUFACTUR
Is there any way I can print out this string sorted with the specific datatype next to the value?
For example as an output I would like to have
Report key: 6392079
Device sequence number: 1.0
Date received: 03/09/2017
Brand name: PARADIGM REAL-TIME REVEL INSULIN INFUSION PUMP
etc.etc. with the other values. I think I would need to use the "|" as a divider to separate the data, but I'm not sure how to. I also cannot use sorting with the index number, because there are many variations of the string above which are all different lengths.
Also as you can see in the string some of the data such as device_event_key, implant_flag, date_removed_flag, and device_sequence number are absent, but there are still corresponding empty vertical slashes.
Any help would be greatly appreciated, thanks.
#nsortur, you can try the below code to get the output.
I have used the concept of list comprehension, zip() function and split(), join() methods defined on string objects.
You can try to run code online at
http://rextester.com/MBDXB29573 (Code perfectly works with Python2/Python3).
string1 = "6392079|||| 1.0|03/09/2017|PARADIGM REAL-TIME REVEL INSULIN INFUSION PUMP|INSULIN INFUSION PUMP / SENSOR AUGMENTED|MEDTRONIC MINIMED|18000 DEVONSHIRE STREET||NORTHRIDGE|CA|91325||US|91325||MMT-723LNAH|MMT-723LNAH|||0LP|R|01/29/2014|OYC||Y"
keys = ["Report key", "Device sequence number","Date received", "Brand name"];
values = [key.strip() for key in string1.split("|") if key.strip()];
output = "\n".join([key + ": " + str(value) for key, value in zip(keys, values)]);
print(output);
Output:
Report key: 6392079
Device sequence number: 1.0
Date received: 03/09/2017
Brand name: PARADIGM REAL-TIME REVEL INSULIN INFUSION PUMP
Use zip to merge the two lists into tuple pairs:
data = '6392079|||| 1.0|03/09/2017|PARADIGM REAL-TIME REVEL INSULIN INFUSION PUMP|INSULIN INFUSION PUMP / SENSOR AUGMENTED|MEDTRONIC MINIMED|18000 DEVONSHIRE STREET||NORTHRIDGE|CA|91325||US|91325||MMT-723LNAH|MMT-723LNAH|||0LP|R|01/29/2014|OYC||Y'
format = 'MDR_REPORT_KEY|DEVICE_EVENT_KEY|IMPLANT_FLAG|DATE_REMOVED_FLAG|DEVICE_SEQUENCE_NO|DATE_RECEIVED|BRAND_NAME|GENERIC_NAME|MANUFACTURER_D_NAME|MANUFACTURER_D_ADDRESS_1|MANUFACTURER_D_ADDRESS_2|MANUFACTURER_D_CITY|MANUFACTURER_D_STATE_CODE|MANUFACTURER_D_ZIP_CODE|MANUFACTURER_D_ZIP_CODE_EXT|MANUFACTURER_D_COUNTRY_CODE|MANUFACTURER_D_POSTAL_CODE|EXPIRATION_DATE_OF_DEVICE|MODEL_NUMBER|CATALOG_NUMBER|LOT_NUMBER|OTHER_ID_NUMBER|DEVICE_OPERATOR|DEVICE_AVAILABILITY|DATE_RETURNED_TO_MANUFACTURER|DEVICE_REPORT_PRODUCT_CODE|DEVICE_AGE_TEXT|DEVICE_EVALUATED_BY_MANUFACTUR'
for label, value in zip(format.split('|'), data.split('|')):
print("%s: %s" % (label.replace('_', ' ').capitalize(), value))
This outputs:
Mdr report key: 6392079
Device event key:
Implant flag:
Date removed flag:
Device sequence no: 1.0
Date received: 03/09/2017
Brand name: PARADIGM REAL-TIME REVEL INSULIN INFUSION PUMP
Generic name: INSULIN INFUSION PUMP / SENSOR AUGMENTED
Manufacturer d name: MEDTRONIC MINIMED
Manufacturer d address 1: 18000 DEVONSHIRE STREET
Manufacturer d address 2:
Manufacturer d city: NORTHRIDGE
Manufacturer d state code: CA
Manufacturer d zip code: 91325
Manufacturer d zip code ext:
Manufacturer d country code: US
Manufacturer d postal code: 91325
Expiration date of device:
Model number: MMT-723LNAH
Catalog number: MMT-723LNAH
Lot number:
Other id number:
Device operator: 0LP
Device availability: R
Date returned to manufacturer: 01/29/2014
Device report product code: OYC
Device age text:
Device evaluated by manufactur: Y
This can be achieved by simple split() method of the str, split('|') would have empty strings for the empty values between two |, and then match it with dict having attribute as key and value as value of dict
query = '6392079|||| 1.0|03/09/2017|PARADIGM REAL-TIME REVEL INSULIN INFUSION PUMP|INSULIN INFUSION PUMP / SENSOR AUGMENTED|MEDTRONIC MINIMED|18000 DEVONSHIRE STREET||NORTHRIDGE|CA|91325||US|91325||MMT-723LNAH|MMT-723LNAH|||0LP|R|01/29/2014|OYC||Y'
def get_detail(str_):
key_finder = {'Report Key': 0, 'Device Sequence Number': 4, 'Device Recieved': 5, 'Brand Name': 6}
split_by = str_.split('|')
print('Report Key : {}'.format(split_by[key_finder['Report Key']]))
print('Device Seq Num : {}'.format(split_by[key_finder['Device Sequence Number']]))
print('Device Recieved : {}'.format(split_by[key_finder['Device Recieved']]))
print('Brand Name : {}'.format(split_by[key_finder['Brand Name']]))
>>> get_detail(query)
Report Key : 6392079
Device Seq Num : 1.0
Device Recieved : 03/09/2017
Brand Name : PARADIGM REAL-TIME REVEL INSULIN INFUSION PUMP
This works because the splited string will be indexed from 0, so the Report Key will have the value in 0th index of the splitted string and so on for other values. This will be matched with the dict key_finder which has the stored index for each value.