how to create an object from a python array - python-3.x

I have the following structure, which I convert from a .txt with pandas
[[000001, 'PEPE ', 'S', 'LAST_NAME ', 'CIP ', 'CELLPHONE'],
[0000002, 'LUIS ', 'S', 'ADRESS ', ' ', 'nan'],
[0000003, 'PEDRO ', 'S', 'STREET ', 'CITY', ' nan']]
My code
import pandas as pd
file = 'C:\\Users\\Admin\\Desktop\\PRUEBA.txt'
columns = ("service", "name", "Active", "reference1", "reference2", "reference3")
df = pd.read_csv(file, sep="|", names=columns, header=None)
cl = df.values.tolist()
print(cl)
but to be able to give it the treatment, which it requires, either by removing the empty strings and nan, how can I transform the service to int and create an object based on the service and the references in this way.
[
{ service: 1, name: 'PEPE', order: 0, ref: 'LAST_NAME' },
{ service: 1, name: 'PEPE', order: 1, ref: 'CIP' },
{ service: 1, name: 'PEPE', order: 2, ref: 'CELLPHONE' },
{ service: 2, name: 'LUIS', order: 0, ref: 'ADRESS' },
{ service: 3, name: 'PEDRO', order: 0, ref: 'STREET' },
{ service: 3, name: 'PEDRO', order: 1, ref: 'CITY' }
]
How can I achieve this, very grateful for your comments

Key: Use df.melt() to unpivot the table and subsequently perform df.to_dict(orient='records') to convert the dataframe to a record-oriented dict as mentioned by #QuangHoang. The rest are regular filtering and miscellaneous adjustments.
# data
ls = [['000001', 'PEPE ', 'S', 'LAST_NAME ', 'CIP ', 'CELLPHONE'],
['0000002', 'LUIS ', 'S', 'ADRESS ', ' ', 'nan'],
['0000003', 'PEDRO ', 'S', 'STREET ', 'CITY', ' nan']
]
df = pd.DataFrame(ls, columns=("service", "name", "Active", "reference1", "reference2", "reference3"))
# reformat and strip over each column
for col in df:
if col == "service":
df[col] = df[col].astype(int)
else:
df[col] = df[col].str.strip() # accessor
# unpivot and adjust
df2 = df.melt(id_vars=["service", "name"],
value_vars=["reference1", "reference2", "reference3"],
value_name="ref")\
.sort_values(by="service")\
.drop("variable", axis=1)\
.reset_index(drop=True)
# filter out empty or nan
df2 = df2[~df2["ref"].isin(["", "nan"])]
# generate order numbering by group
df2["order"] = df2.groupby("service").cumcount()
df2 = df2[["service", "name", "order", "ref"]] # reorder
# convert to a record-oriented dict
df2.to_dict(orient='records')
Out[99]:
[{'service': 1, 'name': 'PEPE', 'order': 0, 'ref': 'LAST_NAME'},
{'service': 1, 'name': 'PEPE', 'order': 1, 'ref': 'CIP'},
{'service': 1, 'name': 'PEPE', 'order': 2, 'ref': 'CELLPHONE'},
{'service': 2, 'name': 'LUIS', 'order': 0, 'ref': 'ADRESS'},
{'service': 3, 'name': 'PEDRO', 'order': 0, 'ref': 'STREET'},
{'service': 3, 'name': 'PEDRO', 'order': 1, 'ref': 'CITY'}]

Related

Flatterning JSON to Pandas Dataframe

I'm trying to flattern this json into a pandas dataframe, but it's getting the better of me.
[{
'contact': {
'id': 101,
'email': 'email1#address.com',
},
'marketingPreference': [{
'marketingId': 1093,
'isOptedIn': True,
'dateModifed': '2022-05-10T14:29:24Z'
}]
},
{
'contact': {
'id': 102,
'email': 'email2#address.com',
},
'marketingPreference': [{
'marketingId': 1093,
'isOptedIn': True,
'dateModifed': '2022-05-10T14:29:24Z'
}]
}
]
I am looking for the columns to be: Id, Email, MarketingId, IsOptedIn, DateModifed.
Even though marketingPreference is an array, there is only ever one json object inside.
You can use pd.json_normalize
df = pd.json_normalize(data, record_path='marketingPreference', meta=[['contact', 'id'], ['contact', 'email']])
print(df)
marketingId isOptedIn dateModifed contact.id contact.email
0 1093 True 2022-05-10T14:29:24Z 101 email1#address.com
1 1093 True 2022-05-10T14:29:24Z 102 email2#address.com

Inserting into python dictionary without quotes

First of all. Thank you to anybody who can help me hear. I am a python beginner but having searched high and low I cannot seem to find the answer, so I am reluctantly posting for help. I am trying to build an API request in Python. I have followed a tutorial and I can make it work using a fixed value however I need to switch the "metrics" out for my own which is a list around 200 long and not sequential.
A working example is as follows:
body = {
'rsid': 'vrs_xxx_abgglobalvrs',
'globalFilters': [
{
'type': 'dateRange',
'dateRange': '2022-01-05T00:00:00.000/2022-02-04T00:00:00.000'
}
],
'metricContainer': {
'metrics': [{
"columnId": "0",
"id": "metrics/event13"
},
{
"columnId": "1",
"id": "metrics/event23"
},
{
'columnId': '2',
'id': 'metrics/event149'
}
]
},
'dimension': 'variables/daterangeday',
'settings': {
'countRepeatInstances': 'true',
'limit': 50,
'page': 0,
'dimensionSort': 'asc'
}
}
If you print this the results show as the following:
....0/2022-02-04T00:00:00.000'}], 'metricContainer': {'metrics': [{'columnId': '0', 'id': 'metrics/event1'}, {'columnId': '1', 'id': 'metrics/event2'}, {'columnId': '2', 'id': 'metrics/event45'}]}, 'dimension': 'variabl...
However when I use create my dynamic code an update the body dictionary I get the an extra quote at the start and end of my dynamic value:
....0/2022-02-04T00:00:00.000'}], 'metricContainer': {'metrics': [**"**{'columnId':'0','id': 'metrics/event1'},{'columnId': '1','id':......
For reference this dynamic value (string) is created by using a list of events generated in the following way:
list_of_events = df['id'].tolist()
list_of_cleaned_event = []
metricstring = ""
columnId = 0
strcolumn = str(columnId)
for events in list_of_events[0:3]:
metric = str("{'columnId': '"+strcolumn+"','id': '"+events+"'},")
columnId += 1
strcolumn = str(columnId)
list_of_cleaned_event.append(metric)
for i in list_of_cleaned_event:
metricstring=metricstring+i
final = (metricstring[:-1])
and the body looks like this:
body = {
'rsid': 'vrs_avisbu0_abgglobalvrs',
'globalFilters': [
{
'type': 'dateRange',
'dateRange': '2022-01-05T00:00:00.000/2022-02-04T00:00:00.000'
}
],
'metricContainer': {
'metrics': [final]
},
'dimension': 'variables/daterangeday',
'settings': {
'countRepeatInstances': 'true',
'limit': 50,
'page': 0,
'dimensionSort': 'asc'
}
}
Try this...
final = []
list_of_events = ["metrics/event13", "metrics/event20", "metrics/event25"]
for columnId, events in enumerate(list_of_events[0:3]):
metric = {'columnID': str(columnId), 'id': events}
final.append(metric)
Define body as below
body = {
'rsid': 'vrs_avisbu0_abgglobalvrs',
'globalFilters': [
{
'type': 'dateRange',
'dateRange': '2022-01-05T00:00:00.000/2022-02-04T00:00:00.000'
}
],
'metricContainer': {
'metrics': final
},
'dimension': 'variables/daterangeday',
'settings': {
'countRepeatInstances': 'true',
'limit': 50,
'page': 0,
'dimensionSort': 'asc'
}
}

Spotipy - Listing only track and artists names in a playlist

Hello All and thank you in advance for your help :)
Can someone help me understand how I can take the below code, which displays data for a specified playlist, and have it only show the artist and track names? I have been toying around with the API documentation for several hours and I have not been able to make heads or tales of it. Right now when it displays data it gives me a whole bunch of data in a jumbled mess. Also, note that I put dummy values in the client_id and Secret parts of this code.
from spotipy.oauth2 import SpotifyClientCredentials
import spotipy
import json
PlaylistExample = '37i9dQZEVXbMDoHDwVN2tF'
cid = '123'
secret = 'xyz'
auth_manager = SpotifyClientCredentials(client_id=cid, client_secret=secret)
sp = spotipy.Spotify(auth_manager=auth_manager)
playlist_id = 'spotify:user:spotifycharts:playlist:37i9dQZEVXbJiZcmkrIHGU'
results = sp.playlist(playlist_id)
print(json.dumps(results, indent=4))
Would something like this be useful?:
print("Song - Artist - Album\n")
for item in results['tracks']['items']:
print(
item['track']['name'] + ' - ' +
item['track']['artists'][0]['name'] + ' - ' +
item['track']['album']['name']
)
Your output will look similar to this:
Song - Artist - Album
ONLY - ZHU - ONLY
Bad - 2012 Remaster - Michael Jackson - Bad 25th Anniversary
Orion - Rodrigo y Gabriela - Rodrigo y Gabriela
Shape of You - Ed Sheeran - ÷ (Deluxe)
Alternatively, you could create your own structure based on the returned one by Spotify but just keeping what you need:
result_dict = {
'tracks': {
'items': [],
'limit': 100,
'next': None,
'offset': 0,
'previous': None,
'total': 16
},
'type': 'playlist',
'uri': '<playlist_uri>'
}
And your track structure that goes inside 'items' from above:
track_dict = {
'track': {
'album': {
'name': item['track']['album']['name'],
},
'artists': [{
'name': item['track']['artists'][0]['name'],
}],
'name': item['track']['name'],
}
}
Then iterate and insert one by one:
for item in results['tracks']['items']:
track_dict = {
'track': {
'album': {
'name': item['track']['album']['name'],
},
'artists': [{
'name': item['track']['artists'][0]['name'],
}],
'name': item['track']['name'],
}
}
# Append the track dict structure to your results dict structure
result_dict['tracks']['items'].append(track_dict)
Having this as a result when printing result_dict:
{
'tracks': {
'items': [{
'track': {
'album': {
'name': 'ONLY'
},
'artists': [{
'name': 'ZHU'
}],
'name': 'ONLY'
}
}, {
'track': {
'album': {
'name': 'Bad 25th Anniversary'
},
'artists': [{
'name': 'Michael Jackson'
}],
'name': 'Bad - 2012 Remaster'
}
}, {
'track': {
'album': {
'name': 'Rodrigo y Gabriela'
},
'artists': [{
'name': 'Rodrigo y Gabriela'
}],
'name': 'Orion'
}
}, {
'track': {
'album': {
'name': '÷ (Deluxe)'
},
'artists': [{
'name': 'Ed Sheeran'
}],
'name': 'Shape of You'
}
}],
'limit': 100,
'next': None,
'offset': 0,
'previous': None,
'total': 4
},
'type': 'playlist',
'uri': '<playlist_uri>'
}

Returning wrong intent should be buy_food not hello

I'm new using node-nlp but according to the examples I saw and the data I provided I should get a buy_food intent. But I'm not, it's returning hello intent. Any suggestions? If I remove the
manager.addDocument('en', 'hi', 'hello') line, it returns the right intent.
Any suggestions?
const { NlpManager } = require('node-nlp');
const manager = new NlpManager({ languages: 'en', ner: { threshold: 0.8 } });
manager.addNamedEntityText('drink', 'sprite', ['en'], ['sprite']);
manager.addNamedEntityText('size', 'large', ['en'], ['large', 'big']);
manager.addDocument('en', 'hello', 'hello');
manager.addDocument('en', 'hi', 'hello');
manager.addDocument('en', '%size% %drink%', 'buy_food');
manager.addDocument('en', 'i want %drink% please', 'buy_food');
manager.addDocument('en', 'i want %size% %drink% please', 'buy_food');
manager.addDocument('en', 'i want a %drink% please', 'buy_food');
manager.addDocument('en', 'i want a %size% %drink% please', 'buy_food');
manager.addDocument('en', 'i want to order a %size% %drink%', 'buy_food');
manager.addDocument('en', 'i want to order a %drink%', 'buy_food');
manager.addDocument('en', 'bye', 'bye');
manager.addDocument('en', 'bye bye', 'bye');
manager.addAnswer('en', 'hello', 'Welcome! How may I help you?');
manager.addAnswer('en', 'bye', 'Till next time');
manager.addAnswer('en', 'bye', 'see you soon!');
manager.addAnswer('en', 'buy_food', 'Got it, what else?');
manager.addAnswer('en', 'buy_food', 'Ok. Next item.');
var text = 'i want a large sprite';
(async () => {
await manager.train();
var result= await manager.process('en', text);
console.log(result);
})();
The result is
{ utterance: 'i want a large sprite',
locale: 'en',
languageGuessed: false,
localeIso2: 'en',
language: 'English',
domain: 'default',
classifications:
[ { label: 'hello', value: 0.5484776863751949 },
{ label: 'buy_food', value: 0.36024868417489125 },
{ label: 'bye', value: 0.09127362944991388 } ],
intent: 'hello',
score: 0.5484776863751949,
entities:
[ { start: 9,
end: 13,
len: 5,
levenshtein: 0,
accuracy: 1,
option: 'large',
sourceText: 'large',
entity: 'size',
utteranceText: 'large' },
{ start: 15,
end: 20,
len: 6,
levenshtein: 0,
accuracy: 1,
option: 'sprite',
sourceText: 'sprite',
entity: 'drink',
utteranceText: 'sprite' } ],
sentiment:
{ score: 0.29699999999999993,
comparative: 0.05939999999999999,
vote: 'positive',
numWords: 5,
numHits: 6,
type: 'senticon',
language: 'en' },
actions: [],
srcAnswer: 'Welcome! How may I help you?',
answer: 'Welcome! How may I help you?' }

Transform a list of dict to an simpler dict

I have list of dict like this:
[{
'attr': 'bla',
'status': '1',
'id': 'id1'
}, {
'attr': 'bla',
'status': '1',
'id': 'id2'
}, {
'attr': 'bli',
'status': '0',
'id': 'id1'
}, {
'attr': 'bli',
'status': '1',
'id': 'id2'
}]
I wan't to get a simpler results dict like this:
result = {
'bla' : True,
'bli' : False
}
If the two id have a 1 for an attr, the value will be True. else, it will False.
I've tried with
for elem in dict:
for key, value in enumerate(elem):
# ???
But i don't see how to do. I've alos tried something like
if all( val == '1' for val in list ):
# ..
Here you go:
dicts = [{
'attr': 'bla',
'status': '1',
'id': 'id1'
}, {
'attr': 'bla',
'status': '1',
'id': 'id2'
}, {
'attr': 'bli',
'status': '0',
'id': 'id1'
}, {
'attr': 'bli',
'status': '1',
'id': 'id2'
}]
# First run is to create all nessecary items in the
# new Dictionary so i can use the and operator on them later.
newDict = {}
for dictio in dicts:
for key, value in dictio.items():
if key == 'attr':
newDict[value] = True
# The second run uses the and operator
for dictio in dicts:
for key, value in dictio.items():
if key == 'attr':
tmpAttr = value
if key == 'status':
newDict[tmpAttr] = newDict[tmpAttr] and (value == '1')
print(newDict)
Have a nice day!

Resources