Creating a 'SS' item in DynamoDB using boto3 - python-3.x

I'm trying to create an item in AWS DynamoDB using boto3 and regardless what I try I can't manage to get an item of type 'SS' created. Here's my code:
client = boto3.resource('dynamodb', region_name=region)
table = client.Table(config[region]['table'])
sched = {
"begintime": begintime,
"description": description,
"endtime": endtime,
"name": name,
"type": "period",
"weekdays": [weekdays]
}
table.put_item(Item=sched)
The other columns work fine but regardless what I try, weekdays always ends up as a 'S' type. For reference, this is what one of the other items look like from the same table:
{'begintime': '09:00', 'endtime': '18:00', 'description': 'Office hours', 'weekdays': {'mon-fri'}, 'name': 'office-hours', 'type': 'period'}
Trying to convert this to a Python structure obviously fails so I'm not sure how it's possible to insert a new item.

To indicate an attribute of type SS (String Set) using the boto3 DynamoDB resource-level methods, you need to supply a set rather than a simple list. For example:
import boto3
res = boto3.resource('dynamodb', region_name=region)
table = res.Table(config[region]['table'])
sched = {
"begintime": '09:00',
"description": 'Hello there',
"endtime": '14:00',
"name": 'james',
"type": "period",
"weekdays": set(['mon', 'wed', 'fri'])
}
table.put_item(Item=sched)

As follow up on #jarmod's answer:
If you want to call update_item with a String Set, then you'll insert a set via ExpressionAttributeValues property like shown below:
entry = table.put_item(
ExpressionAttributeNames={
"#begintime": "begintime",
"#description": "description",
"#endtime": "endtime",
"#name": "name",
"#type": "type",
"#weekdays": "weekdays"
},
ExpressionAttributeValues={
":begintime": '09:00',
":description": 'Hello there',
":endtime": '14:00',
":name": 'james',
":type": "period",
":weekdays": set(['mon', 'wed', 'fri'])
},
UpdateExpression="""
SET #begintime= :begintime,
#description = :description,
#endtime = :endtime,
#name = :name,
#type = :type,
#weekdays = :weekdays
"""
)
(Hint: Usage of AttributeUpdates (related Items equivalent for put_item calls) is deprecated, therefore I recommend using ExpressionAttributeNames, ExpressionAttributeValues and UpdateExpression).

Related

Python: How to populate nested dict with attributes and values from parsed XML string?

I have a dict containing identifiers as keys, with an XML string as their respective values. I want to parse the attributes and values from the XML and automagically populate a dict with them, under their respective identifier keys.
import xml.etree.ElementTree as etree
employees = {
'employee_0': '<Person><Attribute name="name"><Value>Bill Johnson</Value></Attribute><Attribute name="city"><Value>New York</Value></Attribute><Attribute name="email"><Value>bill.johnson#email.com</Value></Attribute></Person>',
'employee_1': '<Person><Attribute name="name"><Value>Amanda Philips</Value></Attribute><Attribute name="city"><Value>Los Angeles</Value></Attribute><Attribute name="email"><Value>amanda.philips#email.com</Value></Attribute></Person>'
}
for identifier_key in employees:
xml = etree.fromstring(employees[identifier_key])
for key in xml:
key_str = key.attrib["name"]
for value in key:
value_str = value.text
employees[identifier_key][key_str] = value_str
I want the employees dict to result in this:
{
"employee_0": {
"name": "Bill Johnson",
"city": "New York",
"email": "bill.johnson#email.com"
},
"employee_1": {
"name": "Amanda Philips",
"city": "Los Angeles",
"email": "amanda.philips#email.com"
}
}
But in the code above, we get a TypeError: 'str' object does not support item assignment. My questions are:
Why do we get this error? It seems like this should be the proper way to populate the dict. If I instead use employees[identifier_key] = { key_str: value_str } it will overwrite the previous iteration. I have tried .update() too, without luck. How can this operation be accomplished?
How can the operation be accomplished in a nice and clean way, e.g. using dict comprehension? I'm having difficulty putting together the syntax for it.
Another method.
from simplified_scrapy import SimplifiedDoc
employees = {
'employee_0': '<Person><Attribute name="name"><Value>Bill Johnson</Value></Attribute><Attribute name="city"><Value>New York</Value></Attribute><Attribute name="email"><Value>bill.johnson#email.com</Value></Attribute></Person>',
'employee_1': '<Person><Attribute name="name"><Value>Amanda Philips</Value></Attribute><Attribute name="city"><Value>Los Angeles</Value></Attribute><Attribute name="email"><Value>amanda.philips#email.com</Value></Attribute></Person>'
}
for identifier_key in employees:
dic = {}
xml = SimplifiedDoc(employees[identifier_key])
for attr in xml.Attributes:
dic[attr['name']]=attr.text
employees[identifier_key]=dic
print (employees)
Result:
{'employee_0': {'name': 'Bill Johnson', 'city': 'New York', 'email': 'bill.johnson#email.com'}, 'employee_1': {'name': 'Amanda Philips', 'city': 'Los Angeles', 'email': 'amanda.philips#email.com'}}

How to create custom field via API?

This is the Question/Answer case, so here you wont find any details.
After Googling within few hours I found solution that you can find below.
simple_salesforce
from simple_salesforce import Salesforce
def custom_field_create:
"""
Based on https://salesforce.stackexchange.com/a/212747/65221
Examples of field types you can find in the column "Data Type" of the salesforce front end,
on the page where you can create/edit/delete fields for your selected object.
NOTE: case of "type" is important. For example the type "DateTime"
must be exactly "DateTime" and not like "datetime".
"""
email = 'your_email'
password = 'your_password'
security_token = 'your_token'
object_api_name = 'contact' # replace with your object name
field_api_name = 'Activity_Time' # replace with your field name
field_label = 'Activity Time' # replace with your field label
sf = Salesforce(username=email, password=password, security_token=security_token)
url = 'tooling/sobjects/CustomField/'
payload = {
"Metadata":
{"type": "Text", "inlineHelpText": "", "precision": None, "label": f"{field_label}", "length": 90, "required": False},
"FullName": f"{object_api_name}.{field_api_name}__c"
}
result = sf.restful(url, method='POST', json=payload)
print('result:', result)

How to insert another item programmatically into body?

I am trying to build a free/busy body request to Google Calendar API via Python 3.8 . However, when I try to insert a new item into the body request, I am getting a bad request and can't use it.
This code is working:
SUBJECTA = '3131313636#resource.calendar.google.com'
SUBJECTB = '34343334#resource.calendar.google.com'
body = {
"timeMin": now,
"timeMax": nownext,
"timeZone": 'America/New_York',
"items": [{'id': SUBJECTA},{"id": SUBJECTB} ]
}
Good Body result:
{'timeMin': '2019-11-05T11:42:21.354803Z',
'timeMax': '2019-11-05T12:42:21.354823Z',
'timeZone': 'America/New_York',
'items': [{'id': '131313636#resource.calendar.google.com'},
{'id': '343334#resource.calendar.google.com'}]}
However,
While using this code:
items = "{'ID': '1313636#resource.calendar.google.com'},{'ID': '3383137#resource.calendar.google.com'},{'ID': '383733#resource.calendar.google.com'}"
body = {
"timeMin": now,
"timeMax": nownext,
"timeZone": 'America/New_York',
"items": items
}
The Body results contain additional quotes at the start and end position, failing the request:
{'timeMin': '2019-11-05T12:04:41.189784Z',
'timeMax': '2019-11-05T13:04:41.189804Z',
'timeZone': 'America/New_York',
'items': ["{'ID': 13131313636#resource.calendar.google.com},{'ID':
53333383137#resource.calendar.google.com},{'ID':
831383733#resource.calendar.google.com},{'ID':
33339373237#resource.calendar.google.com},{'ID':
393935323035#resource.calendar.google.com}"]}
What is the proper way to handle it and send the item list in an accurate way?
In your situation, the value of items is given by the string of "{'ID': '1313636#resource.calendar.google.com'},{'ID': '3383137#resource.calendar.google.com'},{'ID': '383733#resource.calendar.google.com'}".
You want to use as the object by parsing the string value with python.
The result value you expect is [{'ID': '1313636#resource.calendar.google.com'}, {'ID': '3383137#resource.calendar.google.com'}, {'ID': '383733#resource.calendar.google.com'}].
You have already been able to use Calender API.
If my understanding is correct, how about this answer? Please think of this as just one of several answers.
Sample script:
import json # Added
items = "{'ID': '1313636#resource.calendar.google.com'},{'ID': '3383137#resource.calendar.google.com'},{'ID': '383733#resource.calendar.google.com'}"
items = json.loads(("[" + items + "]").replace("\'", "\"")) # Added
body = {
"timeMin": now,
"timeMax": nownext,
"timeZone": 'America/New_York',
"items": items
}
print(body)
Result:
If now and nownext are the values of "now" and "nownext", respectively, the result is as follows.
{
"timeMin": "now",
"timeMax": "nownext",
"timeZone": "America/New_York",
"items": [
{
"ID": "1313636#resource.calendar.google.com"
},
{
"ID": "3383137#resource.calendar.google.com"
},
{
"ID": "383733#resource.calendar.google.com"
}
]
}
Note:
If you can retrieve the IDs as the string value, I recommend the following method as a sample script.
ids = ['1313636#resource.calendar.google.com', '3383137#resource.calendar.google.com', '383733#resource.calendar.google.com']
items = [{'ID': id} for id in ids]
If I misunderstood your question and this was not the result you want, I apologize.

Unable to map nested datasource field of cosmos db to a root index field of Azure indexer using REST APIs

I have a mongo db collection users with the following data format
{
"name": "abc",
"email": "abc#xyz.com"
"address": {
"city": "Gurgaon",
"state": "Haryana"
}
}
Now I'm creating a datasource, an index, and an indexer for this collection using azure rest apis.
Datasource
def create_datasource():
request_body = {
"name": 'users-datasource',
"description": "",
"type": "cosmosdb",
"credentials": {
"connectionString": "<db conenction url>"
},
"container": {"name": "users"},
"dataChangeDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName": "_ts"
}
}
resp = requests.post(url="<create-datasource-api-url>", data=json.dumps(request_body),
headers=headers)
Index for the above datasource
def create_index(config):
request_body = {
'name': "users-index",
'fields': [
{
'name': 'name',
'type': 'Edm.String'
},
{
'name': 'email',
'type': 'Edm.DateTimeOffset'
},
{
'name': 'address',
'type': 'Edm.String'
},
{
'name': 'doc_id',
'type': 'Edm.String',
'key': True
}
]
}
resp = requests.post(url="<azure-create-index-api-url>", data=json.dumps(request_body),
headers=config.headers)
Now the inxder for the above datasource and index
def create_interviews_indexer(config):
request_body = {
"name": "users-indexer",
"dataSourceName": "users-datasource",
"targetIndexName": users-index,
"schedule": {"interval": "PT5M"},
"fieldMappings": [
{"sourceFieldName": "address.city", "targetFieldName": "address"},
]
}
resp = requests.post("create-indexer-pi-url", data=json.dumps(request_body),
headers=config.headers)
This creates the indexer without any exception, but when I check the retrieved data in azure portal for the users-indexer, the address field is null and is not getting any value from address.city field mapping that is provided while creating the indexer.
I have also tried the following code as a mapping but its also not working.
"fieldMappings": [
{"sourceFieldName": "/address/city", "targetFieldName": "address"},
]
The azure documentation also does not say anything about this kind of mapping. So if anyone can help me on this, it will be very much appreciated.
container element in data source definition allows you to specify a query that you can use to flatten your JSON document (Ref: https://learn.microsoft.com/en-us/rest/api/searchservice/create-data-source) so instead of doing column mapping in the indexer definition, you can write a query and get the output in desired format.
Your code for creating data source in that case would be:
def create_datasource():
request_body = {
"name": 'users-datasource',
"description": "",
"type": "cosmosdb",
"credentials": {
"connectionString": "<db conenction url>",
},
"container": {
"name": "users",
"query": "SELECT a.name, a.email, a.address.city as address FROM a",
},
"dataChangeDetectionPolicy": {
"#odata.type": "#Microsoft.Azure.Search.HighWaterMarkChangeDetectionPolicy",
"highWaterMarkColumnName": "_ts"
}
}
resp = requests.post(url="<create-datasource-api-url>", data=json.dumps(request_body),
headers=headers)
Support for MongoDb API flavor is in public preview - you need to explicitly indicate Mongo in the datasource's connection string as described in this article. Also note that with Mongo datasources, custom queries suggested by the previous response are not supported afaik. Hopefully someone from the team would clarify the current state of this support.
It's working for me with the below field mapping correctly. Azure search query is returning values for address properly.
"fieldMappings": [{"sourceFieldName": "address.city", "targetFieldName": "address"}]
I did made few changes to the data your provided for e.g.
while creating indexers, removed extra comma at the end of
fieldmappings
while creating index, email field is kept at
Edm.String and not datetimeoffset.
Please make sure you are using the Preview API version since for MongoDB API is in preview mode with Azure Search.
For e.g. https://{azure search name}.search.windows.net/indexers?api-version=2019-05-06-Preview

Not Getting the Shape Right in DocumentDb Select

I'm trying to get only the person's membership info i.e. ID, name and committee memberships in a SELECT query. This is my object:
{
"id": 123,
"name": "John Smith",
"memberships": [
{
"id": 789,
"name": "U.S. Congress",
"yearElected": 2012,
"state": "California",
"committees": [
{
"id": 444,
"name": "Appropriations Comittee",
"position": "Member"
},
{
"id": 555,
"name": "Armed Services Comittee",
"position": "Chairman"
},
{
"id": 678,
"name": "Veterans' Affairs Comittee",
"position": "Member"
}
]
}
]
}
In this example, John Smith is a member of the U.S. Congress and three committees in it.
The result that I'm trying to get should look like this. Again, this is the "DESIRED RESULT":
{
"id": 789,
"name": "U.S. Congress",
"committees": [
{
"id": 444,
"name": "Appropriations Committee",
"position": "Member"
},
{
"id": 555,
"name": "Armed Services Committee",
"position": "Chairman"
},
{
"id": 678,
"name": "Veterans' Affairs Committee",
"position": "Member"
}
]
}
Here's my SQL query:
SELECT m.id, m.name,
[
{
"id": c.id,
"name": c.name,
"position": c.position
}
] AS committees
FROM a
JOIN m IN a.memberships
JOIN c IN m.committees
WHERE a.id = "123"
I'm getting the following results which is correct but the shape is not right. I'm getting the same membership 3 times. Here's what I'm getting which is NOT the desired result:
[
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 444,
"name": "Appropriations Committee",
"position": "Member"
}
]
},
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 555,
"name": "Armed Services Committee",
"position": "Chairman"
}
]
},
{
"id": 789,
"name": "U.S. Congress",
"committees":[
{
"id": 678,
"name": "Veterans' Affairs Committee",
"position": "Member"
}
]
}
]
As you can see here, the "U.S. Congress" membership is repeated 3 times.
The following SQL query gets me exactly what I want in Azure Query Explorer but when I pass it as the query in my code -- using DocumentDb SDK -- I don't get any of the details for the committees. I simply get blank results for committee ID, name and position. I do, however, get the membership data i.e. "U.S. Congress", etc. Here's that SQL query:
SELECT m.id, m.name, m.committees AS committees
FROM c
JOIN m IN c.memberhips
WHERE c.id = 123
I'm including the code that makes the DocumentDb call. I'm including the code with our internal comments to help clarify their purpose:
First the ReadQuery function that we call whenever we need to read something from DocumentDb:
public async Task<IEnumerable<T>> ReadQuery<T>(string collectionId, string sql, Dictionary<string, object> parameterNameValueCollection)
{
// Prepare collection self link
var collectionLink = UriFactory.CreateDocumentCollectionUri(_dbName, collectionId);
// Prepare query
var query = getQuery(sql, parameterNameValueCollection);
// Creates the query and returns IQueryable object that will be executed by the calling function
var result = _client.CreateDocumentQuery<T>(collectionLink, query, null);
return await result.QueryAsync();
}
The following function prepares the query -- with any parameters:
protected SqlQuerySpec getQuery(string sql, Dictionary<string, object> parameterNameValueCollection)
{
// Declare query object
SqlQuerySpec query = new SqlQuerySpec();
// Set query text
query.QueryText = sql;
// Convert parameters received in a collection to DocumentDb paramters
if (parameterNameValueCollection != null && parameterNameValueCollection.Count > 0)
{
// Go through each item in the parameters collection and process it
foreach (var item in parameterNameValueCollection)
{
query.Parameters.Add(new SqlParameter($"#{item.Key}", item.Value));
}
}
return query;
}
This function makes async call to DocumentDb:
public async static Task<IEnumerable<T>> QueryAsync<T>(this IQueryable<T> query)
{
var docQuery = query.AsDocumentQuery();
// Batches gives us the ability to read data in chunks in an asyc fashion.
// If we use the ToList<T>() LINQ method to read ALL the data, the call will synchronous which is why we prefer the batches approach.
var batches = new List<IEnumerable<T>>();
do
{
// Actual call is made to the backend DocumentDb database
var batch = await docQuery.ExecuteNextAsync<T>();
batches.Add(batch);
}
while (docQuery.HasMoreResults);
// Because batches are collections of collections, we use the following line to merge all into a single collection.
var docs = batches.SelectMany(b => b);
// Return data
return docs;
}
I just write a demo to test with your query and I can get the expected result, check the snapshot below. So I think that query is correct, you've mentioned that you don't seem to get any data when you make the call in my code, would you mind share your code? Perhaps there are some mistakes in you code. Anyway, here is my test just for your reference and hope it helps.
Query used:
SELECT m.id AS membershipId, m.name AS membershipNameName, m.committees AS committees
FROM c
JOIN m IN c.memberships
WHERE c.id = "123"
Code here is very simple, sp_db.innerText represents a span which I used to show the result in my test page:
var docs = client.CreateDocumentQuery("dbs/" + databaseId + "/colls/" + collectionId,
"SELECT m.id AS membershipId, m.name AS membershipName, m.committees AS committees " +
"FROM c " +
"JOIN m IN c.memberships " +
"WHERE c.id = \"123\"");
foreach (var doc in docs)
{
sp_db.InnerText += doc;
}
I think maybe there are some typos in the query you specified in client.CreateDocumentQuery() which makes the result to be none, it's better to provide the code for us, then we can help check it.
Updates:
Just tried your code and still I can get the expected result. One thing I found is that when I specified the where clause like "where c.id = \"123\"", it gets the result:
However, if you didn't make the escape and just use "where c.id = 123", this time you get nothing. I think this could be a reason. You can verify whether you have ran into this scenario.
Just updated my original post. All the code provided in the question is correct and works. I was having a problem because I was using aliases in the SELECT query and as a result some properties were not binding to my domain object.
The code provided in the question is correct.

Resources