I'm having trouble trying to added data to my Azure Cognitive Search index. The data is being read from SQL Server tables with a python script. The script sends it to the index using the SearchIndexClient from the azure search sdk.
The problem is when sending Python "int" values into a search index field of type Edm.String. The link below seems to indicate that this should be possible. Any number type is allowed to go into a Edm.String.
https://learn.microsoft.com/en-us/rest/api/searchservice/data-type-map-for-indexers-in-azure-search#bkmk_sql_search
However I get this error:
Cannot convert the literal '0' to the expected type 'Edm.String'.
Am I misunderstanding the docs? Is the python int different than the SQL Server int through the Azure Search SDK?
I'm using pyodbc to connect to an Azure Synapse db. Retrieving the rows with cursor loop. This is basically what I'm doing...
search_client = SearchIndexClient(env.search_endpoint,
env.search_index,
SearchApiKeyCredential(env.search_api_key),
logging_enable=True)
conn = pyodbc.connect(env.sqlconnstr_synapse_connstr, autocommit=True)
query = f"SELECT * FROM [{env.source_schema}].[{source_table}]"
cursor = conn.cursor()
cursor.execute(query)
source_table_columns = [source_table_column[0] for source_table_column in cursor.description]
rows = []
for source_table_values in cursor.fetchmany(MAX_ROWS_TO_FETCH):
source_table_row = dict(zip(source_table_columns,
source_table_values))
rows.append(source_table_row)
upload = search_client.upload_documents(documents=rows)
If the row contains a row with an int value and the search index table field is Edm.String, we get the error.
Cannot convert the literal '0' to the expected type 'Edm.String'.
Thank you for providing the code snippet. The data type mapping link is applicable when using an Indexer to populate an Index.
Indexers provide a convenient mechanism to load documents into an Index from a source datasource. They perform the mapping outlined here by default or can take in an optional fieldMappings.
In the case of the code snippet where an index is being updated manually, when there is a type mismatch between source & target, that would be handled by casting/converting etc. by the user. In the code snippet after you have the dictionary, you can convert the int into a string using str() before uploading the batch in to the Index
source_table_row[column_name] = str(source_table_row[column_name])
This is a python sample that creates an indexer to update an index
Related
Working on Azure Cognitive Search with backend as MS SQL table, have some scenarios where need help to define a query.
Sample table structure and data :
Scenarios 1 : Need to define a query which will return data based on category.
I have tied query using search.ismatch but its uses prefix search and matches other categories as well with similar kind of values i.e. "Embedded" and "Embedded Vision"
$filter=Region eq 'AA' and search.ismatch('Embedded*','Category')
https://{AZ_RESOURCE_NAME}.search.windows.net/indexes/{INDEX_NAME}/docs?api-version=2020-06-30-Preview&$count=true&$filter=Region eq 'AA' and search.ismatch('Embedded*','Category')
And it will response with below result, where it include "Embedded" and "Embedded Vision" both categories.
But my expectation is to fetch data only if it match "Embedded" category, as highlighted below
Scenario 2: For the above Scenario 1, Need little enhancement to find records with multiple category
For example if I pass multiple categories (i.e. "Embedded" , "Automation") need below highlighted output
you'll need to use a different analyzer which will break the tokens on every ';' just for the category field rather than 'whitespaces'.
You should first ensure your Category data is populated as a Collection(Edm.String) in the index. See Supported Data Types in the official documentation. Each of your semicolon-separated values should be separate values in the collection, in a property called Category (or similar).
You can then filter by string values in the collection. See rules for filtering string collections. Assuming that your index contains a string collection field called Category, you can filter by categories containing Embedded like this:
Category/any(c: c eq 'Embedded')
You can filter by multiple values like this:
Category/any(c: search.in(c, 'Embedded, Automation'))
Start with clean data in your index using proper types for the data you have. This allows you to implement proper facets and you can utilize the syntax made specifically for this. Trying to work around this with wildcards is a hack that should be avoided.
To solve above mention problem used a below SQL function which will convert category to a json string array supported by Collection(Edm.String) data type in Azure Search.
Sql Function
CREATE FUNCTION dbo.GetCategoryAsArray
(
#ID VARCHAR(20)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
DECLARE #result NVARCHAR(MAX) = ''
SET #result = REPLACE(
STUFF(
(SELECT
','''+ TRIM(Value) + ''''
FROM dbo.TABLEA p
CROSS APPLY STRING_SPLIT (Category, ';')
WHERE p.ID = #ID
FOR XML PATH('')
),1,1,''),'&','&')
RETURN '[' + #result + ']'
END
GO
View to use function and return desired data
CREATE View dbo.TABLEA_VIEW AS
select
id
,dbo. GetCategoryAsArray(id) as CategoryArr
,type
,region
,Category
from dbo.TABLEA
Defined a new Azure Search Index using above SQL View as data source and during Index column mapping defined CategoryArr column as Collection(Edm.String) data type
Query to use to achieve expected output from Azure Search
$filter=Region eq 'AA' and CategoryArr/any(c: search.in(c, 'Embedded, Automation'))
OK Here we go.
Using Kentico 11/Portal Engine (no hot fixes)
Have a table that holds Content only page Types. One field of importance is a Date and time field.
I am trying to get rows out of this table that match a certain month and year criteria. For instance give me all records where Month=2 and Year=2018. These argument will be passed via the query string
I have a custom Stored proc that I would like to receive two int(or string) arguments then return a collection of all matching rows.
I am using a RepeaterWithCustomQuery to call the procedure and handle the resulting rows. As you can see below the querystring arguments are named "year" and "monthnumber".
The Query
Me.PR.PREDetailSelect
When my Webpart is set up in this configuration I get the following error:
In my Query, I have tried:
EXEC Proc_Custom_PRDetails #MonthNumber = ##ORDERBY##; #Year = ##WHERE##<br/>
EXEC Proc_Custom_PRDetails #MonthNumber = ##ORDERBY##, #Year = ##WHERE##<br/>
EXEC Proc_Custom_PRDetails #MonthNumber = ##ORDERBY## #Year = ##WHERE##<br/>
Any help would be appreciated (Thanks in advance Brendan). Lastly, don't get too caught up in the names of specific objects as I tried to change names to protect the innocent.
Those macros for queries are not meant to be used with stor procs. The system generates this false condition 1=1 in case if you don't pass anything so it won't break the sql statement like the one below:
SELECT ##TOPN## ##COLUMNS##
FROM View_CMS_Tree_Joined AS V
INNER JOIN CONTENT_MenuItem AS C
ON V.DocumentForeignKeyValue = C.MenuItemID AND V.ClassName = N'CMS.MenuItem'
WHERE ##WHERE##
ORDER BY ##ORDERBY##
You need to convert you stor proc to SQL statement then you can use these SQL macros or use stor proc without parameters
If look at the query above top and where are not good because system will do adjustment, but you can use order by and columns, but they both must be present (I think it passes them as is):
exec proc_test ##ORDERBY##, ##COLUMNS##
Honestly I would advice against doing this, plus you won't gain much by calling stor proc.
I have Inputs in the form of JSON stored in Blob Storage
I have Output in the form of SQL Azure table.
My wrote query and successfully moving value of specific property in JSON to corresponding Column of SQL Azure table.
Now for one column I want to copy entire JSON payload as Serialized string in one sql column , I am not getting proper library function to do that.
SELECT
CASE
WHEN GetArrayLength(E.event) > 0
THEN GetRecordPropertyValue(GetArrayElement(E.event, 0), 'name')
ELSE ''
END AS EventName
,E.internal.data.id as DataId
,E.internal.data.documentVersion as DocVersion
,E.context.custom As CustomDimensionsPayload
Into OutputTblEvents
FROM InputBlobEvents E
This CustomDimensionsPayload should be a JSON actually
I made a user defined function which did the job for me:
function main(InputJSON) {
var InputJSONString = JSON.stringify(InputJSON);
return InputJSONString;
}
Then, inside the Query, I used the function like this:
SELECT udf.ConvertToJSONString(COLLECT()) AS InputJSON
INTO outputX
FROM inputY
You need to just reference the input object itself instead of COLLECT() if you want the entire payload to be converted. I was trying to do this also so figured I'd add what i did.
I used the same function suggested by PerSchjetne, query then becomes
SELECT udf.JSONToString(IoTInputStream)
INTO [SQLTelemetry]
FROM [IoTInputStream]
Your output will now be the full JSON string, including all the metadata extras that IOT hub adds on.
I'm trying to create index which using import data tool.
The datasource is from azure sql's view.
SELECT
b.Name,
b.ID
(SELECT
'[' + STUFF((
SELECT
',{"name":"' + p.Name + '"}'
FROM Product p WHERE p.Brand = b.ID
FOR XML PATH (''), TYPE)
.value('.', 'nvarchar(max)'), 1, 1, '') + ']') AS TAry,
b.IsDelete,
b.ModifyDatetime
from Brand b
Column with TAry will return JSon format string like:
[{"name":"Test1"},{"name":"Test2"}]
In Indexder properties with field TAry Chose the type Collection(Edm.String)
After create , It's return error , the message below:
"The data field 'TAry' has an invalid value. The expected type was 'Collection(Edm.String)'."
Thank for your reply.
I have try this kind format :[Test1","Test2"] still not work.
To do this, you need to use Azure Search REST API to set up a field mapping with jsonArrayToStringCollection function. Take a look at this article for detailed instructions.
Background: Project is a Data Import utility for importing data from tsv files into a EF5 DB through DbContext.
Problem: I need to do a lookup for ForeignKeys while doing the import. I have a way to do that but the retrieval if the ID is not functioning.
So I have a TSV file example will be
Code Name MyFKTableId
codevalue namevalue select * from MyFKTable where Code = 'SE'
So when I process the file and Find a '...Id' column I know I need to do a lookup to find the FK The '...' is always the entity type so this is super simple. The problem I have is that I don't have access to the properties of the results of foundEntity
string childEntity = column.Substring(0, column.Length - 2);
DbEntityEntry recordType = myContext.Entry(childEntity.GetEntityOfReflectedType());
DbSqlQuery foundEntity = myContext.Set(recordType.Entity.GetType()).SqlQuery(dr[column])
Any suggestion would be appreciated. I need to keep this generic so we can't use known type casting. The Id Property accessible from IBaseEntity so I can cast that, but all other entity types must be not be fixed
Note: The SQL in the MyFKTableId value is not a requirement. If there is a better option allowing to get away from SqlQuery() I would be open to suggestions.
SOLVED:
Ok What I did was create a Class called IdClass that only has a Guid Property for Id. Modified my sql to only return the Id. Then implemented the SqlQuery(sql) call on the Database rather than the Set([Type]).SqlQuery(sql) like so.
IdClass x = ImportFactory.AuthoringContext.Database.SqlQuery<IdClass>(sql).FirstOrDefault();
SOLVED:
Ok What I did was create a Class called IdClass that only has a Guid Property for Id. Modified my sql to only return the Id. Then implemented the SqlQuery(sql) call on the Database rather than the Set([Type]).SqlQuery(sql) like so.
IdClass x = ImportFactory.AuthoringContext.Database.SqlQuery<IdClass>(sql).FirstOrDefault();