SQL Execution in Azure taking a very long time - azure

I'm using ASP.NET Core 1.1.
The following is the query which takes most time and causes all this trouble:
C#
List<Message> messages =await _context.Messages.Where(m => m.UserId.Equals(_userManager.GetUserId(User)))
.Select(m => new Message { ID = m.ID, DateTime = m.DateTime, Text = m.Text }).ToListAsync();
SQL
SELECT [m].[ID], [m].[DateTime], [m].[Text] FROM [Messages] AS [m] WHERE [m].[UserId] = #__GetUserId_0
Execution plan statistics:
My website become very slow and not responsive, sometimes showing errors.

As I stated in comments: Your execution plan shows an index scan of nearly 500,000 rows. Seems like you're doing a complete scan, instead of hitting the index.
I suspect adding an index on UserId would resolve this issue, as that's the only field used in your WHERE clause.

Related

How to avoid race condition when updating Azure Table Storage record

Azure Function utilising Azure Table Storage
I have an Azure Function which is triggered from Azure Service Bus topic subscription, let's call it "Process File Info" function.
The message on the subscription contains file information to be processed. Something similar to this:
{
"uniqueFileId": "adjsdakajksajkskjdasd",
"fileName":"mydocument.docx",
"sourceSystemRef":"System1",
"sizeBytes": 1024,
... and other data
}
The function carries out the following two operations -
Check individual file storage table for the existing of the file. If it exists, update that file. If it's new, add the file to the storage table (stored on a per system|per fileId basis).
Capture metrics on the file size bytes and store in a second storage table, called metrics (constantly incrementing the bytes, stored on a per system|per year/month basis).
The following diagram gives a brief summary of my approach:
The difference between the individualFileInfo table and the fileMetric is that the individual table has one record per file, where as the metric table stores one record per month that is constantly updated (incremented) gathering the total bytes that are passed through the function.
Data in the fileMetrics table is stored as follows:
The issue...
Azure functions are brilliant at scaling, in my setup I have a max of 6 of these functions running at any one time. Presuming each file message getting processed is unique - updating the record (or inserting) in the individualFileInfo table works fine as there are no race conditions.
However, updating the fileMetric table is proving problematic as say all 6 functions fire at once, they all intend to update the metrics table at the one time (constantly incrementing the new file counter or incrementing the existing file counter).
I have tried using the etag for optimistic updates, along with a little bit of recursion to retry should a 412 response come back from the storage update (code sample below). But I can't seem to avoid this race condition. Has anyone any suggestion on how to work around this constraint or come up against something similar before?
Sample code that is executed in the function for storing the fileMetric update:
internal static async Task UpdateMetricEntry(IAzureTableStorageService auditTableService,
string sourceSystemReference, long addNewBytes, long addIncrementBytes, int retryDepth = 0)
{
const int maxRetryDepth = 3; // only recurively attempt max 3 times
var todayYearMonth = DateTime.Now.ToString("yyyyMM");
try
{
// Attempt to get existing record from table storage.
var result = await auditTableService.GetRecord<VolumeMetric>("VolumeMetrics", sourceSystemReference, todayYearMonth);
// If the volume metrics table existing in storage - add or edit the records as required.
if (result.TableExists)
{
VolumeMetric volumeMetric = result.RecordExists ?
// Existing metric record.
(VolumeMetric)result.Record.Clone()
:
// Brand new metrics record.
new VolumeMetric
{
PartitionKey = sourceSystemReference,
RowKey = todayYearMonth,
SourceSystemReference = sourceSystemReference,
BillingMonth = DateTime.Now.Month,
BillingYear = DateTime.Now.Year,
ETag = "*"
};
volumeMetric.NewVolumeBytes += addNewBytes;
volumeMetric.IncrementalVolumeBytes += addIncrementBytes;
await auditTableService.InsertOrReplace("VolumeMetrics", volumeMetric);
}
}
catch (StorageException ex)
{
if (ex.RequestInformation.HttpStatusCode == 412)
{
// Retry to update the volume metrics.
if (retryDepth < maxRetryDepth)
await UpdateMetricEntry(auditTableService, sourceSystemReference, addNewBytes, addIncrementBytes, retryDepth++);
}
else
throw;
}
}
Etag keeps track of conflicts and if this code gets a 412 Http response it will retry, up to a max of 3 times (an attempt to mitigate the issue). My issue here is that I cannot guarantee the updates to table storage across all instances of the function.
Thanks for any tips in advance!!
You can put the second part of the work into a second queue and function, maybe even put a trigger on the file updates.
Since the other operation sounds like it might take most of the time anyways, it could also remove some of the heat from the second step.
You can then solve any remaining race conditions by focusing only on that function. You can use sessions to limit the concurrency effectively. In your case, the system id could be a possible session key. If you use that, you will only have one Azure Function processing data from one system at one time, effectively solving your race conditions.
https://dev.to/azure/ordered-queue-processing-in-azure-functions-4h6c
Edit: If you can't use Sessions to logically lock the resource, you can use locks via blob storage:
https://www.azurefromthetrenches.com/acquiring-locks-on-table-storage/

Documentdb performance when using pagination

I have a working code on pagination which works great with azure search and sql, but when using it on documentdb it takes up to 60 seconds to load.
We beleive it's a latency issue, but I can't find a workaround to fasten it up,
any documentation, or ideas on where to start looking?
public PagedList(IQueryable<T> superset, int pageNumber, int pageSize, string sortExpression = null)
{
if (pageNumber < 1)
throw new ArgumentOutOfRangeException("pageNumber", pageNumber, "PageNumber cannot be below 1.");
if (pageSize < 1)
throw new ArgumentOutOfRangeException("pageSize", pageSize, "PageSize cannot be less than 1.");
// set source to blank list if superset is null to prevent exceptions
TotalItemCount = superset == null ? 0 : superset.Count();
if (superset != null && TotalItemCount > 0)
{
Subset.AddRange(pageNumber == 1
? superset.Skip(0).Take(pageSize).ToList()
: superset.Skip((pageNumber - 1) * pageSize).Take(pageSize).ToList()
);
}
}
While the LINQ provider for DocumentDB translates .Take() into a "TOP" SQL clause under certain circumstances, DocumentDB has no equivalent for Skip. So, I'm a little surprised it works at all but I suspect that the provider is rerunning the query from scratch to simulate Skip. In the comments here is a discussion led by a DocumentDB product manager on why they chose not to implement SKIP. tl;dr; It doesn't scale for NoSQL databases. I can confirm this with MongoDB (which does have a skip functionality). Later pages simply scan and throw away earlier documents. The later in the list you go, the slower it gets. I suspect that the LINQ implementation is doing something similar except client-side.
DocumentDB does have a mechanism for getting documents in chunks but it works a bit differently than SKIP. It uses a continuation token. You can even set a maxPageSize, however there is no guarantee that you'll get that number back.
I recommend that you implement a client-side cache of your own and use a fairly large maxPageSize. Let's say each page in your UI is 10 rows and your cache currently has 27 rows in it. If the user selects page 1 or page 2, you have enough rows to render the result from the data already cached. If the user select page 7, then you know that you need at least 70 rows in your cache. Use the last continuation token to get more until you have at least 70 rows in your cache and then render rows 61-70. On the plus side, continuation tokens are long lived so you can use them later based upon user input.

Missing ETW EventSource table in Azure SDK 2.6

I'm trying to use ETW for logging with several custom EventSource classes in Azure SDK 2.6.
When testing locally with the compute/storage emulator, three of my custom WADMyEventXYZ tables show up; however, the final expected table "WADMyDataSets" never seems to be created. How should I determine what is causing this problem? I see no errors from the compute emulator when the debugger is attached and stepping through the code in the debugger shows that WriteEntry on the EventSource is definitely called. The other tables show up in SchemasTable in the developer storage account, but there is no entry there for WADMyDataSets.
I exported WADDiagnosticInfrastrureLogsTable into CSV and examined it in Excel and see the following messages that reference "MyDataSets":
Validating table MyDataSets; DiskMB:451; RequiredQuota:451 RetentionSeconds:7776000 Pri:2 MinQuotaMB:0 RunningTotal:3757
Table does not exist
table C:\Users\Caleb\AppData\Local\dftmp\Resources\b316f531-f673-4db3-ac1c-e4649e289871\WAD0104\Tables\MyDataSets does not exist, CreationDisposition = 4
Table MyDataSets does not exist, will create a new one
Delaying the creation of table MyDataSets until the schema is known
Later on:
Converted EventSource provider name "MyDataSets" to {74a2b9c9-0bd8-547f-6cad-453da47055be}
Matched task with query id MyDataSetsQuery and regex ^MyDataSets$ to source table MyDataSets
Registering query MyDataSetsQuery_MyDataSets_XTableWadAccount:
Adding standard PkRk (MA) fields to 'MyDataSetsQuery_MyDataSets'
Successfully compiled the query 'MyDataSetsQuery_MyDataSets'
Added task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount from MyDataSets - Partitions:-1 Pri:normal TSPolicy:start StoreType:Central Repeat:2147483647 Timeout:3600s Deadline:300s DelayRange:0.00
Later on:
No checkpoint found for task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount after time 2015-05-13T00:44:21.000Z; retry time out is 3600 seconds
First scheduled task for MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount is at 2015-05-13T01:44:00.000Z (plus a delay of 20s)
Later on:
Increasing query delay of task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount from 20 to 40 seconds to introduce randomness to the upload schedule
Later on:
Starting scheduled task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount from 2015-05-13T01:43:00.000Z to 2015-05-13T01:44:00.000Z; query delay 40 seconds
Table C:\Users\Caleb\AppData\Local\dftmp\Resources\b316f531-f673-4db3-ac1c-e4649e289871\WAD0104\Tables\MyDataSets does not exist
Ending scheduled task MyDataSetsQuery_MyDataSets_WADMyDataSets_PT1M_XTableWadAccount from 2015-05-13T01:43:00.000Z to 2015-05-13T01:44:00.000Z in 1ms
Update
The EventSource in question had one event on it:
[Event(1)]
public void DataSetLoaded(string traceActivityId, string userId, string reportCode, long timeToLoadMs)
Removing the fourth parameter "timeToLoadMs" resulted in the WAD event table showing up as expected. I tried changing the last parameter to a string, and it failed to show up again. Is there a documented limit on the number of parameters for an event method? I'm pretty sure I've seen samples that have four parameters.
I upgraded my web project to .NET 4.5.1 and now the WAD table shows up as expected (I had been running on just .NET 4.5 before this).
It would seem that there might be a bug with having 4 parameters on an EventSource event when using .NET 4.5.0.
As a side note, with 4.5.1, I now have the System.Diagnostics.Tracing.EventSource.SetCurrentThreadActivityId method which will let me get rid of manually including the CorrelationManager.ActivityId in my event output.
https://channel9.msdn.com/Series/ConnectOn-Demand/240 video released today says full support for Azure table logging for ETW eventsources.

Entity Framework & Oracle: Cannot Insert VARCHAR2 > 1,999 Characters

I created a 4,000-character VARCHAR2 field in an Oracle table. I am inserting string values into the field using LINQ to Entities with Visual Studio 2010, .NET Framework 4, and ODAC 11.2 Release 4 and Oracle Developer Tools for Visual Studio (11.2.0.3.0). When I try to insert a string value greater than 1,999 characters, I get the following inner exception:
Oracle.DataAccess.Client.OracleException
ORA-00932: inconsistent datatypes: expected - got NCLOB
However, I can insert a 4,000 character string value into the field without any issue when doing so using SQL Developer.
There is a known ODAC bug (source #2) in which there is a 2,000 character limit when saving to an XMLTYPE field, but I am not saving to an XMLTYPE field. I have Oracle.DataAccess 2.112.3.0 in my GAC, and I considered updating to release 5 (11.2.0.3.20) of the aforementioned Oracle software, but "Oracle Developer Tools for Visual Studio" is the only component that appears to have been updated from release 4, and I believe that "Oracle Data Provider for .NET 4" is the component that needs updating. In my .NET project, System.Data.Entity and System.Data.OracleClient are both runtime version 4.0.30319.
Anyway, I am just wondering if anyone else has encountered this error, and if so, if any solution has been found - aside from the one in the Oracle forum thread that is linked above that proposes using stored procedures as a workaround. Google tells me that people are encountering this error only when working with XMLTYPE fields, but I can't be the only person who is encountering this error when working with a VARCHAR2 field, can I?
(FWIW, I am also hoping to receive a response to my post as user "997340" in the Oracle forum thread that is linked above. If I receive a useful response, I'll be sure to share the knowledge on this end.)
EDIT: In case it helps, below are the two blocks in my code that are failing. I created the second block when troubleshooting the first, just to see if there was any difference. I get the exception when checking to see if the string values were already inserted (the "if" statements), and when actually inserting the string values (the "AddObject" statements).
1:
if (!(from q in db.MSG_LOG_MESSAGE where q.MESSAGE == msg select q.MESSAGE).Any())
{
db.MSG_LOG_MESSAGE.AddObject(new MSG_LOG_MESSAGE { MESSAGE = msg });
db.SaveChanges();
}
2:
if (!db.MSG_LOG_MESSAGE.Any(q => q.MESSAGE == msg))
{
db.MSG_LOG_MESSAGE.AddObject(new MSG_LOG_MESSAGE { MESSAGE = msg });
db.SaveChanges();
}
APRIL 3 UPDATE:
I was able to trace the SQL that is being sent to Oracle from the "if" statement in the first code block above. Here it is:
SELECT
CASE WHEN ( EXISTS (SELECT
1 AS "C1"
FROM "SEC"."MSG_LOG_MESSAGE" "Extent1"
WHERE ("Extent1"."MESSAGE" = :p__linq__0)
)) THEN 1 WHEN ( NOT EXISTS (SELECT
1 AS "C1"
FROM "SEC"."MSG_LOG_MESSAGE" "Extent2"
WHERE ("Extent2"."MESSAGE" = :p__linq__0)
)) THEN 0 END AS "C1"
FROM ( SELECT 1 FROM DUAL ) "SingleRowTable1" ;
Unfortunately, the DBA that I worked with did not provide me with the value of the "p_linq_0" parameter, but as previously stated, when it is over 1,999 characters, an exception occurs. (When this SQL was traced, I passed a 4,000-character string as the parameter, and of course an exception occurred.) The DBA also mentioned something about certain SQL clients - such as SQLPlus - not being able to handle VARCHAR2s over 2,000 characters. I did not entirely follow. Whether using SQLPlus, SQL Developer, or any other tool, Oracle is still going to be querying a 4,000-character VARCHAR2 field. Plus, my magic number is 1,999 characters; not 2,000 characters. Did the DBA perhaps mean there is a limitation with how many characters can be in a parameter? More importantly, when I execute this SQL in SQL Developer and I enter a 4,000-character string for the parameter, it works perfectly. So I am still utterly confused about why it is not working via LINQ to Entities. I also tried the following code in my program to run a similar query with a 4,000-character string in the "msg" variable, which worked perfectly as well:
using Oracle.DataAccess;
using Oracle.DataAccess.Client;
using System.Data;
...
OracleConnection conn = new OracleConnection("Data Source=[MASKED];User Id=[MASKED];Password=[MASKED]");
conn.Open();
OracleCommand cmd = new OracleCommand();
cmd.Connection = conn;
cmd.CommandText = "SELECT message FROM msg_log_message WHERE message = '" + msg + "'";
cmd.CommandType = CommandType.Text;
OracleDataReader dr = cmd.ExecuteReader();
dr.Read();
string result1 = dr.GetString(0);
conn.Dispose();
For now, I am still pointing fingers at ODAC being buggy as it pertains to LINQ to Entities...
The latest ODP.NET documentation - for "11.2 Release 5 Production (11.2.0.3.0)" from September 2012 - states the following known issue under the "Entity Framework Related Tips, Limitations and Known Issues" section, which addresses the error from the "if" statements in the question's code blocks:
An "ORA-00932 : inconsistent datatypes" error can be encountered if a string of 2,000 or more characters, or a byte array with 4,000 bytes or more in length, is bound in a WHERE clause of a LINQ/ESQL query. The same error can be encountered if an entity property that maps to a BLOB, CLOB, NCLOB, LONG, LONG RAW, XMLTYPE column is used in a WHERE clause of a LINQ/ESQL query.
Older ODP.NET documentation - for "Release 11.2.0.3.0 Production" from May 2011 - states the same known issue, so apparently this has been a known issue for a while.
Neither of the aforementioned documentation mentions encountering the same error from the "AddObject" statements in the question's code blocks, but that issue is very similar to another known issue for XMLType fields that is mentioned:
An "ORA-00932: inconsistent datatypes: expected - got NCLOB" error will be encountered when trying to bind a string that is equal to or greater than 2,000 characters in length to an XMLType column or parameter. [Bug 12630958]

Why is GetPaged() Executing two database calls?

I'm a bit new to subsonic (i.e. evaluating 3.0.0.3) and have come across a strange behavior in GetPaged(int pageIndex, int pageSize). When I execute the method it does two SQL calls. Any ideas why ?
Details
Lets say I have a "Cultures" table with 200 rows. In my code I do something like ...
var sonicCollection = from c in RTE.Culture.GetPaged(1, 25)
select c;
Now, I would expect this executes a single query returning the first 25 entries in my cultures table. When I watch SQL profiler I see two queries run by.
First this--
SELECT [dbo].[Cultures].[cultureCode], [dbo].[Cultures].[cultureName]
FROM [dbo].[Cultures]
Then This--
SELECT *
FROM (SELECT ROW_NUMBER() OVER (
ORDER BY cultureID ASC) AS Row,
[dbo].[Cultures].[cultureCode], [dbo].[Cultures].[cultureName]
FROM [dbo].[Cultures]
)
AS PagedResults
WHERE Row >= 1 AND Row <= 25
I expect the 2nd query to roll by, as it is the one returning the 25 rows I politely requested of subsonic. The first query, however, appears to return 200 rows (at least according to SQL profiler).
Any ideas what's going on?
It's a bug in the code. The code actually queries every record and then iterates over each one for the count. I've created an issue in the github repo here:
https://github.com/subsonic/SubSonic-3.0/issues/259
You can download the source, fix the issue and recompile pretty easily. I've done this and its fixed my issue.
You just want to use RTE.Culture.GetPaged() - it runs the paged query for you.

Resources