Xamarin Froms Bulk data insertion hangs application - multithreading

I have added data sync in my application. When the application is installed, it prompts the user to sync data. During my testing, i had limited amount of records and it was working fine. Now that I have attached the real database, the app hangs and becomes unresponsive during sync. An alert to close the app is also shown on android device.
For syncing, I am doing this
bool response1 = await syncCustomers();
bool response2 = await syncItems();
if(response1 and response2)
{
do something
}
Both syncCustomers() and syncItems() are tasks which return bool. They fetch the data from API, clear the existing database tables and populate the table with the newly fetched data.
During the syncing process, an activity indicator is shown. When the dataset is small, the app remains responsive and indicator keeps on spinning until the data is synced (which is not long).
But with a large dataset, the app hangs and needs to be closed. There are about 6 to 7 thousand records in the live database.
When I start the app next time, the items and customers exist in the database. Which means that the data is added properly. But for some reason, the app hangs.

I came up with a solution. I was calling the above given snippet directly which I believe was running in the UI thread. Now I am running the code in
Task.Run(some code);
and it runs perfectly. The activity indicator keeps on spinning and the app does not hang.

Related

Why is Azure MySQL database unresponsive at first

I have recently setup an 'Azure Database for MySQL flexible server' using the burstable tier. The database is queried by a React frontend via a node.js api; which each run on their own seperate Azure app services.
I've noticed that when I come to the app first thing in the morning, there is a delay before database queries complete. The React app is clearly running when I first come to it, which is serving the html front-end with no delays, but queries to the database do not return any data for maybe 15-30 seconds, like it is warming up. After this initial slow performance though, it then runs with no delays.
The database contains about 10 records at the moment, and 5 tables, so it's tiny.
This delay could conceivably be due to some delay with the node.js server, but as the React server is running on the same type of infrastructure (an app service), configured in the same way, and is immediately available when I go to its URL, I don't think this is the issue. I also have no such delays in my dev environment which runs on my local PC.
I therefore suspect there is some delay with the database server, but I'm not sure how to troubleshoot. Before I dive down that rabbit hole though, I was wondering whether a delay when you first start querying a database (after, say, 12 hours of inactivity) is simply a characteristic of the burtsable tier on Azure?
There may be more factors affecting this (see comments from people on my original question), but my solution has been to set two global variables which cache data, improving initial load times. The following should be set to ON in the Azure config:
'innodb_buffer_pool_dump_at_shutdown'
'innodb_buffer_pool_load_at_startup'
This is explained further in the following best practices documentation: https://learn.microsoft.com/en-us/azure/mysql/single-server/concept-performance-best-practices in the section marked 'Use InnoDB buffer pool Warmup'

Node js REST Client Scaling the Data collection

I have a scenario where my node js client collects data from rest api.
Scenario : my api endpoint is like this http://url/{project}
where project is parameter. the project comes from a Database table.
here is my procedure:
I am getting all the projects names from Database to a list
using a loop calling rest endpoint for every project in the list
My Query: If I have less number of projects in the Database this procedure working fine but, If I have around 1000 projects to collect, the requests are taking long time and some times failing due to timeout errors.
How can I scale this process so that it finish collecting data in a good amount of time?

Nodemon fs.writeFileSync Crash

I have a queue of data from the AWS SQS service, and I am retrieving this data, posting it to a webpage created and hosted via Node.js, and then telling the SQS service to delete the file. I use Nodemon to create and update the page, such that every time I pull a new event, the page updates and users logged into the page see fresh data. I achieve this with code that goes something like:
sqs.receiveMessage(data){
if (data = 1) {
dataForWebPage = something
fs.writeFileSync( "dataFile.json", JSON.stringify(dataForWebPage, null, 2), "utf8");
}
if (data = 2) {
dataForWebPage = somethingDifferent
fs.writeFileSync( "dataFile.json", JSON.stringify(dataForWebPage, null, 2), "utf8");
}
}
sqs.deleteMessage(data)
When testing this on Windows using Visual Code Studio, this works well. Running 'nodemon myscript.js' and opening localhost:3000 displays the page. As events come in, nodemon restarts, the page updates seamlessly, and the events are purged from the queue.
However, if zip the files and modules up, and move the script over to a linux machine, running an identical script via SSH means that I can view the webpage, the page gets update, nodemon restarts and behaves in the same way that I expect, but the messages from the SQS queue do not get deleted. They simply stay in the queue, and are never removed. Moments later, my script will pull them again, making the webpage inaccurate. They will continue to look forever and never delete.
If I do not use nodemon or if I comment out the fs.writeFileSync, the app works as expected and the events from the SQS queue are deleted as expected. However, my webpage is not then updated.
I had a theory that this was due to nodemon restarting the service, and as a result, causing the script to stop and restart before it reached the 'deleteMessage' part. However, If I simply move the delete event so that it happens before any reset, it does not solve the problem. For example, the following code is still broken on Linux, but like the previous version, DOES work on Windows:
sqs.receiveMessage(data){
if (data = 1) {
dataForWebPage = something
sqs.deleteMessage(data)
fs.writeFileSync( "dataFile.json", JSON.stringify(dataForWebPage, null, 2), "utf8");
}
if (data = 2) {
dataForWebPage = somethingDifferent
sqs.deleteMessage(data)
fs.writeFileSync( "dataFile.json", JSON.stringify(dataForWebPage, null, 2), "utf8");
}
}
It seems that if I use the asynchronous version of this call, fs.writeFile, the SQS events are also deleted as expected, but as I receive a lot of events, I am using the synchronous version of this service to ensure that data does not queue, and is updated simultaneously.
Later in the code, I use fs.readFileSync, and that does not seem to be interfering with the call to delete the SQS events.
My questions are:
1) What is happening, and why is it happening?
2) Why only Linux, and not windows?
3) What's the best way to solve this to ensure I get live updates to the page, but events are being deleted as expected?
1) What is happening, and why is it happening?
Guessing : deleteMessage is asynchronous, and a sync operation to write file is blocking the event loop, so your deleteMessage http call may be blocked and as you restart the process, it's actually never executed.
2) Why only Linux, and not windows?
No idea.
3) What's the best way to solve this to ensure I get live updates to
the page, but events are being deleted as expected?
I will be blunt : you have to redo all the architecture of your system.
Voluntarily failing your webserver and restarting it to refresh a web page won't scale to more than one user, and not even one it seems. It's not meant to work that way.
Depending of the constraint of the system you are trying to build (scale, speed, etc..) many different solution can work.
To stay as simple as possible :
A first improvement could be to keep your file storage but expose it through an API to get the data on the frontend from an ajax request, and polling it at regular interval. You will have a lot more request, but a lot less problem. It's maybe less "live" but few system actually need less than a few seconds live update.
Secondly, don't do sync operation on nodejs, it's a huge performance bottleneck, leading to strange errors and huge latencies.
When that works, file storage is usually a pain, and not really performant, maybe ask yourself if you need a database or a memcached/redis, also you can check if you need to replace polling an API from the webpage to a Socket that will prevent a lot of request and allow less than 1sec update.

What is the best way to keep local copy of Firebase Database on node.js

I have an app where I need to check people's posts constantly. I am trying to make sure that the -server- handles more than 100,000 posts. I tried to explain the program and specify the issues I am worried about by numbers.
I am running a simple node.js program on my terminal that runs as firebase admin controlling the Firebase Database. The program has no connectivity with clients(users), it just keeps the database locally to check users' posts every 2-3 seconds. I am keeping the posts in local hash variables by using on('child_added') to simply push the post to a posts hash and so on for on('child_removed') and on('child_changed').
Are these functions able to handle more than 5 requests per second?
Is this the proper way of keeping data locally for faster processing(and not abusing firebase limits)? I need to check every post on the platform every 2-3 seconds, so I am trying to keep a local copy of the -posts data.
That local copy of the posts are looped through every 2-3 seconds.
If there are thousands of posts, will a simple array variable handle that load?
Second part of the program:
I run a for loop to loop through the posts in a function. I run the function every 2-3 seconds using setInterval(). The program needs not only to check new added posts but it constantly needs to check all posts on the database.
If(specific condition for a post) => the program changes the state of the post
.on(child_changed) function => sends an API request to a website after that state change
Can this function run asynchronously ? When it is called, the function should not wait for the previous call to finish because the old call is sending an API request and it might not complete fast. How can I make sure that .on(child_changed) doesn't miss a single change on the -posts data?
Listen for Value Events documentation shows how to observe changes, namely one uses the .on method.
In terms of backing up your Realtime Database, you simply export the data manually, or if you have the paid plan you can automate it.
I don't understand why you would want to recreate the wheel, so to speak, and have your server ping firebase for updates. Simply use firebase observers.

nodejs - run a function at a specific time

I'm building a website that some users will enter and after a specific amount of time an algorithm has to run in order to take the input of the users that is stored in the database and create some results for them storing the results also in the database. The problem is that in nodejs i cant figure out where and how should i implement this algorithm in order to run after a specific amount of time and only once(every few minutes or seconds).
The app is builded in nodejs-expressjs.
For example lets say that i start the application and after 3 minutes the algorithm should run and take some data from the database and after the algorithm has created some output stores it in database again.
What are the typical solutions for that (at least one is enough). thank you!
Let say you have a user request that saves url to crawl and get listed products
So one of the simplest ways would be to:
On user requests create in DB "tasks" table
userId | urlToCrawl | dateAdded | isProcessing | ....
Then in node main site you have some setInterval(findAndProcessNewTasks, 60000)
so it will get all tasks that are not currently in work (where isProcessing is false)
every 1 min or whatever interval you need
findAndProcessNewTasks
will query db and run your algorithm for every record that is not processed yet
also it will set isProcessing to true
eventually once algorithm is finished it will remove the record from tasks (or mark some another field like "finished" as true)
Depending on load and number of tasks it may make sense to process your algorithm in another node app
Typically you would have a message bus (Kafka, rabbitmq etc.) with main app just sending events and worker node.js apps doing actual job and inserting products into db
this would make main app lightweight and allow scaling worker apps
From your question it's not clear whether you want to run the algorithm on the web server (perhaps processing input from multiple users) or on the client (processing the input from a particular user).
If the former, then use setTimeout(), or something similar, in your main javascript file that creates the web server listener. Your server can then be handling inputs from users (via the app listener) and in parallel running algorithms that look at the database.
If the latter, then use setTimeout(), or something similar, in the javascript code that is being loaded into the user's browser.
You may actually need some combination of the above: code running on the server to periodically do some processing on a central database, and code running in each user's browser to periodically refresh the user's display with new data pulled down from the server.
You might also want to implement a websocket and json rpc interface between the client and the server. Then, rather than having the client "poll" the server for the results of your algorithm, you can have the client listen for events arriving on the websocket.
Hope that helps!
If I understand you correctly - I would just send the data to the client-side while rendering the page and store it into some hidden tag (like input type="hidden"). Then I would run a script on the server-side with setTimeout to display the data to the client.

Resources