I contribute to an open source application that has a series of internal queues (implemented as arrays). Monitoring of these queues is an important part of diagnosing any performance issues.
In other languages, operating systems, I've been able to expose performance counters which can be seen by monitoring tools for a real time view (or to build up historic information).
What's the best way of exposing these in the node/linux ecosystem?
Currently I am logging them to disk through winston, but that creates lots of logging...
You can use pm2 custom metrics for that
Related
We are looking into options to monitor our Acumatica instance to identify performance issues on the application level as well as the SQL server level. We have experience with newrelic and a few others, but also read about Retrace (https://stackify.com/retrace/) which looks worth trying.
I'm curious to know if it's possible/recommended to install such tools within Acumatica?
Does anyone have any experience or feedback on the topic?
Acumatica includes a built-in request profiler that can be used to monitor requests, performance and SQL. Probably not as sophisticated as New Relic, but powerful enough when you have performance issues to resolve. Read more here: https://help-2017r2.acumatica.com/(W(2))/Wiki/ShowWiki.aspx?wikiname=HelpRoot_User&PageID=e7612f3f-fc6f-494d-8532-cc2ceef7147b
i am using nodejs as my server with express. I am logging all my request and response on server. Is there any package available to read my logs and generate graphical report like how many requests we got and how many succeeded. What was the request received and responded. Is there a package which can track all these details for me?
It sounds like you're trying to get some performance metrics about your application which is great. There are many different ways you can go with this, here are a few suggestions for you to weigh up.
Non-real-time performance metrics
If you don't care about seeing the services real-time metrics you might want to create something to process them into a CSV and use something like excel or google sheets to generate graphs from them. If you need something immedietely and don't need to respond to things "in the moment" when a dip happens then this is a good quick and dirty solution.
Real-time performance metrics using SaaS software
If you want the metrics but don't want to host the systems yourself you might want to checkout services such as DataDog. They provide dashboards and graphs as a service. You can use something like statsd to get metrics into DataDog, or use their own integrations. They have a lot of integrations with cloud providers like AWS, GCP, and Azure for machine metrics (CPU etc). They also have packages for inteacting with your application itself such such as their ExpressJS package.
Real-time performance metrics using self-hosted solutions
I've often used a self-hosted approach as I find the pricing often scales a bit better. The setup is fairly simple.
Use a statsd package for all system components (nginx, nodejs, postgres, etc) to publish metrics to the statsd daemon.
The statsd daemon self-hosted somewhere (maybe a proxy cluster if you're working on large applications).
Self-hosted Graphite to consume metrics from the statsd daemon. Graphite is a software package designed for aggregating metrics and has an API for producing static graph images.
Self-hosted Grafana that pulls metrics from graphite. Grafana is a real-time dashboarding software. It allows you to create multiple dashboards that hook into various data sources such as Graphite or other time series data stores.
The self-hosting route can take a day to setup but it does mean you don't increase your costs per-host. It's also easy to put behind internal networks if that's a requirement for your organisation.
Personally, I would recommend either real-time performance metrics approaches. If your application is small and doesn't have many hosts then services like DataDog could be useful and cost effective but if you do need to scale up you'll find your costs sky rocketing. At that point you might decide to move over to a self-hosted infrastructure.
We're considering using Service Fabric on-premises, fully or partially replacing our old solution built based on NServiceBus, though our knowledge about SF is yet a bit limited. What we like about NServiceBus is the out-of-the-box feature to declaratively throttle any service with the maximum amount of threads. If we have multiple services, and one of them starts hiccuping due to some external factors, we do not want other services affected by that. That "problem" service would just take the maximum amount of threads we allocate it with in its configuration, and its queue would start growing, but other services keep working fine as computer resources are still available. In Service Fabric, if we let our application create as many "problem" actors as it wants, it will lead to uncontrollable growth of the "problem" actors that will consume all server resources.
Any ideas on how with SF we can protect our resources in the situation I described? My first impression is that no such things like queuing or actors throttling mechanism are implemented in Service Fabric, and all must be made manually.
P.S. I think it should not be a rare demand for capability to somehow balance resources between different types of actors inside one application, to make them less dependent on each other in regards to consuming resources. I just can't believe there is nothing offered for that in SF.
Thanks
I am not sure how you would compare NServiceBus (which is a messaging solution) with Service Fabric that is a platform for building microservices. Service Fabric is a platform that supports many different types of workload. So it makes sense it does not provide out of the box throttling of threads etc.
Also, what would you expect from Service Fabric when it comes to actors or services when it comes to resource consumption. It is up to you what you want to do and how to react. I wouldn't want SF to kill my actors or throttle service request automatically. I would expect mechanisms to notify me when it happens and those are available.
That said, SF does have a mechanism to react on load using metrics, See the docs:
Metrics are the resources that your services care about and which are provided by the nodes in the cluster. A metric is anything that you want to manage in order to improve or monitor the performance of your services. For example, you might watch memory consumption to know if your service is overloaded. Another use is to figure out whether the service could move elsewhere where memory is less constrained in order to get better performance.
Things like Memory, Disk, and CPU usage are examples of metrics. These metrics are physical metrics, resources that correspond to physical resources on the node that need to be managed. Metrics can also be (and commonly are) logical metrics. Logical metrics are things like “MyWorkQueueDepth” or "MessagesToProcess" or "TotalRecords". Logical metrics are application-defined and indirectly correspond to some physical resource consumption. Logical metrics are common because it can be hard to measure and report consumption of physical resources on a per-service basis. The complexity of measuring and reporting your own physical metrics is also why Service Fabric provides some default metrics.
You can define you're own custom metrics and have the cluster react on those by moving services to other nodes. Or you could use the Health Reporting system to issue a health event and have your application or outside process act on that.
I am now working on Performance Testing of a Java Application that runs on GlassFish Server 4.1.
After going through some statistics that I got from AppDynamics tool, I find that there is no possibility for me to drill down to code/method level issues. For example, I can see the time taken by each method or function using dotTrace or JProfiler but AppDynamics tool seems to skip all these features.
I was also looking for a free solution, hence I choose AppDynamics. Now I feel I am not on the right track. Can someone let me know more about this tool if I am missing something or suggest any other quick and easy solution to this.
Is there a possibility that the monitors on GlassFish server 4.1 can do the same for no cost?
Generally, monitoring tools cannot record method-level data continuously, because they have to operate at a much lower level of overhead compared to profiling tools. They focus on "business transactions" that show you high-level performance measurements with associated semantic information, such as the processing of an order in your web shop.
Method level data only comes in when these business transactions are too slow. The monitoring tool will then start sampling the executing thread and show you a call tree or hot spots. However, you will not get this information for the entire VM for a continuous interval like you're used to from a profiler.
You mentioned JProfiler, so if you are already familiar with that tool, you might be interested in perfino as a monitoring solution. It shows you samples on the method level and has cross-over functionality into profiling with the native JVMTI interface. It allows you to do full sampling of the entire JVM for a selected amount of time and look at the results in the JProfiler GUI.
Disclaimer: My company develops JProfiler and perfino.
I'm currently using Microsoft Network Monitor to parse thru debug event traces. It is not a bad tool, but not very good either. Do you know some better solutions?
These are readers for exploring custom ETW traces:
SvcPerf - End-to-End ETW trace viewer for manifest based traces
LINQPad + Tx (LINQ for Logs and traces) driver - Simple reader that allows you to query ETW traces
PerfView - multitool that allows you to do amost everything with ETW, but not particularly user-friendly
PerfView http://www.microsoft.com/download/en/details.aspx?id=28567
If you're after giving graphic visualization of traces for the sake of performance analysis, you may use the following:
1. Windows Reliability and Performance Monitor which is an MMC snap-in and is easy to use for basic analysis (locally, from the server)
2. xperf, which is a stand-alone tool from the Windows Performance Tools.
Xperf itself is a command-line tool for captures and processing traces and Xperfview allows creating graphs and tables from the captured data. Look at this blog post for an overview.
3. Visual Studio 2010 profiler contains a "Concurrency Visualizer" which is actually a nice tool to collect and visualize ETW traces, specifically tailored around analysis of thread contention issues (but can also be used to analyze network traces, I think). See this blog post on using the tool and also you may use the underlying tools directly: VSPerfCmd and VSPerfReport.
I like to use Log Parser [link] to parse through the logs for the events that I am most interested in. I love the SQL-like query structure.