How Alteryx is deploying data in decentralized way? - alteryx

From the link :https://reviews.financesonline.com/p/alteryx/, I see the following details
Alteryx is an advanced data analytics platform intended to serve the
needs of business analysts looking for a self-service solution. It
contains 3 basic components: Gallery, Designer, and Server, which
blend data from external sources and generate comprehensive reports.
Each of them, however, can be used separately.
The software structures and evaluates data from multiple external
sources, and organizes it into comprehensive insights that can be used
for business deciding and shared with multiple internal/external
users. Basically, Alteryx is deploying data in a decentralized way,
and eliminating in such way the risk of underestimating it. At the
same time, Alteryx is well-integrated, easy to use, and ran both on
premise and in cloud.
Can anyone help me to know what is the text above in bold trying to explain. I am interested to understand it in details with some explanation.

The basic idea of is that the tool can blend just about any kind of data and dump the result to your own local extract... the local extract is "decentralized" in that, obviously it's local, and also you didn't need to rely on some core ETL team to build a process for you (which they would probably dump in a central location). The use of the term "underestimating" probably indicates that, if you're not building in your own insights (say you find something online that you can blend into your analysis), you're "underestimating" the importance of that data.
It's worth noting that your custom extract could be turned into a nightly job and the output could itself be dumped to a centralized database server if desired. So the tool can be used to build centralized assets too. It really just depends on how you're using it. (With Alteryx this would require either their Desktop Automation, or their Server.)
So... it seems that any self service data blending tool would be capable of the same. What's special about Alteryx? The distinguishing factors will lie elsewhere: number of data types supported, overall functionality and power, performance, built-in examples, ease-of-use, service, support, online community, and perhaps other areas.

Related

Questions on Azure Data Explorer normalization

Our Customer currently build out a number of use cases for the client and facilitate the onboarding of logs from 300+ applications. The client is limited on the number of use cases they can support so they have been looking into the option of creating a custom schema with parsers etc.
The focus is insider threat so they are primarily collecting audit/activity logs for these applications.
The challenges they see them are that application audit/activity logs vary greatly and this will make it difficult to bring the data together from multiple applications. The client has a non-standard architecture and ingest their logs through ADX instead of Sentinel and then forward a subset of data for alerting. They also don’t make wide use of native tables yet.
Please do refer the architecture diagram in the attachment.
Question:
Is there a way of normalizing application audit and activity logs so they can build out insider threat use cases over multiple applications?
Any guidance for this requirement would be of great help. Thanks in advance.

Design for a Cloud Native Application in Azure for ML Insights and Actions

I have an idea whereby I intend to build a cloud native application for algorithmic trading, ideally by consuming all PaaS and SaaS (no IaaS), and I'd like to get some feedback on how I intend to build it. The concept is pretty straight-forward in that I intend to consume financial trading data from an external SaaS solution via an API query, feed that data into various Azure PaaS solutions (most notably ML for modeling), and then take some action. Here is a high-level diagram I've come up with so far:
Solution Overview
As a note, while I'm familiar with Azure, I'm not a Azure cloud engineer and have limited experience in actually building solutions myself. Subsequently, I intend to use this project as a foundation to further educate myself.
When starting on the build, I immediately questioned whether I should or shouldn't use Event Hubs. Conceptually it makes sense, in that I'm decoupling the production of a data stream from the consumption of it. Presumably, this facilitates less complications when / if I need to update the data feed(s) in the future. I also thought about where the data is stored... should it be a SQL database, or more simply, an Azure Table? The idea here is that the trading data will need to be stored for regression testing as my iterate through my models. All that said, looking for some insights from anybody that may have experience in this space.
Thanks!
There's no real question in here. Take a look on the architecture reference provided by Microsoft: https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/

Is it a good idea to perform data migration with generic language?

There are two kinds of migration. One is to update database schema during the development period. The other is to migrate existing data into a new system (with different schema).
There are a lot of tools available for the former scenario, such as Flyway, Liqubase. However, I am not aware of tools for the latter purpose.
We are currently using PL/SQL to do the migration. However, not all our Java developers have a DBA background. I wonder if anyone has an experience of using generic languages (Java, Scala, C#, etc.) with database access libraries (Hibernate, NHibernate, etc.) to perform the migration.
I'm unsure what the question is, but if I understand you correctly;
Sure you can develop an application in a(ny) language that reads data from a data source and puts it into a data target.
A data migration between data sources does not have to be SQL to SQL only (in case source and target are relational databases)
In fact it often makes sense to have an application between the source/target if there's logic which needs to handle or transform data between various structures or between various data sources.
For example if migrating data from one ERP system into an e-commerce system (just an example).
Another advantage to doing it via an application, is that you often can include more tools/features for reporting and error handling.
Especially if the integration/migration should run often, such error handling/reporting to verify the data movement is beneficial.
Also if the data source and data target are located in different areas/on different servers, it can be easier to do the migration via an application, to avoid opening up needlessly between servers and linking them together.
So basically - such an application (Java, C# ... anything) would read data from a data source, transform the data into the data structure of the target and then store it in the data target.
Making an application to do things, is just another tool in a developers toolbox.
However, if the data migration is basically a 1:1 movement of data from one structure to another duplicate of that structure and no transformation exists; then the situation would be faster/easier to handle directly in SQL or using a data-sync program.
Even if not "DBA background" (not many developers have DBA background, but that shouldn't prevent people from learning SQL, as it's also just another language. You don't need to be a DBA to be able to write SQL effectively)
So - in conclusion. Yes, you can write an application and yes, it can be a good idea. But as almost everything within our field, then it's a case-by-case/situation-by-situation evaluation whether or not it is the "better" way.

IIS 7 Logs Vs Custom

I want to log some information about my visitors. Is it better to use the IIS generated log or to create my own in an SQL 2008 db.
I know I should probably provide more information about my specific scenario, but I'd like just generally, pros and cons of either proposal.
You can add additional information to the IIS logs from ASP.NET using HttpResponse.AppendToLog, additionally you could use the Advanced Logging Module to create your own logs with custom filters and custom data including data from Performance Counters, and more.
It all depends on what information you want to analyse.
If you're doing aggregations and rollups then you'd want to pull this data into a database for analysis. Pulling your data into a database will give you access to indexes and better querying tools.
If you're doing infrequent one-off simple queries then LogParser might be sufficient for your needs. However you'll be constantly scanning unindexed flat files looking for data which is I/O intensive.
But as you say, without knowing more about your specific scenario it's hard to say what would be best.

What design decisions can I make today, that would make a migration to Azure and Azure Tables easier later?

I'm rebuilding an application from the ground up. At some point in the future...not sure if it's near or far yet, I'd like to move it to Azure. What decisions can I make today, that will make that migration easier.
I'm going to be dealing with large amounts of data, and like the idea of Azure Tables...are there some specific persistance choices I can make now that will mimick Azure Tables so that when the time comes the pain of migration will be lessened?
A good place to start is the Windows Azure Guidance
If you want to use Azure Tables eventually, you could design your database where all tables are a primary key, plus a field with XML data.
I would advise to plan along the lines of almost-infinitely scalable solutions (see Pat Helland's paper on Life beyond distributed transactions) and the CQRS approach in general. This way you'll be able to avoid common pitfalls of the distributed apps generally and Azure table storage peculiarities.
This really helps us to work with Azure and Cloud Computing at Lokad (data-sets are quite large plus various levels of scalability are needed).

Resources