I am having my project collection running in Azure DevOps Online(Services). And I would like to migrate that to Azure DevOps On-Prem Server.
Help me out here with the incompatibility issue i will be facing and how to overcome that.
Options to Migrate from Azure DevOps online(services) to Azure DevOps Server(On-Prem).
Is there any services available in azure to successfully acheive the above migration with out any data loss?
Should I must use third party tool to do the above migration with out any data loss?
Help me out here with the Downtime required for the 100 GB of Project collection with multiple repository.
Project Collection size - 100 GB
One of the previous answers (since deleted?) has captured most of the critical points, and that no tool can migrate 100% of data with zero data loss (Actually, 100% migration with no loss is not feasible as inherently some of the automatically generated and configuration values, like work item ids etc., will inherently be different between two instances). Therefore, the only way to get zero data loss migration is to lift and shift the complete project collection image from Azure DevOps Services to Azure DevOps Server, which is not supported by the official Azure DevOps migration tool. Given that, the only way left to migrate data is using Azure DevOps APIs.
So, the best approach is to understand what data cannot be migrated by the migration tools that you are evaluating, and then decide what works best for you. Also, it will not be a black and white selection when it comes to choosing a migration solution. First, you need to define the must-haves you expect from migration and then evaluate the different migrators available in the market. Here are a few common selection criteria:
Data Loss:
Understand what data can be and cannot be migrated by the migration solution. Ideally, the tool should be able to migrate work items (along with history, attachments, mentions, and inline images) and test management, including Test Results, Source code, Dashboards, Areas and Iterations. For Builds and pipelines, you can use the native Export-Import feature, as they require manual changes to tweak the connection.
Zero Downtime:
Downtime adds operational costs and impacts development operations as teams cannot use Azure DevOps tools. Understand thoroughly that there is no scenario in which downtime will be required for any type of data.
Ease of Use:
Some tools are a collection of unsupported scripts (Naked Agility) which require very high degree of sophistication to use. These can be extremely expensive (even though the scripts are open source), error prone and hinder operations.
Project Consolidation or Customized Templates:
Analyze if you want to consolidate multiple projects into one project while migrating or if the templates need to be customized. If that is the need, evaluate if the migration tool can support such configuration with ease and has a UI to do so. Manually configuring mappings for each project can be tedious and highly error prone.
Migration Time:
Many migration tools migrate projects one by one, hence consuming a lot of effort and time to migrate the data spread across multiple projects. Understand how many projects can be parallelly migrated to have speedy migration.
Reverse Synchronization:
Do you want to keep the data in sync between Services and Server for some time post-migration? Will data be integrated bidirectionally or unidirectionally? Answer these questions and then evaluate the migration solution if it will meet the requirements.
Commercial Support:
Migration can be tricky and time-consuming, as, over time, different teams have created all the odd stuff in there. Better to have a team of experts do the migration for you while you focus on defining requirements and validating the completeness of migration.
I hope this helps. Full disclosure: I work for OpsHub, where we are experts at data migration and using OpsHub Azure DevOps migrator have migrated multiple organizations to and from Azure DevOps Services and Server over the last decade. Contact us if you need more help.
Related
Currently, our product is a web application with SQL Server as DBMS, ASP.NET backend, and classic HTML/JavaScript/CSS frontend. The product is actively developed and each month we have to deploy a new version of it to production.
During this deployment, we update all the components listed above (apply some SQL scripts, update binaries, and client files) but we deploy only the delta (set of files which were changed since the last release). It has some benefits like we do not reset custom data/configs/client adjustments.
Now we are going to move inside clouds like Azure, AWS, etc. Adjust product architecture to be compliant with the Docker/Kubernetes and provide the product as SaaS.
And now the question itself: "Which approach of deployment is recommended in the clouds?" Can we keep applying the delta only? Or we have to reorganize the process to always deploy from scratch?
If there are some Internet resources I have missed, please share.
This question is extremely broad but maybe some clarification could steer you in the right direction anyway:
Source code deployments (like applying delta's) and container deployments are two very different directions in the sense that the tooling you invest in during the entire SLDC CAN differ substantially. Some testing pipelines/products focus heavily (or exclusively) on working with one or the other. There will be tools that can handle both of course.
They also differ in the problems they're attempting to solve and come with some pro's and con's:
Source Code Deployments/Apply Diffs:
Good for small teams and quick deployments as they're simple to understand and setup.
Starts to introduce risk when you need to upgrade the Host OS or application dependencies
Starts to introduce risk when the Host's in production begin to drift (have more differing files then expected) more dramatically over time
Slack has a good write up of their experience here.
Container deployments
Provides isolation from the application (developer space) and the Host OS (sysadmin/ops space). This usually means they can work with each other independently.
Gives an "artifact" that won't change between deployments, ie the container tagged v1 will always be the same unless you do something really funky. You can't really guarantee this
The practice of isolating stateless components makes autoscaling those components very easy, and you can eventually spend more time on the harder ones (usually stateful).
Introduces a new abstraction with new concerns that your team will have to mature into. Testing pipelines, dev tooling, monitoring/loggin architectures might all need to be adjusted over time and that comes with cost and risk.
Stateful containers is hardly a solved problem (ie shoving an existing database in a container can be a surprising challenge).
In order to work with Kubernetes, you need to have a containerized application. That doesn't mean you need to containerize your entire product over night. Splitting out the front end to deploy with cloudfront/s3, and containerizing a stateless app will get your feet wet.
Some books that talk about devops philosophies (in which this transition plays a part)
The Devops Handbook
Accelerate
Effective Devops
SRE book
I have an idea whereby I intend to build a cloud native application for algorithmic trading, ideally by consuming all PaaS and SaaS (no IaaS), and I'd like to get some feedback on how I intend to build it. The concept is pretty straight-forward in that I intend to consume financial trading data from an external SaaS solution via an API query, feed that data into various Azure PaaS solutions (most notably ML for modeling), and then take some action. Here is a high-level diagram I've come up with so far:
Solution Overview
As a note, while I'm familiar with Azure, I'm not a Azure cloud engineer and have limited experience in actually building solutions myself. Subsequently, I intend to use this project as a foundation to further educate myself.
When starting on the build, I immediately questioned whether I should or shouldn't use Event Hubs. Conceptually it makes sense, in that I'm decoupling the production of a data stream from the consumption of it. Presumably, this facilitates less complications when / if I need to update the data feed(s) in the future. I also thought about where the data is stored... should it be a SQL database, or more simply, an Azure Table? The idea here is that the trading data will need to be stored for regression testing as my iterate through my models. All that said, looking for some insights from anybody that may have experience in this space.
Thanks!
There's no real question in here. Take a look on the architecture reference provided by Microsoft: https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/
I have been tasked with creating a scheduled job to first call an api, convert the response to a new format and then pass that data to another api. It doesn't sound like there is any logic in between
The company I work for has a lot of SSIS packages doing a variety of things but also has a healthy Azure platform with a few web jobs running. Several developers on my team have expressed a dislike for SSIS packages so I would like to implement this in Azure, but I want to make sure that is the most reasonable thing to do.
What I am asking for is a pro con list where each option is strong or weak. A good answer will assist readers in making a decision on if their specific situation is best solved using a SSIS package or an Azure webjob, assuming the needed environment is setup for either already.
I'm building an Azure data lake using data factory at the moment, and am after some advice on having multiple data factories vs just one.
I have one data factory at the moment, that is sourcing data from one EBS instance, for one specific company under an enterprise. In the future though there might be other EBS instances, and other companies (with other applications as sources) to incorporate into the factory - and I'm thinking the diagram might get a bit messy.
I've searched around, and I found this site, that recommends to keep everything in a single data factory to reuse linked services. I guess that is a good thing, however as I have scripted the build for one data factory, it would be pretty easy to build the linked services again to point at the same data lake for instance.
https://www.purplefrogsystems.com/paul/2017/08/chaining-azure-data-factory-activities-and-datasets/
Pros for having only one instance of data factory:
have to only create the data sets, linked services once
Can see overall lineage in one diagram
Cons
Could get messy over time
Could get quite big to even find the pipeline you are after
Has anyone got some large deployments of Azure Data Factories out there, that bring in potentially thousands of data sources, mix them together and transform? Would be interested in hearing your thoughts.
My suggestion is to have only one, as it makes it easier to configure multiple integration runtimes (gateways). If you decide to have more than one data factory, take into consideration that a pc can only have 1 integration runtime installed, and that the integration runtime can only be registered to only 1 data factory instance.
I think the cons you are listing are both fixed by having a naming rules. Its not messy to find a pipeline you want if you name them like: Pipeline_[Database name][db schema][table name] for example.
I have a project with thousands of datasets and pipelines, and its not harder to handle than smaller projects.
Hope this helped!
I'd initially agree with an integration runtime being tied to a single data factory being a restriction, however I suspect it is no longer or soon to be no longer a restriction.
In the March 13th update to AzureRm.DataFactories, there is a comment stating "Enable integration runtime to be shared across data factory".
I think it will depend on the complexity of the data factory and if there are inter-dependencies between the various sources and destinations.
The UI particularly (even more so in V2) makes managing a large data factory easy.
However if you choose an ARM deployment technique the data factory JSON can soon become unwieldy in even a modestly complex data factory. And in that sense I'd recommend splitting them.
You can of course mitigate maintainability issues as people have mentioned, by breaking your ARM templates into nested deployments, ARM parameterisation or data factory V2 parameterisation, using the SDK direct with separate files. Or even just use the UI (now with git support :-) )
Perhaps more importantly particularly as you mention separate companies being sourced from; it perhaps sounds like the data isn't related and if it isn't - should it be isolated to avoid any coding errors? Or perhaps even to have segregated roles and responsibilities for the data factories.
On the other hand if the data is interrelated, having it in one data factory makes things far easier for allowing data factory to manage data dependencies and re-running failed slices in one go.
After the March release, you can link integration runtimes among different factories.
The other thing to do is to create different folders for the various pipelines and datasets
My suggestion is to create one DataFactory service per each project. If you need to transfer data from two source to one destination and for each transformation you need several Pipelines and Linked Services and other stuffs, I suggest to create two separated ADF services for each Source. In this case I will see each source as an integration project separated.
You will have two separated CI/CD for each project also.
In your source controller also you need to have two separated repositories.
If you are using ADF v1 then it will get messy. At a client of ours we have over 1000 pipelines in one Data Factory. If you are starting fresh, I would recommend looking at v2 because it allows you to parameterize things and should make your scripts more reusable.
I'm rebuilding an application from the ground up. At some point in the future...not sure if it's near or far yet, I'd like to move it to Azure. What decisions can I make today, that will make that migration easier.
I'm going to be dealing with large amounts of data, and like the idea of Azure Tables...are there some specific persistance choices I can make now that will mimick Azure Tables so that when the time comes the pain of migration will be lessened?
A good place to start is the Windows Azure Guidance
If you want to use Azure Tables eventually, you could design your database where all tables are a primary key, plus a field with XML data.
I would advise to plan along the lines of almost-infinitely scalable solutions (see Pat Helland's paper on Life beyond distributed transactions) and the CQRS approach in general. This way you'll be able to avoid common pitfalls of the distributed apps generally and Azure table storage peculiarities.
This really helps us to work with Azure and Cloud Computing at Lokad (data-sets are quite large plus various levels of scalability are needed).