Unit/Automated Testing in a workflow system

Unit/Automated Testing in a workflow system - sharepoint

Do you do automated testing on a complex workflow system like K2?
We are building a system with extensive integration between Sharepoint 2007 and K2. I can't even imagine where to start with automated testing as the workflow involves multiple users interacting with Sharepoint, K2 workflows and custom web pages.
Has anyone done automated testing on a workflow server like K2? Is it more effort than it's worth?

I'm having a similar problem testing workflow-heavy MOSS-based application. Workflows in our case are based on WWF.
My idea is to mock pretty much everything that you can't control from unit tests - documents storage, authentication, user rights and actions, sharepoint-specific parts of workflows for sharepoint (these mocks should be thoroughly tested to mirror behavior of real components).
You use inversion of control to make code choose which component to use at runtime - real or mock.
Then you can write system-wide tests to test workflows behavior - setting up your own environment, checking how workflow engine reacts. These tests are too big to call them unit-tests, still it is automated testing.
This approach seems to work on trivial cases, but I still have to prove it is worthy to use in real-world workflows.

Here's the solution I use. This is a simple wrapper around the runtime that allows executing single activity, simplifies passing the parameters, blocks the invoking thread until the workflow or activity is done, and translates / rethrows exceptions if any. Since my workflow only sends or waits for messages through a custom workflow service, I can mock out the service to expect certain messages from workflow and post certain messages to it and here I'm having real unit-tests for my WF! The credit for technology goes to Michael Kennedy.

If you are going to do unit testing, Typemock Isolator is the only tool that can currently mock SharePoint objects.
And by the way, Richard Fennell is working on a workflow mocking solution here.

We've just today written an application that monitors our K2 worklist, picks up certain tasks from it, fills in some data and submits the tasks for completion. This is allowing us to perform automated testing, find regressions, and run through as many different paths of the workflow in a fraction of the time that it would take people to do it. I'd imagine a similar program could be written to pretend to be sharepoint.
As for the unit testing of the workflow items themselves, we have a dll referenced from k2 which contains all of our line rule and processing logic. We don't have any code in the k2 workflows themselves, it is all referenced from these dlls. This allows us to easily write unit tests on them to test all of the individual line rules.

I've done automated integration testing on K2 workflows using the K2ROM API (probably SourceCode.Workflow.Client if you're using K2 blackpearl).
Basically you start a process on a test server with a known folio (I generate a GUID), then use the management API to delete it afterwards. I wrote helper methods like AssertAtClientActivity (basically calls ProvideWorkItem with criteria).
Use the IsSynchronous parameter to StartProcessInstance, WorklistItem.Finish, etc. so that relevant method calls will not return until the process instance has reached a stable state.
Expect tests to be slow and to occasionally fail. These are not unit tests.
If you want to write unit tests against other systems, you'll probably want to wrap the K2 API.
Consider looking at Windows Workflow 4 and the new workflow features in SharePoint 2010. You may not need K2.

Related

What can be a feature in BDD

I have done some research over information related to below question, but couldn't get right information.
I have a scenario where a user creates some data using a create rest API and saves it in backend. Then, the user retrieves the saved data using a get API later to validate the data that's saved in the backend as part of create API.
Now, can creating the data in backend and retrieving the data be combined as a feature? or should there be two features – one for creating the data and other for retrieving the data? If it can be done in both ways – what are advantages of one over other?

There are no specific rule of thumb how one would group business logic by features. However there are some technical details making your entire code behave differently depending on how you group features. Here is some advice:
Background is defined once per feature. So if your tests require different background it probably makes sense to put it to different features (probably testing get would imply you have to insert some data before the test which is not necessary for testing put)
If you're not "gluing" your files explicitly they are taken depending on the position of your runner classes within the package structure. So that you can play with different configuration not only on gherkin level but also on the level of the particular test framework (like JUnit and TestNg). This is very much like the previous point but only using the capabilities provided by underlying unit test framework.
If you need to run your tests in parallel, sometimes the way how you group things by features also matters. When you run Cucumber-JUnit4 in parallel, it runs each feature file in parallel but all the scenarios inside a single feature in a sequence.
You would also might need to tag some tests in some special way. If there is a lot of such tests it makes sense to put sthem in a spearate file and apply the tag to entire feature rather than to tag each test individually.

I would suggest to have two separate scenarios to validate the POST and the GET. In that way, you have better visibility of two separate APIs. In the future during execution, you would also be able to know by looking at the title which API works and which one is broken (if any). You don't need to go into the step definitions and check whether the scenario for the POST API also includes the validation for GET as well or if that's a separate scenario.
So, one scenario to validate the POST and whether it returns 201 Created. And another scenario to validate the GET.

Test cases documentation compatible with cucumber, test automation and manual tests

I'm working on strategy for my company which provides testing/development services. I implement both web and mobile apps test automation using Selenium/Appium, Junit, Cucumber.
In my company test cases are written in traditional form:
1) Go to X
2) Perform action Y
3) Go to W
4) Perform action Z
Expected result: The application does ... .
But in Cucumber I use behavioral language which more or less describes similar action. I have also read this article: http://markoh.co.uk/posts/three-reasons-to-use-cucumber-for-test-automation and I wonder if we should write all our test cases in Cucumber language. For test automation, it will be just copy&paste to have a feature written. I assume this is web or mobile app with GUI.
Is this a good idea?
Have you hot any experience with such test
cases documentation in long term?
Can manual testers have difficulties in using test cases written is such manner instead of traditional language?
Any input appreciated!

The main advantage in the Cucumber test cases is their reliability. You will not be able to change the test scenario without the code update. Also the Cucumber allows to figure out your common procedures that may be useful even in the manual tests. The test cases are self documented therefore we usually don't have any difficulties in the scenarios reading by any technical personnel. I succesfully used that approach in my previous job and I going to entry it also now. Also I would suggest to use the Cucumber background feature that allows to define the test prerequisites.

Use cases of the Workflow Engine

I'd like to know about specific problems you - the SO reader - have solved using Workflow Engines and what libraries/frameworks you used if you didn't roll your own. I'd also like to know when a Workflow Engine wasn't the best choice and if/how you chose something simpler, like a TaskList/WorkList/Task-Management type application using state machines.
Questions:
What problems have you used workflow engines to solve?
What libraries/frameworks did you use?
When did a simpler State Machine/Task Management like system suffice?
Bonus: How did/do you make the distinction between Task Management and Workflow Engine?
I'm looking for first-hand experiences.
Some of the resources I've checked out:
Ruote and State Machine != Workflow Engine
StonePath and Docs
Creating and Managing Worklist Task Plans with Oracle
Design and Implementation of a Workflow Engine - Thesis
What to use Windows Workflow Foundation For
JBoss jBPM Docs

I'm biased as well, as I am the main author of StonePath.
I have developed workflow applications for the U.S. State Department, the Geneva Centre for Humanitarian Demining, several fortune 500 clients, and most recently the Washington DC Public School System. Every time I have seen a 'workflow engine' that tried to be the one master reference for business processes, I have seen an organization fighting itself to work around the tool. This may be due to the fact that these solutions have always been vendor/product driven, and then end up with a tactical team of 'consultants' constantly feeding the app... but because of this, I tend to react negatively when I hear the benefits of process-based tools that promise to 'centralize the workflow definitions in one place and make them repeatable'.
I very much like Ruote - I have been following that project for some time and should I need that kind of solution, it will be the next tool I'll be willing to try. StonePath has a very different purpose than ruote
Ruote is useful to Ruby in general,
StonePath is aimed at Rails, the web framework written in Ruby.
Ruote is about long-lived business processes and their associated definitions (Note - active development on ruote ceased).
StonePath is about managing State-based workflow and tasking.
Frankly, I think the distinction from the outside looking in might be subtle - many times the same kinds of business processes can be represented either way - the state-and-task-based model tends to map to my mental model though.
Let me describe the highlights of a state-based workflow.
States
Imagine a workflow revolving around the processing of something like a mortgage loan or a passport renewal. As the document moves 'around the office', it travels from state to state.
If you are responsible for the document, and your boss asked you for a status update you'd say things like
"It is in data entry"...
"We are checking the applicant's credentials now"...
"we are awaiting quality review"...
"We are done"... and so on.
These are the states in a state-based workflow. We move from state to state via transitions - like "approve", "apply", kickback", "deny", and so on. These tend to be action verbs. Things like this are modeled all the time in software as a state machine.
Tasks
The next part of a state/task-based workflow is the creation of tasks.
A Task is a unit of work, typically with a due date and handling instructions, that connects a work item (the loan application or passport renewal, for instance), to a users "in box".
Tasks can happen in parallel with each other or sequentially
Tasks can be created automatically when we enter states,
Create tasks manually as people realize work needs to get done
Require tasks to be completed before we can move onto a new state.
This kind of behavior is optional, and part of the workflow definition.
The rabbit hole can go a lot deeper than this, and I wrote an article about it for Issue #4 of PragPub, the Pragmatic Programmer's Magazine. Check out the repo link above for an updated PDF of that article.
In working with StonePath the last few months, I have found that the state-based model maps really well to restful web architectures - in particular, the tasks and state transitions map nicely as nested resources. Expect to see future writing from me on this subject.

I'm biased, I'm one of the authors of ruote.
variant 1) state machine attached to a resource (document, order, invoice, book, piece of furniture).
variant 2) state machine attached to a virtual resource named a task
variant 3) workflow engine interpreting workflow definitions
Now your question is tagged "BPM" we can be expanded into "Business Process management". How does that kind of management occur in each of the variant ?
In variant 1, the business process (or workflow) is scattered in the application. The state machine attached to the resource enforces some of the aspects of the workflow, but only those related to the resource. There may be other resources with their own state machine following the same business process.
In variant 2, the workflow can be concentrated around the task resource and represented by the state machine around that resource.
In variant 3, the workflow is enacted by interpreting a resource called a workflow definition (or business process definition).
What happens when the business process changes ? Is it worth having a workflow engine where business processes are manageable resources ?
Most of the state machine libraries have 1 set states + transitions. Workflow engines are, most of them, workflow definition interpreters and they allow multiple different workflows to run together.
What will be the cost of changing the workflow ?
The variants are not mutually exclusive. I have seen many examples where a workflow engine changes the state of multiple resources some of them guarded by state machines.
I also use variant 3 + 2 a lot, for human tasks : the workflow engine, at some points when running a process instance, hands a task (workitem) to a human participant (resource task is created and placed in state 'ready').
You can go a long way with variant 2 alone (the task manager variant).
We could also mention variant 0), where there is no state machine, no workflow engine, and the business process(es) are scattered and/or hardcoded in the application.
You can ask many questions, but if you don't take the time to read the answers and don't take the time to try out and experiment, you won't go very far, and will never acquire any flair for when to use this or that tool.

On a previous project I was working on i added some Workflow type rules to a set of Government Forms in the Healhcare industry.
Forms needed to be filled out by the end user , and depending on some answers other Forms were scheduled to be filled out at a later date. There were also external events that would cancel scheduled Forms or schedule new ones.
Sample Flow :
Patient Admitted -> Schedule Initial Assessment FOrm -> Schedule Quarterly Review Form -> Patient Died -> Cancel Review -> Schedule Discharge Assessment Form
Many other rules were based on things such as Patient age, where they were being admitted etc.
This was an ASP.NET app, the rules were basically a table in the database. I added scripting, so a script would run on Form completion to determine what to do next. This was a horrid design, and would have been perfect for a proper Workflow engine.

I'm one of the authors of the open source Temporal Workflow Engine we initially developed at Uber as Cadence. The difference between Temporal and the majority of the existing workflow engines is that it is developer focused and is extremely flexible and scalable (to tens of thousands updates per second and up to billions of open workflows). The workflows are written as object oriented programs and the engine ensures that the state of the workflow objects including thread stacks and local variables is fully preserved in case of host failures.
What problems have you used workflow engines to solve?
Temporal is used for practically any backend application that lives beyond a single request reply. Examples of usage are:
Distributed CRON jobs
Managing ML/Data pipelines
Reacting to business events. For example trip events at Uber. The workflow can accumulate state based on events received and execute activities when necessary.
Services Deployment to Mesos/ Kubernetes
CI Pipeline implementation
Ensuring that multiple service calls complete when a request is received. Including SAGA pattern implementation
Managing human worker tasks (similar to Amazon MTurk)
Media processing
Customer Support Ticket Routing
Order processing
Testing service similar to ChaosMonkey
and many others
The other set of use cases is based on porting existing workflow engines to run on Temporal. Practically any existing engine workflow specification language can be ported to run on Temporal. This way a single backend service can power multiple domain specific workflow systems.
What libraries/frameworks did you use?
Temporal is a self contained service written in Go with Go, Java, PHP, and Typescript client side SDKs (.NET and Python are coming in 2022). The only external dependency is storage. Cassandra, MySQL and, PostgreSQL are supported. Elasticsearch can be used for advanced indexing.
Temporal also support asynchronous cross region (using AWS terminology) replication.
When did a simpler State Machine/Task Management like system suffice?
Open source Temporal service can be self hosted or temporal.io cloud offering can be used. So the overhead of building any custom state machine/task management is always higher than using Temporal. Outside the company the service and storage for it need to be set up. If you already have an SQL database the service deployment is trivial through a docker image. The docker is also used to run a local Temporal service for development on a personal computer or laptop.

I am one of the authors of Imixs-Workflow. Imixs-Workflow is an open source workflow engine based on BPMN 2.0 and fully integrated into the Java EE technology stack.
I develop workflow engines by myself since more than 10 years. I will try to answer your question in short:
> What problems have you used workflow engines to solve?
My personal goal when I started to think about workflow engines was to avoid hard codding the business logic within my application. Many things in a business application can be reused so it makes sense to keep them configurable. For example:
sending out a notification
view open tasks
assigned a task to a person
describing the current task
From this function list you can see I am talking about human-centric workflows. In short: A human-centric workflow engine answers the questions: Who is responsible for a task and who needs to be informed next? And these are the typical questions in business requirements.
>What libraries/frameworks did you use?
5 years ago we started reimplementing Imixs-Workflow engine focusing on BPMN 2.0. BPMN is the common standard for process modeling. And the surprising thing for me was that we were suddenly able to describe even highly complex business processes that could be visualized and executed. I recommend everyone to use BPMN for modeling business processes.
> When did a simpler State Machine/Task Management like system suffice?
A simple state machine is sufficient if you just want to track the status of a business object. This is the case when you begin to introduce the 'status' attribute into your object model. But in case you need business processes with responsibilities, logging and flow control, then a state machine is no longer sufficient.
> Bonus: How did/do you make the distinction between Task Management and Workflow Engine?
This is exactly the point where many workflow engines mentioned here differ. For a human-centric workflow you typically need a task management to distribute tasks between human actors. For a process automation, this point is not so relevant. It is sufficient if the engine performs certain tasks. Task management and workflow engines can not be compared because task management is always a function of a workflow engine.

Check rails_workflow gem - I think this is close to what you searching.

I have an experience with using Activiti BPMN 2.0 engine for handling high-performance and high-throughput data transfer processes in an infrastructure of network nodes. The basic task was to allow configuration and monitoring of such transfer processes and control each network node (ie. request node1 to send a data file to node2 via specific transport layer).
There could be thousands of processes running at a time and overall tens or low hundreds of thousands processes per day.
There were bunch of different process definitions but it was not necessarily required that an operator of the system could create custom workflows. So the primary use case for the BPM engine itself was to be robust, scalable and allow monitoring of each process flow.
In the end it basically worked but what we learned from that project was that a BPMN platform, or rather the Activiti engine specifically, was not the best bet for such a high-throughput system.
The main challenges were task execution prioritization, DB locking, execution retries to name the few concerning the BPM itself. So we had to develop custom handling of these, for example:
Handling of retries in the BPM for cases when a node had no free worker for given task, or when the node was not running at all.
Execution of parallel transfer tasks in a single process and synchronization of the results (success/failure).
I don't know if other BPMN engines would be more suitable for such scenario since BPMN is mostly intended for long-running business tasks involving user interaction where performance is probably not the same issue as was in our case.

I rolled my own workflow engine to support phased processing of documents - cataloging, sending for image processing (we work with redaction sw), if needed sending to validation, then release and finally shipping back to the client. In our case we have a truckload of documents to process so sometimes we need to run each service separately to control delivery and resources usage. Simple in concept but high performance and distributed processing needed, and we could't find any off the shelf product that fit the bill for us.

Should we use the SharePoint WF host for workflows that include external (to SharePoint) data sources?

We need to build a couple applications that require fairly advanced workflow functionality. The plan is to store the data in SQL Server, use Windows Workflow Foundation as the workflow engine, and build the frontend using an RIA technology such as Flex or Silverlight.
We already have Sharepoint 2007 set up, and some of us (including me) have a little bit of experience creating custom Sharepoint workflows that work with data in Sharepoint lists.
My question is, would it make sense to use Sharepoint for the workflow, while the actual data is stored outside of Sharepoint in a separate database? We need the task, authentication, and email functionality of Sharepoint, but our data model is a bit complex so we'd rather not store the data in Sharepoint. We'd rather not start from scratch with Workflow Foundation, because Sharepoint already gives us 90% of the functionality we need.
Any thoughts / advice?

I think that this is a great example for use of SharePoint as a platform. I dont see any conceptual problems using it in the way that you describe. I see SharePoint as a development platform. One thing you might want to keep in mind, is if you want to make the workflow continiue on events happening in the seperate database, you might have to update for instance the workflow tasks item from an external program.

Your use case is a perfect fit and one that SharePoint adds great value to. I would highly recommend using SharePoint to host your workflows.
I have developed many SharePoint hosted WF workflows and the only real problem that I ever experienced was making calls to long running web services (asynchronous operations) as SharePoints WF host has some limitations on the type of external providers it can listen for events from.
The solution that I developed (which was a bit of a hack at first but ended up being of some value to my customers) was to create a service proxy (WCF) that sat outside of SharePoint and would route calls to remote services and wait for their response. In parallel to making that asynchronous call a parallel activity would create a SharePoint task associated with the asynchronous operation. Then the WF would stop on a OnTaskCompleted activity which causes the WF resources to be released and the state to be persisted to SQL. As the long running operation would event back status updates or completion notification the external service would update the related SharePoint task. Once the task is marked completed the WF is dehydrated and continues executing. The neat thing about this approach was that I could then create a dashboard that showed the status of all the long running processes going on outside of SharePoint. Lastly I packaged all of this stuff up into a composite activity so that it didn't clutter up my pretty workflow diagrams.

SharePoint is ideally suited for this scenarion. I would suggest using a Business Data Catalog (BDC) to access external data sources. It provides a tremendouse benefit primarily by making your datasource searchable as well as providing OOB web parts to display the data with master child relation ships, filtering and a rich API.
I would caution against making workflows too complex and instead break up the process into stages using smaller workflows, InfoPath and user actions to facilitate the entire process. this is where SharePoint really shines as you can interject visibility of the process stages to others in the organization using dashboards (if it makes sense for your scenario) as well as collaboration, approvals ... the list goes on.

I agree that SP can provide a nice WF engine, but let me ask this... are you storing anything IN SharePoint? (tasks, data sources, etc)
I ask because it may be as easy (and more appropriate) to run your own WF engine. If you are running all native WF functionality, and just need an engine, you can write a quick console app that can start workflows.
If you are using SP for anything beyond WF, then I absolutely agree to use SP.

How can I still use DDD, TDD in BizTalk?

I just started getting into BizTalk at work and would love to keep using everything I've learned about DDD, TDD, etc. Is this even possible or am I always going to have to use the Visio like editors when creating things like pipelines and orchestrations?

You can certainly apply a lot of the concepts of TDD and DDD to BizTalk development.
You can design and develop around the concept of domain objects (although in BizTalk and integration development I often find interface objects or contract first design to be a more useful way of thinking - what messages get passed around at my interfaces). And you can also follow the 'Build the simplest possible thing that will work' and 'only build things that make tests pass' philosophies of TDD.
However, your question sounds like you are asking more about the code-centric sides of these design and development approaches.
Am I right that you would like to be able to follow the test driven development approach of first writing a unti test that exercises a requirement and fails, then writing a method that fulfils the requirement and causes the test to pass - all within a traditional programing language like C#?
For that, unfortunately, the answer is no. The majority of BizTalk artifacts (pipelines, maps, orchestrations...) can only really be built using the Visual Studio BizTalk plugins. There are ways of viewing the underlying c# code, but one would never want to try and directly develop this code.
There are two tools BizUnit and BizUnit Extensions that give some ability to control the execution of BizTalk applications and test them but this really only gets you to the point of performing more controled and more test driven integration tests.
The shapes that you drag onto the Orchestration design surface will largely just do their thing as one opaque unit of execution. And Orchestrations, pipelines, maps etc... all these things are largely intended to be executed (and tested) within an entire BizTalk solution.
Good design practices (taking pointers from approaches like TDD) will lead to breaking BizTalk solutions into smaller, more modular and testable chunks, and are there are ways of testing things like pipelines in isolation.
But the detailed specifics of TDD and DDD in code sadly don't translate.
For some related discussion that may be useful see this question:
Mocking WebService consumed by a Biztalk Request-Response port

If you often make use of pipelines and custom pipeline components in BizTalk, you might find my own PipelineTesting library useful. It allows you to use NUnit (or whatever other testing framework you prefer) to create automated tests for complete pipelines, specific pipeline components or even schemas (such as flat file schemas).
It's pretty useful if you use this kind of functionality, if I may say so myself (I make heavy use of it on my own projects).
You can find an introduction to the library here, and the full code on github. There's also some more detailed documentation on its wiki.

I agree with the comments by CKarras. Many people have cited that as their reason for not liking the BizUnit framework. But take a look at BizUnit 3.0. It has an object model that allows you to write the entire test step in C#/VB instead of XML. BizUnitExtensions is being upgraded to the new object model as well.
The advantages of the XML based system is that it is easier to generate test steps and there is no need to recompile when you update the steps. In my own Extensions library, I found the XmlPokeStep (inspired by NAnt) to be very useful. My team could update test step xml on the fly. For example, lets say we had to call a webservice that created a customer record and then checked a database for that same record. Now if the webservice returned the ID (dynamically generated), we could update the test step for the next step on the fly (not in the same xml file of course) and then use that to check the database.
From a coding perspective, the intellisense should be addressed now in BizUnit 3.0. The lack of an XSD did make things difficult in the past. I'm hoping to get an XSD out that will aid in the intellisense. There were some snippets as well for an old version of BizUnit but those havent been updated, maybe if theres time I'll give that a go.
But coming back to the TDD issue, if you take some of the intent behind TDD - the specification or behavior driven element, then you can apply it to some extent to Biztalk development as well because BizTalk is based heavily on contract driven development. So you can specify your interfaces first and create stub orchestrations etc to handle them and then build out the core. You could write the BizUnit tests at that time. I wish there were some tools that could automate this process but right now there arent.
Using frameworks such as the ESB guidance can also help give you a base platform to work off so you can implement the major use cases through your system iteratively.
Just a few thoughts. Hope this helps. I think its worth blogging about more extensively.
This is a good topic to discuss.Do ping me if you have any questions or we can always discuss more over here.
Rgds
Benjy

You could use BizUnit to create and reuse generic test cases both in code and excel(for functional scenarios)
http://www.codeplex.com/bizunit
BizTalk Server 2009 is expected to have more IDE integrated testability.
Cheers
Hemil.

BizUnit is really a pain to use because all the tests are written in XML instead of a programming language.
In our projects, we have "ported" parts of BizUnit to a plain old C# test framework. This allows us to use BizUnit's library of steps directly in C# NUnit/MSTest code. This makes tests that are easier to write (using VS Intellisense), more flexible, and most important, easier to debug in case of a test failure. The main drawback of this approach is that we have forked from the main BizUnit source.
Another interesting option I would consider for future projects is BooUnit, which is a Boo wrapper on top of BizUnit. It has advantages similar to our BizUnit "port", but also has the advantage of still using BizUnit instead of forking from it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string