Visio Process-Diagram connected/embeded to detailed Narrative

Visio Process-Diagram connected/embeded to detailed Narrative - excel

How can I add a lot more details to each process step of a Visio process diagram e.g. an (Excel) table?
For all kind of purposes I have to draw process diagrams. Visio is one of the best tools I know in this direction, but it lacks (or I don't know), if (a lot) more details should be added to a process step.
I've asked countless people about the best approach, but till now I only heard about the following, and non of them is really good:
Use index numbers for process steps and then embed the Visio process diagram into word. Reference the process steps by index number and describe it in detail. (disadvantage: having multiple documents, if changes are made changes need to be reflected in multiple documents what is time-consuming).
Don't use Visio at all and draw a process diagram in Excel. Excel is perfect for adding a description to a simpler diagram, right next to the process steps. (disadvantage: only possible for simpler workflow diagrams, Excel is not a process drawing tool)
Link process steps to other sub-process diagrams. (disadvantage: everything is spread into a lot areas and a detail grade of a table is still not really possible to attach it)
I can't believe in the year 2017 this is still a problem and hundert of thousands of people must have this challenge, so I'm quite sure I must be overlooking something.
Is it for example possible to embed an Excel diagram into a Visio document and link from the process steps to a specific row of this table (auto-referencing/-sorting in the best case)? If this is the best practice how to print this on paper after creation?
A research on the web shows me I'm not alone and people doing all kind of hacks to get it done, but none of these looks is looking very practical. In the worst case I can live with a solution which is basically only working in a digital document and I can print it separately in two steps.
Any help would be very welcome because it really bugs me to struggle over this simple barrier my feeling tells me that there must be a much better solution and I'm overlooking something. I'm also willing to use another tool than Visio if this is possible with it in a practical way.

As often, as soon as posted (after long research) I found a possible approach:
MS Blog: https://blogs.office.com/en-us/2017/05/01/automatically-create-process-diagrams-in-visio-from-excel-data/?eu=true
Alternate Description: http://www.free-power-point-templates.com/articles/how-to-automatically-create-process-diagrams-in-visio-using-excel-data/
Youtube Video-Tutorial: https://www.youtube.com/watch?v=Q6_vop2kHCg
Alternate (Visio only) approach: https://www.youtube.com/watch?v=432EvYZs26Y
I'm still more than open to better ideas and approaches. If there is a simpler better way todo it, please post it here. It would be also interesting to go the Excel/Visio route in the opposite direction. Manage all data in Visio and beeing apple to export it excel in a nice way.
Bonus (adding/connecting Data/Excel to a Visio Diagram): https://www.youtube.com/watch?v=gxYu78sOxSY

Related

Best practices for creating a customized report based on user form input?

My Question
What are the best practices for creating a customized report based on a user form input? Specifically, how do I create an easy to maintain system which takes user input which is collected in a form and generate multiple paragraphs that explains the results of analysis.
Background
I am working on a very large multiyear project with a startup (who is my client). My job is to program analysis and generate reports to users. The pipeline for data looks like this:
Users enter information into a form -> results are calculated based on user input -> reports are displayed to users that share analysis.
It is really important to my client that some of the analysis results are displayed in paragraphs in a non-formal user friendly tone. The challenge is that the form and analysis are quite complex and will only get more complex over time. An example of the type of template for the paragraphs looks something like this:
resultsParagraphText=`Hi ${userName}. We found that the best ice cream flavour for you is ${bestIceCreamFlavor}. These other flavors ${otherFlavors} might be good for you. Here are the reasons why you might enjoy these flavors: ${reasonsWhyGoodFlavors}.
However we would not recommend these other flavors ${badFlavors}. Here are the reasons you should avoid this bad flavors: ${reasonsWhyBadFlavors}.`
These results paragraphs, of which there of many, have several minor problems which combined are significant:
If there is a bug in the code, minor visual errors would be visible to end users (capitalization errors, missing/extra commas, and so on).
A lot of string comparisons (e.g. if answers.previousFlavors.includes("Vanilla")) are required to generate the results paragraphs. Minor errors in the forms (e.g. vanilla in the form is not capitalized so answers.previousFlavors.includes("Vanilla") returns false even when user enters vanilla.) can cause errors in the results paragraph.
Changes in different parts of the project (form, analysis) directly effect how the results paragraph is made. Bad types, differences in string values, null or undefined values not being caught directly have an impact on how the results paragraph is made.
There are many edge cases (e.g. What if the user has no other suitable good flavors for them? The the sentence These other flavors ${otherFlavors} might be good for you. needs to be excluded).
It is hard to write paragraphs that use templates and have a non-formal tone.
and so on.
I have charts and other types of ways to display results and have explained to the client the challenges of sharing the information in paragraph form.
What I am looking for
I need examples, how tos, best practices on how to build a maintainable system for generating customized paragraphs based on user input. I know how to solve each of the individual issues (as they are fairly simple) but in a large project this will become very hard to maintain.
Notes
I have no clue what tags to use for the post. Feel free to edit/add tags if you know more appropriate ones.
The project is planning to use machine learning in the future other parts of the project. If there is a ML/AI solution that is useful please tell me.
I am working primarily in JavaScript, Python, C, and R, but if there is a library or tool in any other language please tell me. Finding a solution is very important to me and I would be willing to learn a lot find a best solution.
To avoid this question being removed because I have rephrased it to avoid asking for personal opinion, instead asking for existing examples or how tos. I can also imagine that others might find a solution fairly useful. If you can edit it to make the question less subjective please do so.
If you have any questions or need clarification feel free to ask. Any help is appreciated.

Automating Raw Export Data Cleansing for Client Onboarding - Format is Always Different

So a bit of a general question. I work as a data analyst for a startup. My primary process involves taking existing customer data a client has and cleansing/normalizing it to fit into our platform once as part of our onboarding process. A member of our team exports their data from their system they are transitioning from or, if they kept track of it in house, we receive their Excel log they used to track it. It is always in a different format and requires extensive cleansing (avg 1 min/record). We take what is usually one large table (.xlxs format), and after cleansing, split it into four .csv files; which we load as four tables on our platform.
I feel I have optimized the process quite well in terms of the process steps and cleansing with excel functions (if, concat, text-to-columns, etc). I have beginner-intermediate skills in VBA and SQL and have just scratched the surface in R; what is frustrating is that I know there is the potential to automate this process but I just don't know where to start. If anyone has experience with something like this, code, a link to an article / another thread, or just some general direction would be much appreciated. Please ask for clarification where you feel it is needed. Thanks.

This will be really hard to do in Excel. If you have the time you can try out Optimus, a Data Cleansing library written in Python and Pyspark (you don't need to know spark). Here is the webpage https://hioptimus.com.
You can create Data Pipelines with it, and I recommend that you do that, try to generalize your processes, and asking the client for more a structure way of passing the data.
The good thing is that you don't need Big Data for running Optimus, bit if you have it some day, the same code will work.
Check out the documentation for more:
http://optimus-ironmussa.readthedocs.io/en/latest/
Let me know if you have doubts!

OLAP cube powering Excel Pivot. What's a better solution?

I'm looking to build a dynamic data environment for non-technical marketers.
I want to provide large sets of data in an Excel pivot table form so even marketers without analytics/technical backgrounds can access relevant performance information. I'm trying for avoid non-excel front ends since I don't want users to have to constantly export data when they need to manipulate it in some way.
My first thought was to just throw together an OLAP cube populated with pre-aggregated data, but I got pushback from the IT team as OLAP is "obsolete." I don't disagree with them - there are definitely faster data processing architectures out there.
So my question is this: are there any other ways to structure the data so that marketers can access it easily but still manipulate it to some degree in Excel? I'm working with probably 50-100m rows of data and need the ability to scale dimensionality.

This is just my thoughts.
Really the question could be thrown back at your IT team. Your first thought was to throw together an OLAP cube. IT didn't like this. If they're so achingly hip that they consider OLAP "obsolete", what do they suggest as a better, more up-to-date alternative?
Or, to put it a different way - what is the substance of their objection to an OLAP solution? (I'm assuming there is one beyond "MS gave us an awesome presentation of PowerPivot/Azure tabular, with really great free snacks and coffee").
Your requirements are pretty clear:
Easy access for non-technical people
Structured data so that they don't have to interpret the raw data
Access through Excel
Scalability
I'll be paying close attention to any other answers to your question, because I'm always interested in finding out that I don't know something; but personally I haven't come across a better solution to these requirements than OLAP.
What makes me suspicious of the "post-OLAP" sentiment is related to point (2) in the list above. Non-technical users can tend to think of the cube data they consume as being somehow effortlessly produced, by some kind of magic. That in itself is an indicator of success, demonstrating just how easy it is for users to get what they want from a well-designed OLAP system.
But this effortlessness is an illusion: to structure the raw data into this form takes design effort, and the resulting structure incorporates design decisions and assertions: that is how it can be easy to use, because the hard stuff has been encapsulated in the cube design.
I have a definite Han Solo-like bad feeling about "post-OLAP": that it amounts to pandering to this illusion of effortless transformation of data into a usable form, and propagates further illusions.
Under OLAP, users get their wonderful magic usable data structure, and the hard work is done out of sight by developers like you or me. Perhaps we get something wrong so that they can't see data exactly as they'd like to - but at least the users can then talk to us and ask for what they do want.
My impression of the "post-OLAP" sales pitch is that it tries to dispense with the design work. We don't need those pesky expensive developers, we don't need to make specific design decisions (which necessarily enable some functionality while precluding some other functionality), we don't need cube-processing time-lags. We can somehow deliver this:
Input any data you like. Don't worry if it's completely unstructured or full of dirt!
Any scale
Immediate access to analytics without ETL/processing delays
Somehow, the output is usable, structured data. Structured by... no-one in particular. The user can structure it as they like, but somehow this will be easy
Call me cynical, but this sounds like magical thinking to me.

Open source projects for email scrubbing generating structured data from unstructured source?

Don't know where to start on this one so hopefully you guys can clear up my question. I have project where email will be searched for specific words/patterns and stored in a structured manner. Something that is done with Trip it.
The article states that they developed a DataMapper
The DataMapper is responsible for taking inbound email messages
addressed to plans [at] tripit.com and transforming them from the
semi-structured format you see in your mail reader into a highly
structured XML document.
There is a comment that also states
If you're looking to build this yourself, reading a little bit about
Wrappers and Wrapper Induction might be helpful
I Googled and read about wrapper induction but it was just too broad of a definition and didn't help me understand how one would go about solving such problem.
Is there some open source project out there that does similar things?

There are a couple of different ways and things you can do to accomplish this.
The first part, which involves getting access to the email content I'll not answer here. Basically, I'll assume that you have access to the text of emails, and if you don't there are some libraries that allow you to connect java to an email box like camel (http://camel.apache.org/mail.html).
So now you've got the email so then what?
A handy thing that could help is that lingpipe (http://alias-i.com/lingpipe/) has an entity recognizer that you can populate with your own terms. Specifically, look at some of their extraction tutorials and their dictionary extractor (http://alias-i.com/lingpipe/demos/tutorial/ne/read-me.html) So inside of the lingpipe dictionary extractor (http://alias-i.com/lingpipe/docs/api/com/aliasi/dict/ExactDictionaryChunker.html) you'd simply import the terms you're interested in and use that to associate labels with an email.
You might also find the following question helpful: Dictionary-Based Named Entity Recognition with zero edit distance: LingPipe, Lucene or what?

Really a very broad question, but I can try to give you some general ideas, which might be enough to get started. Basically, it sounds like you're talking about an elaborate parsing problem - scanning through the text and looking to apply meaning to specific chunks. Depending on what exactly you're looking for, you might get some good mileage out of a few regular expressions to start - things like phone numbers, email addresses, and dates have fairly standard structures that should be matchable. Other data points might benefit from some indicator words - the phrase "departing from" might indicate that what follows is an address. The natural language processing community also has a large tool set available for text processing - check out things like parts of speech taggers and semantic analyzers if they're appropriate to what you're trying to do.
Armed with those techniques, you can follow a basic iterative development process: For each data point in your expected output structure, define some simple rules for how to capture it. Then, run the application over a batch of test data and see which samples didn't capture that datum. Look at the samples and revise your rules to catch those samples. Repeat until the extractor reaches an acceptable level of accuracy.
Depending on the specifics of your problem, there may be machine learning techniques that can automate much of that process for you.

How do I manage specs in Scrum? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
Referring to this buddy question, I want to know how one can manage specs in Scrum process ? I'm facing this problem while assigning tasks to my team for the sprint. Needless to say - I'm new to Agile/Scrum.
Currently, we are using our own specs sheet to map StoryId to SpecId and vice versa. I'm getting the felling that Scrum is more about project management [getting things done on time] and you need a seperate process to manage specs and requirements.
How do we manage specs in a Scrum process ?

The short answer is, you don't.
The important question to ask yourself when writing these specs, is why do we do them? What is the value in the spec?
The value in the spec usually comes in communicating the ideas of the business with the development team. Scrum is designed to bring the business (in the form of the Product Owner) to the development team. By interacting with the team frequently (remember, individuals and interactions over processes and tools), and by seeing working software frequently, the business can work hand in hand with developers to produce software that solves business problems better than by trying to spec out the whole thing before you get to try it out.
This is how Agile projects do a better job of delivering the product the business wants instead of the product they requested.
That said, there are certain base criteria that need to be met. We can test for this, and as with any good tests, we can automate it.
Have a look at BDD and Cucumber. In addition to your User Story, it's good to have a basic set of conditions of satisfaction, preferably in the "Give/When/Then" format. These conditions are the minimum set of criteria for the story to be accepted as complete.
For example, "Given I am logged in, when I log out, then I am taken back to the home page".
If you're going to have acceptance criteria, you're going to want to automate it. The worst part of most specifications is they often end up out of date and collecting dust when the project is complete.
Also, you shouldn't be assigning tasks to the team. Scrum teams are self organizing and anyone should be able to grab any task they feel they can work on while respecting the priority of the stories. Swarming is a big part of the performance benefits of Scrum.
You may want to consider bringing in an outside coach to assist with your transition.

I think that the easiest way is to make the specifications a part of the user stories within the tasks, themselves. Clearly list the acceptance criteria in each one (or if your issue tracking software allows you, create them as first class work item types). Let the issue in whatever you use for work item tracking become the living document.
There are drawbacks, such as finding related issues as specs change over time, but this can usually be managed in the work item tracking tooling, assuming your can relate issues to each other.
The way that we do it is that we (actually a BA, not the developers) creates a sign-off deck for the product owner to review and we collaboratively create tasks off of that. If we cannot create a task, or there are open questions, we will go back to the product owner with those questions and update the deck. All of our decks are organized (in SharePoint) so that we can easily find them in the future.

For me the specs is in the user stories. We define the specs and the tasks duing out initial scrum meeting along with the product owner. The specs and tasks are just for the life time of the scrum iteration as everything might change in the next iteration(in the worst case but there will definitively be changes).
We usually keep track of the specifications and task on a spreadsheet just so that everybody know what they are working on. I have also tried a few software to do this and one of the most interesting ones I have come across is from [VersionOne][1] and also from [Rally][2].
But I still find that using a simple spreadsheet is the fastest and simplest solution.

As I understand SCRUM, it does not take care about specs management. You have to broke/map your specs or specs changes to stories and tasks separately. But you can have a task for this :).

There is a real tension between Scrum and other agile dev methodologies and spec writing. I think there are two big points of tension:
Because agile says everything should
be on an index card, that means you
have to have stuff planned out
enough to fit on an index card.
(E.g. you have to know how it's all
going to work.)
Some things don't make sense in
isolation (what's the use of an
upload file page without a manage
uploaded files page, for instance.)
You don't have to design the whole app all at once, but you have to have a vision of the whole app. Then, especially if you have a separation of designers and programmers, you do functional design for a sprint-sized chunk at a time. Those designs then have to be broken down to story-sized chunks.
This is a lot of up front functional design, and I think that's overlooked in a lot of the talk about agile methodologies. Perhaps some shops have the devs do more of the design. Also, I think it's a lot easier to use scrum/agile for making changes/bug fixes to existing apps rather than building new ones.
The thing I've found most helpful is to fight back on story size. A lot of organizations have gone crazy, saying stories need to be only a few hours. The original scrum book says 16 hours, I think, which is often large enough to fit an entire screen of a web app. So "implement manage my account" could be a story (as opposed to the hundreds-of-tiny-stories approach of "implement username", "implement password" etc.) Then reference your design doc for "Manage My Account" and make sure to have word-perfect screenshots/prototype/mockup so the dev can look at them and copy/paste the text directly into the code they're writing, and they know for sure which fields need to be there (or which links, or which pictures, or whatever).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string