What uml diagrams can be used for a data science project

What uml diagrams can be used for a data science project - uml

I am working on a data science project for my 3-2 mini project. My project analyzes the performance of a country in the Olympics based on some attributes. But I am confused about the UML diagrams I should be using in my project.

There are some 15 UML diagram types out there. A sensible sequence of diagrams to be created depends on your approach.
If you'd like to create an analysis model that is a conceptual model of your problem domain then a sensible sequence of diagrams might be:
Usecase diagrams
Activity diagrams
Class diagrams
and if your project gets bigger you might need package diagrams.
If you'd like to create a design model that is a conceptual model of your solution domain then a sensible sequence of diagrams might be:
1. Component diagrams
2. Class diagrams
3. Sequence diagrams
4. Statecharts
In both cases a starting point is having a diagram for your system context. Some people like to mix component and usecase diagram features to denote a system context.
The aspects you might want to take into concideration of your diagram choices are:
syntax - how strictly would you like to follow the UML standard and what use does adhering to the standard have for you
semantics - what is your need - what do you want to document - and who needs to understand it
pragmatics - what is the best way to achieve your projects goal e.g. being efficient and effective
tool - what tools do you have at hand and are used and known to your peers - what can you afford to invest in keeping the tool infrastructure up

While your question is very broad, I could imagine that in view of:
My project analyzes the performance of a country in the Olympics based on some attributes.
you'll certainly need a class-diagram. Because the class diagram will clarify what kind of objects your software will manipulate (e.g. Olympic game, Participating countries, Teams, Athletes, Discipline, Competition), how they are related, and what attributes are associated with which each.
This will enable you to determine for the different analysis you want the access path to the relevant attributes. It will also allow you to find missing attributes, and to desing a convenient interface for the different classes.
You may also use other diagrams. But with the few requirements you've shared, it's difficult to guess which one and I do not want to do a lot of guesses. I could nevertheless imagine that a use-case diagram could help to give the big picture of who is going to do what with your software.

Related

What is the difference between Conceptual Class Diagram and Detailed Class Diagram?

Can someone briefly explain the difference between a Conceptual Class Diagram and a Detailed Class Diagram?

While a "Conceptual Class Diagram" expresses a conceptual (domain) model, it's not clear what you (or your professor) mean(s) with "Detailed Class Diagram": it could refer to a (language-/platform-independent) design model or to an implementation model like a C++ class model or a Java class model.
See also my answer to this related SO question.
The one-to-many relationships between conceptual models and design models, and between design models and implementation models are illustrated in the following Figure:
As an example that illustrates how the derivation chain from concept via design to implementation works, consider the following model of a people/Person concept/class:
Domain models are solution-independent descriptions of a problem domain produced in the analysis phase of a software engineering project. The term "conceptual model" is often used as a synonym of "domain model". A domain model may include both descriptions of the domain’s state structure (in conceptual information models) and descriptions of its processes (in conceptual process models). They are solution-independent, or ‘computation-independent’, in the sense that they are not concerned with making any system design choices or with other computational issues. Rather, they focus on the perspective and language of the subject matter experts for the domain under consideration.
In the design phase, first a platform-independent design model, as a general computational solution to the given software engineering problem, is developed on the basis of the domain model. The same domain model can potentially be used to produce a number of (even radically) different design models representing different design choices. Then, by taking into consideration a number of implementation issues ranging from architectural styles, nonfunctional quality criteria to be maximized (e.g., performance, adaptability) and target technology platforms, one or more platform-specific implementation models are derived from the design model.

A conceptual class diagram is used to understand and analyze a problem domain. A detailed class diagram is a design artifact, where many things may have been optimized away. For example, every dog might bark, but a dog-salon application doesn't care, so it can optimize away that fact.

I don't know of any standard or methodology that defines both these concepts. For example, the UML specification does not mention them. I think every answer will be subjective. I will give my own answer, based on more than 25 years of experience with IT-related modeling.
In a conceptual class diagram, every class is a concept, usually related to the business domain, the real world, e.g. Customer, Order etc. It may also show concepts that cannot be directly found in the business domain, but are needed to model the functionality of a particular application, e.g. BackupCopy. These are concepts the user of the application must understand. See also www.agilemodeling.com
There are other types of class diagram, e.g. class diagrams that model the source code, where every class corresponds to a Java class or a C# class, or class diagrams that model the physical database structure, where every class corresponds to a database table.
Each of these types of class diagrams may or may not be detailed. If a class diagram is not detailed, it typically does not show any attributes, or only the main attributes. If a class diagram is detailed, it shows all attributes relevant for the problem at hand and the data types of these attributes.

The concept of a conceptual class diagram is e.g. explained by Scott Ambler at http://www.agilemodeling.com/artifacts/classDiagram.htm#ConceptualClassDiagrams.
Basically "Conceptual" here means that the content of the diagram is taken from an analytic view point that takes the "concepts" of a domain and describes them.
For "concept" you could also say:
thing
item
aspect
object
topic
The conceptual diagram is basically what you get if you ask people for what problem they'd like to get solved by your software. So you analyze the situation/problem by asking questions that will help you create you diagram:
what are the things that are relevant? - these will be your candidates for classes
what are the features of those things? - these are the candidates for your attributes
how are the things releated to each other - e.g. is one part of another? Does it need the other? - these are the candidates for your relations
what should you be able to do with these things in your system ? - these are the candidates for your operations
In the past this step was called OOA - object oriented analysis. The steps following this are OOD - object oriented design and OOI - object oriented implementation. Many years ago some authors proposed to create three different models for OOA/OOD and OOI. Therefore you'd have different and usually more detailed diagrams for OOD and OOI. For the term "Detailed Class Diagram" i'd guess that one of the OOD and/or OOI views would be meant. Be careful though - some of the diagrams created this way will have patterns or pattern-like ideas as a basis. You'r diagrams would tend to be very repetitive and redundant if you keep capturing such patterns in concrete diagrams for every conceptual diagram. I'd rather recommend to give just one example of how to go from problem to solution and then comment "do it this way for all other concepts that are similar".

Need of UML Diagram

Please help me know as to when it(uml) is necessary. I was told that generally UML are drawn for web based application developement, for desktop based, DFDs and ERDs are used. My university requires all the diagram(uml,dfd,er). please let me know if my information is correct? Thank you

UML Diagram are imporant because it help the person to understand the relationship and dependency between different class present in the code(Class Diagram).
Flow of the program(Sequence Diagram , Activity Diagram).
Help to improve the program architecture etc.
And read about different type of UML diagram you will get more information.

Your needs in using UML depends on your position and your (self)education.
Some companies use UML. So, you would need it to get a job in them. Just now your university requires UML diagrams, so there IS a need in them, isn't it?
If you know UML a bit, you could understand the thoughts of your colleague who wants to share them with you this way.
If you understand the language of a UML diagram, you can use it for improvement of your thinking on the problem. So, you can think into the problem deeper and faster than without a tool. You should be really well acquainted with the tool though, for when inventing something new, you need to think on the domain problem, not on the language problems. But you don't need to know all the rules for this level of use yet.
If you know UML so that you can draw diagrams up to their strict rules, you have two more uses of it.
Translating your knowledge of the problem from one level of abstraction to another and modelling these levels up to the strict rules, you are filtering many misunderstandings in the already accepted model and can practically debug the model before coding. It can save much time and money.
While you are making the diagrams according to strict rules, you can collaborate on the model with your colleagues. It is always better if you can express your ideas more precisely.
As for technology limitations, you can use UML very widely, even out of the IT needs. As for IT, only GUI creation is supported badly. And anonymous classes are almost not supported in class diagrams at all (in behaviour diagrams they work OK).

DFD (datya flow diagram) and ERD (entity relationship diagram) diagram are tools for structural analysis and design, this is way to build structural application (data bases and functions). UML support quite different paradigm: object paradigm - we build application as collaborating objects. DFD and ER (ERD) diagram is not part of UML. We can use ER diagram for data base modeling and join to UML domain model by the ORM (object-relational mapping, implemented e.g. by Hibernate).

How to categorized the UML diagrams based on priorities/ levels?

I am new to UML. I have studied more tutorials.I learned two broad categories like,
UML Diagrams:
1. Structural Diagrams
Class diagram
Object diagram
Component diagram
Deployment diagram
2. Behavioral Diagrams
Use case diagram
Sequence diagram
Collaboration diagram
Statechart diagram
Activity diagram
But I dont know which one is high level design and low design. Anyone list out the UML diagram types based on priorities. (high-level diagrams to low level)

There is not really a well-defined order of higher-level versus lower-level diagram languages in UML. The same diagram language (e.g. class diagrams) can be used at different levels of abstraction. For instance, a conceptual information model, but also a Java data model, can be expressed as a class diagram.
Generally, a use case diagram is higher-level, since it describes requirements, while a deployment diagram is lower-level, since it describes system deployment structures.
But all other diagrams languages can be used at different levels of abstraction.

UML diagrams - from the most common to most detailed level.
Please, notice, that nowadays (the start of 2014) there are no special instrument for UI modelling. So, I'll explain how to do this part of work, too, with the tools we have. But they will be used in a less or more nonstandard way.
Human level. Use case diagrams and state machines. How people will work with the system.
Use cases are about what the system does, who works with it and maybe, grouping of those subjects. Subsystems can be defined here. Try not to show much structure or behaviour. Not to use any IT slang!
State machines show what states the system, subsystems and actors can have and what actions/events can happen in these states and to which other states can it lead. Not to use any IT slang!
Do not forget, that administrators, programmers and testers are users of the system, too. So, plan not only how the system helps to the work of the common user and his senior, but also to the installation/administration/testing/support processes. Don't forget to continue this work on all deeper diagraming levels. These use cases/state machines needn't be so human-oriented.
You can draw activities, sequence, timing diagrams for some dialogues between Actors and subsystems, if they are the part of the requirements. Or make them the part of requirements if they are important. Not to use any IT slang!
Draw the sketches for the UI and talk over them with client. The work on UI art design should be connected to UI planning and realization
Start to work on User Guide - create plan and structure. (I use class diagrams for that).
Deployment and component diagrams. Here you are starting to imagine the inner construction of your system
Components - What compact parts it has. It needn't have much in common with the subsystems, as user see them. Only some components are visible to the user. You could decide on the use of some interfaces between them. Think on the license problems of the third-party components.
Deployment - how the components could be distributed among PCs. The same question about interfaces, but more from the physical side.
A special deployment diagram for license politics of your product could be drawn, too. You can use other diagrams for it, as well. It is at your choice.
You could already plan your user interface by these diagrams, too. In MVC (model-viewer-controller) construction only the components of the controller level are mutually connected and obviously need this level modelling. But the viewer layer (UI) components are connected in a conceptual way, they should be, for the sake of user. So, it should be planned too, by the same diagrams.
On this level you also plan the architecture of the development environment. It consists of components, too.
Draw Interaction Overview and Communication diagrams to see the cooperation of components as a whole or in complex groups.
Package, activities, sequence, timing diagrams
Package diagrams are for planning the hierarchy of your code and mutual visibility of its parts. Don't forget the place for testing packages, too. Notice, that the structures of packages and components hierarchies are different, but they have to work together. It is very important part, frequently overlooked.
Use behavioral diagrams for better understanding how different processes could run.
System analysis - the class diagrams level.
Some important classes could appear on the previous level diagrams - as definitions of intercomponent interfaces or subjects of processes. But now you should do all of them. Minimally a diagram for a component. You should do these class diagrams, using ready package diagrams.
Plan the content of UI, defining elements and functonalities and connections between them WITHOUT choosing the concrete components. Use diagrams that you like. Class ones are usable, but in not standard reading.
Deeper insight
If you have instances with specific behavior, use Object diagrams for their planning.
If you have some very complex classes or their tight groups, use Composite Structure Diagrams.
UI: Plan the content of screen elements WITH choice of the UI components (frames, buttons and so on) and connecting functionalities to them. On this level you can again use class/object and sequence/timing diagrams.
Code. Really, the coding, at least on the prototype level starts already on the stage of component planning. You have to control if and how different technologies will cooperate. But the real coding should be done only after you are sure you understand what are you doing. And to create all or some correct diagrams is the best way to be sure in it.
Notice the rule of thumb - structure diagrams set the sequence of levels. Behavioral diagrams support them on all levels. You can use state machine on the lowest level and timing diagram for to discuss with a client. But try not to mix the levels with the structural diagrams!
Also, do not try to mix diagrams, especially behavioral with structural ones. You should clearly set the rules, by which you can say, what part of information can be on the diagram and what not. And break these rules really only in the most exceptional cases.

As gwag noted, there is no separation of UML diagrams into high and low levels. The different diagrams are used for describing different aspects, not different levels, of a (software) system.
But if you look at UML in a broader context, the Unified Modelling Language is just one of a whole family of modelling languages standardized by OMG. These different languages do have more specific scopes.
SysML (Systems Modelling Language) shares many features with UML and looks very similar, but is specifically intended for the higher levels of systems analysis / design. It also includes a visual representation of requirements, which are conspicuously absent from UML.
Another related language is BPMN (Business Process Model and Notation), which is used for business processes. So you could for instance use BPMN for business analysis, SysML for system design and UML for software design.

UML does not specify level of details you define in diagram. Every diagram can be used for description on business level, implementation or design level as well.
It is up to modeler, what type of diagram uses to descrbe modeled system. Information in diagrams must correspond with each other and all diagrams must give complet view on system.
For example, you can declare services of Bank company using UseCase on business level or use UseCase to declare services implemented by concret physical component of program writen in Java.

Entity-Relationship Modeling and Object Oriented Design - Is it relevant?

I am not sure if this is a good question as I'm unsure if there's any agreement on the subject. However due to the lack of information in the internet I'm compelled to ask anyways.
Let's say I'm making a system that is mainly object-oriented, with its corresponding UML diagrams (use-case, class, colaboration, etc). However, none of the UML diagrams are helpful when dealing with the database, which should be relevant for the developing team so they can know what exists in the database and what does not.
There are two ways to represent a database: Entity-Relationship and Relational (it's unknown to me if there are more, but those two are relevant within the relational database paradigm). ER deals with the representation of the BD in terms of business rules, and Relational deals with the actual, physical implementation. But none are "UML standard" (unless I'm missing something here).
Which modeling should I use, and why? Is ER relevant in terms of UML, or should I stick to relational? Thank you beforehand.

If you want to use UML only, you could use limited class diagrams - without m-n associations and methods. But if you are using some class-table mapping tools, you can use anything, except m-n relationships only.
Nobody had ever said that you can use Class diagram for OOP classes only. You can use them for any more or less formal concepts, if their needs can be covered by the complex of CD elements. I use class diagrams for UI planning and even formal text planning. And tables are very close to classes. So, no problem.
You can use data model diagrams, if you need something that is CALLED data diagram. But they are covered by class diagrams fully. That is the reason they are not supported anymore.
Your task is to make the model understandable for everybody, who can get it in hands. Class diagram is the most widely known UML diagram. A good title and a pair of comments will resolve all possible misunderstandings.

Both are different ER diagrams are relationship of entities and UML diagrams are behaviour of Ojects how they communicate with each other, as per my view point DFD (data flow diagram) is option. It has different levels which is based on number of processes and will better explain about data entities.

What is the UML analogue to the Data Flow Diagram from Structured Analysis?

Back in the Dark Ages (mid-1980s), I used Data Flow Diagrams from Structured Analysis a fair amount, and found them very useful.
My current employer loves UML. I normally use BOUML, which doesn't do non-UML drawings.
What is the UML drawing that corresponds to the Data Flow Diagram?
If there isn't one, what is the recommended UML diagram to present the corresponding data?

Probably the closest thing is the activity diagram. It's not quite the same; more influenced by flow chart than dfd. However: you can do some of the useful things in DFDs, e.g. ADs do support concurrency and differentiate control flow from dataflow.
More details on comparisons & differences in this question.
[fwiw, I still use DFDs: they're simpler and more elegant in many circumstances]
hth.

UML 2 has a very good analogue to a data flow diagram:
the "information flow diagram".
Information flow diagrams are explained here:
https://web.archive.org/web/20121118061853/http://www.uml-diagrams.org/information-flow-diagrams.html
Note that UML 2.5 has information flows and information items, but the term "information flow diagram" is not part of official UML 2.5 diagram taxonomy. So formally, you just create a class or component diagram with lots of information flows in it to obtain your "information flow diagram".
I do this all the time, using information items of UML to represent my data.

There is no equivalent model in OOD. The emphasis on DFD's is data separated from the function. This is most helpful when dealing in a procedural way. DFD's scale much better than OOD, if you try to scale out (to the world view) using OOD you end up using Use Case diagrams, which are useful for capturing essences. I loved DFD's they are so high level, and yet can be expanded by opening up a DFD box and calling it level 1 etc.
I am currently in the process of learning the Go programming language, this does not use Objects whatsoever and in some respects I feel that DFD modelling would suit it much better.
I too am looking for a diagram that could do this sort of work. In Go structs are used intensively which are basic data types. You can have a primitive extension method attached to it which resembles OO but in fact if you look at the Assembly code it appears to be syntax sugar for a function, who's first parameter is the struct you wish the function to operate on.
My advice, is that if you're doing OO code, then use OOD. They map better, and do help in the thinking about a system. It takes a while to get your head out of Procedural code, especially if you're coming from programming from the 80's/90's. Once you're in the zone with thinking about objects then the OOD methods work fine. Its not strictly a methodology as there is no straight answer to which parts you use, just thinking in objects I find to be the hardest part. A good book on this is "Object Thinking--David West"...it helps to think about objects first. Once you start its very difficult to stop, you may even like some end up getting trapped in the kingdom of the nouns which is a horrible place to be, because you write endless boiler plate code, just so that the system is described perfectly. This is a form of coding hell which I have stayed clear of for many years.
If you are coding in a language that allows procedural code, or even mixed OO/Procedural, you need to decide your paradigm before you start coding, for example in both Python and Object Pascal (Delphi) you can go either route of OO or procedural coding mixing the code up into a mess of paradigms. This will decide which diagramming tools that should be used, and how you are going to analyze the system.
Recently there have been shifts in Java and c# to provide functional programming techniques. These I have discovered don't fall into either category of programming (OO or procedural). Trying to map functional programming code into an object is a nightmare.
I am sorry I haven't provided an answer, but it depends on what code you are writing.

There is no direct analogue, since UML emphasises OO design wheras DFD comes from structured systems analysis and design (SSAD). In UML a number of diagrams, specifically those in the with interaction diagrams group have characteristics that might model elements of data flow and processing. A Communication Diagram can be used to reflect most aspects of a DFD in general, while a sequence diagram may model specific sequences of flow. If you wanted to suggest DFD semantics then you could use stereotyped objects for data process and data store, and use actors for external entities.
It may be worth noting that Sparx Systems Enterprise Architect, while primarily a UML tool includes DFD as an extension.

Similar diagrams would be:
information flow diagram
communication diagram
sequence diagram

Theoretically, new diagram kinds can be defined in UML, optionally extending of one or more conventional diagram kinds. The canonical diagram kinds defined in UML are essentially defined as a part of the UML metamodel itself.
Formally, a definition of the UML metamodel is provided in the UML specification published by the Object Management Group (OMG), as well as the corresponding meta-metamodel defined of MOF - to which there is also a corresponding specification - moreover as accompanied with the formal OCL specification, as with regards to definitions of constraints in UML models in applications of the OCL language in UML - and then there's the XMI specification, as with regards to specifications for how UML models may be stored in machine-readable format.
Ostensibly, all of these specifications may be combined for application as though "Under the hood" of any single framework for UML modeling - whether in applications of the Ecore subset of the UML metmodel, or in canonical UML.
Reviewing a short academic presentation about Data Flow Diagrams -although somewhat in departing from formal definitions of UML diagram kinds, but nonetheless in a broader context of applications of the MOF meta-metamodel - perhaps the canonical BPMN metamodel - in its conventional, graphical abstract syntax - perhaps BPMN may serve to provide something of an analogy to Data Flow Diagrams?
Of course, modeling practices may vary by vendor and by application environment.

I consider a Data Flow Diagram as a Sequence Diagram, with Data Producers and Data Consumers creating, using and destroying Data objects by means of synchronous and/or asynchronous messages.

I use Enterprise Architect 'Dynamic View' Analysis diagram.
Control = Process
Information = Data Store
In many ways their Analysis diagram is much better than a data flow diagram, as you can also show events in the form of sending and receiving and there is a process symbol too but I prefer Control. It includes object and decision.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string