Data structure for Hungarian Rings in Haskell - haskell

So I'm working on a solver for the Hungarian Rings puzzle in Haskell (https://www.jaapsch.net/puzzles/rings.htm)
I'm not great at the language, still have a lot of blind spots. I'm struggling to figure out what data structure to use to represent the puzzle, and would love any hints, tips or answers for this! (btw how my current idea represents the coloured balls is as a series of numbers that will be in order when the puzzle is solved)

Much like how to represent a Rubik's cube in a data structure, a naive model contains redundant information, and the most compact model depends on an algebraic analysis of the object. So on the one hand, an operation on a model with redundant information may be inefficient, and an operation on a compact model (e.g. a permutation group) may be quite abstract when translating to physical operations.
So you may find that a permutation group of a higher order more easily describes it; here's from the Rubik's Cube Group article on Wikipedia:
The Rubik's Cube group is the subgroup of the symmetric group S₄₈ generated by the six permutations corresponding to the six clockwise cube moves.
And that might well correspond to a set of double-ended queues as luqui suggests, as long as you take into account that one rotating operation of one queue affects the other queues.

Related

What kind of relationship should "made of" and "made from" belong to in UML?

Please Look at the following example:
1) The chairs are made of wood .
2) Paper is made from trees .
3) Biogas is produced by the fermentation of waste.
4) Asphalt is produced through the refining of petroleum.
Should these be Composition or Dependency?
In general
You can exclude UML composition: composition implies an exclusive ownership. But chairs are not the only products made of wood.
Dependency is either too much (once the biogas is made, there is no dependency to waste anymore) or not enough (the wood of the chair is part of the chair; it's a stronger relationship than a need-to-know-about).
Moreover it is not clear if you want the Chair to be a class on its own or an instance of a class product.
Specific cases
"Made of" expresses a relationship with parts that make a product. A typical pattern here is the bill of material. The relation is a simple association. Some people tend to use UML aggregation, but UML specifications do not define precisely the semantics, so leave it out to avoid ambiguity.
Here a (very simplified) diagram:
"Made from" expresses a transformation process, where some of the original products disappear in the process and others are created. Paper is made from cellulose which is extracted from trees. But the original tree is not in the sheet of paper.
Typically in process industries, like chemicals, this is represented by a "recipe", which is a sequence of inputs (waste) and operation to obtain products (biogas) and co-products (fermented residual waste). In other industries, this is represented by a "routing" that is a sequence of operations performed on a BOM (in this case, the BOM would not only contain components as previously shown, but also raw products that are transformed by the operations in the routing).
I will not show a diagram because this is quickly very complex, but again, it'll use simple associations.
In the end, "made of" and "made from" would both be represented in UML with associations. But only the semantic that you attach to it will change.
I don't think that any "made of" is represented by some static relation. To make a wooden chair it just Depends on wood. There is no other relation here because you have some complicated process to turn wood into a chair. Or a tree into paper. Or anything else of your choices. In order to describe the relation you would need to describe the process. This is possible using activities. You could as well descibe physical components with UML. But that would take us too far apart from your question...
You might search for answers on composition here. But even these can be looked at controversial. Modeling reality yields a model. Which is not reality.
Your question highlights the difference between an ontology and an information model or application design. For example, in a conceptual ontology (which accounts for necessary and possible situations in the world rather than for data formats of observations and measurements), every person has a mother. In a particular database or application, that knowledge is usually irrelevant. What stuff is made of and how it is produced belongs more to a conceptual ontology, for which UML is not quite expressive enough. This is why languages such as First Order Logic (FOL) and OWL are used instead, and why some tools (such as my company’s) plug holes in UML. (One example of a hole in UML is the inability to express exactly the intersection of two classes.)

How can i design a sequence diagram where the actors dont interfere with each other?

i've been trying to design a sequence diagram for a storage system in a shop .
There are 3 actors (boss , employees and supplier) who can do different jobs on the system .
For example supplier can only inform the system for the new arrivals. Employee can only check if a product is in stock .Boss can check products in stock ,get information about the products etc.
The problem here is that the actors dont intefere with each other . Should i design 3 different diagrams or should i just design a diagram where the 3 actors are next to each other but the one does not effect the other?
Any help would be valuable .
A sequence diagram aims to represent an interaction scenario.
You can of course represents several islands of unrelated interactions (e.g. supplier/product/stock and employee/stock/product). But this makes the diagram appear unnecessarily complex without adding value.
Conclusion: if the interactions are unrelated, you‘d better represent them in 3 separate diagrams, each appearing simple to understand and focusing on one logical sequence.
Hint: keep your diagrams as simple as possible but not more.
Why? Any diagram of more than 6-7 elements will appear difficult to understand to the average reader. This is a consequence of findings proven in experiments on short term brain memory by Herbert Simon in the 70s. Experienced readers can understand much more complex diagrams because they work with chunks of elements (patterns) that they are able to group together and they can mentally handle the group as if it was a single element. Keeping diagram simple therefore helps most of the readers to focus on the elements that matter, and making more diagrams facilitates grasping the complex system without having to chunk elements.

Using Conditional Random Fields for Nested named entity recognition

My question is the following.
When we work on Named entity recognition tasks, in most cases the classic LSTM-CRF architecture is used, where the CRF uses the Viterbi decoder and the transition matrix to find the best tag sequence associated to a sentence.
My question is, if a token is now associated to multiple entities and not just one (which is the case of Nested NER), as in the case of Bank of China, where China is a location and Bank of China is an organization. Can the CRF algorithm be adapted for this case? That is, finding more than one possible path in the sequence.
This issue is related to the datasets format more than the LSTM-CRF in itself, i.e. you may indeed implement a LSTM-CRF that would recognize nested entities, without depth limitation, but they are rather rare.
Most of the machine learning (including LSTM-CRF) software are learned with a CoNLL (tab separated) dataset format, which is not convenient for unlimited depth nesting. Many dataset and systems implement a fixed depth nesting, using additional columns (more or less one per nesting depth). Software may use separate or joint learning for each depth or use cascading models.

NLP of Legal Texts?

I have a corpus of a few 100-thousand legal documents (mostly from the European Union) – laws, commentary, court documents etc. I am trying to algorithmically make some sense of them.
I have modeled the known relationships (temporal, this-changes-that, etc). But on the single-document level, I wish I had better tools to allow fast comprehension. I am open for ideas, but here's a more specific question:
For example: are there NLP methods to determine the relevant/controversial parts of documents as opposed to boilerplate? The recently leaked TTIP papers are thousands of pages with data tables, but one sentence somewhere in there may destroy an industry.
I played around with google's new Parsey McParface, and other NLP solutions in the past, but while they work impressively well, I am not sure how good they are at isolating meaning.
In order to make sense out of documents you need to perform some sort of semantic analysis. You have two main possibilities with their exemples:
Use Frame Semantics:
http://www.cs.cmu.edu/~ark/SEMAFOR/
Use Semantic Role Labeling (SRL):
http://cogcomp.org/page/demo_view/srl
Once you are able to extract information from the documents then you may apply some post-processing to determine which information is relevant. Finding which information is relevant is task related and I don't think you can find a generic tool that extracts "the relevant" information.
I see you have an interesting usecase. You've also mentioned the presence of a corpus(which a really good plus). Let me relate a solution that I had sketched for extracting the crux from research papers.
To make sense out of documents, you need triggers to tell(or train) the computer to look for these "triggers". You can approach this using a supervised learning algorithm with a simple implementation of a text classification problem at the most basic level. But this would need prior work, help from domain experts initially for discerning "triggers" from the textual data. There are tools to extract gists of sentences - for example, take noun phrases in a sentence, assign weights based on co-occurences and represent them as vectors. This is your training data.
This can be a really good start to incorporating NLP into your domain.
Don't use triggers. What you need is a word sense disambiguation and domain adaptation. You want to make sense of is in the documents i.e understand the semantics to figure out the meaning. You can build a legal ontology of terms in skos or json-ld format represent it ontologically in a knowledge graph and use it with dependency parsing like tensorflow/parseymcparseface. Or, you can stream your documents in using a kappa based architecture - something like kafka-flink-elasticsearch with added intermediate NLP layers using CoreNLP/Tensorflow/UIMA, cache your indexing setup between flink and elasticsearch using redis to speed up the process. To understand relevancy you can apply specific cases from boosting in your search. Furthermore, apply sentiment analysis to work out intents and truthness. Your use case is one of an information extraction, summarization, and semantic web/linked data. As EU has a different legal system you would need to generalize first on what is really a legal document then narrow it down to specific legal concepts as they relate to a topic or region. You could also use here topic modelling techniques from LDA or Word2Vec/Sense2Vec. Also, Lemon might also help from converting lexical to semantics and semantics to lexical i.e NLP->ontology ->ontology->NLP. Essentially, feed the clustering into your classification of a named entity recognition. You can also use the clustering to assist you in building out the ontology or seeing what word vectors are in a document or set of documents using cosine similarity. But, in order to do all that it be best to visualize the word sparsity of your documents. Something like commonsense reasoning + deep learning might help in your case as well.

What is the technical definition of theoretical computer science? What subfields are included?

What is the technical definition of theoretical computer science? (Or, what should it be?)
What main subfields does it include, and what is the commonality that separates them from the rest of computer science?
More specifically: if some particular research has direct practical motivations, goals and outcomes but mostly involves very abstract methods, is it theoretical computer science or not?
Two examples to consider:
"Dual quaternions for rigid transformation blending" (Better mathematical representation of rotation and transform for animation)
https://www.cs.tcd.ie/publications/tech-reports/reports.06/TCD-CS-2006-46.pdf
"Relational Semantics for Effect-Based Program Transformations
with Dynamic Allocation" (Complier optimisation via denotational semantics): http://research.microsoft.com/pubs/67977/ppdprelational.pdf
[The Wikipedia article gives only a vague definition and a long list of subfields. Should just accept that there's no better definition than this? http://en.wikipedia.org/wiki/Theoretical_computer_science ]
EDIT: I guess this question comes down to "What does the term 'theory' mean in the context of computer science?". Looking at the 6 different meanings of the word at wiktionary, I don't think any of them fully fits. I guess the mathematical sense of a theory fits well for completely mathematical fields but not for others, and for VLSI, machine learning and computational biology from wikipedia:TCS it basically doesn't fit.
I think the easiest way to distinguish theory from application is to look at the field's definition of a computer. If work in the field is based on the assumption that a computer is a physical object or system, then it's probably application. On the other hand, if work in the field is based on the assumption that a computer is an abstract (usually mathematical) object, it's probably theory. So, when you decide whether to say you are a theoretical computer scientist, I think you just have to ask yourself, "what is a computer?"
(For me, it's definitely an abstract object)
This link contains a list of subfields: http://arxiv.org/corr/home, I won't reproduce them here as the link may change, and it would be redundant.
Also, I'm reminded of the quote of someone, can't remember who, along the lines of:
Mathematics is whatever
mathematicians do
It would seem to apply.

Resources