How to translate syntatic parse to a dependency parse tree? - nlp

Using Link Grammar I can have the syntaxic parse of sentences something like the following:
+-------------------Xp------------------+
+------->WV------->+------Ost------+ |
+-----Wd----+ | +----Ds**x---+ |
| +Ds**c+--Ss--+ +-PHc+---A---+ |
| | | | | | | |
LEFT-WALL a koala.n is.v a cute.a animal.n .
+---------------------Xp--------------------+
+------->WV------>+---------Osm--------+ |
+-----Wd----+ | +------Ds**x------+ |
| +Ds**c+--Ss-+ +--PHc-+-----A----+ |
| | | | | | | |
LEFT-WALL a wolf.n is.v a dangerous.a animal.n .
+--------------------Xp--------------------+
+------->WV------>+--------Ost--------+ |
+-----Wd----+ | +------Ds**x-----+ |
| +Ds**c+--Ss-+ +--PHc-+----A----+ |
| | | | | | | |
LEFT-WALL a dog.n is.v a faithful.a animal.n .
+-----------------------Xp----------------------+
+------->WV------->+----------Osm----------+ |
+-----Wd----+ | +-------Ds**x-------+ |
| +Ds**c+--Ss--+ +--PHv--+-----A-----+ |
| | | | | | | |
LEFT-WALL a monkey.n is.v an independant.a animal.n .
The problem with this that it's not possible AFAIK to make sens
of that output programmatically; It seems like the way to go
is to convert that syntaxic output to a dependency parse tree
how can I achieve that?

You may want to look at RelEx (at GitHub).
From link-grammar at Wikipedia (emphasis mine):
The semantic relationship extractor RelEx, layered on top of the
Link Grammar library, generates a dependency grammar output by
making explicit the semantic relationships between words in a
sentence. Its output can be classified as being at a level between
that of SSyntR and DSyntR of Meaning-Text Theory. It also provides
framing/grounding, anaphora resolution, head-word identification,
lexical chunking, part-of-speech identification, and tagging,
including entity, date, money, gender, etc. tagging. It includes a
compatibility mode to generate dependency output compatible with
the Stanford parser, and Penn Treebank-compatible POS tagging.

Related

Recursively add prefix to file names and moving these files from all subdirectories to a specified directory (linux environment)

I'd like to rename the files with the unique sample name (which is the title of the subdirectory 2 levels above the files).
Here is a snippet of the directory structure:
|-RNAdata
| |-Sample1
| | |-cufflinks
| | | |-genes.fpkm_tracking
| | | |-skipped.gtf
| | | |-isoforms.fpkm_tracking
| | | |-transcripts.gtf
| |-Sample2
| | |-cufflinks
| | | |-genes.fpkm_tracking
| | | |-skipped.gtf
| | | |-isoforms.fpkm_tracking
| | | |-transcripts.gtf
There are about 1000 files like this. I'd like to be able to see something like this:
|-RNAdata
| |-Sample1_genes.fpkm_tracking
| |-Sample1_skipped.gtf
| |-Sample1_isoforms.fpkm_tracking
| |-Sample1_transcripts.gtf
| |-Sample2_genes.fpkm_tracking
| |-Sample2_skipped.gtf
| |-Sample2_isoforms.fpkm_tracking
| |-Sample2_transcripts.gtf
I'm working in a Linux environment and have very basic knowledge on file management with this language. Any advice/suggestions for resources on this type of work, that would be great! I'd like to learn this so I can be more independent with this. Thank you!

How to perform a series of steps in a single thread, with an async flow in spring-integration?

I currently have a spring-integration (v4.3.24) flow that looks like the following:
|
| list of
| filepaths
+----v---+
|splitter|
+----+---+
| filepath
|
+----------v----------+
|sftp-outbound-gateway|
| "get" |
+----------+----------+
| file
+---------------------+
| +----v----+ |
| |decryptor| |
| +----+----+ |
| | |
| +-----v------+ | set of transformers
| |decompressor| | (with routers before them
| +-----+------+ | because some steps are optional)
| | | that process the file;
| +--v--+ | call this "FileProcessor"
| | ... | |
| +--+--+ |
+---------------------+
|
+----v----+
|save file|
| to disk |
+----+----+
|
All of the channels above are DirectChannels - Yup, I know this is a poor structure. This was working fine for files in small numbers. But now, I have to deal with thousands of files which need to go through the same flow - benchmarks reveal that this takes ~ 1 day to finish processing. So, I'm planning to introduce some parallel processing to this flow. I want to modify my flow to achieve something like this:
|
|
+----------v----------+
|sftp-outbound-gateway|
| "mget" |
+----------+----------+
| list of files
|
+----v---+
|splitter|
+----+---+
one thread one | thread ...
+------------------------+---------------+--+--+--+--+
| file | file | | | | |
+---------------------+ +---------------------+
| +----v----+ | | +----v----+ |
| |decryptor| | | |decryptor| |
| +----+----+ | | +----+----+ |
| | | | | |
| +-----v------+ | | +-----v------+ | ...
| |decompressor| | | |decompressor| |
| +-----+------+ | | +-----+------+ |
| | | | | |
| +--v--+ | | +--v--+ |
| | ... | | | | ... | |
| +--+--+ | | +--+--+ |
+---------------------+ +---------------------+
| |
+----v----+ +----v----+
|save file| |save file|
| to disk | | to disk |
+----+----+ +----+----+
| |
| |
For parallel processing, I output the files from the splitter on to a ExecutorChannel with a ThreadPoolTaskExecutor.
Some of the questions that I have:
I want all of the "FileProcessor" steps for one file to happen on the same thread, while multiple files are processed in parallel. How can I achieve this?
I saw from this answer, that a ExecutorChannel to MessageHandlerChain flow would offer such functionality. But, some of the steps inside "FileProcessor" are optional (using selector-expression with routers to skip some of the steps) - ruling out using a MessageHandlerChain. I can rig up a couple of MessageHandlerChains with Filters inside, but this more or less becomes the approach mentioned in #2.
If #1 cannot be achieved, will changing all of the channel types starting from the splitter, from DirectChannel to ExecutorChannel help in introducing some parallelism? If yes, should I create a new TaskExecutor for each channel or can I reuse one TaskExecutor bean for all channels (I cannot set scope="prototype" on a TaskExecutor bean)?
In your opinion, which approach (#1 or #2) is better? Why?
If I perform global error handling, like the approach mentioned here, will the other files continue to process even if one file errors out?
It will work as you need by using an ExecutorChannel as an input to the decrypter and leave all the rest as direct channels; the remaining flow does not have to be a chain, each component will run on one of the executor's threads.
You will need to be sure all your downstream components are thread-safe.
Error handling should remain as is; each sub flow is independent.

Interrupt back-propagation in branched neural networks

I've a neural network that looks something like this.
input_layer_1 input_layer_2
| |
| |
some_stuff some_other_stuff
| /|
| _________________/ |
| / |
multiply |
| |
| |
output_1 output_2
Is there any possibility to cut the connection between some_other_stuff and multiplyduring back-propagation? I was thinking of dropout but this also applied during forward-propagation
So during back-propagation it should be like two networks:
input_layer_1 input_layer_2
| |
| |
some_stuff some_other_stuff
| |
| |
| |
multiply |
| |
| |
output_1 output_2
Output_1 errors only influence weight adjustment in the left part of the network and Output_2 errors only on the right part.
I'm using keras with tensorflow so maybe there are some functions/Layers that achieve this.
Thanks.
If anyone wonders, you could use K.gradient_stop() inside a Lambda-Layer

Resort key-Value combination

The following example just shows the pattern, my data much bigger.
I have a Table like
| Variable | String |
|:---------|-------:|
| V1 | Hello |
| V2 | little |
| V3 | World |
I have another table where different arrangements are defined
| Arrangement1 | Arrangement2 |
|:-------------|-------------:|
| V3 | V2 |
| V2 | V1 |
| V1 | V3 |
My output depending on the asked Arrangement (e.g. Arrangement1) should be
| Variable | Value |
|:---------|------:|
| V3 | World |
| V2 | little|
| V1 | Hello |
Till now I try to realize an approach with .find and array but think there might be an easier way (maybe with dictionary?) anyone an idea with good performance?

Text -> Diagram Tool [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm looking for an diagram tool for producing diagrams from text. I only really need sequence and state type diagrams for now, but I'm curious as to what people would recommend? I need something which is standalone, not a web based tool that works on Linux, OSX and Windows.
I'm not positive what you mean by "producing diagrams from text", but if you mean a tool where diagrams are specified by a text file, Graphviz is good. If you mean something that literally converts ascii art like
+--------+ +-------+ +-------+
| | --+ ditaa +--> | |
| Text | +-------+ |diagram|
|Document| |!magic!| | |
| {d}| | | | |
+---+----+ +-------+ +-------+
: ^
| Lots of work |
+-------------------------+
to a graphic:
You can try ditaa (that ascii art is from their website, so it's a good example of the input format it expects)
Look at PlantUML, LaTeX+MetaUML, sdedit, TextUML, yUML, ...
There is a plenty of quite good tools.
I recommend TextDiagram http://weidagang.github.com/text-diagram/. It creates UML sequence diagram from pure text.
Example input
object April Todd Monad
note left of April: Lunch is ready
April->Todd: Todd, what are you doing?
note right of Todd: Programming #_#
Todd->April: Well, I'm programming.
April->Monad: And you?
Monad->April: I'm reading book.
April->Monad: Good boy!
note right of Monad: Smile ^_^
produces:
+-------+ +-------+ +-------+
| April | | Todd | | Monad |
+-------+ +-------+ +-------+
-----------------\ | | |
| Lunch is ready |-| | |
------------------ | | |
| | |
| Todd, what are you doing? | |
|------------------------------>| |
| | ------------------\ |
| |-| Programming #_# | |
| | ------------------- |
| | |
| Well, I'm programming. | |
|<------------------------------| |
| | |
| And you? | |
|------------------------------------------------------>|
| | |
| | I'm reading book. |
|<------------------------------------------------------|
| | |
| Good boy! | |
|------------------------------------------------------>|
| | | ------------\
| | |-| Smile ^_^ |
| | | -------------
| | |
I'd recomment PlantUML. It is an excellent tools that lets you draw all kinds of UML diagrams from simple textual specification.
EventStudio supports generation of sequence diagrams and collaboration diagrams from text input.

Resources