Copying or visiting the same tree multiple times? - antlr4

I have a program that does multiple analyses on the same AST, one analysis per thread. The analysis is done with a Visitor. For now, each thread parses the program input, transforms it as an AST, and runs its visitor. However, this is very time-consuming to parse the same program multiple times.
I tried to parse the program once and have the threads to use the visitors, but it does not seem multiple visitors can share the same tree (especially concurrently).
One alternative would be to parse the program, build the AST and make a copy of the AST that is later analyzed.
Any recommendations on how to proceed?
Is there any way to safely process the same AST by different threads?
Is there a way to make a copy (e.g. deepcopy) of an AST?

Related

Sharing NetworkX graph between processes with no additional memory cost (read-only)

I am using python's multiprocessing module. I have a networkx graph which I wish to share between many sub processes. These subprocesses do not modify the graph in any way, and only read its attributes (nodes, edges, etc). Right now every subprocess has its own copy of the graph, but I am looking for a way to share the graph between all of them, which will result in the memory footprint of the entire program being reduced. Since the computations are very CPU-intensive, I would want this to be done in a way that would not cause big performance issues (avoiding locks if possible, etc).
Note: I want this to work on various operating systems, including Windows, which means COW does not help (if I understand this correctly, it probably wouldn't have helped regardless, due to reference counting)
I found https://docs.python.org/3/library/multiprocessing.html#proxy-objects and
https://docs.python.org/3/library/multiprocessing.shared_memory.html, but I'm not sure which (or if either) is suitable. What is the right way to go about this? I'm using python 3.8, but can use later versions if helpful.
There are a few options for sharing data in python during multiprocessing but you may not be able to do exactly what you want to.
In C++ you could use simple shared memory for ints, floats, structs, etc.. Python's shared memory manager does allow for this type of sharing for simple objects but it doesn't work for classes or anything more complex than a list of base types. For shared complex python objects, you really only have a few choices...
Create a copy of the object in your forked process (which it sounds like you don't want to do).
Put the object in a centralized process (ie.. python's Manager / proxy objects) and interact with it via pipes and pickled data.
Convert your networkX graph to a list of simple ints and put it in shared memory.
What works for you is going to depend on some specifics. Option #2 has a bit of overhead because every time you need to access the object, data has to be pickled and piped to the centralized process and the result pickled/piped for return. This works well if you only need a small portion of the centralized data at a time and your processing steps are relatively long (compared to the pickle/pipe time).
Option #3 could be a lot of work. You would fundamentally be changing the data format from networkX objects to a list of ints so it's going to change the way you do processing a lot.
A while back I put together PythonDataServe which allows you to server your data to multiple processes from another process. It's a very similar solution to #2 above. This type of approach works if you only need a small portion of the data at a time but it you need it all, it's much easier to just create a local copy.

Clojure: Create and manage multiple threads

I wrote a program which needs to process a very large dataset and I'm planning to run it with multiple threads in a high-end machine.
I'm a beginner in Clojure and i'm lost in the myriad of tools at disposal -
agents, futures, core.async (and Quartzite?). I would like to know which one is most suited for this job.
The following describes my situation:
I have a function which transforms some data and store it in database.
The argument to the said function is popped from a Redis set.
Run the function in several separate threads as long as there is a value in the Redis set.
For simplicity, futures can't be beat. They create a new thread, and return a value from it. However, often you need more fine-grained control than they provide.
The core.async library has nice support for parallelism (via pipeline, see below), and it also provides automatic back-pressure. You have to have a way to control the flow of data such that no one's starving for work, or burdened by too much of it. core.async channels must be bounded, and this helps with this problem. Also, it's a pretty logical model of your problem: taking a value from a source, transforming it (maybe using a transducer?) with some given parallelism, and then putting the result to your database.
You can also go the manual route of using Java's excellent j.u.concurrent library. There are low level primitives as well as thread management tools for thread pools. All of this is accessible within clojure.
From a design standpoint, it comes down to whether you are more CPU-bound or I/O-bound. This affects decisions such as whether or not you will perform parallel reads from redis and writes to your database. If you are CPU-bound and thus your bottleneck is the computation, then it wouldn't make much sense to parallelize your reads from redis, or your writes to your database, would it? These are the types of things to consider.
You really have two problems to solve: (1) your familiarity with clojure's/java's concurrency mechanisms, and (2) your approach to this problem (i.e., how would you approach this problem, irrespective of the language you're using?). Once you solve #2, you will have a much better idea of which tools to use that I mentioned above, and how to use them.
Sounds like you may have a
good
embarrassingly parallel problem
to solve. In that case, you could start simply by coding up your
processing into a top-level function that processes the first datum.
Once that's working, wrap it in
a map to handle all of the
data sequentially (serially, one-at-a-time).
You might want to start tackling the bigger problem with just a few
items from your data set. That will make your testing smoother and
faster.
After you have the map working, it's time to just add a p
(parallel) to your code to make it
a pmap. This is a very
rewarding way to heat up your
machine.
Here is
a discussion about the number of threads pmap uses.
The above is the simplest approach. If you need finer control over
the concurrency, the
this concurrency screencast explores
the use cases.
It is hard to be precise w/o knowing the details of your problem. There are several choices as you mention:
Plain Java threads & threadpools. If your problem is similar to a pre-existing Java solution, this may be the most straightforward.
Simple Clojure threading with future et al. Kicking off a thread with future and getting the result in a promise is very easy.
Replace map with pmap (parallel map). This can help in simple cases that are primarily map/reduce oriented.
The Claypoole library: Lots of tools to make multithreading simpler and easier. Please see their GitHub project and the Clojure/West talk.

Trying to apply the concept of coroutines to existing code

I have a very basic sitemap scraper built in Python 3 using requests and lxml. The aim is to build a database of the URLs of a certain website. Currently the way it works is the following: for each top-level sitemap to be scraped, I trigger a celery task. In this task, the sitemap is parsed to check whether it's a sitemapindex or a urlset. Sitemapindexes point to other sitemaps hierarchically, whereas urlsets point to end urls - they're like the leafs in the tree.
If the sitemap is identified as a sitemapindex, each URL it contains, which points to a sub-sitemap, is processed in a separate thread, repeating the process from the beginning.
If the sitemap is identified as a urlset, the URLs within are stored in the database and this branch finishes.
I've been reading about coroutines, asyncio, gevent, async/await, etc and I'm not sure if my problem is suitable to be developed using these technologies or whether performance would be improved.
As far as I've read, corroutines are useful when dealing with IO operations in order to avoid blocking the execution while the IO operation is running. However, I've also read that they're inherently single-threaded, so I understand there's no parallelization when, e.g., the code starts parsing the XML response from the IO operation.
So esentially the questions are, how could I implement this using coroutines/asyncio/insert_similar_technology? and would I benefit from it performance-wise?
Edit: by the way, I know Twisted has a specialized SitemapSpider, just in case anyone suggests using it.
Sorry, I'm not sure I fully understand how your code works, but here some thoughts:
Does your program downloads multiple urls?
If yes, asyncio can be used to reduce time your program waiting for network I/O. If not, asyncio wouldn't help you.
How does your program download urls?
If one-by-one, then asyncio can help you to grab them much faster. On other hand if you're already grabbing them parallely (with different threads, for example), you wouldn't get much benefit from asyncio.
I advice you to read my answer about asyncio here. It's short and it can help you to understand why and when to use asynchronous code.

How should I make my parser concurrent?

I'm working on implementing a music programming language parser in Clojure. The idea is that you run the parser program with a text file as a command-line argument; the text file contains code in this music language I'm developing; the parser interprets the code and figures out what "instrument instances" have been declared, and for each instrument instance, it parses the code and returns a sequence of musical "events" (notes, chords, rests, etc.) that the instrument does. So before that last step, we have multiple strings of "music code," one string per instrument instance.
I'm somewhat new to Clojure and still learning the nuances of how to use reference types and threads/concurrency. My parser is going to be doing some complex parsing, so I figured it would benefit from using concurrency to boost performance. Here are my questions:
The simplest way to do this, it seems, would be to save the concurrency for after the instruments are "split up" by the initial parse (a single-thread operation), then parse each instrument's code on a different thread at the same time (rather than wait for each instrument to finish parsing before moving onto the next). Am I on the right track, or is there a more efficient and/or logical way to structure my "concurrency plan"?
What options do I have for how to implement this concurrent parsing, and which one might work the best, either from a performance or a code maintenance standpoint? It seems like it could be as simple as: (map #(future (process-music-code %)) instrument-instances), but I'm not sure if there is a better way to do it like with an agent, or manual threads via Java interop, or what. I'm new to concurrent programming, so any input on different ways to do this would be great.
From what I've read, it seems that Clojure's reference types play an important role in concurrent programming, and I can see why, but is it always necessary to use them when working with multiple threads? Should I worry about making some of my data mutable? If so, what in particular should be mutable in the code for the parser I'm writing? and what reference type(s) would be best suited for what I'm doing? The nature of the way my program will work (user runs the program with a text file as an argument -- program processes it and turns it into audio) makes it seem like I don't need anything to be mutable, since the input data never changes, so my gut tells me I won't need to use any reference types, but then again, I might not fully understand the relationship between reference types and concurrency in Clojure.
I would suggest that you might be distracting yourself from more important things (like working out the details of your music language) by premature optimization. It would be better to write the simplest, easiest-to-code parser which you can first, to get up and running. If you find it too slow, then you can look at how to optimize for better performance.
The parser should be fairly self-contained, and will probably not take a whole lot of code anyways, so even if you later throw it out and rewrite it, it will not be a big loss. And the experience of writing the first parser will help if and when you write the second one.
Other points:
You are absolutely right about reference types -- you probably won't need any. Your program is a compiler -- it takes input, transforms it, writes output, then exits. That is the ideal situation for pure functional programming, with nothing mutable and all flow of data going purely through function arguments and return values.
Using a parser generator is usually the quickest way to get a working parser, but I haven't found a really good parser generator for Clojure. Parsley has a really nice API, but it generates LR(0) parsers, which are almost useless for anything which does not have clear, unambiguous markers for the beginning/end of each "section". (Like the way S-expressions open and close with parens.) There are a couple parser combinator libraries out there, like squarepeg, but I don't like their APIs and prefer to write my own hand-coded, recursive-descent parsers using my own implementation of something like parser combinators. (They're not fast, but the code reads really well.)
I can only support Alex Ds point that writing parsers is an excellent exercise. You should definitely do it in C one time. From my own experience, it's a lot of debugging training at least.
Aside from that, given that you are in the beautiful world of Clojure notice the following:
Your parser will transform ordinary strings to data structures, like
{:command :declare,
:args {:name "bazooka-violin",
...},
...}
In Clojure you can read such data structures easily from EDN files. Possibly it would be a more valuable approach to play around with finding suitable structures directly before you constrain the syntax of your language too much for it to be flexible for later changes in the way your language works.
Don't ever think about writing for performance. Unless your user describes the collected works of Bach in a file, it's unlikely that it will take more than a second to parse.
If you write your interpreter in a functional, modular and concise way, it should be easy to decompose it into steps that can be parallelized using various techniques from pmap to core.reducers. The same of course goes for all other code and your parser as well (if multi-threading is a necessity there).
Even Clojure is not compiled in parallel. However it supports recompilation (on the JVM) which in contrast is a way more valuable feature to think about.
As an aside, I've been reading The Joy of Clojure, and I just learned that there is a nifty clojure.core function called pmap (parallel map) that provides a nice, easy way to perform an operation in parallel on a sequence of data. It's syntax is just like map, but the difference is that it performs the function on each item of the sequence in parallel and returns a lazy sequence of the results! This can generally give a performance boost, but it depends on the inherent performance cost of coordinating the sequence result, so whether or not pmap gives a performance boost will depend on the situation.
At this stage in my MPL parser, my plan is to map a function over a sequence of instruments/music data, transforming each instrument's music data from a parse tree into audio. I have no idea how costly this transformation will be, but if it turns out that it takes a while to generate the audio for each instrument individually, I suppose I could try changing my map to pmap and see if that improves performance.

Massive number of XML edits

I need to load a mid-sized XML file into memory, make many random access modifications to the file (perhaps hundreds of thousands), then write the result out to STDIO. Most of these modifications will be node insertion/deletions, as well as character insertion/deletions within the text nodes. These XML files will be small enough to fit into memory, but large enough that I won't want to keep multiple copies around.
I am trying to settle on the architecture/libraries and am looking for suggestions.
Here is what I have come up with so far-
I am looking for the ideal XML library for this, and so far, I haven't found anything that seems to fit the bill. The libraries generally store nodes in Haskell lists, and text in Haskell Data.Text objects. This only allows linear Node and Text inserts, and I believe that the Text inserts will have to do full rewrite on every insert/delete.
I think storing both nodes and text in sequences seems to be the way to go.... It supports log(N) inserts and deletes, and only needs to rewrite a small fraction of the tree on each alteration. None of the XML libs are based on this though, so I will have to either write my own lib, or just use one of the other libs to parse then convert it to my own form (given how easy it is to parse XML, I would almost just as soon do the former, rather than have a shadow parse of everything).
I had briefly considered the possibility that this might be a rare case where Haskell might not be the best tool.... But then I realized that mutability doesn't offer much of an advantage here, because my modifications aren't char replacements, but rather add/deletes. If I wrote this in C, I would still need to store the strings/nodes in some sort of tree structure to avoid large byte moves for each insert/delete. (Actually, Haskell probably has some of the best tools to deal with this, but I would be open to suggestions of a better choice of language for this task if you feel there is one).
To summarize-
Is Haskell the right choice for this?
Does any Haskell lib support fast node/text insert/deletes (log(N))?
Is sequence the best data structure to store a list of items (in my case, Nodes and Chars) for fast insert and deletes?
I will answer my own question-
I chose to wrap an Text.XML tree with a custom object that stores nodes and text in Data.Sequence objects. Because haskell is lazy, I believe it only temporarily holds the Text.XML data in memory, node by node as the data streams in, then it is garbage collected before I actually start any real work modifying the Sequence trees.
(It would be nice if someone here could verify that this is how Haskell would work internally, but I've implemented things, and the performance seems to be reasonable, not great- about 30k insert/deletes per second, but this should do).

Resources