Can someone suggest a high performance shared memory API that supports complex data? [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm looking at porting an old driver that generates a large, complex set of tables of data into user space - because the tables have become large enough that memory consumption is serious a problem.
Since performance is critical and because there will be 16-32 simultaneous readers of the data, we thought we'd replace the old /dev based interface to the code with a shared-memory model that would allow clients to directly search the tables rather than querying the daemon directly.
The question is - what's the best way to do that? I could use shm_open() directly, but that would probably require me to devise my own record locking and even, possibly, an ISAM data structure for the shared memory.
Rather than writing my own code to re-visit the 1970s, is there a high-performance shared memory API that provides a hash based lookup mechanism? The data is completely numeric, the search keys are fixed-length bit fields that may be 8, 16, or 32 bytes long.

This is something i've wanted to write for some time, but there's always some more pressing thing to do...
still, for most of the usecases of a shared key-data RAM store, memcached would be the simplest answer.
In your case, it looks like it's lower-level, so it memcached, fast as it is, might not be the best answer. I'd try Judy Arrays on a shmem block. They're really fast, so even if you wrap the access with a simplistic lock, you'd still get high performance access.
For more complex tasks, I'd search about lock-free structures (some links: 1, 2, 3,4). I even wrote one some time ago, with hopes of integrating it in a Lua kernel, but it proved really hard to keep with the existing implementation. Still, it might interest you.

Related

how to achieve disk write speed as high as fio does? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
I bought a Highpoint a HBA card with 4 Samsung 960PRO in it.As the official site said this card can perform 7500MB/s in writing and 13000MB/s in reading.
When I test this card with fio in my ubuntu 16.04 system,I got a writing speed of about 7000MB/s,here is my test arguments:
sudo fio -filename=/home/xid/raid0_dir0/fio.test -direct=1 -rw=write -ioengine=sync -bs=2k -iodepth=32 -size=100G -numjobs=1 -runtime=100 -time_base=1 -group_reporting -name=test-seq-write
I have made a raid0 in the card and made a xfs filesystem.I want to know how to achieve disk writing speed as high as fio performed if I want to use functions such as "open(),read(),write()" or functions such as "fopen(),fread(),fwrite()" in my console applications.
I'll just note that the fio job you specified seems a little flawed:
-direct=1 -rw=write -ioengine=sync -bs=2k -iodepth=32
(for the sake of simplicity let's assume the dashes are actually double)
The above is asking trying to ask a synchronous ioengine to use an iodepth of greater than one. This usually doesn't make sense and the iodepth section of the fio documentation warns about this:
iodepth=int
Number of I/O units to keep in flight against the file. Note that
increasing iodepth beyond 1 will not affect synchronous ioengines
(except for small degrees when verify_async is in use). Even async
engines may impose OS restrictions causing the desired depth not to be
achieved. [emphasis added]
You didn't post the fio output so we can't tell if you ever achieved an iodepth of greater than one. 7.5GByte/s seems high for such a job and I can't help thinking your filesystem quietly went and did buffering behind your back but who knows? I can't say more because the output of you fio run is unavailable I'm afraid.
Also note the data fio was writing might not have been random enough to defeat compression thus helping to achieve an artificially high I/O speed...
Anyway to your main question:
how to achieve disk write speed as high as fio does?
Your example shows you are telling fio to use an ioengine that does regular write calls. With this in mind, theoretically you should be able to achieve a similar speed by:
Preallocating your file and only writing into the allocated portions of it (so you are not doing extending writes)
Fulfilling all the requirements of using O_DIRECT (there are strict memory alignment and size constraints that MUST be fulfilled)
Making sure your write operations are working on buffers writing in chunks of exactly 2048 bytes (or greater so long as they are powers of two)
Submitting your writes as soon as possible :-)
You may find not using O_DIRECT (and thus allowing buffered I/O to do coalescing) is better if for some reason you are unable to submit "large" well aligned buffers every time.

High Level language with Low level Graphics [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I am looking for a high level language which will still allow me to work directly with graphics. I want to be able to modify screen pixels, for instance. But I do not want to write huge amounts of code for each operation. I want simple one line commands for graphics somewhat like those listed below. What are some programming languages which would have these features?
Possible pseudocode:
Screen.clear
Graphics.line(4,5,20,25).color=green
Circle(centerx,centery,radius)
Depending on what you want to do (ie, how complex do you need to get?) Processing is a very high-level, graphics-focused environment. Note, however, that it seems to be focused on the fixed function OpenGL pipeline, which is deprecated (though arguably the easiest and most intuitive way to get started).
Processing is built in Java, runs in web browser (or from your desktop), and abstracts most of the initialization and cleanup code required to use OpenGL.
Edit
I've just noticed your comment that says you're not an experienced programmer. In that case, I'd recommend starting with Processing. Once you get the hang of it, move on to Python.
Another, slightly more complex, option is Python. Python is very powerful, fairly easy to pick up (depending upon your prior development experience), and widely supported. It'll also allow you to use shaders and other features from the 21st century, and is cross-platform See this link for PyOpenGL, the first Python OpenGL site that popped up in google.
Then, there's C# + OpenTK. This can get pretty complex pretty quickly, but is very powerful, and since it's compiled (under .NET or Mono), can potentially give you better performance than Python.
Finally, for close-to-bare-metal performance, C++ is unbeatable, though arguably the most complex of these options, with a significant learning curve. However, most of the example code you'll find online is in C++, which can be an issue if you're not using C++ and aren't comfortable reading it.
Using Qt you can create QImage objects and draw on them usign QPainter.
You've of course pixel control using that abstraction level, but you can also access the underlying memory directly using bits() and bytesPerLine() methods thus accessing the image memory directly.
The format that is easiest to use to do fast special computations is in my opinion QImage::Format_ARGB32 with 32 bits per pixel.
Qt is a C++ library portable on may OSs and platforms, and bindings are available for many very high level languages (e.g. Python).

Good source to learn how about virus and other security tools? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Anti-virus, malware, botnets and the like are becoming larger and larger parts of our daily lives. Are there any resources that discuss creating anti-virus tools, security tools and such? Seems like an interesting topic, but I have not been able to find any real source to refer to in order to learn more.
Suggestions? (Good and bad?)
I assume most languages used for this are C++ or assembly? Or are there others that work well for these sort of items?
Alex's suggestion of Bruce Schneier's work is excellent, and everyone should read his stuff, but probably won't address what you're talking about. Even so, you should read it. He's the clearest writer on security topics today, and a voice of sanity in an often hysterical industry.
A free place to start for the bare bones is the SANS reading room. It's far from enough, but it's the basics.
I was fairly pleased with The Shellcoder's Handbook. It's a good introduction with some practical code to work with. It shows how real exploits are written, which is the first step in understanding how to protect against them.
Exploit work is done in a variety of things, but for the classic stack-smashing attacks, you need to know C and the assembler of the target platform (generally Intel). C++ is much less common in this world. It's too twisty-turny by the time the compiler gets done with it, and too bloated for the kinds of things needed. Objective-C is almost more useful in my opinion so that you can understand Mac reverse engineering. But that isn't where security is usually done. In this I'm speaking of exploits themselves. Many security tools are of course written in C++.
For the security tools side, you probably want to ask on serverfault. There are many, and the SANS link above should have links to some of the common tools (Nessus, nmap, hping, metasploit and the like). sectools.org maintains a big list that I like.
If you're going to be a security developer, you need a lot of breadth and a lot of depth. You need to understand the network protocols as well as the code that talks to them. You should be reasonably comfortable in languages from assembler to ruby. Much of it is more a way of thinking than an actual skill set, but those who are good at it tend to have broad skills and pick up new things quickly and often.
Since you noted specifically detecting and monitoring for exploits, you should dig into tools like snort (for learning how to detect) and metasploit (for generating the attacks to detect).
go to http://www.milw0rm.com/ to see the exploits.
For a holistic view on security, anything by Bruce Schneier comes highly recommended -- not the threat-specific focus you have in mind, but a background that will make you more effective at security issues in whatever role you play, whatever background you have.
For more specific views, I would recommend this book (and just about every book I've looked at in depth in the same series, but I can't personally vouch for all of them, they're dozens!-).
As well as what Alex Martelli posted, this book might be something you can consider.

Interactive Statistical Analysis tool [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm looking for a basic software for statistical analysis. Most important is simple and intuitive use, getting started "right out of the box". At least basic operations should be interactive. Free would be a bonus :)
The purpose is analysis of data dumps and logs of various processes.
Importing a comma/tab separated file
sorting and filtering rows on conditions
basic aggregates: count, average, deviation, regression, trend
visualization - plotting the data,bin distribution etc.
Excel fails (at least for me) for the filtering and re-combining data, I guess something like "Excel with SQL" would be nice. I've been using MS Access + Excel and copying around data before, but that's a pain.
Do you have any recommendation?
Clarification I am not looking for a specific tool for IIS/web server logs, but various data end event logs (mostly from custom applications) with tab-separated values.
Specifically for Log file analysis I would recommend Microsoft's Log Parser(free), which will allow you to run queries with basic aggregation against all types of text based files (and across sets of files), XML, CSV, Event Log, the Registry, file system, Active Directory, etc..
There is also a free GUI build on top of it called Log Parser Lizard GUI which makes it more user friendly and can do basic graphing etc.
I would consider looking at R, it is:
Free
Widely used by statisticians
Fairly easy to use.
Can easily do everything you mention in your post
I used Tableau Software at a previous gig, and it's pretty amazing - extremely intuitive and easy to use.
Unfortunately it's also pricey.

Analyzing Multithreaded Programs [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
We have a codebase that is several years old, and all the original developers are long gone. It uses many, many threads, but with no apparent design or common architectural principles. Every developer had his own style of multithreaded programming, so some threads communicate with one another using queues, some lock data with mutexes, some lock with semaphores, some use operating-system IPC mechanisms for intra-process communications. There is no design documentation, and comments are sparse. It's a mess, and it seems that whenever we try to refactor the code or add new functionality, we introduce deadlocks or other problems.
So, does anyone know of any tools or techniques that would help to analyze and document all the interactions between threads? FWIW, the codebase is C++ on Linux, but I'd be interested to hear about tools for other environments.
Update
I appreciate the responses received so far, but I was hoping for something more sophisticated or systematic than advice that is essentially "add log messages, figure out what's going on, and fix it." There are lots of tools out there for analyzing and documenting control-flow in single-threaded programs; is there nothing available for multi-threaded programs?
See also Debugging multithreaded applications
Invest in a copy of Intel's VTune and its thread profiling tools. It will give you both a system and a source level view of the thread behaviour. It's certainly not going to autodocument the thing for you, but should be a real help in at least visualising what is happening in different circumstances.
I think there is a trial version that you can download, so may be worth giving that a go. I've only used the Windows version, but looking at the VTune webpage it also has a Linux version.
As a starting point, I'd be tempted to add tracing log messages at strategic points within your application. This will allow you to analyse how your threads are interacting with no danger that the act of observing the threads will change their behaviour (as could be the case with step-by-step debugging).
My experience is with the .NET platform and my favoured logging tool would be log4net since it's free, has extensive configuration options and, if you're sensible in how you implement your logging, it won't noticeably hinder your application's performance. Alternatively, there is .NET's built in Debug (or Trace) class in the System.Diagnostics namespace.
I'd focus on the shared memory locks first (the mutexes and semaphores) as they are most likely to cause issues. Look at which state is being protected by locks and then determine which state is under the protection of several locks. This will give you a sense of potential conflicts. Look at situations where code that holds a lock calls out to methods (don't forget virtual methods). Try to eliminate these calls where possible (by reducing the time the lock is held).
Given the list of mutexes that are held and a rough idea of the state that they protect, assign a locking order (i.e., mutex A should always be taken before mutex B). Try to enforce this in the code.
See if you can combine several locks into one if concurrency won't be adversely affected. For example, if mutex A and B seem like they might have deadlocks and an ordering scheme is not easily done, combine them to one lock initially.
It's not going to be easy but I'm for simplifying the code at the expense of concurrency to get a handle of the problem.
This a really hard problem for automated tools. You might want to look into model checking your code. Don't expect magical results: model checkers are very limited in the amount of code and the number of threads they can effectively check.
A tool that might work for you is CHESS (although it is unfortunately Windows-only). BLAST is another fairly powerful tool, but is very difficult to use and may not handle C++. Wikipedia also lists StEAM, which I haven't heard of before, but sounds like it might work for you:
StEAM is a model checker for C++. It detects deadlocks, segmentation faults, out of range variables and non-terminating loops.
Alternatively, it would probably help a lot to try to converge the code towards a small number of well-defined (and, preferably, high-level) synchronization schemes. Mixing locks, semaphores, and monitors in the same code base is asking for trouble.
One thing to keep in mind with using log4net or similar tool is that they change the timing of the application and can often hide the underlying race conditions. We had some poorly written code to debug and introduced logging and this actually removed race conditions and deadlocks (or greatly reduced their frequency).
In Java, you have choices like FindBugs (for static bytecode analysis) to find certain kinds of inconsistent synchronization, or the many dynamic thread analyzers from companies like Coverity, JProbe, OptimizeIt, etc.
Can't UML help you here ?
If you reverse-engineer your codebase into UML, then you should be able to draw class diagrams that shows the relationships between your classes. Starting from the classes whose methods are the thread entry points, you could see which thread uses which class. Based on my experience with Rational Rose, this could be achieved using drag-and-drop ; if no relationship between the added class and the previous ones, then the added class is not directly used by the thread that started with the method you began the diagram with. This should gives you hints towards the role of each threads.
This will also show the "data objects" that are shared and the objects that are thread-specific.
If you draw a big class diagram and remove all the "data objects", then you should be able to layout that diagram as clouds, each clouds being a thread - or a group of threads, unless the coupling and cohesion of the code base is awful.
This will only gives you one portion of the puzzle, but it could be helpful ; I just hope your codebase is not too muddy or too "procedural", in which case ...

Resources