What is your programming language of choice for a multi-threaded http downloading application? - multithreading

I'm eager to learn a new programming language.
Which one(s) would you suggest for a program that:
downloads millions of URLs, in a multi-threaded manner
interacts with a DB of some sort to store downloaded data
Think web crawler/search engine styled projects. And know that I'm up for learning literally anything.
Please post your favorite language, why you chose it, and your favorite tutorial/reference manual (preferably free!) for said language.
Note: I will update this post occasionally to include everyone's best answers.

F# is nice choice, cause the idiomatic patterns of async operations (esp IO) and parallelization is the key strengths of language.
You can do it easy and .NET Framework's BCL is at your service also.

Personally, I use Python for stuff like this. You can use the urllib2 module to download content via HTTP and the I find the syntax of Python to be pleasing.
Furthermore, you can thread easily in Python.
Good Luck.

Related

Options for wrapping a C++ library for Haskell (and other languages)

This question is about design / is fairly open-ended.
I'd like to use OpenCV, a large C++ library, from Haskell.
The closest solution at the moment is probably Arjun Comar's attempt to adapt the Python / Java binding generator.
See here, here, and here.
His approach generates a C interface, which is then wrapped using hsc2hs.
Due to OpenCV's lack of referential transparency in its API, as well as its frequent use of call parameters for output, for Arjun's approach to fully succeed he'll need to define a new API for OpenCV, and implement it in terms of the existing one.
So, it seems it might not be too much extra work to go whole-hog and define an API using an interface description languages (IDL), such as SWIG, protobuf-with-RPC, or Apache Thrift.
This would provide interfaces to a number of languages besides Haskell.
My questions:
Is there anything better than SWIG for a server-free solution?
(I just want to call into C++; I'd rather not go through a local server.)
If there's no good server-free solution, should I use protobuf-with-RPC or Thrift?
Related: How good is Thrift's Haskell support?
From the code, it looks like it needs updating (I see references to GHC 6).
Related: What's a good protobuf-with-RPC solution?
With Apache Thrift, you get Haskell support. You are correct, code is not generally "latest", but you rarely care. You can do complex things on other abstraction levels and keep things as simple as possible at messaging level.
Google Protobuf has no support for Haskell, nor does SWIG. With Protobuf you get C++, Java, JavaScript and Python, to my knowledge the main languages at Google. Have a look at this presentation. Without contest, Thrift and Protobuf are the best in house.
It seems in your case you have to go with Thrift, as it supports Haskell.
It sounds like the foreign function interface for C++ is what you want:
Hackage,
Github
Disclaimer: I haven't used it, only heard good things about it.

Platform for creating a visual programming language

I'm interested in creating a visual programming language which can aid non-programmers(like children) to write simple programs, much like Labview or Simulink allows engineers to connect functional blocks together without the knowledge of how they are internally built. Is this called programming by demonstration? What are example applications?
What would be an ideal platform which can allow me to do this(it can be a desktop or a web app)
Check out Google Blockly. Blockly allows a developer to create their own blocks, translations (generators) to virtually any programming language (or even JSON/XML) and includes a graphical interface to allow end users to create their own programs.
Brief summary:
Blockly was influenced by App Inventor, which itself was based off Scratch
App Inventor now uses Blockly (?!)
So does the BBC microbit
Blockly itself runs in a browser (typically) using javascript
Focused on (visual) language developers
language independent blocks and generators
includes a Block Factory - which allows visual programming to create new Blocks (?!) - I didn't find this useful myself...except for understanding
includes generators to map blocks to javascript/python
e.g. These blocks:
Generated this code:
See https://developers.google.com/blockly/about/showcase for more details
Best wishes - Andy
The adventure on which you are about to embark is the design and implementation of a visual programming language. I don't know of any good textbooks in this area, but there are an IEEE conference and refereed journal devoted to this field. Margaret Burnett of Oregon State University, who is a highly regarded authority, has assembled a bibliography on visual programming languages; I suggest you start there.
You might consider writing to Professor Burnett for advice. If you do, I hope you will report the results back here.
There is Scratch written by MIT which is much like what you are looking for.
http://scratch.mit.edu/
A restricted form of programming is dataflow (aka. flow-based) programming, where the application is built from components by connecting their ports. Depending on the platform and purpose, the components are simple (like a path selector) or complex (like an image transformator). There are several dataflow systems (just I've made two), some of them has no visual editor, some of them are just a part of a bigger system, and there're some which don't even mention the approach. (Did you think, that make, MS-Excel and Unix Shell pipes are some kind of this?)
All modern digital synths based on dataflow approach, there's an amazing visual example: http://www.youtube.com/watch?v=0h-RhyopUmc
AFAIK, there's no dataflow system for definitly educational purposes. For more information, you should check this site: http://flowbased.org/start
There is a new open source library out there: TUM.CMS.VPLControl. Get it here. This library may serve as a basis for your purposes.
There is Snap written by UC Berkeley. It is another option to understand VPL.
Pay attention on CoSpaces Edu. It is an online platform that enables the creation of virtual worlds and learning experiences whilst providing a more flexible approach to the learning curriculum.
There is visual coding named "CoBlocks".
Learners can animate and code their creations with "CoBlocks" before exploring and sharing them in mobile VR.
Also It is possible to use JavaScript or TypeScript.
If you want to go ahead with this, the platform that I suggest is the one used to implement Scratch (which already does what you want, IMHO), which is Squeak Smalltalk. The Squeak environment was designed with visual programming explicitly in mind. It's free, and Smalltalk syntax can learned in half an hour. Learning the gigantic class library may take just a little longer.
The blocks editor which was most support and development for microbit is microsoft makecode
Scratch is a horrible language to teach programming (i'm biased, but check out Pipes Visual Programming Language)
What you seem to want to do sounds a lot like Functional Block programming (as in functional block programming language IEC 61499 and other VPLs for mechatronics development). There is already a lot of research into VPLs so you might want to make sure that A) what your are trying to do has an audience and B) what you are trying to do can be done easily.
It sounds a bit negative in tone, but a good place to start to test the plausibility of your idea is by reading Davor Babic's short blog post at http://blog.davor.se/blog/2012/09/09/Visual-programming/
As far as what platform to use - you could use pretty much anything, just make sure it has good graphic libraries (You could use Java with Swing - if you like pain - or Python with TKinter) just depends what you are familiar with. Just keep in mind who you want to eventually launch the language to (if its iOS, then look at using Objective-C, etc.)

A common set of problems to learn new languages

With "Polyglot" programming techniques becoming more relevant, it is almost a necessity to use the "right" PL for the problem. However, learning new languages takes time which usually most project team can't afford. What is the best way to learn a new programming language? Is there a common set of problems that can be solved to reach a certain level of competence?
Well, it depends what you want to do. (web, db, whatever).
Generally I'd want to know:
What's the library like, how do I reference it
What ORMs are there
What build/deployment platforms exist for it
How does it handle updates
How do I do general things, like:
DB Access
File things
Display UI's
and so on.
Really, learning is only by doing -- you need a project that you can use the given language for.
Project Euler is the first thing to come to mind as an oft-used set of problems to try in a new language, even if it's not something I've ever tried.
If the language is another JVM or CLR hosted one, the issues about learning the environment can be set aside -- you can use all your familiar APIs in your Clojure/Scala/F#... code -- and concentrate on the syntax and idiom.
Otherwise, you're probably using the new language because it has a good fit for the particular problem you want to solve (e.g. native code and functional -> Haskell; distributed and concurrent -> Erlang) so the fit of the feature set is known in advance but you have the extra load of learning the standard APIs. And that's what prototyping is for.
The book Programming Challenges and the associated website provide a large list of algorithmic problems, with automatic online judging in several languages (Java, C, C++). Any algorithm textbook can give you lots of examples of basic data structures and procedures to try and implement, which is often a nice way to get some practice with basic language syntax and features. My personal favourite for this is The Algorithm Design Manual, which is language agnostic, but there are plenty of good language-specific books available as well (Mastering Algorithms in Perl or Data Structures and Algorithms in Java, for example).
If you're interested in a general set of mathematical problems to try and solve, Project Euler is a great resource.
For more day to day problems, I find the cookbook approach most helpful. For example, both Perl and Python have excellent O'Reilly cookbooks, as well as online resources, which provide short examples of many common and important problems. As mentioned in another answer, the key here is to find canonical examples of basic features you will need, particularly by leveraging what's available in standard libraries. I usually try and build up my own small library of examples as I go along, e.g. a socket example, a DB access example, a file reading example, a simple numerical solver, etc, which I then pillage for ideas when it's time to write production code.

Which libraries are indispensable?

If you moved to a new programming language, which libraries do you feel must be supported if you're to keep using the language?
I am interested in both specific libraries (eg, bindings for libXYZ should exist) and categories (eg, a regular expression library should exist).
As an extension to this, what are the deal breaker features or design decisions (language level or library level) that would persuade you to switch to another language or to ignore it? Does your current main language support these well? How could they be improved upon?
I am interested to hear what people find most important for their choice of programming language besides syntax, platform support, efficiency and paradigm.
String handling is still essential today. So either the language or the standard library should have a nice set of string handling features.
A Strong xUnit-like library.
Webservice support
XML Processing
A database connectivity library
A Networking library
A threading library
A File IO library
In terms of frameworks:
A Rich GUI library
An AJAX library
An application server.
It wasn't that important a few decades ago, but support for networking is very important.
At the very least high-level stuff like HTTP.
Things that I use all the time is only the basic stuff like collections, network and I/O stuff. And I would expect that language to support it directly not by adding a library to it.
A solid Math library helps quite a bit.
Regular expressions
Logging & other diagnostics
Cryptography
Collections (lists/maps/stacks/etc)

How do you create a computer or scripting language for an application?

Duplicate of:
Learning to write a compiler
Documentation on creating a programming language
Learning Resources on Parsers, Interpreters, and Compilers
Suggestions for writing a programming language?
Compiler-Programming: What are the most fundamental ingredients?
Are there some online resources about compiler principle?
and others I'm too lazy to find right now.
I'm not asking how to make an incredibly complex language. I just wanted to understand the basics. I would use c# as the underlying language. I know it's vague. I was hoping for something very basic to direct me.
I think I'm mostly interested in creating scripting languages. For example, I see people that write programs but then they have a scripting language for their application. I do not want to rewrite a windows scripting language. Say I had a text file reader and for some reason wanted a scripting language to automate something. I'm not sure how to ask.
Thank you.
EDIT - Thank you for the answers. I was looking at it more for the learning not the doing at the moment. I would probably use LUA, but I am trying to learn more about the concept in general.
You could take a look at LUA - I've used it to great success each time I asked myself the question "How would I automate insert task here in insert one of my apps here?"
Edit: Here are some examples (taken from the links page, admittedly, unwieldy Lua Wiki) on how you could embed Lua in your app:
Embedding Lua in C: Using Lua from inside C
Embedding a scripting language inside your C/C++ code
Embeddable scripting with Lua
You can use an existing language like Python or Javascript. For example, for Javascript, there is http://www.mozilla.org/rhino/ for Java apps. So typically you don't need to actually invent a new language, you would just provide a custom API for a language that already exists.
first you need a lexical parser like lex, then a syntax parser like bison.
then you can work with the syntax parser to create an interpreter to 'execute' the syntax results.
that's how the most scripting languages do.
p.s: another way is to practice by writing shells - shell scripts (bash, csh, or sh) are highly simplified scripting languages.
Some terminology is in order. You may be talking about a domain-specific language.
The two basic ways to transform a text file into an "executable": a compiler or an interpreter. An interpreter fits the scripting concept better, as it is easier to build and executes lines one at a time. Note that beyond a very simple language both writing a decent parser or a decent interpreter are non-trivial. The classic work on interpreters is SICP, but this is quite a hard book for beginners.
Scott Hanselman mentioned in his latest hanselminutes podcast that integrating IronPython to allow scripting of an existing application was very easy to do.
If you're interested in the end target of having your application be scriptable, then you should definitely consider using an existing language rather than attempting to write your own.
If you are more interested in the educational experience of writing your own scripting language, then you should go for it!
There's no need to create a new scripting language there are several eg. Rhino which is a widely used embeddable javascript (http://www.mozilla.org/rhino/) or Jscript from MS, that you can use directly in your product.
I've gone the way that you are asking - I once created my own scheme interpreter. This worked really well, but we re-invented a lot of technology and didn't really get a lot of additional benefit. We would have been far better off just using one of the scheme's that were available. I would not make that decision again even though it was fun and successful.

Resources