Implementing a Text Editor

Implementing a Text Editor - text

I know this question may be a bit involved, but I would like to know the basic skeleton of how to make a desktop text editor that one can use for coding. Very generally speaking, what tools should I use to display text to a window (how to display that window), and how to handle text (I think this is with a split buffer).
Not looking for any details, just a very broad and general skeleton of how this is done. I am thinking about working in Java or C++. Thanks!

I'm sorry people downvoted you without explaining why you deserve them. I'm guessing people think your question isn't educated enough? But in any case, I'll try to get you started. I am not educated enough to answer your question, but I can show you how you can answer it yourself and probably learn a lot more than you would have gotten from here.
https://github.com/vim/vim/blob/master/src/README.txt -The readme for the vim source code, which is all written in C. Not exactly C++, but the better you are at C, the better you are at certain facets of C++. And if you look at the list of source files in the readme along with their short descriptions, you do kind of get a skeleton.
Notepad++ actually is written in C++, but I suspect the GUI overhead would make it significantly harder to trace. Still, if you want, https://github.com/notepad-plus-plus/notepad-plus-plus/tree/master/PowerEditor/src

Related

How does someone even begin to code something like this? What are the ideas/thoughts behind this?

I recently came across this website on http://nkwiatek.com/ and it totally blew my mind. How does someone begin to program something like that smokey/fluid effect? Another thing that I can't even begin to conceptualize is a visualizer for a music program.
I only have two years of programming experience on my back but I believe I can see (well, at least I think I can) the vague ideas behind code that goes into various programs and what those programs require. However, programs that create abstract visual renderings (for lack of better words), such as the site I linked to or visualizers, completely baffle me when I try to think of how something like that is done.
For an answer, I'm looking for a pretty high level definition of the program, but low enough that it includes coding concepts and ideas that I can further research.
Because this question isn't exactly as 'concrete' as some of the other questions on this site, an appropriate answer might include:
Thought process of the coder (what you imagine is happening in abstract visual code/high level definition of the code)
API's
Psuedocode
Source code
Links to content that explains topics similar to this
However, these are just guidelines to the type of answer I'm looking for. Just keep in mind, I am not interested in that site alone, but more of the coding ideas and concepts behind the abstract visual programs. I hope I made sense of what I am confused/interested in. I will gladly clarify if anyone has questions on what I am asking. Thank you in advance for your replies!
Edit: To further define the ideas that I am interested in, here is an article on an interesting visual rendering: http://www.iquilezles.org/www/articles/warp/warp.htm

For the nkwiatek.com example, I would start like that:
Create some JavaScript function that makes characters follow the mouse. It could be for example a simple shape like that at first:
OOO
OOOOO
OOOOO
OOO
Once this is working, make it leave a trail and keep a reference to each characters that's been added to the screen (will be needed later)
Now make each generated character semi-random and use the previously mentioned reference to constantly update the characters on screen. The further away a character is from the mouse, the smaller it should look. i.e. characters near the mouse could be "big" like AMHIJKL, etc. characters further away could be smaller like -~=, etc. and ., etc. for the most further away.
This should already make a nice animation. After that, I think there's some function that makes everything move in a kind of wave. It seems to be based on the velocity of the mouse. Maybe there's some research paper on how to generate such an effect.

That is one amazing background.
How to start? Go to the web page and hit Ctrl+U. It's Javascript, so the source is right there. From that... study. The guy's code looks pretty clear, but of course what he's DOING is complicated so it will take some time. Time well spent, I'd think.
Higher-level things like what the guy was thinking... you'll know that after studying the code.

For what reasons do some programmers vehemently hate languages where whitespace matters (e.g. Python)? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
C++ is my first language, and as such I'm used to whitespace being ignored. However, I've been toying around with Python, and I don't find it too hard to get used to the whitespace rules. It seems, however, that a lot of programmers on the Internet can't get past the whitespace rules. From what I've seen, peoples' C++ programs tend to be formatted very consistently with respect to whitespace (or else it's pretty hard to read), so why do some people have such a problem with whitespace-based languages like Python?

It violates the Principle of Least Astonishment, because we have it ingrained in ourselves (whether for good or bad) that whitespace Does Not Matter in a programming language. Whitespace is one of those issues that has been left up to personal style.
I still have bad memories back from being a student of learning the hard way that 8 spaces is not equivalent to a tab in a Makefile... Ah, the sleep I lost...

The only valid reason I have come across is that refactoring using cut-and-paste (not copy) without refactoring tools (or syntax-aware cut-andpaste), can end up changing semantics if an easy mistake is made.

There are several different types of whitespace (spaces, tabs, weird unicode characters, carriage returns, line breaks, etc.), they aren't necessarily visually distinct, and languages and editors may treat them capriciously. This isn't an argument against well-designed whitespace semantics, but many people are against all forms of it simply because of the possibility of poor design.

People hate it because it violates common sense. Not a single one of the replies I have read here decided that it was ok to simply forgo periods and other punctuations. In fact the grammar has been very good. If the nonsense about indentation actually carrying the meaning were true we would all just forget about using punctuations entirely.
No one learned that newlines terminate a sentence in a horizontal language like English, instead we learned to infer when a sentence ended regardless of whether or not the punctuation was present or not.
The same is true for programming languages, especially for those of us who started out with a programming language that did use explicit block termination. You learn to infer where a block starts and stops over time, it does not mean that the spacing did that for you, the semantics of the language itself did.
Most literate people would have no problem understanding posts without punctuations. Having to rely on what is a representation of the absence of a character is not a good idea. Do any of you count from zero when you make your to-do list?

Alright, this is a very narrow perspective, but I haven't seen it mentioned elsewhere: keeping track of white space is a hassle if you are trying to autogenerate a script.
When I first encountered Python, I don't remember the details, but I had developed a Windows tool with a GUI that allowed novice users to configure several settings, and then press OK. The output of the tool was a script, which the user could copy to a Unix machine, and then execute it there to do something or other that was too complicated or tedious for them to do manually. Since nobody maintained the generated scripts, there was no reason they needed to look nice. So, keeping track of indentation seemed like an unnecessary burden from that perspective.
For most purposes, though, I find that Python is much easier than any other language.

Perhaps your C++ background (and thus who your peers are) is clouding your perception of this (ie selective sampling) but in my experience the reaction to Python's "white space is intent" meme is anywhere from ambivalent to they absolutely love it. The reason a lot of people love it is that it forces people to format their code.
I can't say I've ever met anyone who "hates" it because hating it is much like hating the idea of well-formatted code.
Edit: let me put this in some perspective.
In the Java world there are two main methods of packaging and deploying Web apps: Ant and Maven.
Ant is basically an XML-based Make facility that has tasks for the common things you do. It's a blank slate, which is powerful, but it also means you have to write a lot of common things yourself and every installation is free to do things slightly differently. All of this is well-intentioned but can make it hard to figure out someone's Ant scripts.
Maven is far more fully features. It has archetypes, which are basically project types. Depending on which archetype(s) you use, you won't have to write any tasks to start, stop, clean, build, etc but you will have a mandated directory structure, which is quite deep.
The advantage of that is if you've seen one Maven Web app you've seen them all. You know the commands. You know the structure. That's extremely useful.
But you have people who absolutely hate Maven and I think it comes down to this: they don't like giving up control, even when it's ultimately in their interest to do so. Also, you'll find a certain brand of person who thinks that their use case is a justifiable exception. You see this personality trait a lot. For example, I think an old Joel post mentioned a story where someone wanted to use "enter" to go from the username to password form fields even though the convention was that enter executed the default action (usually "OK") so they had to write a custom dialog class for Windows for this.
Basically some people just don't like being told what to do and others are completely obstinate in their belief that they're right even when all evidence points to the contrary.
This probably explains why some supposedly hate Python's white space: they don't like being told how to format their code. They like the freedom of C/C++.

Because change is scary. And maybe, among certain developers, there are some faint memories of languages with capricious rules about whitespacing that were hard to remember and arbitrary, meant more for compiler convenience than expressiveness.
Most likely, not giving whitespace-significance a fair shake before dismissing it is the real reason. Ask someone to fix a bug in a reasonably complex but well-written Python program, then ask them to go fix a bug in a 20 year old system in C, VB or Cobol and ask them which they prefer.
As for me, I have as much trouble with whitespace in Python or Boo as I have with parentheses in Lisp. Which is to say, none.

They will have to get used to it. Initially I had a problem my self trying to read some examples but after using language for some time I started liking it.
I believe it is a habit that people has to overcome.

Some have developed habits (for example: deeply nested loops, unnecessarily large functions) that they perceive would be hard to support in a whitespace sensitive language.
Some have developed an aesthetic dislike for hanging indents.

Because they are used to languages like C and JavaScript where they can align items as they please.
When it comes to Python, you have to indent code based on its context:
def Print():
ManyArgumentFunction(LongParam1,LongParam2,LongParam3,LongParam4...
In C, you could do:
void Print()
{
ManyArgumentFunction(LongParam1,
LongParam2,
LongParam3,...
}

The only complaints I (also of C++ background) have heard about Python are from people who don't like using the "Replace Tabs with Space" option in their IDE.

An Emacs alternative which exposes a text model to a scriptable environment?

everybody out there!
I'm currently seeking for a technical solution to create a nice literate programming environment. Unfortunately, most editors are too much hard coded, and their functionalities just cover most famous needs, and can't cleanly cover special needs.
I came to Emacs (later after some others), but I also came to numerous troubles with Emacs (I will not talk about these, this is not the topic).
However, there is one thing I like with Emacs and which was indeed matching what I was looking for: it exposes a full text model to a scriptable environment, and the overall UI is designed so it is well suited to either graphical UIs or text UIs (because it is mostly text based). And last but not least, this is scriptable with a kind of LISP, and LISP indeed seems a good choice to me, in the area of text manipulation and interpretation.
I've searched the web for a text editor which would expose a full text model to a scriptable environment, but I have not found anything. I guess this is not an everyday request on the web, so it is probably better to ask some humans about it, better than to ask a robot.
I was, but in short, I'm looking for: an editor which exposes a full text model [*], and which exposes this model to a script engine (preferably LISP, but I would enjoy Python as well, or any others after all).
[*] Talking about text model, I mean: text attributes (optionally font face), text visibility, text read-write property, and text content iteration, at a level as low as the character basis.
Have a nice day! :)
--
Yannick Duchêne

JEdit seems to be very scriptable with Java, BeanShell, Jython and other languages targeting the JVM. Most of its functionality is implemented with OSGI plugins. If you really like LISP, maybe you could even try with Clojure! :-)

Emacs, Climacs, Portable Hemlock (and to some extent Hemlock).
I am sure there are other editors around that exposes a full text model to a script engine that are NOT in "the emacs family", but I don't know them.
Oh, yes, there's the VMS editor framework, but I cannot recall its name.

What Vatine said, plus there's a very minimal Scheme editor built into Fluxus, which I extended with Emacs key-bindings (in my personal copy), so I know it would work as something close to a stubbed implementation (if you rip out all the OpenGL stuff).
Edit:
Looks like I was working with fluxus-0.8, which doesn't even seem to be on the site anymore. If you end up needing to go that low-level to start, let me know and I'll send it over.

Not sure if this is useful, but there is a long list of Emacs-like editors: http://www.finseth.com/emacs.html
Btw., Craig A. Finseth also wrote a book on implementing an Emacs-like editor: http://www.finseth.com/craft/
The Book as PDF.

Report of an (unsuccessfully) ending quest :
Although a possible technical choice I could figure will not work for me (see later), I still point it here, if this can ever be useful to someone running UNIX-Like (I'm running Windows).
Context and state of the “ art ” : near to all (or all) so called Emacsen and Emacs clone, has nothing to compare with Emacs. They just mimics terms like major mode an minor mode, mimics key-bindings, and most of time also, the UI look and feel. But the core is not there. I've learned these are called “ Emacs Ersatz ”.
Disclaimer : for some reasons, I have not tested Climax and Hemlock, so the latter comment does not apply to these.
EFuns : the last one I came to, was EFuns, but unfortunately, I could not compile it on Windows (I suspect something is wrong with the sources, some directory are missing in the archive). Interested parties may get it here : EFuns, an Emacs-like scripted in OCaml. Fortunately for UNIX-Like users, binaries are provided (not for Windows).
Implementations List : to complete the list Rainer Joswig pointed to, here is another one, shorter, but more up-to-date : [ Sorry I can't post this link, it seems I'm not allowed to post more than one link - I'm sorry for interested parties (sad) ]

Resources for learning a new language quickly?

The title may seem slightly self-contradictory, and I accept that you can't really learn a language quickly. However, an experienced programmer that already has knowledge of a few languagues and different styles (functional, OO, imperative etc.) often wants to get started quickly. I've seen a few websites doing effective "translations" in the form of "just show me syntax equivalence". I can't remember the sites now, but for related languages (e.g. Perl/PHP) it's quite common.
Is there a better resource that covers more languages? Is there a resource that covers idioms as well as syntax? I think this would be incredibly useful for doing small amounts of work on existing code bases where you are not familiar with the language. Looking at the existing code, as we know, is not always a good indicator of quality. Likewise, for "learn by doing" weekend project I always have the urge to write reasonably idiomatic, clean code from the start. Such a resource could also link to known good example projects of varying sizes for those that prefer to learn by reading. Reading a well-written medium sized code base can also be much more practical when access to development environments might be limited.
I think it's possible to find tutorials and summaries for individual languages that provide some of this functionality in disparate web locations but I'm hoping there is a good, centralised, comparative place that the busy programmer can turn to.

You generally have two main things to overcome:
Syntax
Reference
Syntax you can pick up fairly quickly with a language tutorial and a stack of samplecode.
Reference (library/API calls) you need to find a proper guide to; perhaps the language reference, or perhaps google...
With those two in place, following a walkthrough (to get you used to using the development environment) will have you pretty much ready - you'll be able to look up what you want to say (reference), and know how to say it (syntax).
This, of course, applies principally to procedural/oop languages; languages that require a paradigm switch (ML/Haskell) you should go to lectures for ;)
(and for the weirder moments, there's SO!)

In the past my favour was "learning by doing". So e.g. I know a little bit of C++ and a lot of C#.Net but I must write a FTP Tool in Python.
So I sit for an hour and so the syntax differences by a tutorial, than I develop the form itself and look at the generated code. Then I search a open source Python FTP Client and get pieces of code (Not copy and paste, write it self to see, feel and remember the code!)
After a few hours I get it.
So: The mix is the best. A book, a piece of good code, the willing to learn and a free night with much coffee.

At the risk of sounding cheesy, I would start with the language's website tutorial and/or FAQ, followed by asking more specific questions here. SO is my centralized location for programming knowledge.
I remember when I learned Perl. I was asked to modify some Perl code at work and I'd never seen the language before. I had experience with several other languages, however, so it wasn't hard to figure out the syntax with the online Perl docs in one window and the code in another, side-by-side. I don't know that solely reading existing code is necessarily the best way to learn. In my case, I didn't know Perl but I could tell that the person who originally wrote the code didn't know Perl either. I'm not sure I could've distinguished between good Perl and really confusing Perl. It would've been nice to be able to ask questions here at the time.

Language isn't important. What is important is learning your ways around designing algorithms and the proper application of design patterns. Focus on the technique, not the language that implements a certain technique. Once you understand the proper development techniques, any programming language will just become real easy, no matter how obscure they are...
When you put a focus on a language, you're restricting your own knowledge.

http://devcheatsheet.com/ seems to be a step in the right direction: it aggregates cheat sheets/quick references and they are (somewhat) manually reviewed. It's also wide-ranging. It still comes up short a bit in terms of "idiomatic" quick reference: for example, the page on Ruby doesn't mention yield.

Rosetta Code appears to be an excellent resource that includes hints on coding idiomatically and moves from simple (like for-loops) to things like drawing. I haven't checked out how comprehensive it is, but there are a large number of languages and tasks listed. The drawbacks re: original question are:
Some of the linking is not accurate
(navigating Python->ForLoop will
take you to the top of the ForLoop
page, not the Python section). It's a
wiki, this can be improved.
Ideally you could "slice" the wiki
however you chose to see e.g. the top
20 tasks for two languages
side-by-side.

http://hyperpolyglot.org/ seems to be an almost perfect match for what I was looking for. The quality is not always there, or idiom can be lacking, but it has the same intention and is pretty comprehensive.

Understanding a Large, Undocumented Set of Source Code? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have always been astonished by Wine. Sometimes I want to hack on it, fix little things and generally understand how it works. So, I download the Wine source code and right after that I feel overwhelmed. The codebase is huge and - unlike the Linux Kernel - there are almost no guides about the code.
What are the best-practices for understanding such a huge codebase?

With a complex code base the biggest mistake you can make is trying to be a computer. Get the computer to run the code, and use a debugger to help find out what is going on.
Figure out how to compile, install and run your own version of Wine from the existing source code.
Learn how debug (e.g. use gdb) on a running instance of your version of Wine.
Run Wine under the debugger and make cause it to demonstrate the undesired behaviour.
The fun part: find where the code execution path goes and start learning how it all goes together.
Yes, reading lots and lots of code will help, but the compiler/debugger/computer can run code a lot faster than you.

A professor once told us to compare such a situation with climbing a mountain. You might be listening to someone who did this and tells you what it's like to look out into the country. And you believe without hesitation that that's a spectacular sight.
However, you have to start climbing yourself for real understanding what the view from the top is like.
And it's not that important to climb all the way to the top. It might be perfectly suficient just to reach a fair height above ground level.
But don't ever be afraid of start climbing. The view is always worth any efforts.
This has always been a nice analogy for me. I know this question was more about specific tips on how to efficiently deal with code bases once you started climbing. But nevertheless it instantly reminded me of our physics classes way back then.

(This is an answer I posted to a question a while back. I modified it a bit to fit this question.)
Experience has shown me that there are 3 major goals you have when learning a legacy system:
Learn what the code is supposed to do.
Learn how it does them.
(crucially) Learn why it does them the way it does.
All three of those parts are very important, and there's a few tricks to help you get started.
First, resist the temptation to just ctrl-click (or whatever your IDE uses) your way around the code to understand everything. You probably won't be able to keep everything in perspective in your mind this way, especially when each line forces you to look at multiple other classes in order to understand what it is, so you need to be able to hold several levels of the stack in your head.
Read documentation where possible; it usually helps you quickly gain a mental framework upon which to build everything that follows.
Run test cases where possible.
Don't be afraid to ask someone who knows if you have a question. Granted, you shouldn't waste others' time with inane queries, but if there's something that you simply don't understand (this is especially true with more conceptual questions like, "Wouldn't it make much more sense to implement this as a ___" or something), it's probably worth finding out the answer before you mess something up and don't know why.
When you do finally get down to reading the code, start at a logical "main" place and go from there. Don't just read the code top to bottom, or in alphabetical order, or anything (this is probably obvious).

The best way to get acquainted with a large codebase is to dive in. Many projects have a list of easy tasks that need to be done, and they're usually reserved to help ease people in. You should find and work on some of these; you'll learn a lot about the general code outline and structure, contribute to the project, and get an easy payoff that will help encourage you to take on larger tasks.
Like most projects, WINE has good resources available to its developers; IRC, wiki, mailing list, and guides/overviews. With most daunting codebases, it's not so scary after the first few fixes. WINE is truly large, and much like the kernel, I doubt there's any expert in all systems; don't feel like you need to be either. Start working on something that matters to you and take it from there.
I've started a few patches to WINE myself, and it's a good community and good structure. There's lots of very helpful debug messages, and it's a really cool project to work on, so that helps you hit it longer too.
We all appreciate your valor and willingness to help with WINE (it needs it). Thanks, and good luck.

Dig in. Think of a question you'd like to have answered, and try to find the answer. When you get tired of reading code, go read the dev mailing list, the developer's guide, or the wiki.
Unfortunately, there's no royal road to understanding a large code base. If you enjoy that sort of thing (I do) you're in for some fun. If not, guide books won't really help, so you aren't really that much worse off.

Look for one peculiar feature you are interested to improve. Search for its implementation. Once you found it, pull on that straw and all the rest will follow.

The best way is through comments.
I'm being ironic, as you understand tiny bits of the beast add comments so you can follow your trail.
The other developers will also enjoy it if you add the missing guides in the code.

Try to implement some tiny little change in the code, something that will be visible to you. That might be figuring out a workable way to output debugging statements (and figuring out where the output appears), it might be changing the default size of windows or desktop color, or something. Once you can make something happen in the codebase, you've scratched the surface of understanding and can begin to move on toward more complicated things. At that point, select a goal of something slightly more useful that you'd like the code to do, and implement that. Or check out the project's bug tracker and look for something small to start with.
Document as you go, and write unit tests as you go, and refactor as you go. When you figure out what a routine does, comment it!!

As others have suggested, dig in! Read all the available documentation you can absorb. Then see if you can find other people who are interested or knowledgeable and learn with/from them. It helps to have people to bounce ideas off of and ask questions.
For C source code, once you get a feel for what areas of the code you'd like to work on, generate ctags and cscope databases for that code. These tools make it a lot easier to jump around and understand the code. Many text editors (one example is gvim) have support for ctags and cscope so you can jump around easily.

(warning: shameless marketing ahead)
For Java developers using Eclipse, there's nWire. It is an Eclipse plugin for navigating and visualizing large codebases.

A good way to understand a large system is to break it down into it's constituent parts and focus on a specific paths through the application.
Your debugger is your friend here, set a breakpoint in the thread you want to investigate then step through it line by line looking at which each part does... hope that helps...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string