DirectShow Editing Services 4GB memory limit under Windows 7 x64

DirectShow Editing Services 4GB memory limit under Windows 7 x64 - 64-bit

I compiled XTLTest as 64 bit and attempted to test some XTLs under windows 7 x64.
All these tests were done using an XTL with one clip from the WMV showcase, with a timeline sized at 1440x1080.
buffering set to 300 - plays back fine.
buffering set to 600 and got a cant run graph error. Recompiled with large memory aware (which should be set by default on 64 bit apps), same thing.
Tested at 310 and worked fine.
Tried playing out 2 different instances of 64 bit XTLTest at the same time with 310 buffering, and the second one fails with 'can't run graph'.
buffering set to 80, was able to play 4 instances of XTLTest using a combined 4GB of memory. Execute any more instances and can't run graph.
Compiled .NET application targeted at any using DirectShowLib, and comfirmed its running as 64 bit native app. I was able to load 4 XTLs at 80 buffering until I got
System.Runtime.InteropServices.COMException (0x8007000E): Not enough storage is available to complete this operation.
So I can an only conclude that the DES subsystem has a 4GB memory limit for all applications combined.
Is this true? If so is this a DES limit or a DirectShow limit and is there any way to workaround?
best,
Tuviah Snyder
Lead programmer, MediaWan
Solid State Logic, Inc

I haven't worked with DES directly before, but my impression has always been that it was deprecated quite a long time ago. The COM objects that it is made up of are likely 32-bit.

Related

How do you increase NodeJS memory limit on Windows?

I need to run npm install with 8GB memory limit on NodeJS 14 on Windows 10 (with 16GB RAM).
I have tried the following to no avail:
node --max-old-space-size=8192 "C:\Program Files\nodejs\npm" install - unexpected behavior but still seems to be at 4GB after using CTRL + C
set NODE_OPTIONS=--max-old-space-size=8192&&npm install - still 4GB
Adding NODE_OPTIONS environment variable (to both User and System variables) - still 4GB
Related questions:
Where do I set 'NODE_OPTIONS="--max-old-space-size=2048"'
npm install - javascript heap out of memory

TROUBLESHOOTING THE ISSUE
You haven't shown any troubleshooting results, thus far, that leads me to believe that the issue is being caused by the "HEAP's MAX_SIZE Env Variable", nor have you shown any evidence that the variable isn't being assigned the value that you are attempting to assign to it. I am not saying that the issue isn't being caused by the env Variable. However, I am saying, that more troubleshooting needs to be done to know for certain, what the underlying cause to your issue is..
Personally, from what you have shown the community thus far, I believe that the problem is due to running out of memory in a location other than the heap, which if your using a typical PC with Windows 10, its probably your machine that doesn't have enough memory, not period, but at the moment you attempt to do the download.
This is why I think you may be running out of memory:
In my machine I have:
16GB of ram,
4x HDD # 2TB each for a total of 8TB of memory
a 8x Threaded Quad Core I7 6700 6gen from Intel
ASUS Motherboard w/ a Z190 Chip-set.
My Operating Systems
I use Linux, but I got Windows for Free because it was cheaper for me to buy a new PC, and upgrade it, than buy the parts seperatly. I never boot to windows, so its always sitting as an untouched fresh install.
Currently I just upgraded to Fedora Workstation 36.
HD #1 (2TB): Windows 10
HD #2 (2TB): Fedora Workstation
HD #3 & #4 (2TBx2 for 4TB total): Local Storage
TESTING
I ran performance test using a node program that implements child processes on other threads, and the Firefox Browser with up to 70 Tabs open, which I would attempt to refresh all at once.
The results were staggering.
I won't go into detail, but the important thing to note, is I have 16GB of DDR4 RAM
When Windows would need 15+GB of memory, Linux would only be using about 10GB+. That is a 37.5% - 42.5% gain.
However, I have Windows PRO, this allowed me to disable much of the telemetry, and I found that if I disabled windows telemetry and some other features windows was much more performant. When windows would use 15+ Linux would be at 12+ GB, that's still around a 25-20% gain of memory though.
One important thing to note though, sometime Linux would freeze up on me, windows was more robust at recovering when I overloaded the system, but it took quite a bit less to bog windows down.
The Point that I am making:
...is that your only using 8GB, that's not very much now'a days.
2BH I feel like the 16GB I have is far to little, I plan on upgrading soon.
STEP #1 — Troubleshooting the Issue Further
When troubleshooting, you need to always check that your syntax, and semantics are correct, that way you can rule out that the problem isn't due to a silly typo that you made.
"I know what your thinking..."
"Its not a typo!",
But as it turns out there is a typo: The code snippet you added as part of your question uses the Node Command-line Flag syntax, as the syntax for the Node Environment Variable, this is incorrect, as they use different sets of characters. At first glance, they look identical, but if you look at the 2-part MD table I created below, you'll see there is actually a pretty big difference between the two
              SEMANTICS
                         SYNTAX
               DIFFERENCE
Command-line Flag
"--max-old-space-size"
DASHES
Environment Variable
"--max_old_space_size"
UNDERSCORES
"For those with less-than-perfect vision, the difference is that the Environment Variable is typed using underscores, and the other is typed using dashes"
STEP #2 — TEST LOG "MAX-OLD-SPACE" POST-CONFIG
"If you used the right syntax and you still are having an issue, you can check to see that the max size of the heap is actually being configured to the size value that you are assigning to it. The process for logging the configured V8 Max Heap Size in the console can be done by completing the stops below."
TO START: Create a completely empty JavaScript file in an environment where you can run it using the node command — name the file test.js.
Add the code below to test.js, and don't add anything else.
/**
* #file "./test.js"
* #desc "PRINTS: The upper memory limit for V8s HEAP"
* */
const maxHeapSz = require('v8').getHeapStatistics().heap_size_limit;
const maxHeapSz_GB = (maxHeapSz / 1024 ** 3).toFixed(1);
console.log('--------------------------');
console.log(`${maxHeapSz_GB}GB`);
STEP #3 — Run the Following Commands
~/$ node --max-old-space-size=8192 test.js; node --max-old-space-size=4096 test.js; node --max-old-space-size=2048 test.js; node --max-old-space-size=1024 test.js; node --max-old-space-size=512 test.js
It would be very strange if you produced a different result than what the editor in the image (my editor) produced, and here is why:
The command above configures the "Max_Old_Space" Env Var. The program that I asked you to "Copy-&-Paste" into the file "test.js" prints the result of the commands Environment configuration change. It shows that every time you set the Env Var to something different, the Env Var is different during "Run Time". Its important to double note, its a variable, an "Environment Variable", it doesn't actually change the size of the heap, the value it represents is just saying that:
_"I the developer approve of the heap growing to the size that is set in the Environment Variable: "MAX_OLD_SPACE_SIZE"
...which does not instantly mean node is able to accomplish such sizes, hell, you could set it to 10e1000 MBs if you wanted, and it would mostly likely accept that number as a valid configuration. I tested all kinds of different stuff, and there is no doubt that you can set it several hundred times more than the amount of ram you actually have. In other words, its a variable, (i.e. an address to a location in memory that holds a binary value), there shouldn't be any issue setting it. It is far more likely that your machine is unable to allocate the amount of space needed, 8GB of RAM, when you only have 16GBs, can be, for some systems, a large amount to offer to a single process, especially f your running Windows, or MacOS. Linux is king in resource intensive situations.
STEP #4 — Lets See What Your System Has to Offer
"I assume your on Windows from the image you uploaded into your question, therefore, I will continue to answer this question for Windows only. If you need help with a different platform please let me know by editing your question."
Check your OS performance tool, see what the memory preformace chart looks like for your machine. If 8GBs aren't available, it won't matter how high you set the Heap Limit to. This is a very probable case, because you can set the HeapSize limit to use more ram that you physically have, and just because you got a 16MB stick doesn't mean half of it isn't being used, you might find that your machine is using 7-10Gbs for background processes. Windows is hard on system resources, that's why people use Linux/Unix for launching applications."_
Preform the Following Steps
Press the following keys: CTRL + ALT + DEL
Click on “Task Manager”
Click on the “Performance” tab and check the section titled “Memory”
If the performance tool shows your that your Memory Use is above 45% then that is your problem. A quick solution to not having enough memory is to simply add more memory, this can be done by: Using additional sticks of Memory if your machine has room, upgrading to sticks with more memory, upgrading your entire machine, switching to a VM on the cloud with enough memory, upgrading your VM if your already using one, etc... There's an endless amount of different ways that one can go about about increasing memory, however; all of them will require money, therefore, it would be best to try and utilize your current memory to the best of your abilities, which would include some of the following options:
Eliminating background processes
Changing your platform to something less resource demanding. If you want to keep it GUI based, use Ubuntu 21.04, or Mint is a good alternative if your use to windows.
if you really want to conserve as much resources as possible, spending the majority of your memory on your application/program/server, you will obviously want to use a command-line Interface, like Debian, Red-hat or Ubuntu, make sure to choose an LTS version, as it isn't like programming languages, using non LTS versions don't give you very many extra goodies, while LTS is extremely important to security.
Lastly, this is obvious, but if it is possible, don't require downloads that need more memory than you can give.
"If this question still is not resolved at this point I will need a bit more info from you, such as the error messages you receive. Also a more in-depth explanation of what is actually happening, for instance, a further more indepth explanation about the statement below would be very helpful. As well as providing the screen shot asked for above"
What does this mean? What was unexpected?
"unexpected behavior but seems to still be 4GB after hitting CTRL + C"

The NODE_OPTIONS should work, especially in the environment variable. However, some of them should be underscores instead of hypens. Try --max_old_space_size=8192 in the NODE_OPTIONS environment variable instead. Here is a link to support that.

Why does VK_PRESENT_MODE_FIFO_KHR cause catastrophic performance issues in Ubuntu MATE?

I am implementing a simple Vulkan renderer according to a popular Vulkan tutorial (https://vulkan-tutorial.com/Introduction), and I've run into an interesting issue with the presentation mode and the desktop environment performance.
I wrote the triangle demo on Windows, and it performed well; however, I ported it to my Ubuntu installation (running MATE 1.20.1) and discovered a curious problem with the performance of the entire desktop environment while running it; certain swapchain presentation modes seem to wreak utter havoc with the desktop environment.
When setting up a Vulkan swapchain with presentMode set to VK_PRESENT_MODE_FIFO_KHR and subsequently running the application, the entire desktop environment grinds to a halt whenever any window is dragged. When literally any window on the entire desktop is dragged, the entire desktop environment slows to a crawl, appearing to run at roughly 4-5 fps. However, when I replace the presentMode with VK_PRESENT_MODE_IMMEDIATE_KHR, the desktop environment is immune to this issue and does not suffer the performance issues when dragging windows.
When I researched this before asking here, I saw that several people discovered that they experienced this behavior when their application was delivering frames as fast as possible (not vsync'd), and that properly synchronizing with vsync resolved this stuttering. However, in my case, it's the opposite; when I use VK_PRESENT_MODE_IMMEDIATE_KHR, i.e., not waiting for vsync, the dragging performance is smooth, and when I synchronize with vsync with VK_PRESENT_MODE_FIFO_KHR, it stutters.
VK_PRESENT_MODE_FIFO_RELAXED_KHR produces identical (catastrophic) results as the standard FIFO mode.
I tried using the Compton GPU compositor instead of Compiz; the effect was still there (regardless of what window was being dragged, the desktop still became extremely slow) but was slightly less pronounced than when using Compiz.
I have fully implemented the VkSemaphore-based frame/image/swapchain synchronization scheme as defined in the tutorial, and I verified that while using VK_PRESENT_MODE_FIFO_KHR the application is only rendering frames at the target 60 frames per second. (When using IMMEDIATE, it runs at 7,700 fps.)
Most interestingly, when I measured the frametimes (using glfwGetTime()), during the periods when the window is being dragged, the frametime is extremely short. The screenshot shows this; you can see the extremely short/abnormal frame time when a window is being dragged, and then the "typical" frametime (locked to 60fps) while the window is still.
In addition, only while using VK_PRESENT_MODE_FIFO_KHR, while this extreme performance degradation is being observed, Xorg pegs the CPU to 100% on one core, while the running Vulkan application uses a significant amount of CPU time as well (73%) as shown in the screenshot below. This spike is only observed while dragging windows in the desktop environment, and is not observed at all if VK_PRESENT_MODE_IMMEDIATE_KHR is used.
I am curious if anyone else has experienced this and if there is a known fix for this window behavior.
System info: Ubuntu 18.04, Mate 1.20.1 w/ Compiz, Nvidia proprietary drivers.
Edit: This Reddit thread seems to have a similar description of an issue; the VK_PRESENT_MODE_FIFO_KHR causing extreme desktop performance issues under Nvidia proprietary drivers.
Edit 2: This bug can be easily reproduced using vkcube from vulkan-tools. Compare the desktop performance of vkcube using --present-mode 0 vs --present-mode 2.

Code working on windows but launch failures on Linux

First and foremost: I am completely unable to create a MCVE, as I can only reproduce this when running a full code, any attempt to measure or replicate the error in a simpler environment makes it disappear. TDLR I suspect its not a code problem, but a configuration problem.
I have a piece of code for some mathematics on kernels in CUDA. I have a windows machine Win10 x64, GTX 1050, CUDA 9.2 and a Ubuntu 17.04, 2xGTX 1080 Ti, CUDA 9.1.
My code runs good on the windows machine. It is long (~700ms per kernel call for big samples) so I needed to increase the TDR value in windows. The code also (for now) forces it to run in 1 GPU, the first one that is selected with cudaSetDevice(0).
When I copy the same input data and code to the linux machine (I am using git, it is the same code), I get either
an illegal memory access was encountered
or
unspecified launch failure
in my error checking after the GPU call.
If I change the kernel to instead do the math, to just write a number in the output, the kernel executes properly. Other CUDA code (different functions that I have) works fine too. All this leads me to think that there is a problem outside the code, not with the code itself, nor with the general configuration of the drivers/environment variables.
I read that the xorg.conf can have an effect on the timeout of the kernels. I generated a xorg.conf (I had none) and remove the devices from there, as suggested here. I am connecting to the server remotely, and have no monitor plugged in. This changes nothing in the behavior, my kernels still error.
My question is: what else should I look? What linux specific configuration should I have a look at to pinpoint the cause of the kernel halts?

The error ended up being indeed illegal memory access.
These were caused by the fact that sizeof(unsigned long) is machine specific, and my linux machine returns 8 while my windows machine returns 4. As this code is called from MATLAB, and MATLAB (like some other high level languages such as python) defines the sizes of variables in bits (such as uint32(1)) there was a mismatch in the linux machine when doing memcpys. Turns out that this happened in a variable that is a index, so the kernels were reading garbage (due to the bad memcpy), but then triying to access another array at that location, creating an illegal memory error.
Too specific? yeah.

Pocket PC 2003 C# Performance Issues...Should I Thread It?

Environment
Windows XP SP3 x32
Visual Studio 2005 Standard Edition
Honeywell Dolphin 9500 Pocket PC/Windows Mobile 2003 Platform
Using the provided Honeywell Dolphin 9500 VS2005 SDK
.NET Framework 1.1 and .NET Compact Framework 1.0 SP3
Using VC#
Problem
When I save an image from the built in camera and Honeywell SDK ImageControl to the device's storage card or internal memory, it takes 6 - 7 seconds.
I am currently saving the image as a PNG but have the option of a BMP or JPG as well.
Relevant lines in the code: 144-184 and 222, specifically 162,163 and 222.
Goal
I would like to reduce that time down to something like 2 or 3 seconds, and even less if possible.
As a secondary goal, I am looking for a profiling suite for Pocket PC 2003 devices specifically supporting the .NET Compact Framework Version 1.0. Ideally free but an unfettered short tutorial would work as well.
Things I Have Tried
I looked into asynchronous I/O via System.Threading a little bit but I do not have the experience to know whether this is a good idea, nor exactly how to implement threading for a single operation.
With threading implemented as it is in the code below, there seems to be a trivial speed increase of maybe a second or less. However, something on the next Form requires the image, possibly in the act of being saved, and I do not know how to mitigate the wait or handle that scenario at all, really.
EDIT: Changing the save format from PNG to BMP or JPG, with the threading, seems to reduce the save time considerably..
Code
http://friendpaste.com/3J1d5acHO3lTlDNTz7LQzB
Let me know if the code should just be posted here in code tags. It is a little long (~226 lines) so I went ahead and friendpasted it as that seemed to be acceptable in my last post.

By changing the save format from PNG to BMP and including the Threading code shown in the Code link, I was able to reduce the save time to ~1 second.

You're at the mercy of the Honeywell SDK for this one, since their control is doing the actual saving of the image. Calling this on a separate thread (i.e. not the UI thread) isn't going to help at all (as you've found out), and it will actually make things more difficult for you since you need to wait until the save task is completed before moving on to the next form.
The only suggestion I can make is to make sure you're saving the image to internal memory (and not to the SD card), since writing to an SD card usually takes significantly longer than writing to memory. Or see if you can get technical support from Honeywell - 6-7 seconds seems way too long for a task like this.
Or see if the Honeywell SDK lets you get the image as a byte array (instead of saving to disk). If this call returns in less than 6-7 seconds, you can handle persisting it yourself.

What are the available interactive languages that run in tiny memory? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am looking for general purpose programming languages that
have an interactive (live coding) prompt
work in 32 KB of RAM by itself or 8 KB when the compiler is hosted on a separate machine
run on a microcontroller with as little as 8-32 KB RAM total (without an MMU).
Below is my list so far, what am I missing?
Python: The PyMite VM needs 64K flash, 8K RAM. Targets LPC, SAM7 and ATmegas with 8K or more. Hosted.
Lua: The eLua FAQ recommends 256K flash, 64K RAM.
FORTH: amforth needs 8K flash, 150 bytes RAM, 30 bytes EEPROM on an ATmega.
Scheme: armpit Scheme The smallest target is the LPC2103 with 32K Flash, 4K SRAM.
C: Interactive C runs on 68HC11 with no flash and 32K SRAM. Hosted.
C: picoc an open source, cross-compiling, interactive C system. When compiled for AVR, it takes 63K flash, 8K RAM. The RAM could be reduced with effort to keep tables in flash.
C++: AngelScript an open source, byte-code based, C/C++ like scripting language with easy native calls.
Tcl: TinyTCL runs on DOS, 60K binary. Looks easy to port.
BASIC: TinyBasic: Initializes with a 64K heap, might be adjustable.
Lisp
PostScript: (I haven't found a FOSS implementation for low memory yet)
Shell: bitlash: An interactive command shell for Arduino (ATmega). See also AVRSH.

A homebrew Forth runtime can be implemented in very little memory indeed. I know someone who made one on a Cosmac in the 1970s. The core runtime was just 30 bytes.

I hear that CHIP-8, XPL0, PicoC, and Objective Caml have been ported to graphing calculators.
The Wikipedia "Lego Mindstorms" article lists a bunch of programming languages that allegedly run on the Lego RCX or Lego NXT platform.
Do any of them meet your "live coding" criteria?
You might want to check out the other microcontroller Forths at the Forth wiki . It lists at least 4 Forths for the Atmel AVR: amforth (which you already mention), PFAVR, avrforth, and ByteForth.
(Links to those interpreters, as well as this StackOverflow question, are included in the "Embedded Systems" wikibook).

I would recommend LUA (or eLUA http://www.eluaproject.net/ ). I've "ported" LUA to a Cortex-M3 a while back. From the top of my head it had a flash size of 60~100KB and needed about 20KB RAM to run. I did strip down to the bare essentials, but depending on your application, that might be enough. There's still room for optimization, especially about RAM requirements, but I doubt you can run it comfortable in 8KB.

Some AVR interpreters/VMs:
http://www.cqham.ru/tbcgroup/index_eng.htm
http://www.jcwolfram.de/projekte/avr/chipbasic2/main.php
http://www.jcwolfram.de/projekte/avr/chipbasic8/main.php
http://www.jcwolfram.de/projekte/avr/main.php
http://code.google.com/p/python-on-a-chip/
http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_id=688&item_type=project
http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_id=626&item_type=project
http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_id=460&item_type=project
http://www.harbaum.org/till/nanovm/index.shtml

Wren fits your criteria -- by default it's configured to use just 4k of RAM. AFAIK it hasn't seen any actual use, since the guy I wrote it for decided he didn't need an interpreter running wholly on the target system after all.
The language is influenced most obviously by ML and Forth.

Have you considered a port in C of Tiny Basic? Or, perhaps rewriting the UCSD Pascal p-machine to your architecture from Z-80?
Seriously, though, JavaScript would make a good embedded scripting language, but I've no clue what the minimum memory requirements are for the VM + GC, nor how difficult to remove OS dependencies. I played with NJS a while back, which could possibly fit your needs. This one is interesting in that the compiler is written in JavaScript (self hosting).

You can take a look at very powerful AvrCo Multitasking Pascal for AVR. You can try it at http://www.e-lab.de. MEGA8/88 version is free. There are tons of drivers and simulator with JTAG debugger and nice live or simulated visualizations of all standard devices (LCDCHAR, LCDGRAPH, 7SEG, 14SEG, LEDDOT, KEYBOARD, RC5, SERVO, STEPPER...).

You're missing EmbedVM, homepage here, svn repo here. Remember to check out both [1,2] videos on the front page ;)
From the homepage:
EmbedVM is a small embeddable virtual machine for microcontrollers
with a C-like language frontend. It has been tested with GCC and AVR
microcontrollers. But as the Virtual machine is rather simple it
should be easy to port it to other architectures.
The VM simulates a 16bit CPU that can access up to 64kB of memory. It
can only operate on 16bit values and arrays of 16bit and 8bit values.
There is no support for complex data structures (struct, objects,
etc.). A function can have a maximum of 32 local variables and 32
arguments.
Besides the memory for the VM, a small structure holding the VM state
and the reasonable amount of memory the EmbedVM functions need on the
stack there are no additional memory requirements for the VM.
Especially the VM does not depend on any dymaic memory management.
EmbedVM is optimized for size and simplicity, not execution speed. The
VM itself takes up about 3kB of program memory on an AVR
microcontroller. On an AVR ATmega168 running at 16MHz the VM can
execute about 75 VM instructions per millisecond.
All memory accesses done by the VM are parformed using user callback
functions. So it is possible to have some or all of the VM memory on
external memory devices, flash memory, etc. or "memory-map" hardware
functions to the VM.
The compiler is a UNIX/Linux commandline tool that reads in a *.evm
file and generates bytecode in vaious formats (binary file, intel hex,
C array initializers and a special debug output format). It also
generates a symbol file that can be used to access data in the VM
memory from the host application.
The C-like language looks like this: http://svn.clifford.at/embedvm/trunk/examples/numberquizz/vmcode.evm

I would recommend MY-BASIC, runs with in minimum 8 KB RAM, and easy to port.

There's also JavaScript, via Espruino.
This is built specifically for Microcontrollers and there are builds for various different chips (mainly STM32s) that fit a full system into as little as 8kB RAM.

Have you considered simply using the /bin/sh supplied by busybox? Or on of the smaller scripting languages they recommend?

Prolog - http://www.gprolog.org/
According to a google search "prolog small" the size of the executable can be made quite small by avoiding linking the built-in predicates.

None of the languages in the list in the question or in the answers proved satisfactory for the requirement of super easy compilation and integration into an existing micro controller project (disclosure: I didn't actually try every single one of the suggestions).
I found instead tinyscript which is a single .c+.h file that compiled with the rest of the source files on my project with the only additional configuration required being to provide a void outchar(int c) which can be empty if you don't require output from the scripts.
For me speed of execution is far less important than ease of build and integration and interop with C, as my use case is mainly just calling some C functions in order.

I have been using in my previous work busybox on a BlackFin.
we compiled perl + php for it, after changing s/fork/vfork/g it worked pretty good... more or less. Not having an MMU is not a good idea. The memory fragmentation will kill the server pretty easily. All I did was:
for i in `seq 1 100`; do wget http://black-fin-ip/test.php; done
It died while I was walking to my boss and telling him that the server is going to die in production :)

I would suggest use python. But now the only problem is the memory overhead right? So I have great idea for people who may be stuck in this problem later on.
First thing's first, write a bf interpreter(or just get source code from somewhere). The interpreter will be really small. Also bf is a Turing complete language. Now you need to write your code in python and then transpiler it to bf using bfpy( https://github.com/felko/bfpy/blob/master/README.md ). I've given you the solution with the least overhead and I am pretty sure a bf interpreter will easily stay under 10KB of ram usage.

Erlang - http://erlang.org/
it can fit in 2MB
http://www.experts123.com/q/is-erlang-small-enough-for-embedded-systems.html

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string