How to convert large Chisel design to C++ model with Verilator? - riscv

I encountered an issue about bad memory allocation while compiling large chisel hardware design to C++ model using verilator backend.
When i want to built a large PE design (36x36 for example), I write my code like this
PEArray = Seq.fill(height)(Seq.fill(width)(Module new PE).io)
and the verilator would crash then throw
terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
Something like the memory not enough cause this problem, is there any memory-efficiency Chisel3 built-in function or other ways to solve this problem?

Related

What Linux entity is responsible for generating Illegal Instruction Traps?

I am working on a custom version of Rocket Chip that features some extra instructions that I would like to be properly handled by Linux. Although bare-metal programs using these instructions run fine, Linux makes the same benchmarks crash with "Illegal Instruction" messages.
Does anyone know which software element of Linux - loader, disassembler, something else - is responsible for detecting illegal instructions?
My goal is to modify that piece of software so that Linux stops complaining about my instructions. If anyone knows about an easier way to suppress this kind of error, that would be very useful too.
The RISC-V implementation (the processor) raises an illegal instruction trap whenever it encounters an instruction it has not implemented. These illegal instruction traps will be piped through to Linux (either via trap delegation or after being handled by the machine-mode software), which then flow through the standard trap handling flow:
strapvec points to Handle_exception, which does a bunch of bookkeeping to avoid trashing userspace and then direct traps to the correct location.
For illegal instruction traps, you'll fall through to the excp_vect_table jump table, which handles all the boring traps.
This is indexed by scause, which in this case points to do_trap_insn_illegal.
do_trap_insn_illegal is just a generic Linux trap handler, it passes SIGILL to whatever caused the trap. This may raise a signal to a userspace task, a kernel task, or just panic the kernel directly.
There are a bunch of levels of indirection here that we're currently not doing anything with, but may serve to emulate unimplemented instructions in the future.

I-7188ex's weird behaviour

I have quite complex I/O program (written by someone else) for controller ICPDAS i-7188ex and I am writing a library (.lib) for it that does some calculations based on data from that program.
Problem is, if I import function with only one line printf("123") and embed it inside I/O, program crashes at some point. Without imported function I/O works fine, same goes for imported function without I/O.
Maybe it is a memory issue but why should considerable memory be allocated for function which only outputs a string? Or I am completely wrong?
I am using Borland C++ 3.1. And yes, I can't use anything newer since controller takes only 80186 instruction set.
If your code is complex then sometimes your compiler can get stuck and compile it wrongly messing things up with unpredictable behavior. Happen to me many times when the code grows ... In such case usually swapping few lines of code (if you can without breaking functionality) or even adding few empty or rem lines inside code sometimes helps. Problem is to find the place where it do its thing. You can also divide your program into several files compile each separately to obj and then just link them to the final file ...
The error description remembers me of one I did fight with a long time. If you are using class/struct/template try this:
bds 2006 C hidden memory manager conflicts
may be it will help (did not test this for old turbo).
What do you mean by embed into I/O ? are you creating a sys driver file? If that is the case you need to make sure you are not messing with CPU registers. That could cause a lot of problems try to use
void some_function_or_whatever()
{
asm { pusha };
// here your code
printf("123");
asm { popa };
}
If you writing ISR handlers then you need to use interrupt keyword so compiler returns from it properly.
Without actual code and or MCVE is hard to point any specifics ...
If you can port this into BDS2006 or newer version (just for debug not really functional) then it will analyse your code more carefully and can detect a lot of hidden errors (was supprised when I ported from BCB series into BDS2006). Also there is CodeGuard option in the compiler which is ideal for finding such errors on runtime (but I fear you will not be able to run your lib without the I/O hw present in emulated DOS)

Valgrind hangs when reading large HDF5 dataset in Fortran

I have an application written in Fortran which makes use of parallel HDF5 for input / output.
A matching post-processing code is used to read its output, in the form of a *.h5 file, and process it.
When I try to use valgrind to check for memory leaks, however, it stalls when reading large datasets.
More exactly, the stalling occurs at a call to H5Dread_f for large datasets, for example 1069120 doubles (where doubles are defined as H5kind_to_type(REAL64,H5_REAL_KIND)), whereas for smaller ones it is okay.
I tried recompiling the HDF5 library using --enable-using-memchecker, as described here, but it didn't help.
Does anybody have more experience with this?
I found the cause / solution: Due to an error of mine, the chunk size I used in these HDF5 routines was often only 1 byte, which is of course much too low.
Fixing this also makes valgrind much much faster and usuable.

Run-time errors and memory leaks detection

Created application is working toooo slow, looks like there are a lot of memory leaks, there are a lot of pointers. So, please, can you advice some effective tool for run-time errors and memory leaks detection in Visual Studio C++?
You can use deleaker. It must help you.
I know 2 good tools for windows: Purify and Insure++.
For linux: Valgrind.
If you use the debug version of the CRT library , you can use find all memory leaks very easily.
Basically after including appropriate headers you call
_CrtSetDbgFlag ( _CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF );
somewhere in the beginning of you program.
Before the program exits you should call
_CrtSetReportMode( _CRT_ERROR, _CRTDBG_MODE_DEBUG );
Which dumps all the memory leaks to Debug Output Window.
But the application being slow might be unrelated to memory leaks, For performance profiling you can follow directions as per Find Application Bottlenecks with Visual Studio Profiler
For catching bad C++ constructs at compile time, you can use the static code analysis feature of Visual Studio 2010 or later.

Porting Unix ada app to Linux: Seg fault before program begins

I am an intern who was offered the task of porting a test application from Solaris to Red Hat. The application is written in Ada. It works just fine on the Unix side. I compiled it on the linux side, but now it is giving me a seg fault. I ran the debugger to see where the fault was and got this:
Warning: In non-Ada task, selecting an Ada task.
=> runtime tasking structures have not yet been initialized.
<non-Ada task> with thread id 0b7fe46c0
process received signal "Segmentation fault" [11]
task #1 stopped in _dl_allocate_tls
at 0870b71b: mov edx, [edi] ;edx := [edi]
This seg fault happens before any calls are made or anything is initialized. I have been told that 'tasks' in ada get started before the rest of the program, and the problem could be with a task that is running.
But here is the kicker. This program just generates some code for another program to use. The OTHER program, when compiled under linux gives me the same kind of seg fault with the same kind of error message. This leads me to believe there might be some little tweak I can use to fix all of this, but I just don't have enough knowledge about Unix, Linux, and Ada to figure this one out all by myself.
This is a total shot in the dark, but you can have tasks blow up like this at startup if they are trying to allocate too much local memory on the stack. Your main program can safely use the system stack, but tasks have to have their stack allocated at startup from dynamic memory, so typcially your runtime has a default stack size for tasks. If your task tries to allocate a large array, it can easily blow past that limit. I've had it happen to me before.
There are multiple ways to fix this. One way is to move all your task-local data into package global areas. Another is to dynamically allocate it all.
If you can figure out how much memory would be enough, you have a couple more options. You can make the task a task type, and then use a
for My_Task_Type_Name'Storage_Size use Some_Huge_Number;
statement. You can also use a "pragma Storage_Size(My_Task_Type_Name)", but I think the "for" statement is preferred.
Lastly, with Gnat you can also change the default task stack size with the -d flag to gnatbind.
Off the top of my head, if the code was used on Sparc machines, and you're now runing on an x86 machine, you may be running into endian problems.
It's not much help, but it is a common gotcha when going multiplat.
Hunch: the linking step didn't go right. Perhaps the wrong run-time startup library got linked in?
(How likely to find out what the real trouble was, months after the question was asked?)

Resources