Random segmentation fault in D lang on switch break - linux

I was debugging a fairly simple program written in D, that seems to have a random chance to receive a SEGV signal.
Upon further inspection I observed that using different compilers and build modes yielded different results.
Results of my tests:
DMD Debug = works 99% of the time
DMD Release = 50/50
LDC Debug = 50/50
LDC Release = 50/50
Because the binary from the default compiler (DMD) crashed only once I couldn't really debug it, and release mode didn't help either due to lack of debug symbols.
Building the binary with LDC in debug mode let me test it with gdb and valgrind, to summarize what I gathered.
Relevant information from valgrind,
Invalid read of size 4 # ctor in file video.d line 46
Access not within mapped region at address 0x0 # ctor in file video.d line
Gdb doesn't give me any more insight, 3 stack frames, of which only 0th is of interest, backtrace of frame 0 shows file video.d line 46 which is a break statement, so what now?
This is the snippet of code producing a seg fault
module video;
import ffmpeg.libavformat.avformat;
import ffmpeg.libavcodec.avcodec;
import ffmpeg.libavutil.avutil;
class Foo
{
private
{
AVFormatContext* _format_ctx;
AVStream* _stream_video;
AVStream* _stream_audio;
}
...
public this(const(string) path)
{
import std.string : toStringz;
_format_ctx = null;
enforce(avformat_open_input(&_format_ctx, path.toStringz, null, null) == 0);
scope (failure) avformat_close_input(&_format_ctx);
enforce(avformat_find_stream_info(_format_ctx, null) == 0);
debug av_dump_format(_format_ctx, 0, path.toStringz, 0);
foreach (i; 0 .. _format_ctx.nb_streams)
{
AVStream* stream = _format_ctx.streams[i];
if (stream == null)
continue;
enforce (stream.codecpar != null);
switch (stream.codecpar.codec_type)
{
case AVMediaType.AVMEDIA_TYPE_VIDEO:
_stream_video = stream;
break;
case AVMediaType.AVMEDIA_TYPE_AUDIO:
_stream_audio = stream;
break;
default:
stream.discard = AVDiscard.AVDISCARD_ALL;
break; // Magic line 46
}
}
}
}
// Might contain spelling errors, had to write it by hand.
So does anyone have an idea what causes this behaviour, or more precisely how to go about fixing it?

Try to check validity _stream_audio
default:
enforce( _stream_audio, new Exception( "_stream_audio is null" ))
.discard = AVDiscard.AVDISCARD_ALL;
break; // Magic line 46

You are not abiding the warning in the toStringz documentation:
“Important Note: When passing a char* to a C function, and the C function keeps it around for any reason, make sure that you keep a reference to it in your D code. Otherwise, it may become invalid during a garbage collection cycle and cause a nasty bug when the C code tries to use it.”
This may not be the cause of your problem, but the way you use toStringz is risky.

Related

Resolving code analysis warnings with the BOLDDAY macro (used with CMonthCalCtrl)

I have some issues with the CMonthCalCtrl control and modernizing my code. The first problem is related to the BOLDDAY macro.
This macro is used to adjust day states (making specific dates bold on the calendar) and the concept is described in detail here. As documented, you need to define a macro:
#define BOLDDAY(ds, iDay) if(iDay > 0 && iDay < 32) \
(ds) |= (0x00000001 << (iDay-1))
Here is my code that uses this macro so that you have some context:
void CMeetingScheduleAssistantDlg::InitDayStateArray(int iMonthCount, LPMONTHDAYSTATE pDayState, COleDateTime datStart)
{
int iMonth = 0;
COleDateTimeSpan spnDay;
CString strKey;
SPECIAL_EVENT_S *psEvent = nullptr;
if (pDayState == nullptr)
return;
memset(pDayState, 0, sizeof(MONTHDAYSTATE)*iMonthCount);
if (m_pMapSPtrEvents == nullptr && m_Reminders.Count() == 0)
{
return;
}
spnDay.SetDateTimeSpan(1, 0, 0, 0);
auto datDay = datStart;
const auto iStartMonth = datStart.GetMonth();
auto iThisMonth = iStartMonth;
auto iLastMonth = iThisMonth;
do
{
strKey = datDay.Format(_T("%Y-%m-%d"));
if (m_pMapSPtrEvents != nullptr)
{
psEvent = nullptr;
m_pMapSPtrEvents->Lookup(strKey, reinterpret_cast<void*&>(psEvent));
if (psEvent != nullptr)
{
BOLDDAY(pDayState[iMonth], datDay.GetDay());
}
}
if (m_Reminders.HasReminder(datDay))
{
BOLDDAY(pDayState[iMonth], datDay.GetDay());
}
datDay = datDay + spnDay;
iThisMonth = datDay.GetMonth();
if (iThisMonth != iLastMonth)
{
iLastMonth = iThisMonth;
iMonth++;
}
} while (iMonth < iMonthCount);
}
Everywhere I use this BOLDDAY macro I get a code analysis warning (C26481):
warning C26481: Don't use pointer arithmetic. Use span instead (bounds.1).
It is not clear to me if the problem is with the BOLDDAY macro or my own code?
Update
I still get the warning when I turn the macro into a function:
Update 2
If it helps, I currently call the InitDayStateArray function in the following ways:
Method 1:
void CMeetingScheduleAssistantDlg::SetDayStates(CMonthCalCtrl &rCalendar)
{
COleDateTime datFrom, datUntil;
const auto iMonthCount = rCalendar.GetMonthRange(datFrom, datUntil, GMR_DAYSTATE);
auto pDayState = new MONTHDAYSTATE[iMonthCount];
if (pDayState != nullptr)
{
InitDayStateArray(iMonthCount, pDayState, datFrom);
VERIFY(rCalendar.SetDayState(iMonthCount, pDayState));
delete[] pDayState;
}
}
Method 2
void CMeetingScheduleAssistantDlg::OnGetDayStateEnd(NMHDR* pNMHDR, LRESULT* pResult)
{
NMDAYSTATE* pDayState = reinterpret_cast<NMDAYSTATE*>(pNMHDR);
MONTHDAYSTATE mdState[3]{}; // 1 = prev 2 = curr 3 = next
const COleDateTime datStart(pDayState->stStart);
if (pDayState != nullptr)
{
InitDayStateArray(pDayState->cDayState, &mdState[0], datStart);
pDayState->prgDayState = &mdState[0];
}
if (pResult != nullptr)
*pResult = 0;
}
Perhaps if the container for the LPMONTHDAYSTATE information is tweaked somehow it would contribute to resolve this span issue?
Sample code provided by Microsoft used to be published as code that compiles both with a C and C++ compiler. That limits availability of language features, frequently producing code that particularly C++ clients shouldn't be using verbatim.
The case here being the BOLDDAY function-like macro, that's working around not having reference types in C. C++, on the other hand, does, and the macro can be replaced with a function instead:
void bold_day(DWORD& day_state, int const day) noexcept {
if (day > 0 && day < 32) {
day_state |= (0x00000001 << (day - 1));
}
}
Using this function in place of the BOLDDAY macro silences the C26481 diagnostic.
While that works, I'm at a complete loss to understand where the compiler is seeing pointer arithmetic in the macro version. Regardless, replacing a function-like macro with an actual function (or function template) where possible is always desirable.
Update
Things are starting to make sense now. While replacing the function-like macro with a function, as suggested above, is desirable, it will not resolve the issue. My test happened to have used pDayState[0] which still raises C26481 for the macro, but not for the function. Using pDayState[1] instead, the diagnostic is raised in either case.
Let's put the pieces of the puzzle together: Recall that the array subscript expression p[N] is exactly identical to the expression *(p + N) when p is a pointer type and N an integral type. That explains why the compiler is complaining about "pointer arithmetic" when it sees pDayState[iMonth].
Solving that is fairly straight forward. As suggested by the diagnostic, use a std::span (requires C++20). The following changes to InitDayStateArray() make the C26481 diagnostic go away:
void CMeetingScheduleAssistantDlg::InitDayStateArray(int iMonthCount,
LPMONTHDAYSTATE pDayState,
COleDateTime datStart)
{
std::span const day_month_state(pDayState, iMonthCount);
// ...
// memset(pDayState, 0, sizeof(MONTHDAYSTATE)*iMonthCount);
std::fill(begin(day_month_state), end(day_month_state), 0);
// ...
do
{
// ...
{
bold_day(day_month_state[iMonth], datDay.GetDay());
}
}
if (m_Reminders.HasReminder(datDay))
{
bold_day(day_month_state[iMonth], datDay.GetDay());
}
// ...
} while (iMonth < day_month_state.size());
}
A std::span "describes an object that can refer to a contiguous sequence of objects". It takes the decomposed pointer and size arguments that describe an array and reunites them into a single object, recovering the full fidelity of the array.
That sounds great. But remember, this is C++, and there's a caveat: Just like its evil C++17 ancestor std::string_view, a std::span is an unhesitating factory for dangling pointers. You can freely pass them around, and hang on to them far beyond the referenced data being alive. And this is guaranteed for every specialization, starting with C++23.
The other issue is, that addressing this one diagnostic now has several others pop out of nowhere, suggesting that std::span isn't good enough, and gsl::span should be used instead. Addressing those would probably warrant another Q&A altogether.

Getting a Hard Fault when trying to list all tasks using vTaskList()

I am trying to list the state of all the tasks that are currently running using vTaskList().Whenever I call the function I get a HardFault and I have no idea where it faults. I tried increasing the Heap size and stack size. This causes the vTaskList() to work once but for the second time it throws a hard fault again.
Following is how I am using vTaskList() in osThreadList()
osStatus osThreadList (uint8_t *buffer)
{
#if ( ( configUSE_TRACE_FACILITY == 1 ) && ( configUSE_STATS_FORMATTING_FUNCTIONS == 1 ) )
vTaskList((char *)buffer);
#endif
return osOK;
}
Following is how i use osThreadList() to print all the tasks on my serial terminal.
uint8_t TskBuf[1024];
bool IOParser::TSK(bool print_help)
{
if(print_help)
{
uart_printf("\nTSK: Display list of tasks.\r\n");
}
else
{
uart_printf("\r\nName State Priority Stack Num\r\n" );
uart_printf("---------------------------------------------\r\n");
/* The list of tasks and their status */
osThreadList(TskBuf);
uart_printf( (char *)TskBuf);
uart_printf("---------------------------------------------\r\n");
uart_printf("B : Blocked, R : Ready, D : Deleted, S : Suspended");
}
return true;
}
When I comment out any one of the tasks I am able to get it working. I am guessing it is something related to memory but I havent been able to find a solution.
vTaskList is dependent on sprintf. So, your guess about memory and heap is right. But you have to use malloc and pass that block instead of what you do. Use pvPortmalloc and after you finish, free it up using vportfree.
Also, it is worthwhile noting that vTaskList is a blocking function.
I do not have a working code example to show this as at now, but this should work.
Hard Faults are often (almost all the time) happens due to uninitialised pointer. Above approach will eliminate that.

mutex unlocking and request_module() behaviour

I've observed the following code pattern in the Linux kernel, for example net/sched/act_api.c or many other places as well :
rtnl_lock();
rtnetlink_rcv_msg(skb, ...);
replay:
ret = process_msg(skb);
...
/* try to obtain symbol which is in module. */
/* if fail, try to load the module, otherwise use the symbol */
a = get_symbol();
if (a == NULL) {
rtnl_unlock();
request_module();
rtnl_lock();
/* now verify that we can obtain symbols from requested module and return EAGAIN.*/
a = get_symbol();
module_put();
return -EAGAIN;
}
...
if (ret == -EAGAIN)
goto replay;
...
rtnl_unlock();
After request_module has succeeded, the symbol we are interested in, becomes available in kernel memory space, and we can use it. However I don't understand why return EAGAIN and re-read the symbol, why can't just continue right after request_module()?
If you look at the current implementation in the Linux kernel, there is a comment right after the 2nd call equivalent to get_symbol() in your above code (it is tc_lookup_action_n()) that explains exactly why:
rtnl_unlock();
request_module("act_%s", act_name);
rtnl_lock();
a_o = tc_lookup_action_n(act_name);
/* We dropped the RTNL semaphore in order to
* perform the module load. So, even if we
* succeeded in loading the module we have to
* tell the caller to replay the request. We
* indicate this using -EAGAIN.
*/
if (a_o != NULL) {
err = -EAGAIN;
goto err_mod;
}
Even though the module could be requested and loaded, since the semaphore was dropped in order to load the module which is an operation that can sleep (and is not the "standard way" this function is executed, the function returns EAGAIN to signal it.
EDIT for clarification:
If we look at the call sequence when a new action is added (which could cause a required module to be loaded) we have this sequence: tc_ctl_action() -> tcf_action_add() -> tcf_action_init() -> tcf_action_init_1().
Now if "move back" the EAGAIN error back up to tc_ctl_action() in the case RTM_NEWACTION:, we see that with the EAGAIN ret value the call to tcf_action_add is repeated.

Vala: Invalid read when a joined thread gets unreferenced

When I compile and run the code below in valgrind, it looks like the thread gets free'd when I join the thread, and then later when it gets unreferenced some memory that is already free'd gets read.
Is this a "false positive" from valgrind? If not, is it in general safe to ignore in larger parallel programs? How do I get around it?
int main (string[] args) {
Thread<int> thread = new Thread<int>.try ("ThreadName", () => {
stdout.printf ("Hello World");
return 0;
});
thread.join ();
return 0;
}
==2697== Invalid read of size 4
==2697== at 0x50F2350: g_thread_unref (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3800.1)
==2697== by 0x400A65: _vala_main (in /home/lockner/test)
==2697== by 0x400A9C: main (in /home/lockner/test)
==2697== Address 0x5dc17e8 is 24 bytes inside a block of size 72 free'd
==2697== at 0x4C2B60C: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2697== by 0x50F2547: g_thread_join (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3800.1)
==2697== by 0x400A4B: _vala_main (in /home/lockner/test)
==2697== by 0x400A9C: main (in /home/lockner/test)
When I manually add "thread = NULL;" between the join call and the _g_thread_unref0 macro in the generated C code, the invalid read is gone in the valgrind output.
g_thread_join (thread);
result = 0;
thread = NULL;
_g_thread_unref0 (thread);
return result;
It turns out it was a missing annotation in glib-2.0.vapi
Adding [DestroysInstance] above join() solves the problem.
The issue is that g_thread_join already removes 1 reference. So the generated code does a double-free.
If you needed to add [DestroysInstance] this is clearly a bug in valac/the GThread binding.

CString::Format() causes debug assertion

Cstring::Format causes debug assertion in visual studio 2008 at vsprintf.c line 244 with "buffer too small".
//inside the function.
somefile.Open (//open for mode read) //somefile is CFile.
char* buff = new [somefile.GetLength()];
somefile.Read ((void*)buff, somefile.GetLength());
CString cbuff;
cbuff.Format ("%s",buff); //this line causes the debug assertion.
//and so on
Any idea why CString::Format() causes "buffer too small" error ? This doesn't always get debug assertion error.
An alternate solution is:
somefile.Open (//open for mode read) //somefile is CFile.
int buflen = somefile.GetLength();
CString cbuff;
somefile.Read ((void*)cbuff.GetBuffer(buflen), buflen);
cbuff.ReleaseBuffer();
It reads directly into a string buffer instead of the intermediate variable. The CString::GetBuffer() function automatically adds the extra byte to the string which you forgot to do when you allocated the "new char[]".
string end with '\0'
so buffer size will not be enough
The problem is that CFile::Read() does not guarantee that it reads as much data as you ask for. Sometimes it's reading less and leaving your buffer without a null terminator. You have to assume that you might only get one byte on each read call. This will also crash sometimes, when an un-readable memory block immediately follows your buffer.
You need to keep reading the file until you get to the end. Also, the null terminator is generally not written to the file at all, so you shouldn't assume that it will be read in but rather ensure that your buffer is always null-terminated no matter what is read.
In addition, you shouldn't use the file size as the buffer size; there's no reason to think you can read it all in at once, and the file size might be huge, or zero.
You should also avoid manual memory management, and instead of new[]/delete[], use a vector, which will ensure that you don't forget to free the buffer or use delete instead of delete[], and that the memory is released even in case of an exception. (I wouldn't recommend using CString or CFile either, for that matter, but that's another topic...)
// read from the current file position to the end of
// the file, appending whatever is read to the string
CString ReadFile(CFile& somefile, CString& result)
{
std::vector<char> buffer(1024 + 1);
for (;;)
{
int read = somefile.Read(&buffer[0], buffer.size() - 1);
if (read > 0)
{
// force a null right after whatever was read
buffer[read] = '\0';
// add whatever was read to the result
result += &buffer[0];
}
else
{
break;
}
}
}
Note that there's no error handling in this example.

Resources