Seg fault while trying to print strings in a loop - string

I'm currently learning Zig (having very little experience in C) and I'm doing some experiments with strings to make sure I understand the concept properly.
For example, since strings are arrays of u8 I figured out that I can print the character 'C' using the following code:
std.debug.print("{}", .{[_]u8{67}});
I then tried to make a loop in order to print some basic characters with codes ranging from 33 to 126:
var i: u8 = 33;
while (i < 127) {
std.debug.print("{}", .{[_]u8{i}});
i += 1;
}
But when I run it the following error happens:
Segmentation fault at address 0x202710
/home/cassidy/learning-zig/hello-world/src/main.zig:9:39: 0x22ae15 in main (main)
std.debug.print("{}", .{[_]u8{i}});
^
/snap/zig/2222/lib/zig/std/start.zig:272:37: 0x204b9d in std.start.posixCallMainAndExit (main)
const result = root.main() catch |err| {
^
/snap/zig/2222/lib/zig/std/start.zig:143:5: 0x2048df in std.start._start (main)
#call(.{ .modifier = .never_inline }, posixCallMainAndExit, .{});
^
[1] 24766 abort (core dumped) ./main
The strangest thing is that when I change the code a little to create a variable holding the u8 array, then it works as intended:
var i: u8 = 33;
while (i < 127) {
const c = [_]u8{i};
std.debug.print("{}", .{c});
i += 1;
}
Returns:
!"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
Could someone explain why the second code triggers a seg fault?

After asking on the Zig Discord it appears that it is indeed a bug. Hopefully it will be fixed soon since the language is still under heavy development!

What you are experiencing is a bug, but it is still possible to print characters
The standard way to do it is with {c} to print a character (or once #6390 is merged, {u} for printing a unicode codepoint as utf-8)
std.debug.print("{c}", .{i});
Also it is possible to print with [_]u8{i} if you address-of it first so it is *const [1]u8 instead of [1]u8
std.debug.print("{}", .{&[_]u8{i}});

Related

Make msvc C4706 go away without pragmas

Following code in MSVC generates warning about assignment in conditional expression.
https://godbolt.org/z/i_rwY9
int main()
{
int a;
if ((a = 5)) {
return 1;
}
return a;
}
Note that I tried to use the double () around if since that makes the warning go away with g++, but I do not know how to make it go away in msvc without extracting the assignment from condition.
Is there a way to nudge msvc to figure out that this assignment is intentional?
I know I can use pragmas to disable this warning, but pattern is very common so I would like to get a solution without pragmas if one exists.
The MSVC compiler will give this warning unless you can convince it that you really do know what you're doing. Adding at least one 'real' logical test will achieve this:
int main()
{
int a;
if ((a = 5) != 0) {
return 1;
}
return a;
}
Note that the constant 5 can readily be replaced with any variable or valid expression: adding the explicit != 0 test does nothing to actually change the outcome of the code (and it is unlikely to change the generated assembly).

Random segmentation fault in D lang on switch break

I was debugging a fairly simple program written in D, that seems to have a random chance to receive a SEGV signal.
Upon further inspection I observed that using different compilers and build modes yielded different results.
Results of my tests:
DMD Debug = works 99% of the time
DMD Release = 50/50
LDC Debug = 50/50
LDC Release = 50/50
Because the binary from the default compiler (DMD) crashed only once I couldn't really debug it, and release mode didn't help either due to lack of debug symbols.
Building the binary with LDC in debug mode let me test it with gdb and valgrind, to summarize what I gathered.
Relevant information from valgrind,
Invalid read of size 4 # ctor in file video.d line 46
Access not within mapped region at address 0x0 # ctor in file video.d line
Gdb doesn't give me any more insight, 3 stack frames, of which only 0th is of interest, backtrace of frame 0 shows file video.d line 46 which is a break statement, so what now?
This is the snippet of code producing a seg fault
module video;
import ffmpeg.libavformat.avformat;
import ffmpeg.libavcodec.avcodec;
import ffmpeg.libavutil.avutil;
class Foo
{
private
{
AVFormatContext* _format_ctx;
AVStream* _stream_video;
AVStream* _stream_audio;
}
...
public this(const(string) path)
{
import std.string : toStringz;
_format_ctx = null;
enforce(avformat_open_input(&_format_ctx, path.toStringz, null, null) == 0);
scope (failure) avformat_close_input(&_format_ctx);
enforce(avformat_find_stream_info(_format_ctx, null) == 0);
debug av_dump_format(_format_ctx, 0, path.toStringz, 0);
foreach (i; 0 .. _format_ctx.nb_streams)
{
AVStream* stream = _format_ctx.streams[i];
if (stream == null)
continue;
enforce (stream.codecpar != null);
switch (stream.codecpar.codec_type)
{
case AVMediaType.AVMEDIA_TYPE_VIDEO:
_stream_video = stream;
break;
case AVMediaType.AVMEDIA_TYPE_AUDIO:
_stream_audio = stream;
break;
default:
stream.discard = AVDiscard.AVDISCARD_ALL;
break; // Magic line 46
}
}
}
}
// Might contain spelling errors, had to write it by hand.
So does anyone have an idea what causes this behaviour, or more precisely how to go about fixing it?
Try to check validity _stream_audio
default:
enforce( _stream_audio, new Exception( "_stream_audio is null" ))
.discard = AVDiscard.AVDISCARD_ALL;
break; // Magic line 46
You are not abiding the warning in the toStringz documentation:
“Important Note: When passing a char* to a C function, and the C function keeps it around for any reason, make sure that you keep a reference to it in your D code. Otherwise, it may become invalid during a garbage collection cycle and cause a nasty bug when the C code tries to use it.”
This may not be the cause of your problem, but the way you use toStringz is risky.

How to add pointer char datas (created using malloc) to a char array in C?

In my MPI code in C, i'm receiving a word from each of my slave processes. I want to add all these words to an char array in master side (part of code below). I can print these words but not collect them into a single char array.
(I consider max word length as 10, and number of slave's as slavenumber)
char* word = (char*)malloc(sizeof(char)*10);
char words[slavenumber*10];
for (int p = 0; p<slavenumber; p++){
MPI_Recv(word, 10, MPI_CHAR, p, 0,MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Word: %s\n", word); //it works fine
words[p*10] = *word; //This does not work, i think there is a problem here.
}
printf(words); //This does not work correctly, it gives something like: ��>;&�>W�
Can anybody help me on this?
Let's break it down line by line
// allocate a buffer large enough to hold 10 elements of type `char`
char* word = (char*)malloc(sizeof(char)*10);
// define a variable-length-array large enough to
// hold 10*slavenumber elements of `char`
char words[slavenumber*10];
for (int p = 0; p<slavenumber; p++){
// dereference `word` which is exactly the same as writing
// `word[0]` assigning it to `words[p*10]`
words[p*10] = *word;
// words[p*10+1] to words[p*10+9] are unchanged,
// i.e. uninitialized
}
// printing from an array. For this to work properly all
// accessed elements must be initialized and the buffer
// terminated by a null byte. You have neither
printf(words);
Because you left elements uninitialized and didn't null terminate, you're invoking undefined behavior. Be happy that you didn't get demons crawl out of your nose.
In seriousness though, in C you can copy strings by mere assignment. Your usage case calls for strncpy.
for (int p = 0; p<slavenumber; p++){
strncpy(&words[p*10], word, 10);
}

Read a String with spaces till a new line in C

I am in a pickle right now. I'm having trouble taking in an input of example
1994 The Shawshank Redemption
1994 Pulp Fiction
2008 The Dark Knight
1957 12 Angry Men
I first take in the number into an integer, then I need to take in the name of the Movie into a string using a character array, however i have not been able to get this done.
here is the code atm
while(scanf("%d", &myear) != EOF)
{
i = 0;
while(scanf("%[^\n]", &ch))
{
title[i] = ch;
i++;
}
addNode(makeData(title,myear));
}
The title array is arbitrarily large and the function is to add the data as a node to a linked list. right now the output I keep getting for each node is as follows
" hank Redemption"
" ion"
" Knight"
" Men"
Yes, it oddly prints a space in front of the cut-off title. I checked the variables and it adds the space in the data. (I am not printing the year as that is taken in correctly)
How can I fix this?
You are using the wrong type of argument passed to scanf() -- instead of scanning a character, try scanning to the string buffer immediately. %[^\n] scans an entire string up to (but not including) the newline. It does not scan only one character.
(Marginal secondary problem: I don't know from where you people are getting the idea that scanf() returns EOF at end of input, but it doesn't - you'd be better off reading the documentation instead of making incorrect assumptions.)
I hope you see now: scanf() is hard to get right. It's evil. Why not input the whole line at once then parse it using sane functions?
char buf[LINE_MAX];
while (fgets(buf, sizeof buf, stdin) != NULL) {
int year = strtol(buf, NULL, 0);
const char *p = strchr(buf, ' ');
if (p != NULL) {
char name[LINE_MAX];
strcpy(name, p + 1); // safe because strlen(p) <= sizeof(name)
}
}

C++'s char * by swig got problem in Python 3.0

Our C++ lib works fine with Python2.4 using Swig, returning a C++ char* back to a python str. But this solution hit problem in Python3.0, error is:
Exception=(, UnicodeDecodeError('utf8', b"\xb6\x9d\xa.....",0, 1, 'unexpected code byte')
Our definition is like(working fine in Python 2.4):
void cGetPubModulus(
void* pSslRsa,
char* cMod,
int* nLen );
%include "cstring.i"
%cstring_output_withsize( char* cMod, int* nLen );
Suspect swig is doing a Bytes->Str conversion automatically. In python2.4 it can be implicit but in Python3.0 it's no long allowed.. Anyone got a good idea? thanks
It's rather Python 3 that does that conversion. In Python 2 bytes and str are the same thing, in Python 3 str is unicode, so something somewhere tries to convert it to Unicode with UTF8, but it's not UTF8.
Your Python 3 code needs to return not a Python str, but a Python bytes. This will not work with Python 2, though, so you need preprocessor statements to handle the differences.
I came across a similar problem. I wrote a SWIG typemap for a custom char array (an unsigned char in fact) and it got SEGFAULT when using Python 3. So I debugged the code within the typemap and I realized the problem Lennart states.
My solution to that problem was doing the following in that typemap:
%typemap(in) byte_t[MAX_FONTFACE_LEN] {
if (PyString_Check($input))
{
$1 = (byte_t *)PyString_AsString($input);
}
else if (PyUnicode_Check($input))
{
$1 = (byte_t *)PyUnicode_AsEncodedString($input, "utf-8", "Error ~");
$1 = (byte_t *)PyBytes_AS_STRING($1);
}
else
{
PyErr_SetString(PyExc_TypeError,"Expected a string.");
return NULL;
}
}
That is, I check what kind of string object PyObject is. The functions PyString_AsString() and PyUnicode_AsString() will return > 0 if its input it's an UTF- 8 string or an Unicode string respectively. If it's an Unicode string, we convert that string to bytes in the call PyUnicode_AsEncodedString() and later on we convert those bytes to a char * with the call PyBytes_AS_STRING().
Note that I vaguely use the same variable for storing the unicode string and converting it later to bytes. Despite of being that questionable and maybe, it could derive in another coding-style discussion, the fact is that I solved my problem. I have tested it out with python3 and python2.7 binaries without any problems yet.
And lastly, the last line is for replicating an exception in the python call, to inform about that input wasn't a string, either utf nor unicode.

Resources