Strange behaviour or may be bug in sprintf() in c++ - string

I just found strange thing about sprintf() (c++ library function).
have a look at these two solutions
Time limit Exceeded solution
Accepted Solution
the only difference between them is that, I used
sprintf(a,"%d%c",n,'\0');
in TLE solution,
in AC solution I replaced above sprintf() with
sprintf(a,"%d",n);
You can also observe that ACed solution took only 0.01s and 2.8MB memory
but TLE solution took around 11.8MB
check here
And one more thing program that gave TLE runs in 0s in IDEONE with extreme input data
so is it a bug in CODECHEF itself
Somebody please explain me is this a bug or some considerable unknown operation is happening here.
Thanks in advance.

First of all, the differences in the codes is not with sscanf, but with sprintf. A diff of the code explains:
--- ac.c 2014-05-24 14:31:18.074977661 -0500
+++ tle.c 2014-05-24 14:30:52.270650109 -0500
## -4,7 +4,7 ##
string mul(string m, int n){
char a[4];
-sprintf(a,"%d",n);
+sprintf(a,"%d%c",n,'\0');
int l1 = strlen(a);
//printf("len : %d\n",l1);
int l2 = m.length();
Second, by explicitly packing the string with %c and '\0', you are reducing the size of the integer that can be stored in a by 1. You need to check the return of sprintf. man printf:
Upon successful return, these functions return the number of characters printed (not including the trailing '\0' used to end output to strings).
In your case you are most likely writing beyond the end of your a[4] string and are experiencing undefined results. With a[4] you have space for only 999\0. When you explicitly add %c + '\0', you reduce that to 99\0\0. If your number exceeds 99, then sprintf will write beyond the end of the string because you are explicitly packing an additional '\0'. In the original case sprintf(a,"%d",n); 999 can be stored without issue relying on sprintf to append '\0' as a[3].
test with n = 9999, sprintf will still store the number in a, but will return 5 which exceeds the space available a[4], meaning that the behavior of your code is a dice-roll at that point.

Related

XOR two strings of different length

So I am trying to XOR two strings together but am unsure if I am doing it correctly when the strings are different length.
The method I am using is as follows.
def xor_two_str(a,b):
xored = []
for i in range(max(len(a), len(b))):
xored_value = ord(a[i%len(a)]) ^ ord(b[i%len(b)])
xored.append(hex(xored_value)[2:])
return ''.join(xored)
I get output like so.
abc XOR abc: 000
abc XOR ab: 002
ab XOR abc: 5a
space XOR space: 0
I know something is wrong and I will eventually want to convert the hex value to ascii so am worried the foundation is wrong. Any help would be greatly appreciated.
Your code looks mostly correct (assuming the goal is to reuse the shorter input by cycling back to the beginning), but your output has a minor problem: It's not fixed width per character, so you could get the same output from two pairs characters with a small (< 16) difference as from a single pair of characters with a large difference.
Assuming you're only working with "bytes-like" strings (all inputs have ordinal values below 256), you'll want to pad your hex output to a fixed width of two, with padding zeroes changing:
xored.append(hex(xored_value)[2:])
to:
xored.append('{:02x}'.format(xored_value))
which saves a temporary string (hex + slice makes the longer string then slices off the prefix, when format strings can directly produce the result without the prefix) and zero-pads to a width of two.
There are other improvements possible for more Pythonic/performant code, but that should be enough to make your code produce usable results.
Side-note: When running your original code, xor_two_str('abc', 'ab') and xor_two_str('ab', 'abc') both produced the same output, 002 (Try it online!), which is what you'd expect (since xor-ing is commutative, and you cycle the shorter input, reversing the arguments to any call should produce the same results). Not sure why you think it produced 5a. My fixed code (Try it online!) just makes the outputs 000000, 000002, 000002, and 00; padded properly, but otherwise unchanged from your results.
As far as other improvements to make, manually converting character by character, and manually cycling the shorter input via remainder-and-indexing is a surprisingly costly part of this code, relative to the actual work performed. You can do a few things to reduce this overhead, including:
Convert from str to bytes once, up-front, in bulk (runs in roughly one seventh the time of the fastest character by character conversion)
Determine up front which string is shortest, and use itertools.cycle to extend it as needed, and zip to directly iterate over paired byte values rather than indexing at all
Together, this gets you:
from itertools import cycle
def xor_two_str(a,b):
# Convert to bytes so we iterate by ordinal, determine which is longer
short, long = sorted((a.encode('latin-1'), b.encode('latin-1')), key=len)
xored = []
for x, y in zip(long, cycle(short)):
xored_value = x ^ y
xored.append('{:02x}'.format(xored_value))
return ''.join(xored)
or to make it even more concise/fast, we just make the bytes object without converting to hex (and just for fun, use map+operator.xor to avoid the need for Python level loops entirely, pushing all the work to the C layer in the CPython reference interpreter), then convert to hex str in bulk with the (new in 3.5) bytes.hex method:
from itertools import cycle
from operator import xor
def xor_two_str(a,b):
short, long = sorted((a.encode('latin-1'), b.encode('latin-1')), key=len)
xored = bytes(map(xor, long, cycle(short)))
return xored.hex()

How to enter two strings together with gets();

I was wondering if there is a way to enter a string together like how we enter cin>>a>>b;
How can we do the same using gets for strings?
This is three months late, but it's probably worth writing since the only other answer is wildly off.
First, and most importantly, don't use gets(). There is no way to use it safely: it might as well be called overrun_my_buffer_and_crash_my_program_if_I_am_lucky(). There's a reason it's been removed from the latest C standard.
Second, cin >> a >> b; reads two whitespace-delimited strings, such as two words in a sentence. gets() and its much safer cousin fgets() read in a whole line, delimited by the newline character '\n' (note that they do not stop at other whitespace characters!). It is therefore much harder to replicate the behavior of cin - you'll have to do all the parsing yourself. (This is not saying it can't be done; if you have to do it in C, checkout strtok()).
The C equivalent of cin >> a >> b; is scanf(): char a[80], b[80]; scanf("%79s %79s", a, b); However, note that you can only read in strings up to a fixed maximum length (in this example 79 characters). It's fairly complicated to safely read in a string of arbitrary length in C; you'll need to read them in fixed-size chunks, and allocate sufficient memory to combine the chunks together.
If your input strings are separated with some sort of whitespace (i.e. " ", "\n" etc.), then it will simply be:
gets(s1);
gets(s2);
Another way is to use an fgets(char* str, int num, FILE* stream). It can be used twice for str1 and str2, and will take an input from stream into string until either num-1 chars was read, or eos ("\n", "\t", etc.) is reached. This function is strongly recommended, as it is more safe, than gets.
There also is a third, much more complicated way of doing this: you can define your own string class (inherited from std::string), and define an operator there, which will be standing for gets() or fgets(); but this option is really too complicated and unuseful for your situation.
If it is necessary for you to use one function, then you can create an overloaded gets() for two strings:
void gets(char * str1, char * str2)
{
gets(str1);
gets(str2);
}
The same trick can be done with fgets. But again, isn't it too complicated for you just to make 2 calls of gets() or fgets()?

Parsing strings in Fortran

I am reading from a file in Fortran which has an undetermined number of floating point values on each line (for now, there are about 17 values on a line). I would like to read the 'n'th value on each line to a given floating point variable. How should i go about doing this?
In C the way I wrote it was to read the entire line onto the string and then do something like the following:
for(int il = 0; il < l; il++)
{
for(int im = -il; im <= il; im++)
pch = strtok(NULL, "\t ");
}
for(int im = -l; im <= m; im++)
pch = strtok(NULL, "\t ");
dval = atof(pch);
Here I am continually reading a value and throwing it away (thus shortening the string) until I am ready to accept the value I am trying to read.
Is there any way I can do this in Fortran? Is there a better way to do this in Fortran? The problem with my Fortran code seems to be that read(tline, '(f10.15)') tline1 does not shorten tline (tline is my string holding the entire line and tline1 what i am trying to parse it into), thus I cannot use the same method as I did in my C routine.
Any help?
The issue is that Fortran is a record-based I/O system while C is stream-based.
If you have access to a Fortran 2003 compliant compiler (modern versions of gfortran should work), you can use the stream ACCESS specifier to do what you want.
An example can be found here.
Of course, if you were really inclined, you could just use your C function directly from Fortran. Interfacing the two languages is generally simple, typically only requiring a wrapper with a lowercase name and an appended underscore (depending on compiler and platform of course). Passing arrays or strings back and forth is not so trivial typically; but for this example that wouldn't be needed.
Once the data is in a character array, you can read it into another variable as you are doing with the ADVANCE=no signature, ie.
do i = 1, numberIWant
read(tline, '(F10.15)', ADVANCE="no") tline1
end do
where tline should contain your number at the end of the loop.
Because of the record-based I/O, a READ statement will typically throw out what is after the end of the record. But the ADVANCE=no tells it not to.
If you know exactly at what position the value you want starts, you can use the T edit descriptor to initiate the next read from that position.
Let's say, for instance, that the width of each field is 10 characters and you want to read the fifth value. The read statement will then look something like the following.
read(file_unit, '(t41, f10.5)') value1
P.s.: You can dynamically create a format string at runtime, with the correct number after the t, by using a character variable as format and use an internal file write to put in this number.
Let's say you want the value that starts at position n. It will then look something like this (I alternated between single and double quotes to try to make it more clear where each string starts and stops):
write(my_format, '(a, i0, a)') "(t", n, ', f10.5)'
read(file_unit, my_format) value1

Convert an integer to a string

I am trying to learn assembler and want to write a function to convert a number to a string. The signature of the function I want to write would looks like this in a C-like fashion:
int numToStr(long int num, unsigned int bufLen, char* buf)
The function should return the number of bytes that were used if conversion was successful, and 0 otherwise.
My current approach is a simple algorithm. In all cases, if the buffer is full, return 0.
Check if the number is negative. If it is, write a - char into buf[0] and increment the current place in the buffer
Repeatedly divide by 10 and store the remainders in the buffer, until the division yields 0.
Reverse the number in the buffer.
Is this the best way to do this conversion?
This is pretty much how every single implementation of itoa that I've seen works.
One thing that you don't mention but do want to take care of is bounds checking (i.e. making sure you don't write past bufLen).
With regards to the sign: once you've written the -, you need to negate the value. Also, the - needs to be excluded from the final reversal; an alternative is to remember the sign at the start but only write it at the end (just before the reversal).
One final corner case is to make sure that zero gets written out correctly, i.e. as 0 and not as an empty string.

Modifying a character in a string in Lua

Is there any way to replace a character at position N in a string in Lua.
This is what I've come up with so far:
function replace_char(pos, str, r)
return str:sub(pos, pos - 1) .. r .. str:sub(pos + 1, str:len())
end
str = replace_char(2, "aaaaaa", "X")
print(str)
I can't use gsub either as that would replace every capture, not just the capture at position N.
Strings in Lua are immutable. That means, that any solution that replaces text in a string must end up constructing a new string with the desired content. For the specific case of replacing a single character with some other content, you will need to split the original string into a prefix part and a postfix part, and concatenate them back together around the new content.
This variation on your code:
function replace_char(pos, str, r)
return str:sub(1, pos-1) .. r .. str:sub(pos+1)
end
is the most direct translation to straightforward Lua. It is probably fast enough for most purposes. I've fixed the bug that the prefix should be the first pos-1 chars, and taken advantage of the fact that if the last argument to string.sub is missing it is assumed to be -1 which is equivalent to the end of the string.
But do note that it creates a number of temporary strings that will hang around in the string store until garbage collection eats them. The temporaries for the prefix and postfix can't be avoided in any solution. But this also has to create a temporary for the first .. operator to be consumed by the second.
It is possible that one of two alternate approaches could be faster. The first is the solution offered by PaĆ­lo Ebermann, but with one small tweak:
function replace_char2(pos, str, r)
return ("%s%s%s"):format(str:sub(1,pos-1), r, str:sub(pos+1))
end
This uses string.format to do the assembly of the result in the hopes that it can guess the final buffer size without needing extra temporary objects.
But do beware that string.format is likely to have issues with any \0 characters in any string that it passes through its %s format. Specifically, since it is implemented in terms of standard C's sprintf() function, it would be reasonable to expect it to terminate the substituted string at the first occurrence of \0. (Noted by user Delusional Logic in a comment.)
A third alternative that comes to mind is this:
function replace_char3(pos, str, r)
return table.concat{str:sub(1,pos-1), r, str:sub(pos+1)}
end
table.concat efficiently concatenates a list of strings into a final result. It has an optional second argument which is text to insert between the strings, which defaults to "" which suits our purpose here.
My guess is that unless your strings are huge and you do this substitution frequently, you won't see any practical performance differences between these methods. However, I've been surprised before, so profile your application to verify there is a bottleneck, and benchmark potential solutions carefully.
You should use pos inside your function instead of literal 1 and 3, but apart from this it looks good. Since Lua strings are immutable you can't really do much better than this.
Maybe
"%s%s%s":format(str:sub(1,pos-1), r, str:sub(pos+1, str:len())
is more efficient than the .. operator, but I doubt it - if it turns out to be a bottleneck, measure it (and then decide to implement this replacement function in C).
With luajit, you can use the FFI library to cast the string to a list of unsigned charts:
local ffi = require 'ffi'
txt = 'test'
ptr = ffi.cast('uint8_t*', txt)
ptr[1] = string.byte('o')

Resources