Read Hex file and append string matlab - string

I am reading H.264 bitstream as Hex file in matlab. I want to insert some string whenever some certain condition met. Like in the attached image if hex value of 00 00 00 01 occurs anywhere in the file i want to add some string like ABC before 00 00 00 01 in the file. String comparison is easy but how to do a Hex comparison?
Here is my code of reading file as hex file
f = fopen(theFile);
if f==-1
return
end
c = fread(f);
theSize=prod(size((c)));
c=sprintf('%02x\n',c);
c(3:3:end)='';
m=floor(length(c)/nChars);
hex='';
hex=reshape(c(1:m*nChars),nChars,m)';
if mod(length(c),nChars)
hex=strvcat(hex,c(m*nChars+1:end));
end
More specifically i want this c code converted into matlab
QByteArray data, basePattern;
basePattern.resize(3);
//start code:
basePattern[0] = (char) 0x00;
basePattern[1] = (char) 0x00;
basePattern[2] = (char) 0x01;
char end1 = 0x25, end2 = 0x45, end3 = 0x65;
x = myfile;//read using fopen
if (x == end1 || x == end2 {
}

Hex values are really just integers:
x = uint8(hex2dec({'01', '02', '0A', '0B', '25', '45', '65', '00', '01', 'AA'}))
x =
1
2
10
11
37
69
101
0
1
170
And they can be compared directly:
x(3) == uint8(hex2dec('0a'))
ans =
1
So putting it all together, you should create a new buffer and search through the bytes for the pattern, if it's found, insert you data, if not found, just append the byte:
pat0 = uint8(hex2dec('00'));
pat1 = uint8(hex2dec('00'));
pat2 = uint8(hex2dec('01'));
pos = 1;
data = % the uint8 array read in from the file.
new_data = uint8([]);
while pos < length(data) - 2
if data(pos+0) == pat0 && data(pos+1) == pat1 && data(pos+2) == pat2
% insert new data buffer and append pattern
new_data = [new_data my_data_to_insert pat0 pat1 pat2];
pos = pos + 3;
else
% append
new_data = [new_data data(pos)];
pos = pos + 1;
end
end
% append last 2 bytes
new_data = [new_data data(end-1:end)];

Related

How to find the lexicographically smallest string by reversing a substring?

I have a string S which consists of a's and b's. Perform the below operation once. Objective is to obtain the lexicographically smallest string.
Operation: Reverse exactly one substring of S
e.g.
if S = abab then Output = aabb (reverse ba of string S)
if S = abba then Output = aabb (reverse bba of string S)
My approach
Case 1: If all characters of the input string are same then output will be the string itself.
Case 2: if S is of the form aaaaaaa....bbbbbb.... then answer will be S itself.
otherwise: Find the first occurence of b in S say the position is i. String S will look like
aa...bbb...aaaa...bbbb....aaaa....bbbb....aaaaa...
|
i
In order to obtain the lexicographically smallest string the substring that will be reversed starts from index i. See below for possible ending j.
aa...bbb...aaaa...bbbb....aaaa....bbbb....aaaaa...
| | | |
i j j j
Reverse substring S[i:j] for every j and find the smallest string.
The complexity of the algorithm will be O(|S|*|S|) where |S| is the length of the string.
Is there a better way to solve this problem? Probably O(|S|) solution.
What I am thinking if we can pick the correct j in linear time then we are done. We will pick that j where number of a's is maximum. If there is one maximum then we solved the problem but what if it's not the case? I have tried a lot. Please help.
So, I came up with an algorithm, that seems to be more efficient that O(|S|^2), but I'm not quite sure of it's complexity. Here's a rough outline:
Strip of the leading a's, storing in variable start.
Group the rest of the string into letter chunks.
Find the indices of the groups with the longest sequences of a's.
If only one index remains, proceed to 10.
Filter these indices so that the length of the [first] group of b's after reversal is at a minimum.
If only one index remains, proceed to 10.
Filter these indices so that the length of the [first] group of a's (not including the leading a's) after reversal is at a minimum.
If only one index remains, proceed to 10.
Go back to 5, except inspect the [second/third/...] groups of a's and b's this time.
Return start, plus the reversed groups up to index, plus the remaining groups.
Since any substring that is being reversed begins with a b and ends in an a, no two hypothesized reversals are palindromes and thus two reversals will not result in the same output, guaranteeing that there is a unique optimal solution and that the algorithm will terminate.
My intuition says this approach of probably O(log(|S|)*|S|), but I'm not too sure. An example implementation (not a very good one albeit) in Python is provided below.
from itertools import groupby
def get_next_bs(i, groups, off):
d = 1 + 2*off
before_bs = len(groups[i-d]) if i >= d else 0
after_bs = len(groups[i+d]) if i <= d and len(groups) > i + d else 0
return before_bs + after_bs
def get_next_as(i, groups, off):
d = 2*(off + 1)
return len(groups[d+1]) if i < d else len(groups[i-d])
def maximal_reversal(s):
# example input: 'aabaababbaababbaabbbaa'
first_b = s.find('b')
start, rest = s[:first_b], s[first_b:]
# 'aa', 'baababbaababbaabbbaa'
groups = [''.join(g) for _, g in groupby(rest)]
# ['b', 'aa', 'b', 'a', 'bb', 'aa', 'b', 'a', 'bb', 'aa', 'bbb', 'aa']
try:
max_length = max(len(g) for g in groups if g[0] == 'a')
except ValueError:
return s # no a's after the start, no reversal needed
indices = [i for i, g in enumerate(groups) if g[0] == 'a' and len(g) == max_length]
# [1, 5, 9, 11]
off = 0
while len(indices) > 1:
min_bs = min(get_next_bs(i, groups, off) for i in indices)
indices = [i for i in indices if get_next_bs(i, groups, off) == min_bs]
# off 0: [1, 5, 9], off 1: [5, 9], off 2: [9]
if len(indices) == 1:
break
max_as = max(get_next_as(i, groups, off) for i in indices)
indices = [i for i in indices if get_next_as(i, groups, off) == max_as]
# off 0: [1, 5, 9], off 1: [5, 9]
off += 1
i = indices[0]
groups[:i+1] = groups[:i+1][::-1]
return start + ''.join(groups)
# 'aaaabbabaabbabaabbbbaa'
TL;DR: Here's an algorithm that only iterates over the string once (with O(|S|)-ish complexity for limited string lengths). The example with which I explain it below is a bit long-winded, but the algorithm is really quite simple:
Iterate over the string, and update its value interpreted as a reverse (lsb-to-msb) binary number.
If you find the last zero of a sequence of zeros that is longer than the current maximum, store the current position, and the current reverse value. From then on, also update this value, interpreting the rest of the string as a forward (msb-to-lsb) binary number.
If you find the last zero of a sequence of zeros that is as long as the current maximum, compare the current reverse value with the current value of the stored end-point; if it is smaller, replace the end-point with the current position.
So you're basically comparing the value of the string if it were reversed up to the current point, with the value of the string if it were only reversed up to a (so-far) optimal point, and updating this optimal point on-the-fly.
Here's a quick code example; it could undoubtedly be coded more elegantly:
function reverseSubsequence(str) {
var reverse = 0, max = 0, first, last, value, len = 0, unit = 1;
for (var pos = 0; pos < str.length; pos++) {
var digit = str.charCodeAt(pos) - 97; // read next digit
if (digit == 0) {
if (first == undefined) continue; // skip leading zeros
if (++len > max || len == max && reverse < value) { // better endpoint found
max = len;
last = pos;
value = reverse;
}
} else {
if (first == undefined) first = pos; // end of leading zeros
len = 0;
}
reverse += unit * digit; // update reverse value
unit <<= 1;
value = value * 2 + digit; // update endpoint value
}
return {from: first || 0, to: last || 0};
}
var result = reverseSubsequence("aaabbaabaaabbabaaabaaab");
document.write(result.from + "→" + result.to);
(The code could be simplified by comparing reverse and value whenever a zero is found, and not just when the end of a maximally long sequence of zeros is encountered.)
You can create an algorithm that only iterates over the input once, and can process an incoming stream of unknown length, by keeping track of two values: the value of the whole string interpreted as a reverse (lsb-to-msb) binary number, and the value of the string with one part reversed. Whenever the reverse value goes below the value of the stored best end-point, a better end-point has been found.
Consider this string as an example:
aaabbaabaaabbabaaabaaab
or, written with zeros and ones for simplicity:
00011001000110100010001
We iterate over the leading zeros until we find the first one:
0001
^
This is the start of the sequence we'll want to reverse. We will start interpreting the stream of zeros and ones as a reversed (lsb-to-msb) binary number and update this number after every step:
reverse = 1, unit = 1
Then at every step, we double the unit and update the reverse number:
0001 reverse = 1
00011 unit = 2; reverse = 1 + 1 * 2 = 3
000110 unit = 4; reverse = 3 + 0 * 4 = 3
0001100 unit = 8; reverse = 3 + 0 * 8 = 3
At this point we find a one, and the sequence of zeros comes to an end. It contains 2 zeros, which is currently the maximum, so we store the current position as a possible end-point, and also store the current reverse value:
endpoint = {position = 6, value = 3}
Then we go on iterating over the string, but at every step, we update the value of the possible endpoint, but now as a normal (msb-to-lsb) binary number:
00011001 unit = 16; reverse = 3 + 1 * 16 = 19
endpoint.value *= 2 + 1 = 7
000110010 unit = 32; reverse = 19 + 0 * 32 = 19
endpoint.value *= 2 + 0 = 14
0001100100 unit = 64; reverse = 19 + 0 * 64 = 19
endpoint.value *= 2 + 0 = 28
00011001000 unit = 128; reverse = 19 + 0 * 128 = 19
endpoint.value *= 2 + 0 = 56
At this point we find that we have a sequence of 3 zeros, which is longer that the current maximum of 2, so we throw away the end-point we had so far and replace it with the current position and reverse value:
endpoint = {position = 10, value = 19}
And then we go on iterating over the string:
000110010001 unit = 256; reverse = 19 + 1 * 256 = 275
endpoint.value *= 2 + 1 = 39
0001100100011 unit = 512; reverse = 275 + 1 * 512 = 778
endpoint.value *= 2 + 1 = 79
00011001000110 unit = 1024; reverse = 778 + 0 * 1024 = 778
endpoint.value *= 2 + 0 = 158
000110010001101 unit = 2048; reverse = 778 + 1 * 2048 = 2826
endpoint.value *= 2 + 1 = 317
0001100100011010 unit = 4096; reverse = 2826 + 0 * 4096 = 2826
endpoint.value *= 2 + 0 = 634
00011001000110100 unit = 8192; reverse = 2826 + 0 * 8192 = 2826
endpoint.value *= 2 + 0 = 1268
000110010001101000 unit = 16384; reverse = 2826 + 0 * 16384 = 2826
endpoint.value *= 2 + 0 = 2536
Here we find that we have another sequence with 3 zeros, so we compare the current reverse value with the end-point's value, and find that the stored endpoint has a lower value:
endpoint.value = 2536 < reverse = 2826
so we keep the end-point set to position 10 and we go on iterating over the string:
0001100100011010001 unit = 32768; reverse = 2826 + 1 * 32768 = 35594
endpoint.value *= 2 + 1 = 5073
00011001000110100010 unit = 65536; reverse = 35594 + 0 * 65536 = 35594
endpoint.value *= 2 + 0 = 10146
000110010001101000100 unit = 131072; reverse = 35594 + 0 * 131072 = 35594
endpoint.value *= 2 + 0 = 20292
0001100100011010001000 unit = 262144; reverse = 35594 + 0 * 262144 = 35594
endpoint.value *= 2 + 0 = 40584
And we find another sequence of 3 zeros, so we compare this position to the stored end-point:
endpoint.value = 40584 > reverse = 35594
and we find it has a smaller value, so we replace the possible end-point with the current position:
endpoint = {position = 21, value = 35594}
And then we iterate over the final digit:
00011001000110100010001 unit = 524288; reverse = 35594 + 1 * 524288 = 559882
endpoint.value *= 2 + 1 = 71189
So at the end we find that position 21 gives us the lowest value, so it is the optimal solution:
00011001000110100010001 -> 00000010001011000100111
^ ^
start = 3 end = 21
Here's a C++ version that uses a vector of bool instead of integers. It can parse strings longer than 64 characters, but the complexity is probably quadratic.
#include <vector>
struct range {unsigned int first; unsigned int last;};
range lexiLeastRev(std::string const &str) {
unsigned int len = str.length(), first = 0, last = 0, run = 0, max_run = 0;
std::vector<bool> forward(0), reverse(0);
bool leading_zeros = true;
for (unsigned int pos = 0; pos < len; pos++) {
bool digit = str[pos] - 'a';
if (!digit) {
if (leading_zeros) continue;
if (++run > max_run || run == max_run && reverse < forward) {
max_run = run;
last = pos;
forward = reverse;
}
}
else {
if (leading_zeros) {
leading_zeros = false;
first = pos;
}
run = 0;
}
forward.push_back(digit);
reverse.insert(reverse.begin(), digit);
}
return range {first, last};
}

MATLAB string 2 number of table

I have a 3-year data in a string tableformat.txt. Three of its lines are given below:
12-13 Jan -10.5
14-15 Jan -9.992
15-16 Jan -8
How to change the 3rd column (-10.5, -9.992 and -8) of string to be (-10.500, -9.992 and -8.000) of number?
I have made the following script:
clear all; clc;
filename='tableformat.txt';
fid = fopen(filename);
N = 3;
for i = [1:N]
line = fgetl(fid)
a = line(10:12);
na = str2num(a);
ma(i) = na;
end
ma
which gives:
ma = -1 -9 -8
When I did this change: a = line(10:15);, I got:
Error message: Index exceeds matrix dimensions.
This will work for you.
clear all;
clc;
filename='tableformat.txt';
filename2='tableformat2.txt';
fid = fopen(filename);
fid2 = fopen(filename2,'w');
formatSpec = '%s %s %6.4f\n';
N = 3;
for row = [1:N]
line = fgetl(fid);
a = strsplit(line,' ');
a{3}=cellfun(#str2num,a(3));
fprintf(fid2, formatSpec,a{1,:});
end
fclose(fid);
fclose(fid2);

Counting substring that begin with character 'A' and ends with character 'X'

PYTHON QN:
Using just one loop, how do I devise an algorithm that counts the number of substrings that begin with character A and ends with character X? For example, given the input string CAXAAYXZA there are four substrings that begin with A and ends with X, namely: AX, AXAAYX, AAYX, and AYX.
For example:
>>>count_substring('CAXAAYXZA')
4
Since you didn't specify a language, im doing c++ish
int count_substring(string s)
{
int inc = 0;
int substring_count = 0;
for(int i = 0;i < s.length();i++)
{
if(s[i] == 'A') inc++;
if(s[i] == 'X') substring_count += inc;
}
return substring_count;
}
and in Python
def count_substring(s):
inc = 0
substring_count = 0
for c in s:
if(c == 'A'): inc = inc + 1
if(c == 'X'): substring_count = substring_count + inc
return substring_count
First count number of "A" in the string
Then count "X" in the string
using
Public Function CountCharacter(ByVal value As String, ByVal ch As Char) As Integer
Dim cnt As Integer = 0
For Each c As Char In value
If c = ch Then cnt += 1
Next
Return cnt
End Function
then take each "A" as a start position and "X" as an end position and get the substring. Do this for each "X" and then start with second "A" and run that for "X" count times. Repeat this and you will get all the substrings starting with "A" and ending with "X".
Just another solution In python:
def count_substring(str):
length = len(str) + 1
found = []
for i in xrange(0, length):
for j in xrange(i+1, length):
if str[i] == 'A' and str[j-1] == 'X':
found.append(str[i:j])
return found
string = 'CAXAAYXZA'
print count_substring(string)
Output:
['AX', 'AXAAYX', 'AAYX', 'AYX']

How to read a C generated binary file in Lua

I want to read a 32 bit integer binary file provided by another program. The file contains only integer and no other characters (like spaces or commas). The C code to read this file is as follows:
FILE* pf = fopen("C:/rktemp/filename.dat", "r");
int sz = width*height;
int* vals = new int[sz];
int elread = fread((char*)vals, sizeof(int), sz, pf);
for( int j = 0; j < height; j++ )
{
for( int k = 0; k < width; k++ )
{
int i = j*width+k;
labels[i] = vals[i];
}
}
delete [] vals;
fclose(pf);
But I don't know how to read this file into array using Lua.
I've tried to read this file using io.read, but part of the array looks like this:
~~~~~~xxxxxxxxyyyyyyyyyyyyyyzzzzzzzz{{{{{{{{{|||||||||}}}}}}}}}}}~~~~~~~~~xxxxxxxyyyyyyyyyyyyyyzzzzzz{{{{{{{{{{|||||||||}}}}}}}}}}}~~~~~~~~~xxyyyyyyyyyyyyyzzzzz{{{{{{|||}}}yyyyyyyyyyyz{{{yyyyyyyyÞľūơǿȵɶʢ˺̤̼ͽаҩӱľǿجٴȵɶʢܷݸ˺໻⼼ӱľǿ
Also the Matlab code to read this file is like this:
row = image_size(1);
colomn = image_size(2);
fid = fopen(data_path,'r');
A = fread(fid, row * colomn, 'uint32')';
A = A + 1;
B = reshape(A,[colomn, row]);
B = B';
fclose(fid);
I've tried a function to convert bytes to integer, my code is like this:
function bytes_to_int(b1, b2, b3, b4)
if not b4 then error("need four bytes to convert to int",2) end
local n = b1 + b2*256 + b3*65536 + b4*16777216
n = (n > 2147483647) and (n - 4294967296) or n
return n
end
local sup_filename = '1.dat'
fid = io.open(sup_filename, "r")
st = bytes_to_int(fid:read("*all"):byte(1,4))
print(st)
fid:close()
But it still not read this file properly.
You are only calling bytes_to_int once. You need to call it for every int you want to read. e.g.
fid = io.open(sup_filename, "rb")
while true do
local bytes = fid:read(4)
if bytes == nil then break end -- EOF
local st = bytes_to_int(bytes:byte(1,4))
print(st)
end
fid:close()
Now you can use the new feature of Lua language by calling string.unpack , which has many conversion options for format string. Following options may be useful:
< sets little endian
> sets big endian
= sets native endian
i[n] a signed int with n bytes (default is native size)
I[n] an unsigned int with n bytes (default is native size)
The arch of your PC is unknown, so I assume the data to read is unsigned and native-endian.
Since you are reading binary data from the file, you should use io.open(sup_filename, "rb").
The following code may be useful:
local fid = io.open(sup_filename, "rb")
local contents = fid:read("a")
local now
while not now or now < #contents do
local n, now = string.unpack("=I4", contents, now)
print(n)
end
fid:close()
see also: Lua 5.4 manual

Separate chars of a file in matlab

I have strings of 32 chars in a file (multiple lines).
What I want to do is to make a new file and put them there by making columns of 4 chars each.
For example I have:
00000000000FDAD000DFD00ASD00
00000000000FDAD000DFD00ASD00
00000000000FDAD000DFD00ASD00
....
and in the new file, I want them to appear like this:
0000 0000 000F DAD0 00DF D00A SD00
0000 0000 000F DAD0 00DF D00A SD00
Can you anybody help me? I am working for hours now and I can't find the solution.
First, open the input file and read the lines as strings:
infid = fopen(infilename, 'r');
C = textscan(infid, '%s', 'delimiter', '');
fclose(infid);
Then use regexprep to split the string into space-delimited groups of 4 characters:
C = regexprep(C{:}, '(.{4})(?!$)', '$1 ');
Lastly, write the modified lines to the output file:
outfid = fopen(outfilename, 'w');
fprintf(outfid, '%s\n', C{:});
fclose(outfid);
Note that this solution is robust enough to work on lines of variable length.
Import
fid = fopen('test.txt');
txt = textscan(fid,'%s');
fclose(fid);
Transform into a M by 28 char array, transpose and reshape to have a 4 char block on each column. Then add to the bottom a row of blanks and reshape back. Store each line in a cell.
txt = reshape(char(txt{:})',4,[]);
txt = cellstr(reshape([txt; repmat(' ',1,size(txt,2))],35,[])')
Write each cell/line to new file
fid = fopen('test2.txt','w');
fprintf(fid,'%s\r\n',txt{:});
fclose(fid);
Here's one way to do it in Matlab:
% read in file
fid = fopen('data10.txt');
data = textscan(fid,'%s');
fclose(fid);
% save new file
s = size(data{1});
newFid = fopen('newFile.txt','wt');
for t = 1:s(1) % format and save each row
line = data{1}{t};
newLine = '';
index = 1;
for k = 1:7 % seven sets of 4 characters
count = 0;
while count < 4
newLine(end + 1) = line(index);
index = index + 1;
count = count + 1;
end
newLine(end + 1) = ' ';
end
fprintf(newFid, '%s\n', newLine);
end
fclose(newFid);

Resources