Verilog Data Casting - verilog

I know it might seems a bit dumb for a question but i had to learn verilog by my own and sometimes i get a little confused about the basic things,
so my question is ,when i read a file in verilog with signed decimals inside and special characters like commas,when i store the line i am reading firstly does the type of data matter? Like if i store it as an integer will the data be converted as a string of ascii characters?And if i store it as a reg type will it be automatically converted to a binary string?
Sorry if this looks dumb but i am a little confused about how verilog is processing external data
Thanks!
I am trying to read a file with sensor data and to put it in a reg type ,the maximum total characters in a line of the file are 25 ,so i've assigned the width of my variable as 8*25 but because of my question above i am not sure how to progress with regarding the manipulation of my data

If you read the file using $fscanf, the file is treated as ASCII text and will be converted to a binary string if you use any of the integral formatting specifiers like %d for signed decimal and %h for hexadecimal. For example, suppose you have the file text:
123,-456, 789
-987, 654, -321
And the code
module top;
integer A,B,C;
integer file, code;
initial begin
file = $fopen("text","r");
repeat(2) begin
code = $fscanf(file,"%d,%d,%d",A,B,C);
$display(code,,A,,B,,C);
end
end
endmodule
This displays
# run -all
# 3 123 -456 789
# 3 -987 654 -321
# exit
If you need to read the file as an ASCII string (.i.e the ASCII character "0" is the binary string '00110000') then you would use the $s format specifier which would read the entire line. I doubt that is what you want to do.

Related

Is there a way to compare the format in which a line of a text file was written in Fortran?

I'm developing a Fortran program that must obtain some data from a text file and generate another text file using specific data from the first one.
The input file have many lines written in several specific formats which I know of. Although I know the formats, the lines in this file are generated in a "random way".
It would be much easier to generate the output file if I could compare the format in which each line was written, then I would know exactly what data I can get from that line of the input file to use it in the output file.
What I need is something like, for example, knowing that the format of the line read and stored in the LINHA variable is described in the FORMATO variable, do something like:
    
IF (FORMATO = '(1X, 15,3F8.1,2 (5A, 1X))') THEN
READ (LINHA, '(6X, F8.1)') my_variable
END IF
Because there might be another format such as
'(6A, 2F8.1, F8.6,2 (6A))'
in which, if I use the same READ statement, I will read an F8.1 variable in my_variable, however this value is not the correct one.
A (not so elegant) work-around that I can think of is to read the entire line using the advance = no option of read() and parse each character in the line separately. While doing so, you may count white spaces or other specific characters that you know of and then identify the different formats from there.
It would be helpful if you could give more specifications of the nature of the task.
The best option is to read without format, keeping each line in a character array. Then read the line variable as an internal file with the required format using the variable IOSTAT in order to check if the format is the correct.
INT max_size = 80
CHARACTER(LEN=max_size) :: line
READ(*,*) line
READ(line,'(1X, 15,3F8.1,2 (5A, 1X))',IOSTAT=ios) var1, var2, ...
Problem solved using a mixture of some of the suggestions posted.
I read each of the lines of the input file in an internal variable (RLINFILE) in the format '(A165)'. After that, I read all the contents of the string that I put in this internal variable in several dummy variables, using the format I knew of the lines from where I wanted to get some information (read all the information of the line in the desired and get IOSTAT = 0 guarantee that this is the correct line), so if the result of the reading is ok (IOSTAT = 0), it is because the line I just read was the correct one for the information I wanted, so I store the contents of some of the dummy variables that represent the values that interest me. In the code, the solution looked something like this:
OPEN(UNIT=LU1,FILE=RlinName,STATUS='OLD')
ilin = 0
formato = '(14X,A,1X,F7.1,1X,F7.1,5X,A,1X,A,1X,A,5X,A,I5,1X,A,I3,3F8.1,A,A,A,1X,A,2(1X,F8.2),1X,A,1X,A)'
DO WHILE (.TRUE.)
READ(LU1,'(A165)',END=300) RLINFILE
READ(RLINFILE,formato,IOSTAT=linhaok) dum2_a1,dum2_f1,dum2_f2,dum2_a2,dum2_a3,dum2_a4,dum2_a5,dum2_i1,dum2_a6,dum2_i2,dum2_f3,dum2_f4,dum2_f5,dum2_a7,dum2_a8,dum2_a9,dum2_a10,dum2_f6,dum2_f7,dum2_a11,dum2_a12
IF(linhaok.EQ.0) THEN
ilin = ilin+1
rlin_lshu(ilin) = dum2_a4
rlin_nbpa(ilin) = dum2_i1
rlin_ncir(ilin) = dum2_i2
rlin_ppij(ilin) = dum2_f3
rlin_pqij(ilin) = dum2_f4
rlin_tapn(ilin) = dum2_a7
END IF
END DO
300 CLOSE(UNIT=LU1)
The description of the problem you are trying to solve is a bit vague to me, but the simplest solutions that comes to my mind, given the description of the problem, is to modify the original code that generates the input data file, to write the used Fortran READ format before the data line in the input file. This way, you can read the format as a string and use it in the subsequent data IO in your second code.
If you describe the specific task your tryting to accomplish in more details, perhaps more experienced Fortranners could help.

Converting character to real fails if character consists of an integer (i.e. without dot)

I am reading a text file with my fortran code. I parse the text file (which contain a bunch of stuff such as names and numbers) and I end up with strings containing real number (they are real time measuraments) such as:
string = 1.34
I simply write this string in a real number by doing
read(levelCHAR,'(f)') level
And everything worked great for a month until today, when the number in the input file was exactly 1 and I had:
string = 1
and the read statement above gave me
level=0
Therefore to fix this I added before the read statement:
if (index(string ,'.')<=0) then
string = trim(string )//'.'
endif
And this seems to have fixed the issue. However, I wanted to know if I am missing something and there is a more elegant way to do this in one line for example by replacing the format '(f)' in the read statement with a more suitable expression.
Your program is not valid Fortran:
read(levelCHAR,'(f)') level
1
Error: Nonnegative width required in format string at (1)
form.f90:5.5:
You must indicate the input field with such as f5.0. Or you can use the list-directed input read(levelChar,*) level.
Also, be sure to use the .0 and not any other number in the fw.d descriptor for input. Otherwise strange results are to be expected for integer inputs as they will be multiplied by 10**(-d).

Converting bits to string (data)

I have file which contains some data (text copied and pasted from the "What You Will Learn" portion of this PDF). Firstly, I have converted the contents in the file to bits successfully. However, when I try to convert it back to the original format, some of the characters are not correctly converted, as shown below:
Cisco has
developed the Cisco Open Network Environment (ONE)
architecture as a multifaceted approach to network
programmability delivered across three pillars:
??)É¥ Í?н??ÁÁ±¥?Ñ¥½¸ÁɽÉ?µµ¥¹?¥¹Ñ?É???Ì?¡A%̤?)?áÁ½Í??¥É?ѱ佸Íݥѡ?Ì?¹É½ÕÑ?ÉÌѼ?Õµ?¹Ð?)?á¥ÍÑ¥¹?=Á?¹±½ÜÍÁ?¥?¥?Ñ¥½¹Ì* ¤&öGV7F?öâ×&VG?÷VäfÆ÷r6öçG&öÆÆW"æB÷VäfÆ÷r ¦vVçG0¨?HÝZ]HÙ??ÙXÝÈÈ[]?\??\X[Ý?\?^\Ë?\X[?Ù\?XÙ\Ë[??\ÛÝ\?ÙHÜ?Ú\Ý?][Û?Ø\X?[]Y\È[?H?]HÙ[
As you can see here some characters are converted successfully, others are not.
My code is below:
file = open("test.txt",'r')
myfile = ''.join(map(str,file))
l = []
for i in myfile:
asc11 = ord(i)
b = "{0:08b}".format(asc11)
l.extend(int(y) for y in b)
string_bin = ''.join(map(str,l))
mydata = ''.join(chr(int(string_bin[i:i+8], 2)) for i in range(0,len(string_bin), 8))
print(mydata)
What wrong with my code? What I need to change to make it work properly?
What's Going On?
You are running into an encoding issue because some characters in the PDF are non-ASCII characters. For example, the bullet points are U+2022 which require 3 bytes of storage.
When Python reads from your file, it doesn't know what encoding you used to write that data. Thus it reads bytes from the file and uses a character encoding to translate them into strs which are stored using Python's own internal unicode format. (This differs from Python 2 where open() returned raw bytes stored in a str which you could then manually decoded to unicode.)
Thus, in Python 3, open() accepts a named encoding parameter. For example open("test.txt",'r', encoding='ascii'). Because you don't specify the encoding when you call open(), you end up using your system's default encoding. For instance, on my laptop, the default encoding is CP1252 (LATIN-1). Yours may differ.
Whatever encoding Python uses to interpret your file, it then internally uses it's own unicode format to store your string. This means that your string may internally use mutli-byte characters even if the original encoding did not. For example, my laptop uses CP1252 to interpret U+2022 as • which is internally stored as U+00e2, U+20AC and U+00A2 -- € is stored using a multi-byte character even though it was just one byte in the original file.
Let's assume you computer is sane and uses UTF-8 by default (this explanation is similar for many multi-byte characters). When you reach a bullet point, it is stored as U+2022. When you call ord('\u2022') the result is 8226. When you then call "{0:08b}".format(8226) this returns "10000000100010". That's a 14 character string. Your parsing code assumes all of the ordinals will generate 8 character strings. Because of this, the "binary" output becomes misaligned. This means that when you then parse the binary string in 8-character segments, it gets thrown off and starts interpreting things as control characters and all sorts of foreign language characters.
If you call open(..., encoding='ascii'), Python will actually throw an exception because it reads non-valid ASCII characters.
Possible Solutions
I'm not sure why exactly you are converting the input string into the representation that you are using. It's not binary, as your question title would suggest. Rather, you've converted the data into a textual representation of it's binary encoding.
Technically speaking, when you store encoded text to a file, it's stored using a binary representation. Python, and any text editor, has to decode those bytes into it's internal character representation before it can display them as text. Thus, calling open("test.txt", "r", encoding="utf-8") reads the binary data out of your text file and converts it into Python's internal unicode format. Similarly, calling myfile.encode('utf-8') will return the UTF-8 encoded bytes which can then be written to a file, network socket, etc.
If, however, you do need to use a format similar to what you are currently using, first, I still recommend you specify an encoding when you call open() (I recommend UTF-8). Then you can consider these options:
Detect and omit non-ASCII characters. They will have an ordinal >= 128.
Mimic UTF-16 or UTF-32 and output multi-byte output for all characters. For example, use "{0:032b}".format(asc11) and then parse the result in 32-character chunks. It's memory and storage inefficient, but it will preserve multi-byte characters.
Regardless, I highly recommend reading the Dive Into Python 3 chapter about strings.

Parse Fortran String Like A File

I want to pass a string from C to Fortran and then process it line-by-line as if I was reading a file. Is this possible?
Example String - contains newlines
File description: this file contains stuff
3 Values
1 Description
2 Another description
3 More text
Then, I would like to parse the string line-by-line, like a file. Similar to this:
subroutine READ_STR(str, len)
character str(len),desc*70
read(str,'(a)') desc
read(str,*) n
do 10 i=1,n
read(str,*) parm(i)
10 continue
Not without significant "manual" intervention. There are a couple of issues:
The newline character has no special meaning in an internal file. Records in an internal file correspond to elements in an array. You would either need to first manually preprocess your character scalar into an array, or use a single READ that skipped over the newline characters.
If you did process the string into an array, then internal files do not maintain a file position between parent READ statements. You would need to manually track the current record yourself, or process the entire array with a single READ statement that accessed multiple records.
If you have variable width fields, then writing a format specification to process everything in a single READ could be problematic, though this depends on the details of the input.

How to store binary data in a Lua string

I needed to create a custom file format with embedded meta information. Instead of whipping up my own format I decide to just use Lua.
texture
{
format=GL_LUMINANCE_ALPHA;
type=GL_UNSIGNED_BYTE;
width=256;
height=128;
pixels=[[
<binary-data-here>]];
}
texture is a function that takes a table as its sole argument. It then looks up the various parameters by name in the table and forwards the call on to a C++ routine. Nothing out of the ordinary I hope.
Occasionally the files fail to parse with the following error:
my_file.lua:8: unexpected symbol near ']'
What's going on here?
Is there a better way to store binary data in Lua?
Update
It turns out that storing binary data is a Lua string is non-trivial. But it is possible when taking care with 3 sequences.
Long-format-string-literals cannot have an embedded closing-long-bracket (]], ]=], etc).
This one is pretty obvious.
Long-format-string-literals cannot end with something like ]== which would match the chosen closing-long-bracket.
This one is more subtle. Luckily the script will fail to compile if done wrong.
The data cannot embed \n or \r.
Lua's built in line-end processing messes these up. This problem is much more subtle. The script will compile fine but it will yield the wrong data. 0x13 => 0x10, 0x1013 => 0x10, etc.
To get around these limitations I split the binary data up on \r, \n, then pick a long-bracket that works, finally emit Lua that concats the various parts back together. I used a script that does this for me.
input: XXXX\nXX]]XX\r\nXX]]XX]=
texture
{
--other fields omitted
pixels= '' ..
[[XXXX]] ..
'\n' ..
[=[XX]]XX]=] ..
'\r\n' ..
[==[XX]]XX]=]==];
}
Lua is able to encode most characters in long bracket format including nulls. However, Lua opens the script file in text mode and this causes some problems. On my Windows system the following characters have problems:
Char code(s) Problem
-------------- -------------------------------
13 (CR) Is translated to 10 (LF)
13 10 (CR LF) Is translated to 10 (LF)
26 (EOF) Causes "unfinished long string near '<eof>'"
If you are not using windows than these may not cause problems, but there may be different text-mode based problems.
I was only able to produce the error you received by encoding multiple close brackets:
a=[[
]]] --> a.lua:2: unexpected symbol near ']'
But, this was easily fixed with the following:
a=[==[
]]==]
The binary data needs to be encoded into printable characters. The simplest method for decoding purposes would be to use C-like escape sequences for all bytes. For example, hex bytes 13 41 42 1E would be encoded as '\19\65\66\30'. Of course, then the encoded data is three to four times larger than the source binary.
Alternatively, you could use something like Base64, but that would have to be decoded at runtime instead of relying on the Lua interpreter. Personally, I'd probably go the Base64 route. There are Lua examples of Base64 encoding and decoding.
Another alternative would be have two files. Use a well defined image format file (e.g. TGA) that is pointed to by a separate Lua script with the additional metadata. If you don't want two files to move around then they could be combined in an archive.

Resources