Vim opens Powershell dump files with null characters - vim

In Powershell I run some command-line script and make a dump of it:
PS> cmd /c D:\script.bat | tee ~/out.log
Then, I open this log in vim, and this is roughly what I see:
яю^M^#
^#C^#:^#\^#U^#s^#e^#r^#s^#>^#s^#e^#t^# ^#D^#E^#B^#U^#G^#_^#F^#U^#L^#F^#I^#L^#L^#M^#E^#N^#T^#=^#-^#X^#r^#u^#n^#j^#d^#w^#p^#:^#t^#r^#a^#n^#s^#p^#o^#r^#t^#=^#d^#t^#_^#s^#o^#c^#k^#e^#t^#,^#a^#d^#d^#r^#e^#s^#s^#=^#3^#0^#0^#4^#,^#s^#e^#r^#v^#e^#r^#=^#y^#,^#s^#u^#s^#p^#e^#n^#d^#=^#n^# ^#^M^#
And the hex dump:
0000000: d18f d18e 0d00 0a00 4300 3a00 5c00 5500 ........C.:.\.U.
0000010: 7300 6500 7200 7300 3e00 7300 6500 7400 s.e.r.s.>.s.e.t.
0000020: 2000 4400 4500 4200 5500 4700 5f00 4600 .D.E.B.U.G._.F.
0000030: 5500 4c00 4600 4900 4c00 4c00 4d00 4500 U.L.F.I.L.L.M.E.
0000040: 4e00 5400 3d00 2d00 5800 7200 7500 6e00 N.T.=.-.X.r.u.n.
0000050: 6a00 6400 7700 7000 3a00 7400 7200 6100 j.d.w.p.:.t.r.a.
0000060: 6e00 7300 7000 6f00 7200 7400 3d00 6400 n.s.p.o.r.t.=.d.
0000070: 7400 5f00 7300 6f00 6300 6b00 6500 7400 t._.s.o.c.k.e.t.
0000080: 2c00 6100 6400 6400 7200 6500 7300 7300 ,.a.d.d.r.e.s.s.
0000090: 3d00 3300 3000 3000 3400 2c00 7300 6500 =.3.0.0.4.,.s.e.
00000a0: 7200 7600 6500 7200 3d00 7900 2c00 7300 r.v.e.r.=.y.,.s.
00000b0: 7500 7300 7000 6500 6e00 6400 3d00 6e00 u.s.p.e.n.d.=.n.
00000c0: 2000 0d00 0a ....
How do I configure vim to open such files normally without a hassle of deleting those nulls every time?
P.S. Notepad opens file without nulls visible

Your file is encoded in UTF-16; each character is represented by 2 byte (and the lower end comes first).
As long as 'encoding' is set to something that can represent the characters in the file (utf-8 is recommended, latin1 won't do), this is fine, no need to fiddle with it.
You need to tell Vim the file's encoding. Either explicitly when opening
:edit ++enc=utf-16le out.log
or by prepending (after files with BOM, which can be uniquely identified) the value to 'fileencodings':
:set fileencodings=ucs-bom,utf-16le,utf-8,default,latin1

It seems that your file is UTF8 encoded.
To open it, start vim with encoding argument like this:
vim "+set encoding=utf-8"
or if you are already editing the file, you can set the encoding as follows:
:set enc=utf-8

Related

linx mmc-utils erase command not working as expect

I have a system that has an eMMC card on it, and I am trying to use the erase functioanlity defined in the eMMC specification (6.6.9 Erase) using the the mmc-utils user space tool.
The implantation correct to me. But when I run the command the erase does not work as expected.
mmc erase secure-erase <start address in erase blocks> <end address in erase blocks> <device/path>
root#sys:~# # write 0x02 to every byte on the on block device
root#sys:~# tr '\0' '\2' < /dev/zero > /dev/mmcblk0
root#sys:~# mmc erase secure-erase 0 2 /dev/mmcblk0
Executing Secure Erase from 0x00000000 to 0x00000002
High Capacity Erase Unit Size=524288 bytes
High Capacity Erase Timeout=300 ms
High Capacity Write Protect Group Size=8388608 bytes
Secure Erase Succeed
root#sys:~# hexdump /dev/mmcblk0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0080000 0202 0202 0202 0202 0202 0202 0202 0202
*
I was expecting 2 blocks of 524288 (0x0080000) to be erased. But only one block was erased. I tried several other values and they also did not erase more then one block.
Am I using the tool correctly? does it work for others? Could this be an issue with my mmc driver? or does the issue lay with the firmware on the eMMC?

How to change file encoding from hexadecimal to UTF-8

I have a log file from virtual terminal that looks like this:
0025 0200 c7c8 da20 0d00 2c01 0400 3d01
1822 0000 0012 a0f5 cd02 2810 0030 0010
0030 0110 0030 0210 0030 0310 0030 0410
0030 0510 0030 0610 0030 0710 0030 0810
0030 0a87 1332 203c 1000 3000 0011 1000
3001 0010 1000 3002 0000 1000 3003 0000
1000 3004 0000 1000 3005 0000 1000 3006
................
By default the programHercules(serial terminal) that append to the log file AFAIK doesn't have the ability to change the log file encoding. So when I open the file with Sublime Text 3 it shows that the file is encoded with hexadecimal encoding. The problem arises when I use the text from the file in nodejs is not converted properly to a string.
For now I found the solution to change the file encoding to UTF-8 manually with ST3 File -> Save With Encoding -> UTF-8. But it will happen very often and for many files, so some automation would harm.
The question is: Can I change the file encoding using node.js or batch script, from hexadecimal to UTF-8?

How to create zip file with data-descriptor section

Is there a way to create a zip file and to force it to have data-descriptor section from the command line?
In a comment on Github (https://github.com/adamhathcock/sharpcompress/issues/88#issuecomment-215696631), I found a suggestion to use the -fd flag:
Just FYI, when creating the ZIP file i also used the the command line parameter -fd which enforces usage of data descriptors. Not sure whether the ZIP tool on OSX provides this parameter, but i noticed that you didn't use it when creating your ZIP file
So I tested it (with the standard zip tool on OS X, "Zip 3.0 (July 5th 2008)"), and confirmed that it indeed generates a zip file with the data descriptor set, as follows:
/tmp> touch empty.txt
/tmp> zip -fd foo.zip empty.txt
adding: empty.txt (stored 0%)
/tmp> xxd foo.zip
00000000: 504b 0304 0a00 0800 0000 698d 7c49 0000 PK........i.|I..
00000010: 0000 0000 0000 0000 0000 0900 1c00 656d ..............em
00000020: 7074 792e 7478 7455 5409 0003 a65e 3c58 pty.txtUT....^<X
00000030: a65e 3c58 7578 0b00 0104 f501 0000 0400 .^<Xux..........
00000040: 0000 0050 4b07 0800 0000 0000 0000 0000 ...PK...........
00000050: 0000 0050 4b01 021e 030a 0008 0000 0069 ...PK..........i
00000060: 8d7c 4900 0000 0000 0000 0000 0000 0009 .|I.............
00000070: 0018 0000 0000 0000 0000 00b0 8100 0000 ................
00000080: 0065 6d70 7479 2e74 7874 5554 0500 03a6 .empty.txtUT....
00000090: 5e3c 5875 780b 0001 04f5 0100 0004 0000 ^<Xux...........
000000a0: 0000 504b 0506 0000 0000 0100 0100 4f00 ..PK..........O.
000000b0: 0000 5300 0000 0000 ..S.....
The boldfaced sequence of 16 bytes above is the data descriptor section.
Its header 50 4b07 08 (or PK..) and the data descriptor format is specified by the zip specification (https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT):
4.3.9 Data descriptor:
crc-32 4 bytes
compressed size 4 bytes
uncompressed size 4 bytes
4.3.9.1 This descriptor MUST exist if bit 3 of the general
purpose bit flag is set (see below). It is byte aligned
and immediately follows the last byte of compressed data.
This descriptor SHOULD be used only when it was not possible to
seek in the output .ZIP file, e.g., when the output .ZIP file
was standard output or a non-seekable device. For ZIP64(tm) format
archives, the compressed and uncompressed sizes are 8 bytes each.
...
4.3.9.3 Although not originally assigned a signature, the value
0x08074b50 has commonly been adopted as a signature value
for the data descriptor record. Implementers should be
aware that ZIP files may be encountered with or without this
signature marking data descriptors and SHOULD account for
either case when reading ZIP files to ensure compatibility.
To find out whether the third bit of the general purpose bit flag is set, we have to parse the zip file to locate the file header for empty.txt.
See Wikipedia for a brief overview and tables describing the meaning of bytes in a zip file - https://en.wikipedia.org/wiki/Zip_(file_format) .
The last 22 bytes (starting at the penultimate line, 504b 0506 (or PK..) is the end of central directory (EOCD) record. At offset 16 within this EOCD record, a 4-byte unsigned integer specifies the start of the central directory. We have 5300 0000 (little endian), or 0x53 = 83. This happens to be the offset right after the data descriptor section that we have identified above. Starting at the 6th offset after the start of the central directory, we find a pair of bytes that form the bit flag.
0a 00 (little endian) = 00000000 00001010 (binary, big endian)
^
bit 3 of the general purpose flag
Indeed, the third bit (counting form the right, starting at 0) is set, so we see that the zip file created above indeed has a data descriptor section.

How to make a valid input for "xxd -r" in vim when removing a byte?

Create a file named as test containing the following content:
0123456789abcdef0123456789abcdef
I want to remove the first 0 with the use of xxd. Open it with vim -b test then run :%!xxd inside vim. The result is:
0000000: 3031 3233 3435 3637 3839 6162 6364 6566 0123456789abcdef
0000010: 3031 3233 3435 3637 3839 6162 6364 6566 0123456789abcdef
0000020: 0a .
Then I remove the hex code 30 for the first 0:
0000000: 31 3233 3435 3637 3839 6162 6364 6566 0123456789abcdef
0000010: 3031 3233 3435 3637 3839 6162 6364 6566 0123456789abcdef
0000020: 0a .
Then I run :%!xxd -r to read the hex back. The result is:
^#23456789abcdef^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#
The result is garbled. I know that the reason is the above content is not a valid xxd input. If I remove the line numbers and the text part:
31 3233 3435 3637 3839 6162 6364 6566
3031 3233 3435 3637 3839 6162 6364 6566
0a
The I run :%!xxd -r -p to read the hex back. And the result is correct:
123456789abcdef0123456789abcdef
Is there a better way to do this job with the use of vim and xxd? A better way means less editing to make a valid input for xxd -r.
You've discovered on your own that removing the byte offset column and text column allows you to use :%!xxd -r -p to get what you want. So how about creating a custom command to remove those columns and then do the conversion? :%!xxd -r -p is too much to type, anyway.
Something like:
:command MyXXDR %s#^[^:]*: \(\%(\x\+ \)\+\) .*#\1# | %!xxd -r -p
This exact command may cause problems if you have messed up the format of the file too much with your edits (i.e. if the substitute command doesn't match to remove the necessary text), but you get the idea.

Switched to a Linux Environment, now fscanf doesn't work

I have a program that works when compiled in Windows with both Visual Studio and CodeBlocks, but when I compile it in Kubuntu using QT Creator, the fscanf functions don't work the same way.
I have a file containing the names of other files, each of which is separated by a space and line break. Using fscanf with "%s" in the working environments reads the file name into char entity[21] which, in this test case, holds "ent001.txt" and ends in a null byte. In linux, however, entity[0] has "-74 / 182" then several null bytes, then several things that aren't in the file being read, none of them letters. Fscanf returns a -1.
Is there a deeper problem in portability, or are my standard libraries a bit off?
EDIT: For some sample code:
fin = fopen( levelfile, "r" ) ;
test = fscanf(fin, "%s", entity ) ;
Where 'levelfile' is 'char* levelfile[21]' whose value is hard coded in right now. Test is an 'int' to find the return value. 'fin' is not equal to null.
EDIT2: Output from xxd on the level file:
0000000: 656e 7430 3031 2e74 7874 200a 656e 7430 ent001.txt .ent0
0000010: 3032 2e74 7874 2024 200a 5472 6967 6765 02.txt $ .Trigge
0000020: 7230 3031 2e74 7874 2024 200a 3020 3531 r001.txt $ .0 51
0000030: 3220 3531 3220 3020 0a31 2037 3132 2037 2 512 0 .1 712 7
Where did your data file get created? Any chance it has DOS-style line breaks (CR+LF) instead of Unix newlines?
If that's the problem, then text-mode (fopen(fname, "rt")) may help or you can run the file through the dos2unix utility (just d2u on some Linus distributions).

Resources