Deflate Compression Specification - zip

I'm currently looking through the DEFLATE compression specification and am confused about this part:
0 - 15: Represent code lengths of 0 - 15
16: Copy the previous code length 3 - 6 times.
The next 2 bits indicate repeat length
(0 = 3, ... , 3 = 6)
Example: Codes 8, 16 (+2 bits 11),
16 (+2 bits 10) will expand to
12 code lengths of 8 (1 + 6 + 5)
17: Repeat a code length of 0 for 3 - 10 times.
(3 bits of length)
18: Repeat a code length of 0 for 11 - 138 times
(7 bits of length)
If I'm understanding correctly, 0-15 are the lengths of the Huffman codes for the code length sequences. However, I do not understand what 16-18 is supposed to be. Thanks for your help!

The 16-18 codes are instructions to the decoder to generate several lengths, either zeros or the repeats of the last length.
So for example:
18(12) 14 4 3 3 3 4 4 5 17(3) 5 16(9) 7
becomes:
0 0 0 0 0 0 0 0 0 0 0 0 14 4 3 3 3 4 4 5 0 0 0 5 5 5 5 5 5 5 5 5 5 7
where the numbers in parentheses are coded as 7, 3, and 2 bits respectively immediately after the Huffman code for that symbol.

Related

How can I to resize PCM audio sample buffer longer?

For example, I have 100 Byte PCM buffer and want to increase it to 300 Byte.
what i tried:
asume original buffer was 9, 4, 1, 7, 5
insert 0 - 9 0 0 4 0 0 1 0 0 7 0 0 5 0 0
average - 9 7 5 4 3 2 1 3 5 7 7 6 5 5 5
insert 0 in back - 9 4 1 7 5 0 0 0 0 0 0 0 0 0 0
They all had weird noise in result audio file.
How can I change length of buffer without effect sound?
Is there any formula I can use?
Usually linear interpolation works. What is the bit-resolution of your PCM file? If it is 16 bits (pretty typical), you'll have to first convert two bytes into a single value before applying the interpolation, and then disassemble the values back to bytes. You will need to know the byte order, as it can be either little-endian or big-endian.
EDIT: I should have added that the pitch will drop with this method of lengthening the file, unless the playback frame rate increases. To stretch out a sound in time without affecting its pitch is considerably more complicated.

How to list the number of words in a row with the most words?

I try to write the number of words from the longest line. I was able to write the number of words in each line, but I can't print the maximum number. The max () function do not works. Can anyone help me?
import os
import sys
import numpy as np
with open('demofile.txt') as f:
lines = f.readlines()
for index, value in enumerate(lines):
number_of_words = len(value.split())
print(number_of_words)
demofile.txt
<=4 1 2 3 4 5 6 7 8 9 10 11
<=4 1 2 3 4 5 6 7 8 9
<=4 1 2 3 4 5 6 7 8 9 10 11 sdad adada affg
<=4 1 2 3 4 5 6 7 8 9 10 11
Output:
12
10
15
12
0
0
0
0
0
0
0
0
0
0
0
I also don't understand why it lists the number of words in the next lines where there are no words
If I understood correctly max() function doesn't work because you are searching max of strings so you need to convert them to ints(floats).
lines = [int(x) for x in lines.split(" ")] // converts to ints
maximum = max(lines)// should work now
UPD:
Edited with comment below.
Before:
int(x) for x in lines
Now:
int(x) for x in lines.split(" ")

How to set a variable space with right alignment for a string in Python?

I'm trying to do this program where given a number N, one has to print out the decimal, octal, hexadecimal and binary for all the numbers in range 1 to N. The trouble is that the platform requires the solution in a particular format.
Suppose the number is 17, so the output should be like :
1 1 1 1
2 2 2 10
3 3 3 11
4 4 4 100
5 5 5 101
6 6 6 110
7 7 7 111
8 10 8 1000
9 11 9 1001
10 12 A 1010
11 13 B 1011
12 14 C 1100
13 15 D 1101
14 16 E 1110
15 17 F 1111
16 20 10 10000
17 21 11 10001
For 7 it would be like :
1 1 1 1
2 2 2 10
3 3 3 11
4 4 4 100
5 5 5 101
6 6 6 110
7 7 7 111
If you notice, the above is required to be printed in a way that the decimal, octal and hexadecimal numbers need a minimum of 2 spaces at their left whereas the binary numbers need at least one space at their left. Now, as the length of the numbers increase the space needs to be given accordingly such that the minimum space is there even for the max length number. So, how do I print them using a variable space? So far I have tried this :
Code
def print_formatted(number):
space=len(str(bin(number))[2:])
for i in range(1,number+1):
print('{:2d}'.format(i), end='')
print('{:>3s}'.format(str(oct(i))[2:]), end='')
print('{:>3s}'.format(str(hex(i))[2:]), end='')
print('{:>'+str(space)+'s}'.format(str(bin(i))[2:]))
print_formatted(17)
Here, I just tried doing the required with just the binary numbers but it's giving me an error
print('{:>'+str(space)+'s}'.format(str(bin(i))[2:]))
ValueError: Single '}' encountered in format string
Is there any fix/alternative for this?
Your problem is operator order - the + for string concattenation is weaker then the method call in
'{:>' + str(space) + 's}'.format(str(bin(i))[2:])
. Thats why you call the .format(...) only on "s}" - not the whole string. And thats where the
ValueError: Single '}' encountered in format string
comes from.
Putting the complete formatstring into parenthesis before applying .format to it fixes that.
You also need 1 more space for binary and can skip some str() that are not needed:
def print_formatted(number):
space=len(str(bin(number))[2:])+1 # fix here
for i in range(1,number+1):
print('{:2d}'.format(i), end='')
print('{:>3s}'.format(oct(i)[2:]), end='')
print('{:>3s}'.format(hex(i)[2:]), end='')
print(('{:>'+str(space)+'s}').format(bin(i)[2:])) # fix here
print_formatted(17)
Output:
1 1 1 1
2 2 2 10
3 3 3 11
4 4 4 100
5 5 5 101
6 6 6 110
7 7 7 111
8 10 8 1000
9 11 9 1001
10 12 a 1010
11 13 b 1011
12 14 c 1100
13 15 d 1101
14 16 e 1110
15 17 f 1111
16 20 10 10000
17 21 11 10001
From your given output above you might need to prepend this by 2 spaces - not sure if its a formatting error in your output above or part of the restrictions.
You could also shorten this by using f-strings (and removing superflous str() around bin, oct, hex: they all return a strings already).
Then you need to calculate the the numbers you use to your space out your input values:
def print_formatted(number):
de,bi,oc,he = len(str(number)), len(bin(number)), len(oct(number)), len(hex(number))
for i in range(1,number+1):
print(f' {i:{de}d}{oct(i)[2:]:>{oc}s}{hex(i)[2:]:>{he}s}{bin(i)[2:]:>{bi}s}')
print_formatted(26)
to accomodate other values then 17, f.e. 128:
1 1 1 1
2 2 2 10
3 3 3 11
...
8 10 8 1000
...
16 20 10 10000
...
32 40 20 100000
...
64 100 40 1000000
...
128 200 80 10000000

Variable string formatting in python 3

Input is a number, e.g. 9 and I want to print decimal, octal, hex and binary value from 1 to 9 like:
1 1 1 1
2 2 2 10
3 3 3 11
4 4 4 100
5 5 5 101
6 6 6 110
7 7 7 111
8 10 8 1000
9 11 9 1001
How can I achieve this in python3 using syntax like
dm, oc, hx, bn = len(str(9)), len(bin(9)[2:]), ...
print("{:dm%d} {:oc%s}" % (i, oct(i[2:]))
I mean if number is 999 so I want decimal 10 to be printed like ' 10' and binary equivalent of 999 is 1111100111 so I want 10 like ' 1010'.
You can use str.format() and its mini-language to do the whole thing for you:
for i in range(1, 10):
print("{v} {v:>6o} {v:>6x} {v:>6b}".format(v=i))
Which will print:
1 1 1 1
2 2 2 10
3 3 3 11
4 4 4 100
5 5 5 101
6 6 6 110
7 7 7 111
8 10 8 1000
9 11 9 1001
UPDATE: To define field 'widths' in a variable you can use a format-within-format structure:
w = 5 # field width, i.e. offset to the right for all octal/hex/binary values
for i in range(1, 10):
print("{v} {v:>{w}o} {v:>{w}x} {v:>{w}b}".format(v=i, w=w))
Or define a different width variable for each field type if you want them non-uniformly spaced.
Btw. since you've tagged your question with python-3.x, if you're using Python 3.6 or newer, you can use Literal String Interpolation to simplify it even more:
w = 5 # field width, i.e. offset to the right for all octal/hex/binary values
for v in range(1, 10):
print(f"{v} {v:>{w}o} {v:>{w}x} {v:>{w}b}")

Average of multiple files with unequal row sizes in Shell

I have 15 datafiles with unequal row sizes, but number of columns in each file is same. e.g.
ifile1.dat ifile2.dat ifile3.dat and so on ............
0 0 0 0 1 6
1 2 5 3 2 7
2 5 6 10 4 6
5 2 8 9 5 9
10 2 10 3 8 2
In each file 1st column represents the index number.
I would like to compute average of all these files for each index number in column 1. i.e.
ofile.txt
0 0 [This is computed as (0+0)/2]
1 4 [This is computed as (2+6)/2]
2 6 [This is computed as (5+7)/2]
3 [no value]
4 6 [This is computed as (6)/1]
5 4.66 [This is computed as (2+3+9)/3]
6 10
7
8 5.5
9
10 2.5
I can't think of any simple method to do it. I was thinking of a method, but seems very lengthy. Taking the average after converting all the files with same row size, .e.g.
ifile1.dat ifile2.dat ifile3.dat and so on ............
0 0 0 0 0 0
1 2 1 1 6
2 5 2 2 7
3 3 3
4 4 4 6
5 2 5 3 5 9
6 6 10 6
7 7 7
8 8 9 8 2
9 9 9
10 2 10 3 10
$ awk '{s[$1]+=$2; c[$1]++;} END{for (i in s) print i,s[i]/c[i];}' ifile*.dat
0 0
1 4
2 6
4 6
5 4.66667
6 10
8 5.5
10 2.5
In the above code, there are two arrays, s and c. s[i] is the sum of all entries with index i and c[i] is the number of entries with index i. After we have read all the files, we print the average, s[i]/c[i], for each index i.

Resources