decompressing a compressed string

decompressing a compressed string - string

My question is that I don't know where to go now with the code I have to create a decompress code. I get the error (TypeError: can't multiply sequence by non-int of type 'str') and assume its because I'm not multiplying the string correctly. Also, I can't use lists, just string manipulation for this assignment
Just as an example, the output's suppose to look like this-> cat2dog1qwerty3 -> catcatdogqwertyqwertyqwerty
Function:
def decompress(compressed_in):
new_word = True
char_holder = ""
decompressed_out = ""
for char in compressed_in:
if char.isalpha and new_word:
char_holder += char
new_word = False
elif char.isalnum:
decompressed_out += char * char_holder
new_word = True
return decompressed_out
Main:
# Import
from compress import decompress
# Inputs
compressed_in = str(input("Enter a compressed string: ")) # compressed
# Outputs
decompressed_out = decompress(compressed_in)
print(decompressed_out)

Since this is apparently a homework assignment, I won't give you the code, but here are several problems I see with what you're presented.
Indentation. This is probably an artifact of copying-and-pasting, but every line after the def should be indented.
Not calling functions. When you write char.isalpha, that probably isn't doing what you want it to. .isalpha() is a function, so you need to call it with parentheses, like char.isalpha().
isalnum() is probably not the function you want. That checks if something is a letter or a number, but you've already checked for letters, so you probably want the function that checks if something is a number. This isn't strictly necessary, since the other if condition will still trigger first, but it's something you could get marked down for.
You never clear char_holder. It looks like you meant to, since you have a boolean new_word that you keep track of, but you aren't using it properly. At some point, you should be doing char_holder = char (ie. not +=). I'll let you decide where to put that logic.
Finally, for the error you're getting. You are correct that you are not multiplying things together correctly. Think about what the types are in the multiplication statement, and what values the variables would have. For example, in the first pass, char_holder would be equal to 'cat', and char would be equal to '3'. Try typing '3' * 'cat' into a Python interpreter and see what happens. It should be evident from here what you need to do to fix this.

Related

Inner workings of map() in a specific parsing situation

I know there are already at least two topics that explain how map() works but I can't seem to understand its workings in a specific case I encountered.
I was working on the following Python exercise:
Write a program that computes the net amount of a bank account based a
transaction log from console input. The transaction log format is
shown as following:
D 100
W 200
D means deposit while W means withdrawal. Suppose the following input
is supplied to the program:
D 300
D 300
W 200
D 100
Then, the output should be:
500
One of the answers offered for this exercise was the following:
total = 0
while True:
s = input().split()
if not s:
break
cm,num = map(str,s)
if cm=='D':
total+=int(num)
if cm=='W':
total-=int(num)
print(total)
Now, I understand that map applies a function (str) to an iterable (s), but what I'm failing to see is how the program identifies what is a number in the s string. I assume str converts each letter/number/etc in a string type, but then how does int(num) know what to pick as a whole number? In other words, how come this code doesn't produce some kind of TypeError or ValueError, because the way I see it, it would try and make an integer of (for example) "D 100"?

first
cm,num = map(str,s)
could be simplified as
cm,num = s
since s is already a list of strings made of 2 elements (if the input is correct). No need to convert strings that are already strings. s is just unpacked into 2 variables.
the way I see it, it would try and make an integer of (for example) "D 100"?
no it cannot, since num is the second parameter of the string.
if input is "D 100", then s is ['D','100'], then cm is 'D' and num is '100'
Then since num represents an integer int(num) is going to convert num to its integer value.
The above code is completely devoid of error checking (number of parameters, parameters "type") but with the correct parameters it works.
and map is completely useless in that particular example too.

The reason is the .split(), statement before in the s = input().split(). This creates a list of the values D and 100 (or ['D', '100']), because the default split character is a space ( ). Then the map function applies the str operation to both 'D' and '100'.
Now the map, function is not really required because both values upon input are automatically of the type str (strings).
The second question is how int(num) knows how to convert a string. This has to do with the second (implicit) argument base. Similar to how .split() has a default argument of the character to split on, so does num have a default argument to convert to.
The full code is similar to int(num, base=10). So as long as num has the values 0-9 and at most 1 ., int can convert it properly to the base 10. For more examples check out built in int.

What is the Pythonic way to ID when in a bytestring values stop being in a particular range?

I need a function which takes a bytestring of length n, and returns only those first m bytes that can be decoded into the Latin1 character set. (I am writing a translator for a very old file format, and one of the fields providing a human readable label is—unlike much of the rest of the file format—of fixed length, so after the human readable name, there's a byte character that is either less than b'\x20'—space—or greater than b'\xe7'—tilde—and then uninitialized byte values for the remainder of the fixed field.)
Currently I make do thus:
n = 27
def extract_name(bytestring):
for i in range(0,n):
if (bytestring[i:i+1]<b' ') or (bytestring[i:i+1]>b'~'):
name_length = i
break
else:
name_length = n
name = bytestring[0:name_length].decode('Latin1')
return name
This works:
>>>my_string = b'My Great Name\xf0\x1e\x23\x23\xe1\x06\xbc\x1b\x8a\xf7\x00\x00\x00\x00\x00\x00\x00'
>>>extract_name(my_string)
'My Great Name'
However, I feel clunky about the code, and wonder if I can make it more Pythonic.

Regular expressions are your friend!
import re
def extract_name(bytestring):
return re.sub(b'[\x00-\x1f].*|[\x7f-\xff].*', b'', bytestring).decode('Latin1')
This gives the desired behavior:
>>>my_string = b'My Great Name\xf0\x1e\x23\x23\xe1\x06\xbc\x1b\x8a\xf7\x00\x00\x00\x00\x00\x00\x00'
>>>extract_name(my_string)
'My Great Name'

Python get character position matches between 2 strings

I'm looking to encode text using a custom alphabet, while I have a decoder for such a thing, I'm finding encoding more difficult.
Attempted string.find, string.index, itertools and several loop attempts. I would like to take the position, convert it to integers to add to a list. I know its something simple I'm overlooking, and all of these options will probably yield a way for me to get the desired results, I'm just hitting a roadblock for some reason.
alphabet = '''h8*jklmnbYw99iqplnou b'''
toencode = 'You win'
I would like the outcome to append to a list with the integer position of the match between the 2 string. I imagine the output to look similar to this:
[9,18,19,20,10,13,17]

Ok, I just tried a bit harder and got this working. For anyone who ever wants to reference this, I did the following:
newlist = []
for p in enumerate(flagtext):
for x in enumerate(alphabet):
if p[1] == x[1]:
newlist.append(x[0])
print newlist

Pyparsing - matching the outermost set of nested brackets

I'm trying to use pyparsing to build a parser that will match on all text within an arbitrarily nested set of brackets. If we consider a string like this:
"[A,[B,C],[D,E,F],G] Random Middle text [H,I,J]"
What I would like is for a parser to match in a way that it returns two matches:
[
"[A,[B,C],[D,E,F],G]",
"[H,I,J]"
]
I was able to accomplish a somewhat-working version of this using a barrage of originalTextFor mashed up with nestedExpr, but this breaks when your nesting is deeper than the number of OriginalTextFor expressions.
Is there a straightforward way to only match on the outermost expression grabbed by nestedExpr, or a way to modify its logic so that everything after the first paired match is treated as plaintext rather than being parsed?
update: One thing that seems to come close to what I want to accomplish is this modified version of the logic from nestedExpr:
def mynest(opener='{', closer='}'):
content = (empty.copy()+CharsNotIn(opener+closer+ParserElement.DEFAULT_WHITE_CHARS))
ret = Forward()
ret <<= ( Suppress(opener) + originalTextFor(ZeroOrMore( ret | content )) + Suppress(closer) )
return ret
This gets me most of the way there, although there's an extra level of list wrapping in there that I really don't need, and what I'd really like is for those brackets to be included in the string (without getting into an infinite recursion situation by not suppressing them).
parser = mynest("[","]")
result = parser.searchString("[A,[B,C],[D,E,F],G] Random Middle text [H,I,J]")
result.asList()
>>> [['A,[B,C],[D,E,F],G'], ['H,I,J']]
I know I could strip these out with a simple list comprehension, but it would be ideal if I could just eliminate that second, redundant level.

Not sure why this wouldn't work:
sample = "[A,[B,C],[D,E,F],G] Random Middle text [H,I,J]"
scanner = originalTextFor(nestedExpr('[',']'))
for match in scanner.searchString(sample):
print(match[0])
prints:
'[A,[B,C],[D,E,F],G]'
'[H,I,J]'
What is the situation where "this breaks when your nesting is deeper than the number of OriginalTextFor expressions"?

How can I write the following script in Python?

So the program that I wanna write is about adding two strings S1 and S2 who are made of int.
example: S1='129782004977', S2='754022234930', SUM='883804239907'
So far I've done this but still it has a problem because it does not rive me the whole SUM.
def addS1S2(S1,S2):
N=abs(len(S2)-len(S1))
if len(S1)<len(S2):
S1=N*'0'+S1
if len(S2)<len(S1):
S2=N*'0'+S2
#the first part was to make the two strings with the same len.
S=''
r=0
for i in range(len(S1)-1,-1,-1):
s=int(S1[i])+int(S2[i])+r
if s>9:
r=1
S=str(10-s)+S
if s<9:
r=0
S=str(s)+S
print(S)
if r==1:
S=str(r)+S
return S

This appears to be homework, so I will not give full code but just a few pointers.
There are three problems with your algorithm. If you fix those, then it should work.
10-s will give you negative numbers, thus all those - signs in the sum. Change it to s-10
You are missing all the 9s. Change if s<9: to if s<=9:, or even better, just else:
You should not add r to the string in every iteration, but just at the very end, after the loop.
Also, instead of using those convoluted if statements to check r and substract 10 from s you can just use division and modulo instead: r = s/10 and s = s%10, or just r, s = divmod(s, 10).
If this is not homework: Just use int(S1) + int(S2).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

decompressing a compressed string - string

Related

Inner workings of map() in a specific parsing situation

What is the Pythonic way to ID when in a bytestring values stop being in a particular range?

Python get character position matches between 2 strings

Pyparsing - matching the outermost set of nested brackets

How can I write the following script in Python?

Categories

Resources