Python Challenge # 2 = removing characters from a string - python-3.x

I have the code:
theory = """}#)$[]_+(^_#^][]_)*^*+_!{&$##]((](}}{[!$#_{&{){
*_{^}$#!+]{[^&++*#!]*)]%$!{#^&%(%^*}#^+__])_$#_^#[{{})}$*]#%]{}{][#^!#)_[}{())%)
())&##*[#}+#^}#%!![#&*}^{^(({+#*[!{!}){(!*#!+#[_(*^+*]$]+#+*_##)&)^(#$^]e#][#&)(
%%{})+^$))[{))}&$(^+{&(#%*#&*(^&{}+!}_!^($}!(}_##++$)(%}{!{_]%}$!){%^%%#^%&#([+[
_+%){{}(#_}&{&++!#_)(_+}%_#+]&^)+]_[#]+$!+{#}$^!&)#%#^&+$#[+&+{^{*[#]#!{_*[)(#[[
]*!*}}*_(+&%{&#$&+*_]#+#]!&*#}$%)!})#&)*}#(#}!^(]^#}]#&%)![^!$*)&_]^%{{}(!)_&{_{
+[_*+}]$_[##_^]*^*##{&%})*{&**}}}!_!+{&^)__)#_#$#%{+)^!{}^#[$+^}&(%%)&!+^_^#}^({
*%]&#{]++}#$$)}#]{)!+#[^)!#[%#^!!"""
#theory = open("temp.txt")
key = "##!$%+{}[]_-&*()*^#/"
new2 =""
print()
for letter in theory:
if letter not in key:
new2 += letter
print(new2)
This is a test piece of code to solve the python challenge #2: http://www.pythonchallenge.com/pc/def/ocr.html
The only trouble is, the code I wrote seems to leaves lots of whitespace but I'm not sure why.
Any ideas on how to remove the unnecessary white? In other words I want the code to return "e" not " e ".

The challenge is to find a rare character. You could use collections.Counter for that:
from collections import Counter
c = Counter(theory)
print(c.most_common()[-1])
Output
('e', 1)
The unnecessary whitespace could be removed using .strip():
new2.strip()
Adding '\n' to the key works too.

The best would be to use regular expression library, like so
import re
characters = re.findall("[a-zA-Z]", sourcetext)
print ("".join(characters))
In a resulting string you will have ONLY an alphabetic characters.

If you look at the distribution of characters (using collections.Counter), you get:
6000+ each of )#(]#_%[}!+$&{*^ (which you are correctly excluding from the output)
1220 newlines (which you are not excluding from the output)
1 each of — no, I'm not going to give away the answer
Just add \n to your key variable to exclude the unwanted newlines. This will leave you with just the rare (i.e., 1 occurrence only) characters you need.
P.S., it's highly inefficient to concatenate strings in a loop. Instead of:
new2 =""
for letter in theory:
if letter not in key:
new2 += letter
write:
new2 = ''.join(letter for letter in theory if letter not in key)

The theory string contains several newlines. They get printed by your code. You can either get rid of the newline, like this:
theory = "}#)$[]_+(^_#^][]_)*^*+_!{&$##]((](}}{[!$#_{&{){" \
"*_{^}$#!+]{[^&++*#!]*)]%$!{#^&%(%^*}#^+__])_$#_^#[{{})}$*]#%]{}{][#^!#)_[}{())%)" \
"())&##*[#}+#^}#%!![#&*}^{^(({+#*[!{!}){(!*#!+#[_(*^+*]$]+#+*_##)&)^(#$^]e#][#&)(" \
"%%{})+^$))[{))}&$(^+{&(#%*#&*(^&{}+!}_!^($}!(}_##++$)(%}{!{_]%}$!){%^%%#^%&#([+[" \
"_+%){{}(#_}&{&++!#_)(_+}%_#+]&^)+]_[#]+$!+{#}$^!&)#%#^&+$#[+&+{^{*[#]#!{_*[)(#[[" \
"]*!*}}*_(+&%{&#$&+*_]#+#]!&*#}$%)!})#&)*}#(#}!^(]^#}]#&%)![^!$*)&_]^%{{}(!)_&{_{" \
"+[_*+}]$_[##_^]*^*##{&%})*{&**}}}!_!+{&^)__)#_#$#%{+)^!{}^#[$+^}&(%%)&!+^_^#}^({" \
"*%]&#{]++}#$$)}#]{)!+#[^)!#[%#^!!"
or your can filter them out, like this:
key = "##!$%+{}[]_-&*()*^#/\n"
Both work fine (yes, I tested).

a simpler way to output the answer is to:
print ''.join([ c for c in theory if c not in key])
and in your case you might want to add the newline character to key to also filter it out:
key += "\n"

You'd better work in reverse, something like this:
out = []
for i in theory:
a = ord(i)
if (a > 96 and a < 122) or (a > 65 and a < 90):
out.append(chr(a))
print ''.join(out)
Or better, use a regexp.

Related

Caesar Cipher in Python - how to replace characters

I'm trying to re-arrange long sentence from a puzzle that is encoded using a Caesar Cipher.
Here is my code.
sentence="g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr'q ufw rfgq rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu ynnjw ml rfc spj."
import string
a=string.ascii_lowercase[]
b=a[2:]+a[:2]
for i in range(26):
sentence.replace(sentence[sentence.find(a[i])],b[i])
Am I, missing anything in replace function?
When I tried sentence.replace(sentence[sentence.find(a[0])],b[0])
it worked but why I can't loop through?
Thanks.
sentence.replace
returns a new string, which you are immediately throwing away. Note that replacing each character repeatedly will cause duplicate replacements in your cipher. See #RemcoGerlich's answer for a better-detailed explanation of what is wrong. As for the solution, what about
import string
letters = string.ascii_lowercase
shifted = {l: letters[(i + 2) % len(letters)] for i, l in enumerate(letters)}
sentence = ''.join(shifted.get(c, c) for c in sentence.lower())
or if you really want the tabled way:
from string import ascii_lowercase
rotated_lowercase = ascii_lowercase[2:] + ascii_lowercase[:2]
translation_table = str.maketrans(ascii_lowercase, rotated_lowercase)
sentence = sentence.translate(translation_table)
There are a few problems:
One, sentence[sentence.find(a[i])] is strange. It tries to look up where in the sentence the character a[1] occurs, and then looks up which character is there. Well, you already know -- a[1]. Unless that character doesn't occur in the string, then .find will return -1, and sentence[-1] is the last character in the sentence. Probably not what you meant. So instead you meant sentence.replace(a[i], b[i]).
But, you don't save the result anywhere. You meant sentence = sentence.replace(a[i], b[i]).
But that still doesn't work! What if a should be changed into b, and then b into c? Then the original as are also changed into c! That's a fundamental problem with your approach.
Better solutions are given by modesitt. Mine would have been something like
lookupdict = {a_char: b_char for (a_char, b_char) in zip(a, b)}
sentence_translated = [lookupdict.get(s, '') for s in sentence]
sentence = ''.join(sentence_translated)

delete characters that are not letters, numbers, whitespace?

community,
I need to clean a string, so that it will contain only letters, numbers and whitespace.
The string momentarily consists of different sentences.
I tried:
for entry in s:
if not isalpha() or isdigit() or isspace:
del (entry)
else: s.append(entry) # the wanted characters should be saved in the string, the rest should be deleted
I am using python 3.4.0
You can use this:
clean_string = ''.join(c for c in s if c.isalnum() or c.isspace())
It iterates through each character, leaving you only with the ones that satisfy at least one of the two criteria, then joins them all back together. I am using isalnum() to check for alphanumeric characters, rather than both isalpha() and isdigit() separately.
You can achieve the same thing using a filter:
clean_string = filter(lambda c: c.isalnum() or c.isspace(), s)
The or does not work the way you think it works in English. Instead, you should do:
new_s = ''
for entry in s:
if entry.isalpha() or entry.isdigit() or entry.isspace():
new_s += entry
print(new_s)

Finding mean of ascii values in a string MATLAB

The string I am given is as follows:
scrap1 =
a le h
ke fd
zyq b
ner i
You'll notice there are 2 blank spaces indicating a space (ASCII 32) in each row. I need to find the mean ASCII value in each column without taking into account the spaces (32). So first I would convert to with double(scrap1) but then how do I find the mean without taking into account the spaces?
If it's only the ASCII 32 you want to omit:
d = double(scrap1);
result = mean(d(d~=32)); %// logical indexing to remove unwanted value, then mean
You can remove the intermediate spaces in the string with scrap1(scrap1 == ' ') = ''; This replaces any space in the input with an empty string. Then you can do the conversion to double and average the result. See here for other methods.
Probably, you can use regex to find the space and ignore it. "\s"
findSpace = regexp(scrap1, '\s', 'ignore')
% I am not sure about the ignore case, this what comes to my mind. but u can read more about regexp by typying doc regexp.

Convert underscores to spaces in Matlab string?

So say I have a string with some underscores like hi_there.
Is there a way to auto-convert that string into "hi there"?
(the original string, by the way, is a variable name that I'm converting into a plot title).
Surprising that no-one has yet mentioned strrep:
>> strrep('string_with_underscores', '_', ' ')
ans =
string with underscores
which should be the official way to do a simple string replacements. For such a simple case, regexprep is overkill: yes, they are Swiss-knifes that can do everything possible, but they come with a long manual. String indexing shown by AndreasH only works for replacing single characters, it cannot do this:
>> s = 'string*-*with*-*funny*-*separators';
>> strrep(s, '*-*', ' ')
ans =
string with funny separators
>> s(s=='*-*') = ' '
Error using ==
Matrix dimensions must agree.
As a bonus, it also works for cell-arrays with strings:
>> strrep({'This_is_a','cell_array_with','strings_with','underscores'},'_',' ')
ans =
'This is a' 'cell array with' 'strings with' 'underscores'
Try this Matlab code for a string variable 's'
s(s=='_') = ' ';
If you ever have to do anything more complicated, say doing a replacement of multiple variable length strings,
s(s == '_') = ' ' will be a huge pain. If your replacement needs ever get more complicated consider using regexprep:
>> regexprep({'hi_there', 'hey_there'}, '_', ' ')
ans =
'hi there' 'hey there'
That being said, in your case #AndreasH.'s solution is the most appropriate and regexprep is overkill.
A more interesting question is why you are passing variables around as strings?
regexprep() may be what you're looking for and is a handy function in general.
regexprep('hi_there','_',' ')
Will take the first argument string, and replace instances of the second argument with the third. In this case it replaces all underscores with a space.
In Matlab strings are vectors, so performing simple string manipulations can be achieved using standard operators e.g. replacing _ with whitespace.
text = 'variable_name';
text(text=='_') = ' '; //replace all occurrences of underscore with whitespace
=> text = variable name
I know this was already answered, however, in my case I was looking for a way to correct plot titles so that I could include a filename (which could have underscores). So, I wanted to print them with the underscores NOT displaying with as subscripts. So, using this great info above, and rather than a space, I escaped the subscript in the substitution.
For example:
% Have the user select a file:
[infile inpath]=uigetfile('*.txt','Get some text file');
figure
% this is a problem for filenames with underscores
title(infile)
% this correctly displays filenames with underscores
title(strrep(infile,'_','\_'))

Octave strings malipulating

I have a problem in Octave
I want to find all different(!) pairs of two letters in a text(with no spaces, only letters)
For example:
my text = "abcdabcd"
i want find array(or vector?) that looks like: ab bc cd da
How do i do this in the easies way possible?
Thanks for your help
You can use the unique() function to do this. The only trick is in creating the list of two characters which can be done by using two lines, shifted by one character.
str = "abcdabcd";
str(2,:) = shift (str, -1);
str(:,end) = []; # remove last column
unique (str', "rows")

Resources