I am trying to take the letters from the string 'Hello, World!' and convert it into values using a dictionary.
I've tried using strings and lists to see if this would work but can't figure it out.
d ={'H':1, 'e':2, 'l':3, 'o':4,',':5, ' ':6, 'W':7, 'r':8, 'd':9, '!':10}
mystr = 'Hello, World!'
mystr1 = d(mystr)
print(mystr1)
TypeError: 'dict' object is not callable is the error I keep getting.
'Hello, World!'
My expected output is: '12334567483910'
If possible I would also like a way to convert the number back to the words 'Hello, World!'
You can do what you're trying to do by converting your dictionary to a translation table and then using the str.translate method.
d = {'H':'1', 'e':'2', 'l':'3', 'o':'4',',':'5', ' ':'6', 'W':'7', 'r':'8', 'd':'9', '!':'10'}
tt = str.maketrans(d)
print("Hello, World!".translate(tt))
# 12334567483910
Note that we had to change the values of the dictionary from integers to strings, otherwise the str.maketrans method treats them like Unicode ordinals.
You need to iterate on each character of mystr. So using the get method, we can retrieve the value from the dictionary without causing an error if the character isn't there (and just leave it out), for c in mystr loops through each character, and the str function converts the integer from the dictionary to a string (if the values in the dictionary were strings you wouldn't need it, although then you could use translate as in Patrick's answer). Finally, ''.join joins all the characters back together into a new string.
Instead of the get method, if you want it to throw an error if the character is not in the dictionary you can use d[c] instead of d.get(c, '').
d ={'H':1, 'e':2, 'l':3, 'o':4,',':5, ' ':6, 'W':7, 'r':8, 'd':9, '!':10}
mystr = 'Hello, World'
encoded_string = ''.join(str(d.get(c, '')) for c in mystr)
print('Encoded String:', encoded_string)
r = dict((value, key) for key, value in d.items())
decoded_string = ''.join(r.get(int(c)) for c in encoded_string)
print('Decoded String:', decoded_string)
Result
Encoded String: 123345674839
Decoded String: Hello, World
Related
#the list to make an string format
a = ['h','e','l','l','o','','w','o','r','l','d','!']
#wanted output
hello world!
It seems some suggestions was given above about how to solve this problem. An approach that can be taken though is creating a empty string variable and concatenating all the characters from the given character list to the string variable. For example,
a = ['h','e','l','l','o','','w','o','r','l','d','!']
temp_string = ""
for eachChar in a:
temp_string += eachChar
print(temp_string)
The output of this result to
helloworld!
This is due to the null character between hello and world. If the null character was instead a space such as ['h','e','l','l','o',' ','w','o','r','l','d','!'], the output will result to what you have displayed above.
The following works:
import re
text = "I\u2019m happy"
text_p = text
text_p = re.sub("[\u2019]","'",text_p)
print(text_p)
Output: I'm happy
This doesn't work:
training_data = pd.read_csv('train.txt')
import re
text = training_data['tweet_text'][0] # Assume that this returns a string "I\u2019m happy"
text_p = text
text_p = re.sub("[\u2019]","'",text_p)
print(text_p)
Output: I\u2019m happy
I tried running your code and got I'm happy returned from both the string and the list item when passing each into re.sub(...) as outlined in your question.
If you're just looking to parse (decode) the unicode characters you probably don't need to be using re. Something like the below could be used to parse the unicode characters without having to run re to check each possibility.
text = training_data['tweet_text'][0]
if type(text) == str: # if value is str then encode to utf-8 byte string then decode back to str
text = text.encode()
text = text.decode()
elif type(text) == bytes: # elif value is bytes just decode to str
text = text.decode()
else: # else printout to console if value is neither str or bytes
print("Value not recognised as str or bytes!")
My question is more or less similar to:
Is there a way to substring a string in Python?
but it's more specifically oriented.
How can I get a par of a string which is located between two known words in the initial string.
Example:
mySrting = "this is the initial string"
Substring = "initial"
knowing that "the" and "string" are the two known words in the string that can be used to get the substring.
Thank you!
You can start with simple string manipulation here. str.index is your best friend there, as it will tell you the position of a substring within a string; and you can also start searching somewhere later in the string:
>>> myString = "this is the initial string"
>>> myString.index('the')
8
>>> myString.index('string', 8)
20
Looking at the slice [8:20], we already get close to what we want:
>>> myString[8:20]
'the initial '
Of course, since we found the beginning position of 'the', we need to account for its length. And finally, we might want to strip whitespace:
>>> myString[8 + 3:20]
' initial '
>>> myString[8 + 3:20].strip()
'initial'
Combined, you would do this:
startIndex = myString.index('the')
substring = myString[startIndex + 3 : myString.index('string', startIndex)].strip()
If you want to look for matches multiple times, then you just need to repeat doing this while looking only at the rest of the string. Since str.index will only ever find the first match, you can use this to scan the string very efficiently:
searchString = 'this is the initial string but I added the relevant string pair a few more times into the search string.'
startWord = 'the'
endWord = 'string'
results = []
index = 0
while True:
try:
startIndex = searchString.index(startWord, index)
endIndex = searchString.index(endWord, startIndex)
results.append(searchString[startIndex + len(startWord):endIndex].strip())
# move the index to the end
index = endIndex + len(endWord)
except ValueError:
# str.index raises a ValueError if there is no match; in that
# case we know that we’re done looking at the string, so we can
# break out of the loop
break
print(results)
# ['initial', 'relevant', 'search']
You can also try something like this:
mystring = "this is the initial string"
mystring = mystring.strip().split(" ")
for i in range(1,len(mystring)-1):
if(mystring[i-1] == "the" and mystring[i+1] == "string"):
print(mystring[i])
I suggest using a combination of list, split and join methods.
This should help if you are looking for more than 1 word in the substring.
Turn the string into array:
words = list(string.split())
Get the index of your opening and closing markers then return the substring:
open = words.index('the')
close = words.index('string')
substring = ''.join(words[open+1:close])
You may want to improve a bit with the checking for the validity before proceeding.
If your problem gets more complex, i.e multiple occurrences of the pair values, I suggest using regular expression.
import re
substring = ''.join(re.findall(r'the (.+?) string', string))
The re should store substrings separately if you view them in list.
I am using the spaces between the description to rule out the spaces between words, you can modify to your needs as well.
That's the source code:
def revers_e(str_one,str_two):
for i in range(len(str_one)):
for j in range(len(str_two)):
if str_one[i] == str_two[j]:
str_one = (str_one - str_one[i]).split()
print(str_one)
else:
print('There is no relation')
if __name__ == '__main__':
str_one = input('Put your First String: ').split()
str_two = input('Put your Second String: ')
print(revers_e(str_one, str_two))
How can I remove a letter that occurs in both strings from the first string then print it?
How about a simple pythonic way of doing it
def revers_e(s1, s2):
print(*[i for i in s1 if i in s2]) # Print all characters to be deleted from s1
s1 = ''.join([i for i in s1 if i not in s2]) # Delete them from s1
This answer says, "Python strings are immutable (i.e. they can't be modified). There are a lot of reasons for this. Use lists until you have no choice, only then turn them into strings."
First of all you don't need to use a pretty suboptimal way using range and len to iterate over a string since strings are iterable you can just iterate over them with a simple loop.
And for finding intersection within 2 string you can use set.intersection which returns all the common characters in both string and then use str.translate to remove your common characters
intersect=set(str_one).intersection(str_two)
trans_table = dict.fromkeys(map(ord, intersect), None)
str_one.translate(trans_table)
def revers_e(str_one,str_two):
for i in range(len(str_one)):
for j in range(len(str_two)):
try:
if str_one[i] == str_two[j]:
first_part=str_one[0:i]
second_part=str_one[i+1:]
str_one =first_part+second_part
print(str_one)
else:
print('There is no relation')
except IndexError:
return
str_one = input('Put your First String: ')
str_two = input('Put your Second String: ')
revers_e(str_one, str_two)
I've modified your code, taking out a few bits and adding a few more.
str_one = input('Put your First String: ').split()
I removed the .split(), because all this would do is create a list of length 1, so in your loop, you'd be comparing the entire string of the first string to one letter of the second string.
str_one = (str_one - str_one[i]).split()
You can't remove a character from a string like this in Python, so I split the string into parts (you could also convert them into lists like I did in my other code which I deleted) whereby all the characters up to the last character before the matching character are included, followed by all the characters after the matching character, which are then appended into one string.
I used exception statements, because the first loop will use the original length, but this is subject to change, so could result in errors.
Lastly, I just called the function instead of printing it too, because all that does is return a None type.
These work in Python 2.7+ and Python 3
Given:
>>> s1='abcdefg'
>>> s2='efghijk'
You can use a set:
>>> set(s1).intersection(s2)
{'f', 'e', 'g'}
Then use that set in maketrans to make a translation table to None to delete those characters:
>>> s1.translate(str.maketrans({e:None for e in set(s1).intersection(s2)}))
'abcd'
Or use list comprehension:
>>> ''.join([e for e in s1 if e in s2])
'efg'
And a regex to produce a new string without the common characters:
>>> re.sub(''.join([e for e in s1 if e in s2]), '', s1)
'abcd'
I'm trying to merge cell-array of strings delimiting each by new line to one string in Matlab.
Following method merges the strings, but the final string contains \n instead of new lines:
function str = toString(self)
% some not important logic that creates cell array called strings
% ...
str = '';
for i = 1 : 9
str = strcat(str, strings(i), '\n');
end
end
It returns: ' 111\n 111\n 111\n333666444555\n333666444555\n333666444555\n 222\n 222\n 222\n'
When I add str = sprintf(str); before the end of the method, it returns Invalid format error. However when I write to Matlab command window sprintf(' 111\n 111\n 111\n333666444555\n333666444555\n333666444555\n 222\n 222\n 222\n'); it returns formatted string without any errors.
Anyone knows what could be a problem? Why it works in command window but doesn't in .m file?
sprintf will loop over the elements or your cell array:
sprintf('%s\n', strings{:})
The problem with your loop is '\n' is a 2 element char array, but what you want is sprintf('\n')