Return emoji name instead of emoji - python-3.x

I have this: '1⃣' (without the single quotes) in Python 3, which is :one:. Is there a way I could get the emoji (like the one above) and print the corresponding emoji (in this case :one:) name instead?
I'm getting the emoji from a discord.py reaction object.

In your case, that emoji is a two-character string. You can get the number by getting the first character of the string:
char = '1⃣'
print(char[0]) # 1
With another emoji that isn't just two characters, you can use the unicodedata module:
import unicodedata
char = '❤'
name = unicodedata.name(char)
print(name) # HEAVY BLACK HEART
In most cases, the name of the emote is the last word of the unicode name:
import unicodedata
char = '1⃣'
name = unicodedata.name(char[0])
name = name.split(' ')[-1]
print(f':{name.lower()}:')
# :one:

Related

Get number from string in Python

I have a string, I have to get digits only from that string.
url = "www.mylocalurl.com/edit/1987"
Now from that string, I need to get 1987 only.
I have been trying this approach,
id = [int(i) for i in url.split() if i.isdigit()]
But I am getting [] list only.
You can use regex and get the digit alone in the list.
import re
url = "www.mylocalurl.com/edit/1987"
digit = re.findall(r'\d+', url)
output:
['1987']
Replace all non-digits with blank (effectively "deleting" them):
import re
num = re.sub('\D', '', url)
See live demo.
You aren't getting anything because by default the .split() method splits a sentence up where there are spaces. Since you are trying to split a hyperlink that has no spaces, it is not splitting anything up. What you can do is called a capture using regex. For example:
import re
url = "www.mylocalurl.com/edit/1987"
regex = r'(\d+)'
numbers = re.search(regex, url)
captured = numbers.groups()[0]
If you do not what what regular expressions are, the code is basically saying. Using the regex string defined as r'(\d+)' which basically means capture any digits, search through the url. Then in the captured we have the first captured group which is 1987.
If you don't want to use this, then you can use your .split() method but this time provide a split using / as the separator. For example `url.split('/').

How to extract text between specific letters from a string in Python(3.9)?

how may I be able to take from a string in python a value that is in a given text but is inside it, it's between 2 letters that I want it to copy from inside.
e.g.
"Kahoot : ID:1234567 Name:RandomUSERNAME"
I want it to receive the 1234567 and the RandomUSERNAME in 2 different variables.
a way I found to catch is to get it between the "ID:"COPYINPUT until the SPACE., "Name:"COPYINPUT until the end of the text.
How do I code this?
if I hadn't explained correctly tell me, I don't know how to ask/format this question! Sorry for any inconvenience!.
If the text always follows the same format you could just split the string. Alternatively, you could use regular expressions using the re library.
Using split:
string = "Kahoot : ID:1234567 Name:RandomUSERNAME"
string = string.split(" ")
id = string[2][3:]
name = string[3][5:]
print(id)
print(name)
Using re:
import re
string = "Kahoot : ID:1234567 Name:RandomUSERNAME"
id = re.search(r'(?<=ID:).*?(?=\s)', string).group(0)
name = re.search(r'(?<=Name:).*', string).group(0)
print(id)
print(name)

Count up misspelled words in a sentence of variable size

as a part of a large project, I need a function that will check for any misspelt words in a sentence, however, this sentence can be one word or it can be 30 words or any size really.
It needs to be fast, if possible I would like to use text blob or pyspellcheck as python_language_tool has problems installing on my comp.
My code so far (non-working):
def spell2():
from textblob import TextBlob
count = 0
sentence = "Tish soulhd al be corrrectt"
split_sen = sentence.split(" ")
for thing in split_sen:
thing = Word(thing)
thing.spellcheck()
# if thing is not spelt correctly add to count, if it is go to
# next word
spell2()
this gives me this error:
thing = Word(thing)
NameError: name 'Word' is not defined
Any suggestions appreciated:)
def spell3():
from spellchecker import SpellChecker
s = "Tish soulhd al be corrrectt, riiiigghtttt?"
wordlist=s.split()
spell = SpellChecker()
amount_miss = len(list(spell.unknown(wordlist)))
print("Possible amount of misspelled words in the text:",amount_miss)
spell3()

Getting a value error: invalid literal for int() with base 10: '56,990'

So I am trying to scrap a website containing price of a laptop.However it is a srting and for comparison purposes I need to convert it to int.But on using the same I get a none type error: invalid literal for int() with base 10: '56,990'
Below is the code:
from bs4 import BeautifulSoup
import requests
r = requests.get("https://www.flipkart.com/apple-macbook-air-core-i5-5th-gen-8-gb-128-gb-ssd-mac-os-sierra-mqd32hn-a-a1466/p/itmevcpqqhf6azn3?pid=COMEVCPQBXBDFJ8C&srno=s_1_1&otracker=search&lid=LSTCOMEVCPQBXBDFJ8C5XWYJP&fm=SEARCH&iid=2899998f-8606-4b81-a303-46fd62a7882b.COMEVCPQBXBDFJ8C.SEARCH&qH=9e3635d7234e9051")
data = r.text
soup = BeautifulSoup(data,"lxml")
data=soup.find('div',{"class":"_1vC4OE _37U4_g"})
cost=(data.text[1:].strip())
print(int(cost))
PS:I used text[1:] toremove the currency character
I get error in the last line.Basically I need to get the int value of the cost.
The value has a comma in it. So you need to replace the comma with empty character before converting it to integer.
print(int(cost.replace(',','')))
python does not understand , group separators in integers, so you'll need to remove them. Try:
cost = data.text[1:].strip().translate(None,',')
Rather than invent a new solution for every character you don't want (strip() function for whitespace, [1:] index for the currency, something else for the digit separator) consider a single solution to gather what you do want:
>>> import re
>>> text = "\u20B956,990\n"
>>> cost = re.sub(r"\D", "", text)
>>> print(int(cost))
56990
The re.sub() replaces anything that isn't a digit with nothing.

String manipulations using Python Pandas

I have some name and ethnicity data, for example:
John Wick English
Black Widow French
I then do a bit of manipulation to make the name as below
John Wick -> john#wick??????????????????????????????????
Black Widow -> black#widow????????????????????????????????
I then proceed into creating multiple variables and each contain the 3-character sub-strings through the for loop.
I also try to find the number of alphabets using the re.findall.
I have two questions:
1) Is the for loop efficient? Can I replace with better code even though it is working as is?
2) I can't get the code that tries to find the number of alphabet to work. Any suggestions?
import pandas as pd
from pandas import DataFrame
import re
# Get csv file into data frame
data = pd.read_csv("C:\Users\KubiK\Desktop\OddNames_sampleData.csv")
frame = DataFrame(data)
frame.columns = ["name", "ethnicity"]
name = frame.name
ethnicity = frame.ethnicity
# Remove missing ethnicity data cases
index_missEthnic = frame.ethnicity.isnull()
index_missName = frame.name.isnull()
frame2 = frame.loc[~index_missEthnic, :]
frame3 = frame2.loc[~index_missName, :]
# Make all letters into lowercase
frame3.loc[:, "name"] = frame3["name"].str.lower()
frame3.loc[:, "ethnicity"] = frame3["ethnicity"].str.lower()
# Remove all non-alphabetical characters in Name
frame3.loc[:, "name"] = frame3["name"].str.replace(r'[^a-zA-Z\s\-]', '') # Retain space and hyphen
# Replace empty space as "#"
frame3.loc[:, "name"] = frame3["name"].str.replace('[\s]', '#')
# Find the longest name in the dataset
##frame3["name_length"] = frame3["name"].str.len()
##nameLength = frame3.name_length
##print nameLength.max() # Longest name has !!!40 characters!!! including spaces and hyphens
# Add "?" to fill spaces up to 43 characters
frame3["name_filled"] = frame3["name"].str.pad(side="right", width=43, fillchar="?")
# Split into three-character strings
for i in range(1, 41):
substr = "substr" + str(i)
frame3[substr] = frame3["name_filled"].str[i-1:i+2]
# Count number of characters
frame3["name_len"] = len(re.findall('[a-zA-Z]', name))
# Test outputs
print frame3
!) Regarding the loop, I can't think of a better way than what you're already doing
2) Try frame3["name_len"] = frame3["name"].map(lambda x : len(re.findall('[a-zA-Z]', x)))

Resources