How to strip whitespace from element in list - string

I read over a file, scraped all the artist names from within the file and put it all in a list. Im trying to pull out one artist only from the list, and then removing the space and shuffling all the letters (word scrabble).
artist_names = []
rand_artist = artist_names[random.randrange(len(artist_names))] #Picks random artist from list]
print(rand_artist)
howevever when i print rand_artist out, sometimes i get an artist with lets say 2 or 3 words such as "A Northern Chorus" or "The Beatles". i would like to remove the whitespace between the words and then shuffle the words.

First replace whitespaces with empty strings. Then turn the string to a list of characters. Since i guess you want them to be lowercase, I included that as well.
import random
s = "A Northern Chorus".replace(' ','').lower()
l=list(s)
random.shuffle(l)
print(l)
Also, you can use random.choice(artist_names) instead of randrange().

Related

How to print a specific string containing a particular word - Python

I want to print out the entire string if it contains a particular word. for example
a = ['www.facbook.com/xyz','www.google.com/xyz','www.amazon.com/xyz','www.instagram.com/xyz']
if I am looking to find the word amazon then the code should print www.amazon.com/xyz
I have found many examples in which you can find out if a string contains a word but I need to print out the entire string which contains the word.
Try this -
your_list = ['www.facebook.com/xyz', 'www.google.com/xyz', 'www.amazon.com/xyz', 'www.instagram.com/xyz']
word = 'amazon'
res = [x for x in your_list if word in x]
print (*res)
Output:
www.amazon.com/xyz
This works fine if there are only one or two strings containing the word, if there are multiple strings in the list containing that name it will print them in a horizontal line.
It needs to print line by separate line but I do not know how to incorporate this in the code. It would be interesting to see how that looks.

Extracting acronyms from each string in list index

I have a list of strings (the other posts only had single words or ints) that are imported from a file and I am having trouble using nested loops to separate each few words in an index into its own list and then taking the first letters of each to create acronyms.
I have tried picking apart each index and processing it through another loop to get the first letter of each word but the closest I got was pulling every first letter from each indexes from the original layer.
text = (infile.read()).splitlines()
acronym = []
separator = "."
for i in range(len(text)):
substring = [text[i]]
for j in range(len(substring)):
substring2 = [substring[j][:1])]
acronym.append(substring2)
print("The Acronym is: ", separator.join(acronym))
Happy Path: The list of multi-word strings will turn be translated into acronyms that are listed with linebreaks.
Example of what should output at the end: D.O.D. \n N.S.A. \n ect.
What's happened so far: Before I had gotten it to take the first letter of the first word of every index at the sentence level but I haven't figured out how to nest these loops to get to the single words of each index.
Useful knowledge: THE BEGINNING FORMAT AFTER SPLITLINES (Since people couldn't read this) is a list with indexes with syntax like this: ['Department of Defense', 'National Security Agency', ...]
What you have is kind of a mess. If you are going to be re-using code, it is often better to just make it into a function. Try this out.
def get_acronym(the_string):
words = the_string.split(" ")
return_string = ""
for word in words:
return_string += word[0]
return return_string
text = ['Department of Defense', 'National Security Agency']
for agency in text:
print("The acronym is: " + get_acronym(agency))
I figured out how to do it from a file. File format was like this:
['This is Foo', 'Coming from Bar', 'Bring Your Own Device', 'Department of Defense']
So if this also helps anyone, enjoy~
infile = open(iname, 'r')
text = (infile.read()).splitlines()
print("The Strings To Become Acronyms Are As Followed: \n", text, "\n")
acronyms = []
for string in text:
words = string.split()
letters = [word[0] for word in words]
acronyms.append(".".join(letters).upper())
print("The Acronyms For These Strings Are: \n",acronyms)
This code outputs like this:
The Strings To Become Acronyms Are As Followed:
['This is Foo', 'Coming from Bar', 'Bring Your Own Device', 'Department of Defense']
The Acronyms For These Strings Are:
['T.I.F', 'C.F.B', 'B.Y.O.D', 'D.O.D']

How do I select a random work from a list which was made by importing a .txt file?

So I have a .txt file which contains the name, abbreviation, nickname, and capital of each state in the United States per line.
I need to print the names of 5 random states on different lines, one below the other.
The text file looks like -
Alabama,AL,Cotton State,Montgomery
Alaska,AK,The Last Frontier,Juneau
Arizona,AR,Grand Canyon State,Phoenix
Arkansas,AK,Land of Opportunity,Little Rock
and so on...
I imported the file and put it into a list by splitting at the comma. I have a list of all the names in it. But when I run random.choice(data2) instead of getting a random word, I get a random letter. This means that the list is made of each letter as an element and not each word as an element.
My code -
import random
inflie=open("F:\\SKKU\\study\\ISS3178 - Python\\11\\StatesANC.txt","r")
for line in inflie:
data1=line.split(",")
data2=data1[0]
print(random.choice(data2))
I expect a random state name but what I get is a random letter from one of the state names.
You need to append the result to a list and then use random
Ex:
import random
data2 = []
inflie=open("F:\\SKKU\\study\\ISS3178 - Python\\11\\StatesANC.txt","r")
for line in inflie:
data1=line.split(",")
data2.append(data1[0]) #Append state
print(random.choice(data2))

How to make tokenize not treat contractions and their counter parts as the same when comparing two text files?

I am currently working on a data structure that is supposed to compare two text files and make a list of the strings they have in common. my program receives the content of the two files as two strings a & b (one file's content per variable). I then use the tokenize function in a for loop to break the string by each sentence. They are then stored into a set to avoid duplicate entries. I remove all duplicate lines within each variable before I compare them. I then compare each the two variables to each other and only keep the string they have in common. I have a bug that occurs in the last part when they are comparing against each other. The program will treat contractions and their proper counter parts as the same when it should not. For Example it will read Should not and Shouldn't as the same and will produce an incorrect answer. I want to make it not read contraction and their counter parts as the same.
import nltk
def sentences(a, b): #the variables store the contents of the files in the form of strings
a_placeholder = a
set_a = set()
a = []
for punctuation_a in nltk.sent_tokenize(a_placeholder):
if punctuation_a not in set_a:
set_a.add(punctuation_a)
a.append(punctuation_a)
b_placeholder = b
set_b = set()
b = []
for punctuation_b in nltk.sent_tokenize(b_placeholder):
if punctuation_b not in set_b:
set_b.add(punctuation_b)
b.append(punctuation_b)
a_new = a
for punctuation in a_new:
if punctuation not in set_b:
set_a.remove(punctuation)
a.remove(punctuation)
else:
pass
return []

How to decode a text file by extracting alphabet characters and listing them into a message?

So we were given an assignment to create a code that would sort through a long message filled with special characters (ie. [,{,%,$,*) with only a few alphabet characters throughout the entire thing to make a special message.
I've been searching on this site for a while and haven't found anything specific enough that would work.
I put the text file into a pastebin if you want to see it
https://pastebin.com/48BTWB3B
Anywho, this is what I've come up with for code so far
code = open('code.txt', 'r')
lettersList = code.readlines()
lettersList.sort()
for letters in lettersList:
print(letters)
It prints the code.txt out but into short lists, essentially cutting it into smaller pieces. I want it to find and sort out the alphabet characters into a list and print the decoded message.
This is something you can do pretty easily with regex.
import re
with open('code.txt', 'r') as filehandle:
contents = filehandle.read()
letters = re.findall("[a-zA-Z]+", contents)
if you want to condense the list into a single string, you can use a join:
single_str = ''.join(letters)

Resources