Replacing spaces in lists - python-3.x

I'm creating a google searcher in python. Is there any way that I can replace a space in a list with a "+" for my url? This is my code so far:
q=input("Question=")
qlist=list(q)
#print(qlist)
Can I replace any spaces in my list with a plus, and then turn that back into a string?

Just want to add another line of thought there. Try the urllib library for parsing url strings.
Here's an example:
import urllib
## Create an empty dictionary to hold values (for questions and answers).
data = dict()
## Sample input
input = 'This is my question'
### Data key can be 'Question'
data['Question='] = input
### We'll pass that dictionary hrough the urlencode method
url_values = urllib.parse.urlencode(data)
### And print results
print(url_values)
#-------------------------------------------------------------------------------------------------------
#-------------------------------------------------------------------------------------------------------
#Alternatively, you can setup the dictionary a little better if you only have a couple of key-value pairs
## Input
input = 'This is my question'
# Our dictionary; We can set the input value as the value to the Question key
data = {
'Question=': input
}
print(urllib.parse.urlencode(data))
Output:
'Question%3D=This+is+my+question'

You can just join it together to create 1 long string.
qlist = my_string.split(" ")
result = "+".join(qlist)
print("Output string: {}".format(result))

Look at the join and split operations in python.
q = 'dog cat'
list_info = q.split()
https://docs.python.org/3/library/stdtypes.html#str.split
q = ['dog', 'cat']
s_info = ''.join(q)
https://docs.python.org/3/library/stdtypes.html#str.join

Related

How to insert variable length list into string

I have what I think is a basic question in Python:
I have a list that can be variable in length and I need to insert it into a string for later use.
Formatting is simple, I just need a comma between each name up to nameN and parenthesis surrounding the names.
List = ['name1', 'name2' .... 'nameN']
string = "Their Names are <(name1 ... nameN)> and they like candy.
Example:
List = ['tom', 'jerry', 'katie']
print(string)
Their Names are (tom, jerry, katie) and they like candy.
Any ideas on this? Thanks for the help!
# Create a comma-separated string with names
the_names = ', '.join(List) # 'tom, jerry, katie'
# Interpolate it into the "main" string
string = f"Their Names are ({the_names}) and they like candy."
There are numerous ways to achieve that.
You could use print + format + join similar to the example from #ForceBru.
Using format would make it compatible with both Python2 and Python3.
names_list = ['tom', 'jerry', 'katie']
"""
Convert the list into a string with .join (in this case we are separating with commas)
"""
names_string = ', '.join(names_list)
# names_string == "tom, katie, jerry"
# Now add one string inside the other:
string = "Their Names are ({}) and they like candy.".format(names_string)
print(string)
>> Their Names are (tom, jerry, katie) and they like candy.

How do I print out results on a separate line after converting them from a set to a string?

I am currently trying to compare to text files, to see if they have any words in common in both files.
The text files are as
ENGLISH.TXT
circle
table
year
competition
FRENCH.TXT
bien
competition
merci
air
table
My current code is getting them to print, Ive removed all the unnessecary squirly brackets and so on, but I cant get them to print on different lines.
List = open("english.txt").readlines()
List2 = open("french.txt").readlines()
anb = set(List) & set(List2)
anb = str(anb)
anb = (str(anb)[1:-1])
anb = anb.replace("'","")
anb = anb.replace(",","")
anb = anb.replace('\\n',"")
print(anb)
The output is expected to separate both results onto new lines.
Currently Happening:
Competition Table
Expected:
Competition
Table
Thanks in advance!
- Xphoon
Hi I'd suggest you to try two things as a good practice:
1) Use "with" for opening files
with open('english.txt', 'r') as englishfile, open('french.txt', 'r') as frenchfile:
##your python operations for the file
2) Try to use the "f-String" opportunity if you're using Python 3:
print(f"Hello\nWorld!")
File read using "open()" vs "with open()"
This post explains very well why to use the "with" statement :)
And additionally to the f-strings if you want to print out variables do it like this:
print(f"{variable[index]}\n variable2[index2]}")
Should print out:
Hello and World! in seperate lines
Here is one solution including converting between sets and lists:
with open('english.txt', 'r') as englishfile, open('french.txt', 'r') as frenchfile:
english_words = englishfile.readlines()
english_words = [word.strip('\n') for word in english_words]
french_words = frenchfile.readlines()
french_words = [word.strip('\n') for word in french_words]
anb = set(english_words) & set(french_words)
anb_list = [item for item in anb]
for item in anb_list:
print(item)
Here is another solution by keeping the words in lists:
with open('english.txt', 'r') as englishfile, open('french.txt', 'r') as frenchfile:
english_words = englishfile.readlines()
english_words = [word.strip('\n') for word in english_words]
french_words = frenchfile.readlines()
french_words = [word.strip('\n') for word in french_words]
for english_word in english_words:
for french_word in french_words:
if english_word == french_word:
print(english_word)

Extract characters within certain symbols

I have extracted text from an HTML file, and have the whole thing in a string.
I am looking for a method to loop through the string, and extract only values that are within square brackets and put strings in a list.
I have looked in to several questions, among them this one: Extract character before and after "/"
But i am having a hard time modifying it. Can someone help?
Solved!
Thank you for all your inputs, I will definitely look more into regex. I managed to do what i wanted in a pretty manual way (may not be beautiful):
#remove all html code and append to string
for i in html_file:
html_string += str(html2text.html2text(i))
#set this boolean if current character is either [ or ]
add = False
#extract only values within [ or ], based on add = T/F
for i in html_string:
if i == '[':
add = True
if i == ']':
add = False
clean_string += str(i)
if add == True:
clean_string += str(i)
#split string into list without square brackets
clean_string_list = clean_string.split('][')
The HTML file I am looking to get as pure text (dataframe later on) instead of HTML, is my personal Facebook data that i have downloaded.
Try out this regex, given a string it will place all text inside [ ] into a list.
import re
print(re.findall(r'\[(\w+)\]','spam[eggs][hello]'))
>>> ['eggs', 'hello']
Also this is a great reference for building your own regex.
https://regex101.com
EDIT: If you have nested square brackets here is a function that will handle that case.
import re
test ='spam[eg[nested]gs][hello]'
def square_bracket_text(test_text,found):
"""Find text enclosed in square brackets within a string"""
matches = re.findall(r'\[(\w+)\]',test_text)
if matches:
found.extend(matches)
for word in found:
test_text = test_text.replace('[' + word + ']','')
square_bracket_text(test_text,found)
return found
match = []
print(square_bracket_text(test,match))
>>>['nested', 'hello', 'eggs']
hope it helps!
You can also use re.finditer() for this, see below example.
Let suppose, we have word characters inside brackets so regular expression will be \[\w+\].
If you wish, check it at https://rextester.com/XEMOU85362.
import re
s = "<h1>Hello [Programmer], you are [Excellent]</h1>"
g = re.finditer("\[\w+\]", s)
l = list() # or, l = []
for m in g:
text = m.group(0)
l.append(text[1: -1])
print(l) # ['Programmer', 'Excellent']

String items in list: how to remove certain keywords?

I have a set of links that looks like the following:
links = ['http://www.website.com/category/subcategory/1',
'http://www.website.com/category/subcategory/2',
'http://www.website.com/category/subcategory/3',...]
I want to extract the 1, 2, 3, and so on from this list, and store the extracted data in subcategory_explicit. They're stored as str, and I'm having trouble getting at them with the following code:
subcategory_explicit = [cat.get('subcategory') for cat in links if cat.get('subcategory') is not None]
Do I have to change my data type from str to something else? What would be a better way to obtain and store the extracted values?
subcategory_explicit = [i[i.find('subcategory'):] for i in links if 'subcategory' in i]
This uses a substring via slicing, starting at the "s" in "subcategory" until the end of the string. By adding len('subcategory') to the value from find, you can exclude "subcategory" and get "/#" (where # is whatever number).
Try this (using re module):
import re
links = [
'http://www.website.com/category/subcategory/1',
'http://www.website.com/category/subcategory/2',
'http://www.website.com/category/subcategory/3']
d = "|".join(links)
# 'http://www.website.com/category/subcategory/1|http://www.website.com/category/subcategory/2|http://www.website.com/category/subcategory/3'
pattern = re.compile("/category/(?P<category_name>\w+)/\d+", re.I)
subcategory_explicit = pattern.findall(d)
print(subcategory_explicit)

How to hash only value to right of "=" sign (or any other delimiter) and output new text file

i outputted a text file based on user input from csv file, in certain format (with help from user suggestions). Now i want to hash only value to right of "=" sign and output new text file with same format but hashed values to right. here is code suggested to me with some of my mods, that worked for the 1st part:
import csv
device = input("Enter the device name: ").upper()
output_file = 'C:\path_to_scripts\{}_items.txt'.format(device)
with open(r'C:\path_to_script\filename_Brief.csv') as infh, \
open(output_file, 'wt') as outfh:
reader = csv.DictReader(infh)
for row in reader:
if row['ALIAS'] == device:
outfh.write('Full_Name = {Full_Name}\n'
'PHONE_NO = {PHONE_NO}\n'
'ALIAS = {ALIAS}\n'.format(**row))
I can hash the whole line using code such as:
import hashlib
with open(outfh, 'r') as file:
lines = [x.strip('\n') for x in file.readlines()]
hashfile = 'C:\path_to_scripts\{}_hashed.csv'.format(device)
with open(hash_file,'w') as save_file:
# Write the saved file header
header = '\"Value\",\"Algorithm\",\"Hash\"\n'
save_file.write(header)
# For each line in file
for line in lines:
# Create a list of all the available algorithms
algorithms = ['md5','sha1','sha224','sha256','sha384','sha512']
# For each algorithm
for algo in algorithms:
# Encode the original value as utf8
val_enc = line.encode('utf-8')
# Create the hashing object using the current algorithm
hashed = hashlib.new(algo)
# Set the value we want the hash of
hashed.update(val_enc)
# Get the hash
hash_digest = hashed.hexdigest()
results = '\"{}\",\"{}\",\"{}\"\n'.format(line, algo, hash_digest)
save_file.write(results)
but can't figure out how to split the line--for example "Full_Name = Jack Flash" to obtain only "Jack Flash" as object to be hashed;hash "Jack Flash" and write values to new text file with format of key = hashed value. The above code saves to a csv file. i am cutting and pasting relevant sections so hope that makes sense. Any ideas on how to accomplish this? thanks in advance!
What you are looking for is str.split('=')
>>> line = "Full_Name = Jack Flash"
>>> parts = line.split('=')
>>> parts
['Full_Name ', ' Jack Flash']
>>> parts[1]
' Jack Flash'
If you do not want the initial ' ' hashed, remove it. Or, if '=' is *always' followed by ' ', .split('= ').

Resources