Itertool.product at to string - python-3.x

Hi I starting learning python.
I have to do a small work to generate password using itertool. the only thing I don't understand how to convert this result :
itertools.product object at 0x03493BE8
to something readeable.
How can I convert it to get a string or something similar ?
Here is my code :
for CharLength in range(12):
words= (itertools.product(Alphabet, repeat=CharLength))
print(words)

itertools.product() returns a generator.
To print them you can use the * operator.
for char_len in range(12):
words = itertools.product(alphabet, repeat=char_len)
print(*words)

itertools.product returns a generator object. If you wish, you could go over it and convert it to a list so it's easier to view its contents, e.g. by using a list comprehension:
words = [w for w in itertools.product(Alphabet, repeat=CharLength)]

Related

Convert everything in a dictionary to lower case, then filter on it?

import pandas as pd
import nltk
import os
directory = os.listdir(r"C:\...")
x = []
num = 0
for i in directory:
x.append(pd.read_fwf("C:\\..." + i))
x[num] = x[num].to_string()
So, once I have a dictionary x = [ ] populated by the read_fwf for each file in my directory:
I want to know how to make it so every single character is lowercase. I am having trouble understanding the syntax and how it is applied to a dictionary.
I want to define a filter that I can use to count for a list of words in this newly defined dictionary, e.g.,
list = [bus, car, train, aeroplane, tram, ...]
Edit: Quick unrelated question:
Is pd_read_fwf the best way to read .txt files? If not, what else could I use?
Any help is very much appreciated. Thanks
Edit 2: Sample data and output that I want:
Sample:
The Horncastle boar's head is an early seventh-century Anglo-Saxon
ornament depicting a boar that probably was once part of the crest of
a helmet. It was discovered in 2002 by a metal detectorist searching
in the town of Horncastle, Lincolnshire. It was reported as found
treasure and acquired for £15,000 by the City and County Museum, where
it is on permanent display.
Required output - changes everything in uppercase to lowercase:
the horncastle boar's head is an early seventh-century anglo-saxon
ornament depicting a boar that probably was once part of the crest of
a helmet. it was discovered in 2002 by a metal detectorist searching
in the town of horncastle, lincolnshire. it was reported as found
treasure and acquired for £15,000 by the city and county museum, where
it is on permanent display.
You shouldn't need to use pandas or dictionaries at all. Just use Python's built-in open() function:
# Open a file in read mode with a context manager
with open(r'C:\path\to\you\file.txt', 'r') as file:
# Read the file into a string
text = file.read()
# Use the string's lower() method to make everything lowercase
text = text.lower()
print(text)
# Split text by whitespace into list of words
word_list = text.split()
# Get the number of elements in the list (the word count)
word_count = len(word_list)
print(word_count)
If you want, you can do it in the reverse order:
# Open a file in read mode with a context manager
with open(r'C:\path\to\you\file.txt', 'r') as file:
# Read the file into a string
text = file.read()
# Split text by whitespace into list of words
word_list = text.split()
# Use list comprehension to create a new list with the lower() method applied to each word.
lowercase_word_list = [word.lower() for word in word_list]
print(word_list)
Using a context manager for this is good since it automatically closes the file for you as soon as it goes out of scope (de-tabbed from with statement block). Otherwise you would have to use file.open() and file.read().
I think there are some other benefits to using context managers, but someone please correct me if I'm wrong.
I think what you are looking for is dictionary comprehension:
# Python 3
new_dict = {key: val.lower() for key, val in old_dict.items()}
# Python 2
new_dict = {key: val.lower() for key, val in old_dict.iteritems()}
items()/iteritems() gives you a list of tuples of the (keys, values) represented in the dictionary (e.g. [('somekey', 'SomeValue'), ('somekey2', 'SomeValue2')])
The comprehension iterates over each of these pairs, creating a new dictionary in the process. In the key: val.lower() section, you can do whatever manipulation you want to create the new dictionary.

How can I extract text from string in python?

Say I have the code txt = "Hello my name is bob. I really like pies.", how would I extract each sentence individually and add the to a list. I created this messy script which gives me a number of sentences roughly in a string...
sentences = 0
capitals = [
'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S',
'T','U','V','W','X','Y','Z'
]
finish_markers = [
'.','?','!'
]
newTxt = txt.split()
for x in newTxt[1:-1]:
for caps in capitals:
if caps in x:
for fin in finish_markers:
if fin in newTxt[newTxt.index(x) - 1]:
sentences += 1
for caps in capitals:
if caps in newTxt[0]:
sentences += 1
print("Sentence count...")
print(sentences)
It is using the txt variable mentioned above. However I would now like to extract each sentence and put them into a list so the final product would look something like this...
['Hello my name is bob.','I really like pies.']
I would prefer not to use any non standard packages because I want this script to work independent of everything and offline. Thank you for any help!
Use nltk.tokenize
import nltk
sentences = nltk.sent_tokenize(txt)
This will give you a list of sentences.
You could work with a regex for all the ending chars(".","?","!")and then split it into different string.
You are trying to split a string into sentences, that is a bit hard to do it with regular expressions or string functions handling. For your use case, I'd recommend a NLP library like NLTK. Then, take a look at this Tokenize a paragraph into sentence and then into words in NLTK.

Find a specific item from a list using python

I have a list of 20000 Products with their Description
This shows the variety of the products
I want to be able to write a code that searches a particular word say 'TAPA'
and give a output of all the TAPAs
I found this Find a specific word from a list in python , but it uses startswith which finds only the first item for example:
new = [x for x in df1['A'] if x.startswith('00320')]
## output ['00320671-01 Guide rail 25N/1660', '00320165S02 - Miniature rolling table']
How shall i find for the second letter, third or any other item
P.S- the list consists of strings, integers, floats
You can use string.find(substring) for this purpose. So in your case this should work:
new = [x for x in df1['A'] if x.find('00320') != -1]
The find() method returns the lowest index of the substring found else returns -1.
To know more about usage of find() refer to Geeksforgeeks.com - Python String | find()
Edit 1:
As suggested by #Thierry in comments, a cleaner way to do this is:
new = [x for x in df1['A'] if '00320' in x]
You can use the built-in functions of Pandas to find partial string matches and generate lists:
new = df1['A'][df1['A'].astype(str).str.contains('00320')]['A'].tolist()
An advantage of pandas str.contains() is that the use of regex is possible.

Proper Syntax for List Comprehension Involving an Integer and a Float?

I have a List of Lists that looks like this (Python3):
myLOL = ["['1466279297', '703.0']", "['1466279287', '702.0']", "['1466279278', '702.0']", "['1466279268', '706.0']", "['1466279258', '713.0']"]
I'm trying to use a list comprehension to convert the first item of each inner list to an int and the second item to a float so that I end up with this:
newLOL = [[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
I'm learning list comprehensions, can somebody please help me with this syntax?
Thank you!
[edit - to explain why I asked this question]
This question is a means to an end - the syntax requested is needed for testing. I'm collecting sensor data on a ZigBee network, and I'm using an Arduino to format the sensor messages in JSON. These messages are published to an MQTT broker (Mosquitto) running on a Raspberry Pi. A Redis server (also running on the Pi) serves as an in-memory message store. I'm writing a service (python-MQTT client) to parse the JSON and send a LoL (a sample of the data you see in my question) to Redis. Finally, I have a dashboard running on Apache on the Pi. The dashboard utilizes Highcharts to plot the sensor data dynamically (via a web socket connection between the MQTT broker and the browser). Upon loading the page, I pull historical chart data from my Redis LoL to "very quickly" populate the charts on my dashboard (before any realtime data is added dynamically). I realize I can probably format the sensor data the way I want in the Redis store, but that is a problem I haven't worked out yet. Right now, I'm trying to get my historical data to plot correctly in Highcharts. With the data properly formatted, I can get this piece working.
Well, you could use ast.literal_eval:
from ast import literal_eval
myLOL = ["['1466279297', '703.0']", "['1466279287', '702.0']", "['1466279278', '702.0']", "['1466279268', '706.0']", "['1466279258', '713.0']"]
items = [[int(literal_eval(i)[0]), float(literal_eval(i)[1])] for i in myLOL]
Try:
import json
newLOL = [[int(a[0]), float(a[1])] for a in (json.loads(s.replace("'", '"')) for s in myLOL)]
Here I'm considering each element of the list as a JSON, but since it's using ' instead of " for the strings, I have to replace it first (it only works because you said there will be only numbers).
This may work? I wish I was more clever.
newLOL = []
for listObj in myLOL:
listObj = listObj.replace('[', '').replace(']', '').replace("'", '').split(',')
newListObj = [int(listObj[0]), float(listObj[1])]
newLOL.append(newListObj)
Iterates through your current list, peels the string apart into a list by replace un-wanted string chracters and utilizing a split on the comma. Then we take the modified list object and create another new list object with the values being the respective ints and floats. We then append the prepared newListObj to the newLOL list. Considering you want an actual set of lists within your list. Your previously documented input list actually contains strings, which look like lists.
This is a very strange format and the best solution is likely to change the code which generates that.
That being said, you can use ast.literal_eval to safely evaluate the elements of the list as Python tokens:
>>> lit = ast.literal_eval
>>> [[lit(str_val) for str_val in lit(str_list)] for str_list in myLOL]
[[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
We need to do it twice - once to turn the string into a list containing two strings, and then once per resulting string to convert it into a number.
Note that this will succeed even if the strings contain other valid tokens. If you want to validate the format too, you'd want to do something like:
>>> def process_str_list(str_list):
... l = ast.literal_eval(str_list)
... if not isinstance(l, list):
... raise TypeError("Expected list")
... str_int, str_float = l
... return [int(str_int), float(str_float)]
...
>>> [process_str_list(str_list) for str_list in myLOL]
[[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
Your input consists of a list of strings, where each string is the string representation of a list. The first task is to convert the strings back into lists:
import ast
lol2 = map(ast.literal_eval, mylol) # [['1466279297', '703.0'], ...]
Now, you can simply get int and float values from lol2:
newlol = [[int(a[0]), float(a[1])] for a in lol2]

How to get dictionary values in Python

I'm working with python dictionaries and ntlk on some reviews.I have and input (txt)file which is a simple review. In a dictionary all_dict.txt. I have all words (negative and positive) with word polarities and value.
all_dict.txt looks like this
"acceptable":("positive",1),"good":("positive",1),"shame":("negative",2),"bad":("negative",4),...
I want to know how can I get this polarities from a dictionary and a number value for each word so that I can get an output like this:
"acceptable_positive":1,"good_positive":1,"shame_negative":2,"bad_negative":4
I tried with dict.get(), dict.values but I don't get what I want. Is there a method to fetch key and values automatically?:
I tried with my code:
f_all_dict=open('all_dict.txt','r',encoding='utf-8').read()
f = eval(f_all_dict)
result_all = {}
for word in f.items():
suffix, pol=result_all[word] #pol->polarity
result_all[word + "_" + suffix] = pol
But I get KeyError if the word doesn't exist in an input file (review).
Thank you for your help
First off, the dict.items() return a dictitem object contains tuples of key and value and when you want to pass it as a key to your dictionary it raise a KeyError.
suffix, pol=result_all[word]
Secondly you better to use with statement in order to dealing with external objects like files. And use ast.literal_eval() for evaluating your dictionary. Also you can access to your value's items, by using throwaway variables unpacking :-) within a dict comprehension.
from ast import literal_eval
with open('all_dict.txt','r',encoding='utf-8') as f_all_dict:
dictionary = literal_eval(f_all_dict.read().strip())
result_all = {"{}_{}".format(word, suffix): pol for word, (suffix, pol) in dictionary.items()}
After modification my code looks like this. I didn't use with statement and it is working good.
f_all_dict=open('all_dict.txt','r',encoding='utf-8').read()
f = literal_eval(f_all_dict)
result_all = {}
for word in f.items():
result_all = {"{}_{}".format(word, suffix): pol * tokens.count(word) for word, (suffix, pol) in f.items()}
print(result_all)

Resources