How to bring text data into given format using python

How to bring text data into given format using python - python-3.x

I have JSON data which looks like this:
{
"text":"Dispute Case ID MM-E-904982837 the amount of $20.06 should be AU dollars total is $62.34 US dollars.",
"spans":[
{"start":82,"end":99,"label":"dis_amt","ngram":"$62.34 US dollars"},
{"start":45,"end":51,"label":"dis_amt","ngram":"$20.06"}
]
}
I want to bring this data into the below format:
Wherever there is label in spans I replace that in the original text with label. It may comprise of more than 1 token.
**First step**: ['Dispute', 'Case', 'ID', 'MM-E-904982837', 'the', 'amount', 'of', '$20.06', 'should', 'be', 'AU', 'dollars', 'total', 'is', '$62.34', 'US', 'dollars.']
**Second step**: ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'dis_amount', 'O', 'O', 'O', 'dispute_amount', 'O', 'O', 'dis_amt', 'dis_amt', 'dis_amt']
My Code:
for data in data-set:
data = ast.literal_eval(data)
text=data['text']
split_txt=text.split()
print(split_txt)
nerd_label=['O' for i in range(len(split_txt))]
for sp in data['spans']:
ngrams=sp['ngram']
split_ngram=ngrams.split()
for ngram in split_ngram:
if ngram in split_txt:
idx=split_txt.index(ngram)
nerd_label[idx]=sp['label']
I get this wrong output:
['O', 'O', 'O', 'O', 'O', 'O', 'O', 'dis_amt', 'O', 'O', 'O', 'dis_amt', 'O', 'O', 'dis_amt', 'dis_amt', 'O']

Related

python 3 how to generate multiple random element in list for loops

I'm doing a coding exercise and it's to build a password generator. I understand I need to utilize the for loop with the list containing the elements but I'm having trouble getting multiple random elements. If the user input is 5, I'm able to generate a random letter and 5 times of the same element but I can't get it to generate 5 different elements. What code do I need to utilize to generate random elements depending on user input? I know my code and logic is incorrect but I can't figure out how else to get around this. Any feedback is much appreciated, thank you.
import random
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
nr_letters= int(input("How many letters would you like in your password?\n"))
for letter in letters:
random_letter = random.choice(letters) * nr_letters
print(random_letter)

There could be better ways - I've just used your code.
The for loop you are using is redundant.
Can do something like -
import random
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
nr_letters= int(input("How many letters would you like in your password?\n"))
random_letter=''
for i in range (nr_letters):
random_letter += random.choice(letters)
print(random_letter)

You actually don't have to use for loop to get your desired password.
import random
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
nr_letters= int(input("How many letters would you like in your password?\n"))
random_letter = "".join(random.choices(letters, k= nr_letters))
print(random_letter)
but if you must use loop, just pass the above code under loop as you wish. Happy coding.

Merging given list into a list of list

I have a list which looks like seen=['poll','roll','toll','told']
I need to compare characters from each of the elements from that list.
When I try to strip those charcters using
for i in range(len(seen)):
chain1=[]
for j in range(len(seen)):
chain1.append(seen[i][j])
print(chain1)
I get an output like this
['p', 'o', 'l', 'l']
['r', 'o', 'l', 'l']
['t', 'o', 'l', 'l']
['t', 'o', 'l', 'd']
Since these are all different lists I cant seem to iterate over them.
My thinking is, if I can manage to get those lists into a single list of list I can do my iterations.
Any suggestions on how to make it into a list of list or some other way to iterate over those words?

you can merge it like below:
seen=['poll','roll','toll','told']
alist=[]
for i in seen:
chain=[]
for j in i:
chain.append(j)
alist.append(chain)
print(alist)
Output:
[['p', 'o', 'l', 'l'], ['r', 'o', 'l', 'l'], ['t', 'o', 'l', 'l'], ['t', 'o', 'l', 'd']]

Password Generator - implementation with minimum password length constraint e.g. password length = 8 (in python)

I am not able to modify the code when I implement minimum password length like minimum length must be 8 I tried using a while loop but code is not running as expected Please help me in this case
import random
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l',
'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N',
'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
numbers = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
symbols = ['!', '#', '$', '%', '&', '(', ')', '*', '+']
print("Welcome to the PyPassword Generator!")
nr_letters= int(input("How many letters would you like in your
password?\n"))
nr_symbols = int(input(f"How many symbols would you like?\n"))
nr_numbers = int(input(f"How many numbers would you like?\n"))
password = []
password.extend(random.sample(letters, nr_letters))
password.extend(random.sample(symbols, nr_symbols))
password.extend(random.sample(numbers, nr_numbers))
random.shuffle(password)
finalPassword = ""
print(f"Here is you password: {finalPassword.join(password)}")

Simply keep the input statement within a while loop that checks your criteria:
import random
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l',
'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N',
'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
numbers = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
symbols = ['!', '#', '$', '%', '&', '(', ')', '*', '+']
print("Welcome to the PyPassword Generator!")
nr_letters = 0
nr_symbols = 0
nr_numbers = 0
while nr_letters + nr_symbols + nr_numbers < 8:
nr_letters= int(input("How many letters would you like in your password?\n"))
nr_symbols = int(input(f"How many symbols would you like?\n"))
nr_numbers = int(input(f"How many numbers would you like?\n"))
password = []
password.extend(random.sample(letters, nr_letters))
password.extend(random.sample(symbols, nr_symbols))
password.extend(random.sample(numbers, nr_numbers))
random.shuffle(password)
finalPassword = ""
print(f"Here is you password: {finalPassword.join(password)}")

how to make the first for statement to appear on new line

I want to make the output like:
.OOOOOOO.
.OOOOOOO.
..OOOOO..
...OOO...
....O....
grid = [
['.', '.', '.', '.', '.', '.'] ,
['.', 'O', 'O', '.', '.', '.'] ,
['O', 'O', 'O', 'O', '.', '.'] ,
['O', 'O', 'O', 'O', 'O', '.'] ,
['.', 'O', 'O', 'O', 'O', 'O'] ,
['O', 'O', 'O', 'O', 'O', '.'] ,
['O', 'O', 'O', 'O', '.', '.'] ,
['.', 'O', 'O', '.', '.', '.'] ,
['.', '.', '.', '.', '.', '.']
]
x = 0
y = 0
for y in range(0,6):
for x in range(0,9):
print(grid[x][y] , end = '')
but the output is :
..OO.OO...OOOOOOO..OOOOOOO...OOOOO.....OOO.......O....
what should to the code so that it shows the output of first 'for' condition on new line...

If you want to stick to for loops, you may have to create a line variable to print only once when you collected all of your new line:
for y in range(0,6):
line = ""
for x in range(0,9):
line += grid[x][y]
print(line)
Gives:
..OO.OO..
.OOOOOOO.
.OOOOOOO.
..OOOOO..
...OOO...
....O....
Otherwise, a quick way to solve this is to transpose grid (thanks to np.ndarray.T).
Then convert into strings.
>>> print("\n".join("".join(l) for l in np.array(grid).T))
..OO.OO..
.OOOOOOO.
.OOOOOOO.
..OOOOO..
...OOO...
....O....

No need for numpy methonds or other imports.
You can transpose your lists using zip() and use str.join() to join them before printing:
grid = [
['.', '.', '.', '.', '.', '.'] ,
['.', 'O', 'O', '.', '.', '.'] ,
['O', 'O', 'O', 'O', '.', '.'] ,
['O', 'O', 'O', 'O', 'O', '.'] ,
['.', 'O', 'O', 'O', 'O', 'O'] ,
['O', 'O', 'O', 'O', 'O', '.'] ,
['O', 'O', 'O', 'O', '.', '.'] ,
['.', 'O', 'O', '.', '.', '.'] ,
['.', '.', '.', '.', '.', '.']
]
x = 0
y = 0
transposed = '\n'.join(''.join(k) for k in zip(*grid))
print(transposed)
Output:
..OO.OO..
.OOOOOOO.
.OOOOOOO.
..OOOOO..
...OOO...
....O....

Pandas array to columns ( need to convert alphabets into words)

I have data in the below array format, I need to convert this array
these alphabets into words e.g.
(' ', 'a', 'd', 'e', 'l', 'o', 'r', 't') = 'adelort'
how i can do this
Array =[(' ', 'a', 'd', 'e', 'l', 'o', 'r', 't'),
(' ', 'a', 'd', 'e', 'l', 'o', 'r', 't'),
(' ', 'e', 'i', 'o', 't', 'v'),
('d', 'e', 'g', 'i', 'n', 'r', 't'),
('d', 'e', 'g', 'i', 'n', 'r', 't'),
('a', 'd', 'e', 'i', 'l', 'm', 'n', 't')]
Getting above array while working on an NLP problem, please refer below code:
xtest_tfidf = tfidf_vectorizer.transform(pred_test)
y_pred_test = clf.predict(xtest_tfidf)
multilabel_binarizer.inverse_transform(y_pred_test)

Not sure about the NLP part but you asked how to convert the array into words:
Array =[(' ', 'a', 'd', 'e', 'l', 'o', 'r', 't'),
(' ', 'a', 'd', 'e', 'l', 'o', 'r', 't'),
(' ', 'e', 'i', 'o', 't', 'v'),
('d', 'e', 'g', 'i', 'n', 'r', 't'),
('d', 'e', 'g', 'i', 'n', 'r', 't'),
('a', 'd', 'e', 'i', 'l', 'm', 'n', 't')]
words = ["".join(x).strip() for x in Array]
yields
['adelort', 'adelort', 'eiotv', 'deginrt', 'deginrt', 'adeilmnt']

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to bring text data into given format using python - python-3.x

Related

python 3 how to generate multiple random element in list for loops

Merging given list into a list of list

Password Generator - implementation with minimum password length constraint e.g. password length = 8 (in python)

how to make the first for statement to appear on new line

Pandas array to columns ( need to convert alphabets into words)

Categories

Resources