How to split the sinhala word by specific character of the word in python. I tried with using the length of the word. Is there any other methods? - python-3.x

I used this. But it is not applicable for all the cases.
def splitword(verb_list):
split = -((-len(verb_list))//2)
return verb_list[:split], verb_list[split:]
print(verb_list)

If you want to remove a specific word from each string in a list you can use the str.replace() method in a list comprehension:
l = ['බලනවා', 'නටනවා']
l = [s.replace('නවා', '') for s in l]
l would become:
['බල', 'නට']

Related

string appears as subset of character in list element python

So far, I have:
my_list = ['hello', 'oi']
comparison_list = ['this hellotext', 'this oitext']
for w in my_list:
if w in comparison_list: print('yes')
However, nothing prints because no element in my_list equals any element in comparison_list.
So how do I make this check as a subset or total occurance?
Ideal output:
yes
yes
You are checking the occurrence of the complete string in the list currently. Instead you can check for the occurrence of the string inside each comparison string and make a decision. A simple approach will be to re-write the loop as below
for w in my_list:
# Check for every comparison string. any() considers atleast 1 true value
if any([True for word in comparison_list if w in word]):
print('yes')
It's because you're comparing w to the list elements. If you wanna find w in each string in your comparison_list you can use any:
my_list = ['hello', 'oi', 'abcde']
comparison_list = ['this hellotext', 'this oitext']
for w in my_list:
if any(w in s for s in comparison_list):
print('yes')
else:
print('no')
I added a string to your list and handle the 'no' case in order to get an output for each element
Output:
yes
yes
no
Edited Solution:
Apologies for older solution, I was confused somehow.
Using re module , we can use re.search to determine if the string is present in the list of items. To do this we can create an expression using str.join to concatenate all the strings using |. Once the expression is created we can iterate through the list of comparison to be done. Note | means 'Or', so any of the searched strings if present should return bool value of True. Note I am assuming there are no special characters present in the my_list.
import re
reg = '|'.join(my_list)
for item in comparison_list:
print(bool(re.search(reg, item)))

Python 3: Add string to OBJECTS within a list

I am sure it is a very trivial question but I can not seem to find anything online, perhaps because I am not using the right terminology..
I have a list that looks like so:
list = ["abc",
"ded"]
I know how to append elements to this list, how to add elements to the beginning, etc.
What I need to do is to add a string (more specifically an asterisk (*)) before and after each object in this list. So it should look like:
list = ["*abc*",
"*ded*"]
I have tried:
asterisk = '*'
list = [asterisk] + list[0]
list = asterisk + List[0]
list = ['*'] + list[0]
list = * + list[0]
asterisk = list(asterisk)
list = [asterisk] + list[0]
and I always get:
TypeError: can only concatenate list (not "str") to list
Then of course there is the problem with adding it before and after each of the objects in the list.
Any help will be appreciated.
Just string interpolate it in as follows:
[f'*{s}*' for s in ["abc","ded"]]
Output
['*abc*', '*ded*']
Note this is for Python 3.6+ only.
For your list you use this beautiful syntax, called "list comprehension":
lst = ['abc', 'ded']
lst = ['*'+s+'*' for s in lst]
print(lst)
This would get you:
['*abc*', '*ded*']
Have you tried using a list comprehension to adjust
[('*'+i) for i in asterix]
you can give it a name just for good measure so that you can call it later
You can join the asterisks with each word
asterisked = [w.join('**') for w in lst]
The trick here is to remember that strings are iterables, so you can pass a string that contains two asterisks to the word's join method to let it prepend the word with the first asterisk and append the second one

What is the time complexity for a nested loop in this case?

I'm trying to tokenize a text file. I created a list of lines found in the text file using readlines() and plan to loop through each sentence in that list to split each sentence using re.split(). I then plan to loop through the resulting list to add each word into a dictionary to count how many times each word occurs. Would this implementation of a nested list result in O(N^2) or O(N)? Thanks.
This code is just an example of how I plan to implement it.
for sentence in list:
result = re.split(sentence)
for word in result:
dictionary[word] += 1
for sentence in list: # n-times (n = length of list)
result = re.split(sentence)
for word in result: # m-times (m = number of words in sentence)
dictionary[word] += 1
so runtime will be n * m or n-squared.
A better way to solve a counting problem is by using collections.Counter.

Find all the locations of `0000` from string by python [duplicate]

I'm trying to find every 10 digit series of numbers within a larger series of numbers using re in Python 2.6.
I'm easily able to grab no overlapping matches, but I want every match in the number series. Eg.
in "123456789123456789"
I should get the following list:
[1234567891,2345678912,3456789123,4567891234,5678912345,6789123456,7891234567,8912345678,9123456789]
I've found references to a "lookahead", but the examples I've seen only show pairs of numbers rather than larger groupings and I haven't been able to convert them beyond the two digits.
Use a capturing group inside a lookahead. The lookahead captures the text you're interested in, but the actual match is technically the zero-width substring before the lookahead, so the matches are technically non-overlapping:
import re
s = "123456789123456789"
matches = re.finditer(r'(?=(\d{10}))',s)
results = [int(match.group(1)) for match in matches]
# results:
# [1234567891,
# 2345678912,
# 3456789123,
# 4567891234,
# 5678912345,
# 6789123456,
# 7891234567,
# 8912345678,
# 9123456789]
You can also try using the third-party regex module (not re), which supports overlapping matches.
>>> import regex as re
>>> s = "123456789123456789"
>>> matches = re.findall(r'\d{10}', s, overlapped=True)
>>> for match in matches: print(match) # print match
...
1234567891
2345678912
3456789123
4567891234
5678912345
6789123456
7891234567
8912345678
9123456789
I'm fond of regexes, but they are not needed here.
Simply
s = "123456789123456789"
n = 10
li = [ s[i:i+n] for i in xrange(len(s)-n+1) ]
print '\n'.join(li)
result
1234567891
2345678912
3456789123
4567891234
5678912345
6789123456
7891234567
8912345678
9123456789
Piggybacking on the accepted answer, the following currently works as well
import re
s = "123456789123456789"
matches = re.findall(r'(?=(\d{10}))',s)
results = [int(match) for match in matches]
conventional way:
import re
S = '123456789123456789'
result = []
while len(S):
m = re.search(r'\d{10}', S)
if m:
result.append(int(m.group()))
S = S[m.start() + 1:]
else:
break
print(result)

Make Strings In List Uppercase - Python 3

I'm in the process of learning python and with a practical example I've come across a problem I cant seem to find the solution for.
The error I get with the following code is
'list' object has to attribute 'upper'.
def to_upper(oldList):
newList = []
newList.append(oldList.upper())
words = ['stone', 'cloud', 'dream', 'sky']
words2 = (to_upper(words))
print (words2)
Since the upper() method is defined for string only and not for list, you should iterate over the list and uppercase each string in the list like this:
def to_upper(oldList):
newList = []
for element in oldList:
newList.append(element.upper())
return newList
This will solve the issue with your code, however there are shorter/more compact version if you want to capitalize an array of string.
map function map(f, iterable). In this case your code will look like this:
words = ['stone', 'cloud', 'dream', 'sky']
words2 = list(map(str.upper, words))
print (words2)
List comprehension [func(i) for i in iterable].In this case your code will look like this:
words = ['stone', 'cloud', 'dream', 'sky']
words2 = [w.upper() for w in words]
print (words2)
You can use the list comprehension notation and apply theupper method to each string in words:
words = ['stone', 'cloud', 'dream', 'sky']
words2 = [w.upper() for w in words]
Or alternatively use map to apply the function:
words2 = list(map(str.upper, words))
AFAIK, upper() method is implemented for strings only. You have to call it from each child of the list, and not from the list itself.
It's great that you're learning Python! In your example, you are trying to uppercase a list. If you think about it, that simply can't work. You have to uppercase the elements of that list. Additionally, you are only going to get an output from your function if you return a result at the end of the function. See the code below.
Happy learning!
def to_upper(oldList):
newList = []
for l in oldList:
newList.append(l.upper())
return newList
words = ['stone', 'cloud', 'dream', 'sky']
words2 = (to_upper(words))
print (words2)
Try it here!

Resources