How to find the second most repetitive character in string using python

How to find the second most repetitive character in string using python - python-3.x

Here in the program how can you find the second repetitive character in the string. for ex:abcdaabdefaggcbd"
Output : d (because 'd' occurred 3 times where 'a' occurred 4 times)
how can I get the output, please help me.
Given below is my code:
s="abcdaabdefaggcbd"
d={}
for i in s:
d[i] = d.get(i,0)+1
print(d,"ddddd")
max2 = 0
for k,v in d.items():
if(v>max2 and v<max(d.values())):
max2=v
if max2 in d.values():
print k,"kkk"

The magnificent Python Counter and its most_common() method are very handy here.
import collections
my_string = "abcdaabdefaggcbd"
result = collections.Counter(my_string).most_common()
print(result[1])
Output
('b', 3)
In case you need to capture all the second values (if you have more than one entry) you can use the following:
import collections
my_string = "abcdaabdefaggcbd"
result = collections.Counter(my_string).most_common()
second_value = result[1][1]
seconds = []
for item in result:
if item[1] == second_value:
seconds.append(item)
print(seconds)
Output
[('b', 3), ('d', 3)]
I also wanted to add an example of solving the problem using a methodology more similar to the one that you showed in your question:
my_string="abcdaabdefaggcbd"
result={}
for character in my_string:
if character in result:
result[character] = result.get(character) + 1
else:
result[character] = 1
sorted_data = sorted([(value,key) for (key,value) in result.items()])
second_value = sorted_data[-2][0]
result = []
for item in sorted_data:
if item[0] == second_value:
result.append(item)
print(result)
Output
[(3, 'b'), (3, 'd')]
Ps
Please forgive me if I took the freedom to change variable names but I think that in this way my answer will be more readable for a broader audience.

Sort the dict's items on their values (descending) and get the second item:
>>> from collections import Counter
>>> c = Counter("abcdaabdefaggcbd")
>>> vals = sorted(c.items(), key=lambda item:item[1], reverse=True)
>>> vals
[('a', 4), ('b', 3), ('d', 3), ('c', 2), ('g', 2), ('e', 1), ('f', 1)]
>>> print(vals[1])
('b', 3)
>>>
EDIT:
or just use Counter.most_common():
>>> from collections import Counter
>>> c = Counter("abcdaabdefaggcbd")
>>> print(c.most_common()[1])

Both b and d are second most repetitive. I would think that both should be displayed. This is how I would do it:
Code:
s="abcdaabdefaggcbd"
d={}
for i in s:
ctr=s.count(i)
d[i]=ctr
fir = max(d.values())
sec = 0
for j in d.values():
if(j>sec and j<fir):
sec = j
for k,v in d.items():
if v == sec:
print(k,v)
Output:
b 3
d 3

in order to find the second most repetitive character in string you can very well use collections.Counter()
Here's an example:
import collections
s='abcdaabdefaggcbd'
count=collections.Counter(s)
print(count.most_common(2)[1])
Output: ('b', 3)
You can do a lot with Counter(). Here's a link for a further read:
More about Counter()
I hope this answers your question. Cheers!

Related

PySpark sort values

I have a data:
[(u'ab', u'cd'),
(u'ef', u'gh'),
(u'cd', u'ab'),
(u'ab', u'gh'),
(u'ab', u'cd')]
I would like to do a mapreduce on this data and to find out how often same pairs appear.
As a result I get:
[((u'ab', u'cd'), 2),
((u'cd', u'ab'), 1),
((u'ab', u'gh'), 1),
((u'ef', u'gh'), 1)]
As you can see it is not quire right as (u'ab', u'cd') has to be 3 instead of 2 because (u'cd', u'ab') is the same pair.
My question is how can I make the program to count (u'cd', u'ab') and (u'ab', u'cd') as the same pair? I was thinking about sorting values for each row but could not find any solution for this.

You can sort the values then use reduceByKey to count the pairs:
rdd1 = rdd.map(lambda x: (tuple(sorted(x)), 1))\
.reduceByKey(lambda a, b: a + b)
rdd1.collect()
# [(('ab', 'gh'), 1), (('ef', 'gh'), 1), (('ab', 'cd'), 3)]

You can key by the sorted element, and count by key:
result = rdd.keyBy(lambda x: tuple(sorted(x))).countByKey()
print(result)
# defaultdict(<class 'int'>, {('ab', 'cd'): 3, ('ef', 'gh'): 1, ('ab', 'gh'): 1})
To convert the result into a list, you can do:
result2 = sorted(result.items())
print(result2)
# [(('ab', 'cd'), 3), (('ab', 'gh'), 1), (('ef', 'gh'), 1)]

Python: How to use one dictionary to use to decode the other?

Say if I had two dictionaries:
d1 = {'a':1, 'b':2}
d2 = {'a':'b', 'b':'b', 'a':'a'}
How can I use dictionary d1 as the rules to decode d2, such as:
def decode(dict_rules, dict_script):
//do something
return dict_result
decode(d1,d2)
>> {1:2, 2:2, 1:1}

of course it can be written much shorter, but here a version to see the principle:
result_list = list()
result_dict = dict()
for d2_key in d2.keys():
d2_key_decoded = d1[d2_key]
d2_value = d2[d2_key]
d2_value_decoded = d1[d2_value]
result_dict[d2_key_decoded] = d2_value_decoded
# add a tuple to the result list
result_list.append((d2_key_decoded, d2_value_decoded))
the result might be unexpected - because the resulting dict would have entries with the same key, what is not possible, so the key 1 is overwritten:
>>> # equals to :
>>> result_dict[1] = 2
>>> result_dict[2] = 2
>>> result_dict[1] = 1
>>> # Result : {1:1, 2:2}
>>> # therefore I added a list of Tuples as result :
>>> # [(1, 2), (2, 2), (1, 1)]
but as #Patrik Artner pointed out, that is not possible, because already the input dictionary can not have duplicate keys !

When utilizing a for loop what does each argument specify exactly?

I'm new to learning Python and have a clarifying question regarding for loops.
For instance:
dictionary_a = {"A": "Apple", "B": "Ball", "C": "Cat"}
dictionary_b = {"A": "Ant", "B": "Basket", "C": "Carrot"}
temp = ""
for k_a, v_a in dictionary_a.items():
temp = dictionary_b[k_a]
dictionary_b[k_a] = v_a
dictionary_a[k_a] = temp
How exactly is k_a run through the interpreter? I understand v_a in dictionary_a.items() as simply iterating through the sequence in whatever collection.
But when for loops have the syntax for x, y in z I don't quite understand what values x takes with each iteration.
Hope I'm making some sense. Appreciate any help.

when iterating over a dict.items(), it will return a 2 tuple, so when providing two variables in the for loop, each tuple elements will be assigned to it.
Here is another example to help you understand the mechanics:
coordinates = [(1, 2, 3), (4, 5, 6)]
for x, y, z in coordinates:
print(x)
Edit: you can make even more complicated unpacking. For example, let's assume you are interested to collect only the first and last item in a long list, you can proceed as follow:
long_list = 'This is a very long list to process'.split()
first_item, *_, last_item = long_list

In Python you can "Cast" multiple variables from another iterable variable.
Let's use this example:
>>> a, b = [1, 2]
>>> a
1
>>> b
2
The above behavior is what is happening when you loop over a dictionary with the dict.items() method.
Here is an example of what is happening:
>>> a = {"abc":123, "def":456}
>>> a.items()
dict_items([('abc', 123), ('def', 456)])
>>> for i in a.items():
... i
...
('abc', 123)
('def', 456)
>>>

Printing a list where values are in a specific distance from each other using Python

I have a list,
A = ['A','B','C','D','E','F','G','H']
if user input x = 4, then I need an output that shows every value that is 4 distance away from each other.
If starting from 'A' after printing values that are 4 distance away from each other ie: {'A', 'E'}, the code should iterate back and start from 'B' to print all values from there ie: {'B', 'F'}
No number can be in more than one group
Any help is going to be appreciated since I am very new to python.
this is what I have done
x = input("enter the number to divide with: ")
A = ['A','B','C','D','E','F','G','H']
print("Team A is divided by " +x+ " groups")
print("---------------------")
out = [A[i] for i in range(0, len(A), int(x))]
print(out)
My code is printing only the following when user input x =4
{'A', 'E'}
But I need it to look like the following
{'A', 'E'}
{'B', 'F'}
{'C', 'G'}
{'D', 'H'}
what am I doing wrong?

Use zip:
out = list(zip(A, A[x:]))
For example:
x = 4 # int(input("enter the number to divide with: "))
A = ['A','B','C','D','E','F','G','H']
print(f"Team A is divided by {x} groups")
print("---------------------")
out = list(zip(A, A[x:]))
print(out)
Outputs:
[('A', 'E'), ('B', 'F'), ('C', 'G'), ('D', 'H')]
Here you have the live example
If you want to keep the comprehension:
out = [(A[i], A[i+x]) for i in range(0, len(A)-x)]

**You can find my answer below.
def goutham(alist):
for passchar in range(0,len(alist)-4):
i = alist[passchar]
j = alist[passchar+4]
print("{"+i+","+j+"}")
j = 0
alist = ['a','b','c','d','e','f','g','h']
goutham(alist)

How can i check if a string has some of the same characters in it in Python?

In my program, when a user inputs a word, it needs to be checked for letters that are the same.
For example, in string = "hello", hello has 2 'l's. How can i check for this in a python program?

Use a Counter object to count characters, returning those that have counts over 1.
from collections import Counter
def get_duplicates(string):
c = Counter(string)
return [(k, v) for k, v in c.items() if v > 1]
In [482]: get_duplicates('hello')
Out[482]: [('l', 2)]
In [483]: get_duplicates('helloooo')
Out[483]: [('l', 2), ('o', 4)]

You can accomplish this with
d = defaultdict(int)
def get_dupl(some_string):
# iterate over characters is some_string
for item in some_string:
d[item] += 1
# select all characters with count > 1
return dict(filter(lambda x: x[1]>1, d.items()))
print(get_dupl('hellooooo'))
which yields
{'l': 2, 'o': 5}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to find the second most repetitive character in string using python - python-3.x

Related

PySpark sort values

Python: How to use one dictionary to use to decode the other?

When utilizing a for loop what does each argument specify exactly?

Printing a list where values are in a specific distance from each other using Python

How can i check if a string has some of the same characters in it in Python?

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to find the second most repetitive character​ ​in string using python - python-3.x

Related

PySpark sort values

Python: How to use one dictionary to use to decode the other?

When utilizing a for loop what does each argument specify exactly?

Printing a list where values are in a specific distance from each other using Python

How can i check if a string has some of the same characters in it in Python?

Categories

Resources

How to find the second most repetitive character in string using python - python-3.x