I am using split_at from more_itertools. The reason is that I have a list of 70k records in which Record i [Record 1, Record 2, ...Record n] occurs. I need to split the list at these Record i occurences. How can i implement this in the following code?
Note tha n is unkown
Result = list(split_at(lst, lambda x: x == 'Record i'))
This is my input (subset)
[1] "Record 1"
[1] 1010286
[1] 7
[1] F
[1] "Record 2"
[1] 1000152
[1] 5
[1] M
This is my desired output (in a dataframe)
"Record 1", 1010286, 7, F
"Record 2", 1000152, 5, M
Related
What I have:
series = ['foo', 'bar', 'baz', 'foo', 'baz', 'foo' ]
column = [1, 2, -3, -4, 5, -6]
list = [column[function(x)].count() for x in series]
list:
foo = 3
bar = 1
baz = 2
Works fine, each instance in series is counted.
Want only positive number instances counted as well, so:
list = [column[function(x)].count() for x in series if (x := function(x)) >= 0]
list:
foo = 1
bar = 1
baz = 1
Discovered Walrus Operator, but x in my case is a string, perhaps the core problem?
I do get a syntax error with Walrus portion of code.
I need both total & positive number counts, creating say a "total" & "positive totals" columns in function seems clunky, is there a way to do this with list comprehension.
Thank you in advance for your assistance.
Since you tagged pandas:
pd.Series(column).gt(0).groupby(series).agg({'count','sum'})
Output:
count sum
bar 1 1
baz 2 1
foo 3 1
You are calling function(x) where x is the result of function(x) already. Try:
vals = [column[y].count() for x in series if (y := function(x)) >= 0]
Notes:
Use a different variable name than x so that it is less confusing (this is probably also the source of your syntax error).
list is a type name, choose a different name for the list of values.
I'm trying to write a python code for a problem wherein I will be given with a list of string characters for example ["A", "B", "B", "C"] and the output that I should get is B and if there are more than one repeated value with equal number of repetitions or no elements in the list it should give"NONE" AS output. and my code is doing good so far but when the size of the list is increasing my code is giving wrong output please help me in optimizing the code so that it takes a list of any size and gives a correct output
lis = ["A","B","B","A"] #INPUT LIST
catch = []
final_catch=[]
for i in range(len(lis)):
for j in range(i + 1, len(lis)):
if lis[i] == lis[j]:
catch.append(lis[i])
final_catch =list(set(catch))
print(final_catch)
if len(final_catch)>=2 or len(final_catch) == 0:
print("NONE")
else:
print(final_catch.pop())
for input ["A,"B","B","A"] expected output:"NONE" actual output: "NONE"
for input ["A","A","A","A"] expected output :"A" actual output : "A"
for input ["A","A","B","B","A","B","A","B","B"] expected output : "B"
Try this,
>>> from collections import Counter
>>> l = ["A","A","B","B","A","B","A","B","B"]
>>> d = Counter(l)
>>> result = d.most_common() # result looks like [('B', 5), ('A', 4)]
Output:
>>> result[0][0] if result[0][1] >2 else 'None' # conditional if-else statement
'B'
Explanation:
Use Counter to get number of occurrences of each element in a list
Use .most_common() to get list of most occurrence in the form of tuplesi.e., [(element, no. of occurrence)]
Return a list of the n most common elements and their counts from the
most common to the least
result[0][0] - we are passing index values to get first element of tuple in list.
result[0][1] - This gives you second element of tuple in a list.
*result[0] - select first element in a list
This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 4 years ago.
a = [1, 2, 2]
value = 2
for i in a:
if i == value:
a.remove(i)
I want to delete 2 same elements in a list. But the result tells me I just delete one of them. when I debug it, I find it only cycle 2 times, not 3 as I wish.
You can use a simple list comprehension for a 1-line solution:
In [2764]: a = [1, 2, 2]
In [2755]: value = 2
In [2768]: a = [i for i in a if i != value]
In [2769]: a
Out[2769]: [1]
You can write the above as :
ans = []
for i in a:
if i <> value:
ans.append(i)
OR, you can also use filter to remove all occurrences:
Python-2
In [2778]: filter(lambda x: x!= value, a)
Out[2778]: [1]
Python-3
In [5]: list(filter(lambda x: x!= value, a))
Out[5]: [1]
Here you don't have to use a comparison to remove a certain value from list
Here is a little modification to your code:
a = [1, 2, 2]
value = 2
try:
for i in a: a.remove (value)
except ValueError: pass
print (a)
Whenever the remove () function couldn't find the value you are looking for, it will raise a value error: ValueError: list.remove(x): x not in list. To eliminate this, surround it with a try except. That'll do the trick.
However there are more easier methods to use remove () function. For an example you could use while loop.
Look at this code:
a = [1, 2, 2]
value = 2
while value in a:
a.remove (value)
print (a)
Its far more easier.
I am trying to build a list of 0's using list comprehension. But i also want to make an index 1 where i choose in the list. For example myList 5 2 = [0,1,0,0,0] where 5 is the number of elements and 2 is the index.
myList el index = [0 | n <- [1..el], if n == index then 1 else 0]
but this results in an error.
The smallest change that fixes that is
myList el index = [if n == index then 1 else 0 | n <- [1..el]]
Note that what's at the left of | is what generates the list elements. A list comprehension of the form [ 0 | ...] will only generate zeros, and the ... part only decides how long is the resulting list.
Further, in your code the compiler complains because at the right of | we allow only generators (e.g. n <- someList), conditions (e.g. x > 23), or new definitions (let y = ...). In your code the if ... is interpreted to be a condition, and for that it should evaluate to a boolean, but then 1 makes the result a number, triggering a type error.
Another solution could be
myList el index = replicate (index-1) 0 ++ [1] ++ replicate (el-index) 0
where replicate m 0 generates a list with m zeros, and ++ concatenates.
Finally, note that your index is 1-based. In many programming languages, that's unconventional, since 0-based indexing is more frequently used.
I'd like to replace a character in a string with another character, by first sampling by the character. I'm having trouble having it print out the character instead of the index.
Example data, is labelled "try":
L 0.970223325 - 0.019851117 X 0.007444169
K 0.962779156 - 0.027295285 Q 0.004962779
P 0.972704715 - 0.027295285 NA 0
C 0.970223325 - 0.027295285 L 0.00248139
V 0.970223325 - 0.027295285 T 0.00248139
I'm trying to sample a character for a given row using weighted probabilities.
samp <- function(row) {
sample(try[row,seq(1, length(try), 2)], 1, prob = try[row,seq(2, length(try), 2)])
}
Then, I want to use the selected character to replace a position in a given string.
subchar <- function(string, pos, new) {
paste(substr(string, 1, pos-1), new , substr(string, pos+1, nchar(string)), sep='')
}
My question is - if I do, for example
> subchar("KLMN", 3, samp(4))
[1] "KL1N"
But I want it to read "KLCN". As.character(samp(4)) doesn't work either. How do I get it to print out the character instead of the index?
The problem arises because your letters are stored as factors rather than characters, and samp is returning a data.frame.
C is the first level in your factor so that is stored as 1 internally, and as.character (which gets invoked by the paste statement) pulls this out when working on the mini-data.frame:
samp(4)
V1
4 C
as.character(samp(4))
[1] "1"
You can solve this in 2 ways, either dropping the data.frame of the samp output in your call to subchar, or modifying samp to do so:
subchar("KLMN", 3, samp(4)[,1])
[1] "KLCN"
samp2 <- function(row)
{ sample(try[row,seq(1, length(try), 2)], 1, prob = try[row,seq(2, length(try), 2)])[,1]
}
subchar("KLMN",3,samp2(4))
[1] "KLCN
You may also find it easier to sample within your subsetting, and you can drop the data.frame from there:
samp3 <- function(row){
try[row,sample(seq(1,length(try),2),1,prob=try[row,seq(2,length(try),2)]),drop=TRUE]
}