writing a line in a text in a given sequence - python-3.x

I have found some code and I changed it for my purpose. The code goes to the given line number and adds a new line with a certain format. But it does not work if I have a sequence of numbers. I could not find out why.
This is my code:
import fileinput
x = [2, 4, 5, 6]
for line in fileinput.FileInput("1.txt", inplace=1):
print(line, end="")
for index, item in enumerate(x):
if line.startswith("ND "+str(x[index]-1)):
print("ND "+str(x[index])+" 0 0 0 0")
This is the input file "1.txt":
ND 1 12 11 8 9
ND 3 15 11 7 9
ND 7 8 9 2 3
ND 8 4 5 1 12
ND 9 2 3 6 10
This is the result now :
ND 1 12 11 8 9
ND 2 0 0 0 0
ND 3 15 11 7 9
ND 4 0 0 0 0
ND 7 8 9 2 3
ND 8 4 5 1 12
ND 9 2 3 6 10
What I need should be like this:
ND 1 12 11 8 9
ND 2 0 0 0 0
ND 3 15 11 7 9
ND 4 0 0 0 0
ND 5 0 0 0 0
ND 6 0 0 0 0
ND 7 8 9 2 3
ND 8 4 5 1 12
ND 9 2 3 6 10
Can you please give me a hint! How should I change my code?

Would something like that work for you ? (you don't need to maintain the list of missing lines in x). It's not the most elegant code but it can be improved later if that work for you.
import fileinput
n = 1
for line in fileinput.FileInput("1.txt", inplace=1):
while not line.startswith("ND %d" % n):
print("ND %d 0 0 0 0" % n)
n+=1
print(line)
n+=1

You are missing ND 4 and ND 5 lines in your 1.txt. That's why you cannot print the ND 5 0 0 0 0 and ND 6 0 0 0 0.
You can use the regex to extract line number from text and compare:
import fileinput
import re
x = [2, 4, 5, 6]
last = 0
for line in fileinput.FileInput("1.txt", inplace=1):
# using regex to extract the "current" line number from ND...
current = int(re.search(r'\d+', line).group())
for index, item in enumerate(x):
# "=" because there's a case that your given line already exists
if item > last and item <= current:
print("ND "+str(item)+" 0 0 0 0")
last = current
print(line, end="")

As I said in the comment, it needs a break with the maximum number of ND in the file. So in this case at n == 10.
n = 1
for line in fileinput.FileInput("1.txt", inplace=1):
while not line.startswith("ND %d" % n):
print("ND %d 0 0 0 0" % n)
n+=1
if n == 10:
break
print(line, end="")
n+=1

Related

How to list the number of words in a row with the most words?

I try to write the number of words from the longest line. I was able to write the number of words in each line, but I can't print the maximum number. The max () function do not works. Can anyone help me?
import os
import sys
import numpy as np
with open('demofile.txt') as f:
lines = f.readlines()
for index, value in enumerate(lines):
number_of_words = len(value.split())
print(number_of_words)
demofile.txt
<=4 1 2 3 4 5 6 7 8 9 10 11
<=4 1 2 3 4 5 6 7 8 9
<=4 1 2 3 4 5 6 7 8 9 10 11 sdad adada affg
<=4 1 2 3 4 5 6 7 8 9 10 11
Output:
12
10
15
12
0
0
0
0
0
0
0
0
0
0
0
I also don't understand why it lists the number of words in the next lines where there are no words
If I understood correctly max() function doesn't work because you are searching max of strings so you need to convert them to ints(floats).
lines = [int(x) for x in lines.split(" ")] // converts to ints
maximum = max(lines)// should work now
UPD:
Edited with comment below.
Before:
int(x) for x in lines
Now:
int(x) for x in lines.split(" ")

Tracing sorting algorithms

I am trying to trace the changes in selection sort algorithm with python, Here's a piece of my code and what I've tried, the problem I am facing is printing the results in a table-like format
l = [2,5,1,7,9,5,3,0,-1]
iterat = 1
print('Iteration' + '\t\t\t' + 'Results')
for i in range(1, len(l)):
val_to_sort = l[i]
while l[i-1] > val_to_sort and i > 0:
l[i-1], l[i] = l[i], l[i-1]
i -= 1
print(iterat, '\t\t\t', l[0:iterat + 1],'|',l[iterat:])
iten += 1
from the code above, I am obtaining the following results:
But I am trying to obtain such results
Unident print one level to the left, so it is inside the for block instead of the while block.
Use join and map to print the lists as a string
You can use enumerate instead of manually incrementing iterat
def format_list(l):
return ' '.join(map(str, l))
l = [2,5,1,7,9,5,3,0,-1]
print('Iteration' + '\t\t\t' + 'Results')
for iterat, i in enumerate(range(1, len(l)), 1):
val_to_sort = l[i]
while l[i-1] > val_to_sort and i > 0:
l[i-1], l[i] = l[i], l[i-1]
i -= 1
print(iterat, '\t\t\t', format_list(l[0:iterat + 1]),'|', format_list(l[iterat:]))
Outputs
Iteration Results
1 2 5 | 5 1 7 9 5 3 0 -1
2 1 2 5 | 5 7 9 5 3 0 -1
3 1 2 5 7 | 7 9 5 3 0 -1
4 1 2 5 7 9 | 9 5 3 0 -1
5 1 2 5 5 7 9 | 9 3 0 -1
6 1 2 3 5 5 7 9 | 9 0 -1
7 0 1 2 3 5 5 7 9 | 9 -1
8 -1 0 1 2 3 5 5 7 9 | 9
I can't help you with the Cyrillic text though ;)

How to take mean of 3 values before flag change 0 to 1python

I have dataframe with columns A,B and flag. I want to calculate mean of 2 values before flag change from 0 to 1 , and record value when flag change from 0 to 1 and record value when flag changes from 1 to 0.
# Input dataframe
df=pd.DataFrame({'A':[1,3,4,7,8,11,1,15,20,15,16,87],
'B':[1,3,4,6,8,11,1,19,20,15,16,87],
'flag':[0,0,0,0,1,1,1,0,0,0,0,0]})
# Expected output
df_out=df=pd.DataFrame({'A_mean_before_flag_change':[5.5],
'B_mean_before_flag_change':[5],
'A_value_before_change_flag':[7],
'B_value_before_change_flag':[6]})
I try to create more general solution:
df=pd.DataFrame({'A':[1,3,4,7,8,11,1,15,20,15,16,87],
'B':[1,3,4,6,8,11,1,19,20,15,16,87],
'flag':[0,0,0,0,1,1,1,0,0,1,0,1]})
print (df)
A B flag
0 1 1 0
1 3 3 0
2 4 4 0
3 7 6 0
4 8 8 1
5 11 11 1
6 1 1 1
7 15 19 0
8 20 20 0
9 15 15 1
10 16 16 0
11 87 87 1
First create groups by mask for 0 with next 1 values of flag:
m1 = df['flag'].eq(0) & df['flag'].shift(-1).eq(1)
df['g'] = m1.iloc[::-1].cumsum()
print (df)
A B flag g
0 1 1 0 3
1 3 3 0 3
2 4 4 0 3
3 7 6 0 3
4 8 8 1 2
5 11 11 1 2
6 1 1 1 2
7 15 19 0 2
8 20 20 0 2
9 15 15 1 1
10 16 16 0 1
11 87 87 1 0
then filter out groups with size less like N:
N = 4
df1 = df[df['g'].map(df['g'].value_counts()).ge(N)].copy()
print (df1)
A B flag g
0 1 1 0 3
1 3 3 0 3
2 4 4 0 3
3 7 6 0 3
4 8 8 1 2
5 11 11 1 2
6 1 1 1 2
7 15 19 0 2
8 20 20 0 2
Filter last N rows:
df2 = df1.groupby('g').tail(N)
And aggregate last with mean:
d = {'mean':'_mean_before_flag_change', 'last': '_value_before_change_flag'}
df3 = df2.groupby('g')['A','B'].agg(['mean','last']).sort_index(axis=1, level=1).rename(columns=d)
df3.columns = df3.columns.map(''.join)
print (df3)
A_value_before_change_flag B_value_before_change_flag \
g
2 20 20
3 7 6
A_mean_before_flag_change B_mean_before_flag_change
g
2 11.75 12.75
3 3.75 3.50
I'm assuming that this needs to work for cases with more than one rising edge and that the consecutive values and averages get appended to the output lists:
# the first step is to extract the rising and falling edges using diff(), identify sections and length
df['flag_diff'] = df.flag.diff().fillna(0)
df['flag_sections'] = (df.flag_diff != 0).cumsum()
df['flag_sum'] = df.flag.groupby(df.flag_sections).transform('sum')
# then you can get the relevant indices by checking for the rising edges
rising_edges = df.index[df.flag_diff==1.0]
val_indices = [i-1 for i in rising_edges]
avg_indices = [(i-2,i-1) for i in rising_edges]
# and finally iterate over the relevant sections
df_out = pd.DataFrame()
df_out['A_mean_before_flag_change'] = [df.A.loc[tpl[0]:tpl[1]].mean() for tpl in avg_indices]
df_out['B_mean_before_flag_change'] = [df.B.loc[tpl[0]:tpl[1]].mean() for tpl in avg_indices]
df_out['A_value_before_change_flag'] = [df.A.loc[idx] for idx in val_indices]
df_out['B_value_before_change_flag'] = [df.B.loc[idx] for idx in val_indices]
df_out['length'] = [df.flag_sum.loc[idx] for idx in rising_edges]
df_out.index = rising_edges

How to recognize [1,X,X,X,1] repeating pattern in panda serie

I have a boolean column in a csv file for example:
1 1
2 0
3 0
4 0
5 1
6 1
7 1
8 0
9 0
10 1
11 0
12 0
13 1
14 0
15 1
You can see here 1 is reapting every 5 lines.
I want to recognize this repeating pattern [1,0,0,0] as soon as the repetition is above 10 in python (I have ~20.000 rows/file).
The pattern can start at any position
How could I manage this in python avoiding if .....
# Generate 20000 of 0s and 1s
data = pd.Series(np.random.randint(0, 2, 20000))
# Keep indices of 1s
idx = df[df > 0].index
# Check distance of current index with next index whether is 4 or not,
# Say if position 2 and position 6 is found as 1, so 6 - 2 = 4
found = []
for i, v in enumerate(idx):
if i == len(idx) - 1:
break
next_value = idx[i + 1]
if (next_value - v) == 4:
found.append(v)
print(found)

Slicing a pandas dataframe

import pandas as pd
x = pd.DataFrame([[1,2,3],[4,5,6]])
x[::2]
what does the above command mean and how does it function?
Better is more data, it return even rows only by slicing:
x = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9],[0,1,2]])
print (x)
0 1 2
0 1 2 3
1 4 5 6
2 7 8 9
3 0 1 2
print (x[::2])
0 1 2
0 1 2 3
2 7 8 9

Resources