Split list into randomised ordered sub lists

Split list into randomised ordered sub lists - python-3.x

I would like to improve the below code to split a list of values into two sub lists, which have been randomised and sorted. The below code works, but I'm sure there is a better/cleaner way to do it.
import random
data = list(range(1, 61))
random.shuffle(data)
Intervention = data[:30]
Control = data[30:]
Intervention.sort()
Control.sort()
f = open('Randomised_Groups.txt', 'w')
f.write('Intervention Group = ' + str(Intervention) + '\n' + 'Control Group = ' + str(Control))
f.close()
The expected output is:
Intervention = [1,3,7,9]
Control = [2,4,5,6,8,10]

I think your code is short and clean already. Some changes you can make:
Call sorted() when you slice it.
Intervention = sorted(data[:30])
You can also define both Intervention and Control on one line:
Intervention, Control = data[:30], data[30:]
I would replace the 30 with a variable:
half = len(data)//2
It is safer to open a file with with. That closes the file automatically when indentation stops.
with open('Randomised_Groups.txt', 'w') as f:
...
With the use of f-strings you can make the write statement shorter:
f.write(f'Intervention Group = {Intervention} \nControl Group = {Control}')
All combined:
import random
data = list(range(1, 61))
random.shuffle(data)
half = len(data)//2
Intervention, Control = sorted(data[:half]), sorted(data[half:])
with open('Randomised_Groups.txt', 'w') as f:
f.write(f'Intervention Group = {Intervention}\nControl Group = {Control}')

Something like this might be what you want:
import random
my_rng = [random.randint(0,1) for i in range(60)]
Control = [i for i in range(60) if my_rng[i] == 0]
Intervention = [i for i in range(60) if my_rng[i] == 1]
print(Control)
The idea is to create 60 random 1s or 0s to use as indicators for which list to put each number in. This will only work if you do not need the two lists to be the same length. To get the same length would require changing how my_rng is created in this example.
I have tinkered a bit further and got the lists of the same length:
import random
my_rng = [0 for i in range(30)]
my_rng.extend([1 for i in range(30)])
random.shuffle(my_rng)
Control = [i for i in range(60) if my_rng[i] == 0]
Intervention = [i for i in range(60) if my_rng[i] == 1]
Here, instead of adding randomly 1 or 0 to my_rng I get a list of 30 0s and 30 1s to shuffle, then continue like before.

Here is another solution that is more dynamic using built in random functionality that only creates the lists needed (no extra memory) and would work with lists that contain any type of object (provided that object can be sorted):
import random
def convert_to_random_list(data, num_list):
"""
Takes in the data as one large list and converts it into
[num_list] random sorted lists.
"""
result_lists = [list() for _ in range(num_list)] # two lists
for x in data:
# Using randint we pick which list to insert into
result_lists[random.randint(0, num_list - 1)].append(x)
# You could use list comprehension here with `sorted(...)` but it would take a little extra memory.
for _list in result_lists:
_list.sort()
return result_lists
Can be tested with:
data = list(range(1, 61))
random.shuffle(data)
temp = convert_to_random_list(data, 3)
print(temp)

Related

Parallelize a list item append to dict using multiprocessing

I have a large list containing strings. I wish to create a dict from this list such that:
list = [str1, str2, str3, ....]
dict = {str1:len(str1), str2:len(str2), str3:len(str3),.....}
My go to solution was a for loop but its taking too much time (my list contains almost 1M elements):
for i in list:
d[i] = len(i)
I wish to use the multiprocessing module in python in order to leverage all cores and reduce the time taken for the process to execute. I have come across some crude examples involving manager module to share dict between different processes but am unable to implement it. Any help would be appreciated!

I don't know if using multiple process will be faster, but it's an interesting experiment.
General flow:
Create list of random words
Split list into segments, one segment per process
Run processes, pass segment as parameter
Merge result dictionaries to single dictionary
Try this code:
import concurrent.futures
import random
from multiprocessing import Process, freeze_support
def todict(lst):
print(f'Processing {len(lst)} words')
return {e:len(e) for e in lst} # convert list to dictionary
if __name__ == '__main__':
freeze_support() # needed for Windows
# create random word list - max 15 chars
letters = [chr(x) for x in range(65,65+26)] # A-Z
words = [''.join(random.sample(letters,random.randint(1,15))) for w in range(10000)] # 10000 words
words = list(set(words)) # remove dups, count will drop
print(len(words))
########################
cpucnt = 4 # process count to use
# split word list for each process
wl = len(words)//cpucnt + 1 # word count per process
lstsplit = []
for c in range(cpucnt):
lstsplit.append(words[c*wl:(c+1)*wl]) # create word list for each process
# start processes
with concurrent.futures.ProcessPoolExecutor(max_workers=cpucnt) as executor:
procs = [executor.submit(todict, lst) for lst in lstsplit]
results = [p.result() for p in procs] # block until results are gathered
# merge results to single dictionary
dd = {}
for r in results:
dd.update(r)
print(len(dd)) # confirm match word count
with open('dd.txt','w') as f: f.write(str(dd)) # write dictionary to text file

Is there any ways to make this more efficient?

I have 24 more attempts to submit this task. I spent hours and my brain does not work anymore. I am a beginner with Python can you please help to figure out what is wrong? I would love to see the correct code if possible.
Here is the task itself and the code I wrote below.
Note that you can have access to all standard modules/packages/libraries of your language. But there is no access to additional libraries (numpy in python, boost in c++, etc).
You are given a content of CSV-file with information about set of trades. It contains the following columns:
TIME - Timestamp of a trade in format Hour:Minute:Second.Millisecond
PRICE - Price of one share
SIZE - Count of shares executed in this trade
EXCHANGE - The exchange that executed this trade
For each exchange find the one minute-window during which the largest number of trades took place on this exchange.
Note that:
You need to send source code of your program.
You have only 25 attempts to submit a solutions for this task.
You have access to all standart modules/packages/libraries of your language. But there is no access to additional libraries (numpy in python, boost in c++, etc).
Input format
Input contains several lines. You can read it from standart input or file “trades.csv”
Each line contains information about one trade: TIME, PRICE, SIZE and EXCHANGE. Numbers are separated by comma.
Lines are listed in ascending order of timestamps. Several lines can contain the same timestamp.
Size of input file does not exceed 5 MB.
See the example below to understand the exact input format.
Output format
If input contains information about k exchanges, print k lines to standart output.
Each line should contain the only number — maximum number of trades during one minute-window.
You should print answers for exchanges in lexicographical order of their names.
Sample
Input Output
09:30:01.034,36.99,100,V
09:30:55.000,37.08,205,V
09:30:55.554,36.90,54,V
09:30:55.556,36.91,99,D
09:31:01.033,36.94,100,D
09:31:01.034,36.95,900,V
2
3
Notes
In the example four trades were executed on exchange “V” and two trades were executed on exchange “D”. Not all of the “V”-trades fit in one minute-window, so the answer for “V” is three.
X = []
with open('trades.csv', 'r') as tr:
for line in tr:
line = line.strip('\xef\xbb\xbf\r\n ')
X.append(line.split(','))
dex = {}
for item in X:
dex[item[3]] = []
for item in X:
dex[item[3]].append(float(item[0][:2])*60.+float(item[0][3:5])+float(item[0][6:8])/60.+float(item[0][9:])/60000.)
for item in dex:
count = 1
ccount = 1
if dex[item][len(dex[item])-1]-dex[item][0] <1:
count = len(dex[item])
else:
for t in range(len(dex[item])-1):
for tt in range(len(dex[item])-t-1):
if dex[item][tt+t+1]-dex[item][t] <1:
ccount += 1
else: break
if ccount>count:
count=ccount
ccount=1
print(count)

First of all it is not necessary to use datetime and csv modules for such a simple case (like in Ed-Ward's example).
If we remove colon and dot signs from the time strings it could be converted to int() directly - easier way than you tried in your example.
CSV features like dialect and special formatting not used so i suggest to use simple split(",")
Now about efficiency. Efficiency means time complexity.
The more times you go through your array with dates from the beginning to the end, the more complicated the algorithm becomes.
So our goal is to minimize cycles count, best to make only one pass by all rows and especially avoid nested loops and passing through collections from beginning to the end.
For such a task it is better to use deque, instead of tuple or list, because you can pop() first element and append last element with complexity of O(1).
Just append every time for needed exchange to the end of the exchange's queue until difference between current and first elements becomes more than 1 minute. Then just remove first element with popleft() and continue comparison. After whole file done - length of each queue will be the max 1min window.
Example with linear time complexity O(n):
from collections import deque
ex_list = {}
s = open("trades.csv").read().replace(":", "").replace(".", "")
for line in s.splitlines():
s = line.split(",")
curr_tm = int(s[0])
curr_ex = s[3]
if curr_ex not in ex_list:
ex_list[curr_ex] = deque()
ex_list[curr_ex].append(curr_tm)
if curr_tm >= ex_list[curr_ex][0] + 100000:
ex_list[curr_ex].popleft()
print("\n".join([str(len(ex_list[k])) for k in sorted(ex_list.keys())]))

This code should work:
import csv
import datetime
diff = datetime.timedelta(minutes=1)
def date_calc(start, dates):
for i, date in enumerate(dates):
if date >= start + diff:
return i
return i + 1
exchanges = {}
with open("trades.csv") as csvfile:
reader = csv.reader(csvfile)
for row in reader:
this_exchange = row[3]
if this_exchange not in exchanges:
exchanges[this_exchange] = []
time = datetime.datetime.strptime(row[0], "%H:%M:%S.%f")
exchanges[this_exchange].append(time)
ex_max = {}
for name, dates in exchanges.items():
ex_max[name] = 0
for i, d in enumerate(dates):
x = date_calc(d, dates[i:])
if x > ex_max[name]:
ex_max[name] = x
print('\n'.join([str(ex_max[k]) for k in sorted(ex_max.keys())]))
Output:
2
3
( obviously please check it for yourself before uploading it :) )
I think the issue with your current code is that you don't put the output in lexicographical order of their names...
If you want to use your current code, then here is a (hopefully) fixed version:
X = []
with open('trades.csv', 'r') as tr:
for line in tr:
line = line.strip('\xef\xbb\xbf\r\n ')
X.append(line.split(','))
dex = {}
counts = []
for item in X:
dex[item[3]] = []
for item in X:
dex[item[3]].append(float(item[0][:2])*60.+float(item[0][3:5])+float(item[0][6:8])/60.+float(item[0][9:])/60000.)
for item in dex:
count = 1
ccount = 1
if dex[item][len(dex[item])-1]-dex[item][0] <1:
count = len(dex[item])
else:
for t in range(len(dex[item])-1):
for tt in range(len(dex[item])-t-1):
if dex[item][tt+t+1]-dex[item][t] <1:
ccount += 1
else: break
if ccount>count:
count=ccount
ccount=1
counts.append((item, count))
counts.sort(key=lambda x: x[0])
print('\n'.join([str(x[1]) for x in counts]))
Output:
2
3
I do think you can make your life easier in the future by using Python's standard library, though :)

How to concatenate data frames from two different dictionaries into a new data frame in python?

This is my sample code
dataset_current=dataset_seq['Motor_Current_Average']
dataset_consistency=dataset_seq['Consistency_Average']
#technique with non-overlapping the values(for current)
dataset_slide=dataset_current.tolist()
from window_slider import Slider
import numpy
list = numpy.array(dataset_slide)
bucket_size = 336
overlap_count = 0
slider = Slider(bucket_size,overlap_count)
slider.fit(list)
empty_dictionary = {}
count = 0
while True:
count += 1
window_data = slider.slide()
empty_dictionary['df_current%s'%count] = window_data
empty_dictionary['df_current%s'%count] =pd.DataFrame(empty_dictionary['df_current%s'%count])
empty_dictionary['df_current%s'%count]= empty_dictionary['df_current%s'%count].rename(columns={0: 'Motor_Current_Average'})
if slider.reached_end_of_list(): break
locals().update(empty_dictionary)
#technique with non-overlapping the values(for consistency)
dataset_slide_consistency=dataset_consistency.tolist()
list = numpy.array(dataset_slide_consistency)
slider_consistency = Slider(bucket_size,overlap_count)
slider_consistency.fit(list)
empty_dictionary_consistency = {}
count_consistency = 0
while True:
count_consistency += 1
window_data_consistency = slider_consistency.slide()
empty_dictionary_consistency['df_consistency%s'%count_consistency] = window_data_consistency
empty_dictionary_consistency['df_consistency%s'%count_consistency] =pd.DataFrame(empty_dictionary_consistency['df_consistency%s'%count_consistency])
empty_dictionary_consistency['df_consistency%s'%count_consistency]= empty_dictionary_consistency['df_consistency%s'%count_consistency].rename(columns={0: 'Consistency_Average'})
if slider_consistency.reached_end_of_list(): break
locals().update(empty_dictionary_consistency)
import pandas as pd
output_current ={}
increment = 0
while True:
increment +=1
output_current['dataframe%s'%increment] = pd.concat([empty_dictionary_consistency['df_consistency%s'%count_consistency],empty_dictionary['df_current%s'%count]],axis=1)
My question is i have two dictionaries that contains 79 data frames in each one of them namely "empty_dictionary_consistency" and "empty_dictionary" . I want to create a new data frame for each one of them so that it concatenates df1 from empty_dictionary_consistency with df1 from empty_dictionary .So , it will start from concatenating df1 from empty_dictionary_consistency with df1 from empty_dictionary till df79 from empty_dictionary_consistency with df79 from empty_dictionary . I tried using while loop to increment it but does not shows any output.
output_current ={}
increment = 0
while True:
increment +=1
output_current['dataframe%s'%increment] = pd.concat([empty_dictionary_consistency['df_consistency%s'%count_consistency],empty_dictionary['df_current%s'%count]],axis=1)
Can anyone help me regarding this? How can i do this.

I am not near my computer now, so I can not test the code, but it seems that the problem is in indices. In the last loop, on every iteration you increment a variable called 'increment', but you still use indices from previous loops for dictionaries that you want to concatenate. Try to change variables that you use for indexing all dictionaries to 'increment'.
And one more thing - I can't see when this loop is going to finish?
UPD
I mean this:
length = len(empty_dictionary_consistency)
increment = 0
while increment < length:
increment +=1
output_current['dataframe%s'%increment] = pd.concat([empty_dictionary_consistency['df_consistency%s'%increment],empty_dictionary['df_current%s'%increment]],axis=1)
While iterating over your dictionaries you should use a variable that you increment as an index in all three dictionaries. And as soon as you do not use a Slider object in the loop, you have to stop it when the first dictionary is over.

Python For Loop Slows Due To Large List

So currently I have a for loop, which causes the python program to die with the program saying 'Killed'. It slows down around 6000 items in, with the program slowly dying at around 6852 list items. How do I fix this?
I assume it's due to the list being too large.
I've tried splitting the list in two around 6000. Maybe it's due to memory management or something. Help would be appreciated.
for id in listofids:
connection = psycopg2.connect(user = "username", password = "password", host = "localhost", port = "5432", database = "darkwebscraper")
cursor = connection.cursor()
cursor.execute("select darkweb.site_id, darkweb.site_title, darkweb.sitetext from darkweb where darkweb.online='true' AND darkweb.site_id = %s", ([id]))
print(len(listoftexts))
try:
row = cursor.fetchone()
except:
print("failed to fetch one")
try:
listoftexts.append(row[2])
cursor.close()
connection.close()
except:
print("failed to print")

You're right, it's probably because the list becomes large: python list are contiguous spaces in memory. Each time you append to the list, python looks if there is a spot at the next position, and if not he relocates the whole array somewhere where there is enough room. The bigger your array, the more python has a to relocate.
One way around would be to create an array of the right size beforehand.
EDIT: Just to make sure it was clear, I made up an example to illustrate my point. I've made 2 functions. The first one appends the stringified index (to make it bigger) to a list at each iteration, and the other just fills a numpy array:
import numpy as np
import matplotlib.pyplot as plt
from time import time
def test_bigList(N):
L = []
times = np.zeros(N,dtype=np.float32)
for i in range(N):
t0 = time()
L.append(str(i))
times[i] = time()-t0
return times
def test_bigList_numpy(N):
L = np.empty(N,dtype="<U32")
times = np.zeros(N,dtype=np.float32)
for i in range(N):
t0 = time()
L[i] = str(i)
times[i] = time()-t0
return times
N = int(1e7)
res1 = test_bigList(N)
res2 = test_bigList_numpy(N)
plt.plot(res1,label="list")
plt.plot(res2,label="numpy array")
plt.xlabel("Iteration")
plt.ylabel("Running time")
plt.legend()
plt.title("Evolution of iteration time with the size of an array")
plt.show()
I get the following result:
You can see on the figure that for the list case, you have regularly some peaks (probably due to relocation), and they seem to increase with the size of the list. This example is with short appended strings, but the bigger the string, the more you will see this effect.
If it does not do the trick, then it might be linked to the database itself, but I can't help you without knowing the specifics of the database.

Analysis of Eye-Tracking data in python (Eye-link)

I have data from eye-tracking (.edf file - from Eyelink by SR-research). I want to analyse it and get various measures such as fixation, saccade, duration, etc.
Is there an existing package to analyse Eye-Tracking data?
Thanks!

At least for importing the .edf-file into a pandas DF, you can use the following package by Niklas Wilming: https://github.com/nwilming/pyedfread/tree/master/pyedfread
This should already take care of saccades and fixations - have a look at the readme. Once they're in the data frame, you can apply whatever analysis you want to it.

pyeparse seems to be another (yet currently unmaintained as it seems) library that can be used for eyelink data analysis.
Here is a short excerpt from their example:
import numpy as np
import matplotlib.pyplot as plt
import pyeparse as pp
fname = '../pyeparse/tests/data/test_raw.edf'
raw = pp.read_raw(fname)
# visualize initial calibration
raw.plot_calibration(title='5-Point Calibration')
# create heatmap
raw.plot_heatmap(start=3., stop=60.)
EDIT: After I posted my answer I found a nice list compiling lots of potential tools for eyelink edf data analysis: https://github.com/davebraze/FDBeye/wiki/Researcher-Contributed-Eye-Tracking-Tools

Hey the question seems rather old but maybe I can reactivate it, because I am currently facing the same situation.
To start I recommend to convert your .edf to an .asc file. In this way it is easier to read it to get a first impression.
For this there exist many tools, but I used the SR-Research Eyelink Developers Kit (here).
I don't know your setup but the Eyelink 1000 itself detects saccades and fixation. I my case in the .asc file it looks like that:
SFIX L 10350642
10350642 864.3 542.7 2317.0
...
...
10350962 863.2 540.4 2354.0
EFIX L 10350642 10350962 322 863.1 541.2 2339
SSACC L 10350964
10350964 863.4 539.8 2359.0
...
...
10351004 683.4 511.2 2363.0
ESACC L 10350964 10351004 42 863.4 539.8 683.4 511.2 5.79 221
The first number corresponds to the timestamp, the second and third to x-y coordinates and the last is your pupil diameter (what the last numbers after ESACC are, I don't know).
SFIX -> start fixation
EFIX -> end fixation
SSACC -> start saccade
ESACC -> end saccade
You can also check out PyGaze, I haven't worked with it, but searching for a toolbox, this one always popped up.
EDIT
I found this toolbox here. It looks cool and works fine with the example data, but sadly does not work with mine
EDIT No 2
Revisiting this question after working on my own Eyetracking data I thought I might share a function wrote, to work with my data:
def eyedata2pandasframe(directory):
'''
This function takes a directory from which it tries to read in ASCII files containing eyetracking data
It returns eye_data: A pandas dataframe containing data from fixations AND saccades fix_data: A pandas dataframe containing only data from fixations
sac_data: pandas dataframe containing only data from saccades
fixation: numpy array containing information about fixation onsets and offsets
saccades: numpy array containing information about saccade onsets and offsets
blinks: numpy array containing information about blink onsets and offsets
trials: numpy array containing information about trial onsets
'''
eye_data= []
fix_data = []
sac_data = []
data_header = {0: 'TimeStamp',1: 'X_Coord',2: 'Y_Coord',3: 'Diameter'}
event_header = {0: 'Start', 1: 'End'}
start_reading = False
in_blink = False
in_saccade = False
fix_timestamps = []
sac_timestamps = []
blink_timestamps = []
trials = []
sample_rate_info = []
sample_rate = 0
# read the file and store, depending on the messages the data
# we have the following structure:
# a header -- every line starts with a '**'
# a bunch of messages containing information about callibration/validation and so on all starting with 'MSG'
# followed by:
# START 10350638 LEFT SAMPLES EVENTS
# PRESCALER 1
# VPRESCALER 1
# PUPIL AREA
# EVENTS GAZE LEFT RATE 500.00 TRACKING CR FILTER 2
# SAMPLES GAZE LEFT RATE 500.00 TRACKING CR FILTER 2
# followed by the actual data:
# normal data --> [TIMESTAMP]\t [X-Coords]\t [Y-Coords]\t [Diameter]
# Start of EVENTS [BLINKS FIXATION SACCADES] --> S[EVENTNAME] [EYE] [TIMESTAMP]
# End of EVENTS --> E[EVENT] [EYE] [TIMESTAMP_START]\t [TIMESTAMP_END]\t [TIME OF EVENT]\t [X-Coords start]\t [Y-Coords start]\t [X_Coords end]\t [Y-Coords end]\t [?]\t [?]
# Trial messages --> MSG timestamp\t TRIAL [TRIALNUMBER]
try:
with open(directory) as f:
csv_reader = csv.reader(f, delimiter ='\t')
for i, row in enumerate (csv_reader):
if any ('RATE' in item for item in row):
sample_rate_info = row
if any('SYNCTIME' in item for item in row): # only start reading after this message
start_reading = True
elif any('SFIX' in item for item in row): pass
#fix_timestamps[0].append (row)
elif any('EFIX' in item for item in row):
fix_timestamps.append ([row[0].split(' ')[4],row[1]])
#fix_timestamps[1].append (row)
elif any('SSACC' in item for item in row):
#sac_timestamps[0].append (row)
in_saccade = True
elif any('ESACC' in item for item in row):
sac_timestamps.append ([row[0].split(' ')[3],row[1]])
in_saccade = False
elif any('SBLINK' in item for item in row): # stop reading here because the blinks contain NaN
# blink_timestamps[0].append (row)
in_blink = True
elif any('EBLINK' in item for item in row): # start reading again. the blink ended
blink_timestamps.append ([row[0].split(' ')[2],row[1]])
in_blink = False
elif any('TRIAL' in item for item in row):
# the first element is 'MSG', we don't need it, then we split the second element to seperate the timestamp and only keep it as an integer
trials.append (int(row[1].split(' ')[0]))
elif start_reading and not in_blink:
eye_data.append(row)
if in_saccade:
sac_data.append(row)
else:
fix_data.append(row)
# drop the last data point, because it is the 'END' message
eye_data.pop(-1)
sac_data.pop(-1)
fix_data.pop(-1)
# convert every item in list into a float, substract the start of the first trial to set the start of the first video to t0=0
# then devide by 1000 to convert from milliseconds to seconds
for row in eye_data:
for i, item in enumerate (row):
row[i] = float (item)
for row in fix_data:
for i, item in enumerate (row):
row[i] = float (item)
for row in sac_data:
for i, item in enumerate (row):
row[i] = float (item)
for row in fix_timestamps:
for i, item in enumerate (row):
row [i] = (float(item)-trials[0])/1000
for row in sac_timestamps:
for i, item in enumerate (row):
row [i] = (float(item)-trials[0])/1000
for row in blink_timestamps:
for i, item in enumerate (row):
row [i] = (float(item)-trials[0])/1000
sample_rate = float (sample_rate_info[4])
# convert into pandas fix_data Frames for a better overview
eye_data = pd.DataFrame(eye_data)
fix_data = pd.DataFrame(fix_data)
sac_data = pd.DataFrame(sac_data)
fix_timestamps = pd.DataFrame(fix_timestamps)
sac_timestamps = pd.DataFrame(sac_timestamps)
trials = np.array(trials)
blink_timestamps = pd.DataFrame(blink_timestamps)
# rename header for an even better overview
eye_data = eye_data.rename(columns=data_header)
fix_data = fix_data.rename(columns=data_header)
sac_data = sac_data.rename(columns=data_header)
fix_timestamps = fix_timestamps.rename(columns=event_header)
sac_timestamps = sac_timestamps.rename(columns=event_header)
blink_timestamps = blink_timestamps.rename(columns=event_header)
# substract the first timestamp of trials to set the start of the first video to t0=0
eye_data.TimeStamp -= trials[0]
fix_data.TimeStamp -= trials[0]
sac_data.TimeStamp -= trials[0]
trials -= trials[0]
trials = trials /1000 # does not work with trials/=1000
# devide TimeStamp to get time in seconds
eye_data.TimeStamp /=1000
fix_data.TimeStamp /=1000
sac_data.TimeStamp /=1000
return eye_data, fix_data, sac_data, fix_timestamps, sac_timestamps, blink_timestamps, trials, sample_rate
except:
print ('Could not read ' + str(directory) + ' properly!!! Returned empty data')
return eye_data, fix_data, sac_data, fix_timestamps, sac_timestamps, blink_timestamps, trials, sample_rate
Hope it helps you guys. Some parts of the code you may need to change, like the index where to split the strings to get the crutial information about event on/offsets. Or you don't want to convert your timestamps into seconds or do not want to set the onset of your first trial to 0. That is up to you.
Additionally in my data we sent a message to know when we started measuring ('SYNCTIME') and I had only ONE condition in my experiment, so there is only one 'TRIAL' message
Cheers

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Split list into randomised ordered sub lists - python-3.x

Related

Parallelize a list item append to dict using multiprocessing

Is there any ways to make this more efficient?

How to concatenate data frames from two different dictionaries into a new data frame in python?

Python For Loop Slows Due To Large List

Analysis of Eye-Tracking data in python (Eye-link)

Categories

Resources