Processing of sensor data - statistics

I am working on a system with laser trip detectors(if something breaks the laser path I get a one on the output of the laser receiver).
I have many of these trip detectors and I want to detect if one is malfunctioning, but I do not know how to go about doing this. The lasers should not trip all that often..maybe a few times a day.
A typical case would be that the laser gets tripped for a .5-2 seconds, or brief intermittent tripping for a short time period, and possibly again after that(within 2-10 seconds)...
Are there any good ways to check the sensor is malfunctioning using a good statistical methodology?

You could just create a "profile" which includes the avg/mean/min/max of how often each sensor is tripped/how long it is tripped/how long is the time between a trip and the next trip etc. for example by using the data of some period of time like the last week/month or similar...
THEN you can compare the current state of a sensor to its profile... when the deviation is "big enough" you can assume an exceptional situation/perhaps a malfunction... the hardest part is to adjust the threshold for the deviation from the profile which in turn if hit triggers for example "malfunction handling"...

Related

pausing device without visual cues

I'm doing a competition that requires our device to pause and stop midway on a pole. The device carries an increasing load (chain). The midway point is marked with a slight score mark (which we don't believe will be that noticeable). The device is supposed to travel up on the outside of the pole without pausing, pause on the way down and again on the way up, not pause going down the second time and then stop midway going up again.
I don't have experience with programming sensors and motors so, while I have ideas about how to do this, I don't quite know how to actually program this. There's a cap on the top I can use to switch direction of travel but the end of the pipe is open. I'm thinking of using either height off the floor or the weight of the chain as a constant to mark position along the pole. I'm having trouble finding inexpensive sensors so I don't have specifics.
Right now, I'm toying with the idea of using the switch at the top as sort of an on/off for the pausing/not pausing code. Alternatively, I can use the constant measurement (height/weight) to count the cycles and switch on the code that way. Is this an effective way of doing this? I'm looking at a raspberry pi to control everything.

EEGLAB preprocessing for sleep EEG data

I have 29-channel EEG data from overnight sleep recordings (~8 hours of EEG data). The EEG data is not continuous as I have paused it if the subject needed a loo break (~5-10 minutes) or to check the impedance (~2 minutes). This inserts a ‘boundary event’ in my EEG data. On some occasions, I have also reapplied a couple of electrodes if the impedance was high. In addition, sometimes I have had to stop the EEG recording and start again due to some technical reasons, because of which I have two EEG files for the same subject during the same night i.e. same session.
In my experiment, I have 15-20 events overnight and plan to use 1-2 minute data before each event for preprocessing and analyses. I would like to use ICA to correct for artifacts but before I do that, I would like to know if I need to split the data at each ‘boundary event’ and process them as separate EEG files? Or can I just consider the EEG recording from each subject as a single EEG recording and perform ICA (I can append the EEG files in cases where I have two EEG files per subject in the same session)? Any suggestions will be highly appreciated as I am very new to EEG and MATLAB.
You can find a detailed comment on your issue here
https://sccn.ucsd.edu/pipermail/eeglablist/2014/007380.html
Changing the impedance of the electrode introduce non-stationarity of the signal, that could affect ICA. However, ICA seems robust to this kind of violation, so you could concatenate all the sessions and run the ICA.
Rather, an issue could be if the electrodes moved during the breaks (ICA is a spatial filter, so spatial releationship are very important. see also here https://sccn.ucsd.edu/pipermail/eeglablist/2016/011878.html).
As a personal comment, I would use an "empirical" approach, to get a better grasp on the data and on the problem. In EEGLAB you could easily write a script and perform two parallel analysis, keeping all equal but the ICA decomposition (one on the concatenated data, one on the data separated for sessions). Then you can compare the difference on the data at the last step. In this way you can appreciate how the different strategy affected your results, and if this is really relevant.

What do the ALSA timestamping function return and how do the result relate to each other?

There are several "hi-res" timestamping functions in ALSA:
snd_pcm_status_get_trigger_htstamp
snd_pcm_status_get_audio_htstamp
snd_pcm_status_get_driver_htstamp
snd_pcm_status_get_htstamp
I would like to understand what points in time the resulting functions represent.
My current understanding is that trigger_htstamp represents the time when stream was started/stopped/paused. snd_pcm_status_get_trigger_htstamp returns a constant value and when I add audio_htstamp to that value the result is very close to the current system time.
audio_htstamp seems to start from zero on my system and it is incremented by a value that is equal to the period size I use. Hence on my system it is a simple frame counter. If I understand ALSA correctly audio_htstamp can also work in different more accurate way depending on the system capabilities.
driver_htstamp I guess by the name is a timestamp generated by the audio driver.
Question 1: When is the timestamp driver_htstamp usually generated?
With htstamp I am really unsure where and when it is generated. I have a hunch that it may be related to DMA.
Question 2: Where is htstamp generated?
Question 3: When is htstamp generated?
Question 4: Is the assumption audio_htstamp < htstamp < driver_htstamp generally correct?
It seems like this with a little test program I wrote, but I want to verify my assumption.
I can not find this information in the ALSA documentation.
I just dug through the code for this stuff for my own purposes, so I figured I would share what I found.
The purpose of these timestamps is to allow you to determine subtle differences in the rate of different clocks; most importantly in this case the main system clock that Linux uses for general timekeeping compared with the different clock that determines the rate at which samples move in and out of the sound device. This can be very important for applications that need to keep audio from different hardware devices in sync, since the rates of different physical clocks are never exactly the same.
The technique used is sometimes called "cross-timestamping"; you capture timestamps from the clocks you want to compare as close to simultaneously as possible, and repeat this at regular intervals. There is usually some measurement error introduced, but some relatively simple filtering can get you a good characterization of the difference in the rate at which the clocks count.
The core PCM driver arranges to take a system clock timestamp as closely as possible to when an audio stream starts, and then it does a cross-timestamp between the system clock and audio clock (which can be measured in different ways) whenever it is asked to check the state of the hardware pointers for the DMA engine that moves samples around.
The default method of measuring the audio clock is via DMA hardware pointer comparsion. This isn't terribly precise, but over longer periods of time you can still get a good measure of the rate difference. At the start of snd_pcm_update_hw_ptr0, a system timestamp is captured; this will end up being htstamp. The DMA pointers are then checked, and if it's determined that they've moved since the last check, audio_htstamp is calculated based on the number of frames DMA has copied and the nominal frequency of the audio clock. Then, once all the DMA pointer update is done and right before snd_pcm_update_hw_ptr0 returns, another system timestamp is captured in driver_htstamp. This isn't meant to be used when you're using the DMA hw_ptr method of calculating the audio_htstamp though.
If you happen to have an audio device using the HDAudio driver, you can use an alternate and much more precise method of measuring the audio clock. It supplies an extra operation callback called get_time_info that is used instead of the default method of capturing the system and audio timestamps. It the HDAudio case, it takes a system timestamp for htstamp as close to possible to when it reads an interal counter driven by the same clock source as the audio clock; this forms the audio_htstamp. Afterwards, the same DMA hw_ptr bookkeeping is done, but the code that translates the pointer movement into time is skipped. The driver_htstamp is still taken right before the routine ends, though; this is "to let apps detect if the reference tstamp read by low-level hardware was provided with a delay" as the comment says in the code. This is because there's no guarantee that the get_time_info callback is going to take a new system timestamp; it may have previously recorded an audio timestamp along with a system timestamp as part of an interrupt handler. In this case, the timestamps you get might not match with the available frames and delay frames counts calculated by hw_ptr bookkeeping, but the driver_htstamp will let you know the closest system time to when those calculations were made.
In any case, the code is designed in both cases to capture htstamp and audio_htstamp as closely together as possible, and for htstamp - trigger_htstamp to represent the amount of system time that passed during the period measured by audio_htstamp of the audio clock. You mostly shouldn't need to use driver_htstamp, but I guess it might be used with the USB Audio driver, as I think it and HDAudio are the only ones that do anything special with these interfaces right now.
The documentation for this, although it doesn't contain all the details you might want to know, is part of the kernel documentation: http://lxr.free-electrons.com/source/Documentation/sound/alsa/timestamping.txt?v=4.9

explain me a difference of how MRTG measures incoming data

Everyone knows that MRTG needs at least one value to be passed on it's input.
In per-target options MRTG has 'gauge', 'absolute' and default (with no options) behavior of 'what to do with incoming data'. Or, how to count it.
Lets look at the elementary, yet popular example :
We pass cumulative data from network interface statistics of 'how much packets were recieved by the interface'.
We take it from '/proc/net/dev' or look at 'ifconfig' output for certain network interface. The number of recieved bytes is increasing every time. Its cumulative.
So as i can imagine there could be two types of possible statistics:
1. How fast this value changes upon the time interval. In oher words - activity.
2. Simple, as-is growing graphic that just draw every new value per every minute (or any other time interwal)
First graphic will be saltatory (activity). Second will just grow up every time.
I read twice rrdtool's and MRTG's docs and can't understand which option mentioned above counts what.
I suppose (i am not sure) that 'gauge' draw values as is, without any differentiation calculations (good for measuring how much memory or cpu is used every 5 minutes). And default or 'absolute' behavior tryes to calculate the speed between nearby measures, but what's the differencr between last two?
Can you, guys, explain in a simple manner which behavior stands after which option of three options possible?
Thanks in advance.
MRTG assumes that everything is being measured as a rate (even if it isnt a rate)
Type 'gauge' assumes that you have already calculated the rate; thus, the provided value is stored as-is (after Data Normalisation). This is appropriate for things like CPU usage.
Type 'absolute' assumes the value passed is the count since the last update. Thus, the value is divided by the number of seconds since the last update to get a rate in thingies per second. This is rarely used, and only for certain unusual data sources that reset their value on being read - eg, a script that counts the number of lines in a log file, then truncates the log file.
Type 'counter' (the default) assumes the value passed is a constantly growing count, possibly that wraps around at 16 or 64 bits. The difference between the value and its previous value is divided by the number of seconds since the last update to get a rate in thingies per second. If it sees the value decrease, it will assume a counter wraparound at 16 or 64 bit. This is appropriate for something like network traffic counters, which is why it is the default behaviour (MRTG was originally written for network traffic graphs)
Type 'derive' is like 'counter', but will allow the counter to decrease (resulting in a negative rate). This is not possible directly in MRTG but you can manually create the necessary RRD if you want.
All types subsequently perform Data Normalisation to adjust the timestamp to a multiple of the Interval. This will be more noticeable for Gauge types where the value is small than for counter types where the value is large.
For information on this, see Alex van der Bogaerdt's excellent tutorial

Digital delay decay

I am developing a digital delay on a microcontroller and I am stuck with the delay decay. The delay is implemented with a comb filter.
Here it is: http://www.tonmeister.ca/main/textbook/intro_to_sound_recording837x.png
The delay line, "emulating the tape", is implemented as a circula buffer. The effect can be killed and such case does not represents an issue; when turning the effect off though, I have the tail of the delay left in the buffer to process, as if the delay had been frozen and the tail slowly decay (depending on the feedback gain).
My question is: how many times I have to recirculate samples through the buffer?
One way I thought to approach this could be by modelling the physical process ... assuming that the input sequence has a loudness of 0dB for its entire duration and that, after going through the delay line, it gets attenuated by a factor of 1/10. In terms of loudness this corresponds to a drop of 20dB, as power = voltage^2, every time the sequence goes through the feedback path. The weakest audible sound has a loudness of −130dB but, taking into consideration the ambient noise as well, −120dB will be sufficient as the least reference power. Hence, after the echoes have been through the feedback path 6 times (120dB/20dB) they will be no longer audible.
Is there a more efficient way?
Thank you!

Resources