I wish to automate the visualisation process of my activity monitoring process.
To do so, I import a starting time, ending time and activity name of an activity, afterwards I distribute these over separate sheets.
And now I'd like to first manually build a graph that visualizes the data in the desired way, and that's where I run into trouble.
Data and graph nearly visualized:
Obtaining the desired way entails 2 more small steps, the first and most important one, which I have troubles with is:
Visualising the difference between two plotted bars, on the actual location/height of the difference, in stead of two bars going up to the starting, and ending point of an activity. (practically inversing the area that is plotted with the area that is not plotted*)
Clustering the activities into 1 vertical bar consisting of several smaller bars floating in the right time, visualising the time of the activity. But that's something I'll look into after solving 1.
Ps. if you are interested in the actual .xlsm for your own use feel free to send me a pm.
*As seen I already tried to use just the actual duration of the activities with their respective starting time, but it only yielded 1 line of identical bars.
Related
I am trying to get a variable-base column bar graph in Excel. Basically, with reference to the figure below, I have a number of periods. Each period can range from 1 to 5 months. So the various periods do not have the same length. Each period, however, has only one value associated with it that is representative of the entire period.
What I wish to achieve can be seen in the first figure (A), where two characteristics that the graph must have are also highlighted, namely, to consist of columns of varying width, as well as height, and to have the final month of the period placed in the center of the column and not at the bottom right.
I have tried several ways to get this graph with Excel, but all I have been able to get is visible in the second figure (B). Basically, I had to create a second, auxiliary, table by hand from the first, and then generate a normal column chart.
Clearly the result is different from what I wanted, although I came close.
Do you think it is possible, with Excel, to get exactly the chart I need without having to use a second table? Or, alternatively, do you think there are other programs, e.g., Holoviews (Python) to get the expected result?
I'm new to python and matplotlib. I need to plot a live graph from a CSV file which is being updated in real time. This is what I'm trying to do:
Just keep plotting as soon as a new value is updated into the file. It's procedural, as soon as I write the new data into the file, I read it again and plot. The number of readings may go higher than 1000. Maybe even 10000. (1000 or 10000 lines in CSV file, each line contains a unique x value and a corresponding y value). I'm plotting in a tkinter canvas. I need to plot the latest 50 values in the file, but also keep the previous values so that I can stop the graph, drag and see previous values. Plotting is fine, I understand how to get it done. But how much Ram/time and other resources does this process take. How does it affect the performance of the application and is there a better way to accomplish this? Note that after a while, I'll have an array with maybe 10000 values in it. Then I'll have to plot it.
I have a table of data similar to this:
I want to create a bar chart like this:
But I get this:
or, when I add major gridlines, I get this:
However, I want a quick way to visually differentiate between the different quarters (Q1... Q4) by shading them with a different background color each or marking a border around them.
I don't want to export the chart as an image and edit it because:
1. This is a weekly report, it would be very repetitive and error prone.
2. It would be time consuming when I need to do it for 100s of records.
3. My manager prefers the data and the chart to be sent as part of the report. Hence, changing it to other formats is not possible.
Is it possible to create such a graph using Excel 2010? If so, how? I don't mind writing a macro for this, but am currently lost on the approach.
If you want to do something like this:
It's a bit tricky, but can be done. First you must add a fourth data series so you have the data like this:
Then you have to put the three "real" data series in the secondary axis. You must set the maximum values both in secondary and primary axes to the same value (30 in my example). Next, you delete the secondary axis. And finally, in the fourth series settings you let it overlap and put the separation to zero. Sorry, I don't know the exact english name of the settings, as my Excel is not in english.
I'm trying to determine where, in a set of measurement data, the data takes a dive...
... so I can plot a vertical line and
... plot a horizontal line in the graph.
I have no problem doing the 2nd and 3rd bullet points above on my own, so that's taken care of.
The problem I need help with is the first bullet point - determining WHERE the data takes a dive - WHERE the data crosses a threshold that basically says, "Whatever-it-is you're measuring, is no longer performing as it is expected to.".
Here's what I'm doing:
I am taking measurements using a measuring device and that device is logging the measurements in its internal memory and allowing me to download that measurement data to my computer into a csv when the test session is complete.
I pull that csv into an xls and plot the data on a graph. (see attached image)
Here's what I want to do:
If you look at the attached image I would like to find the value where the data DEFINITELY crosses BELOW the horizontal line so I can say, "Here is where the device being tested 'gave up the ghost' and was no longer able to perform as desired."
What the data roughly looks like:
Each measurement set will have the rough look and feel of the attached image but slightly different each time. (because each object I am testing will have roughly the same performance characteristics but they all have their own manufacturing defects and variations.)
The data set for the attached image is a data set of 7000 measurements.
I never really know where the horizontal line will be.
Examples of the data sets I have gotten in the past several tests look like this:
(394 to 0)
(390000 to 0)
(3.88 to 0)
(375000 to 0)
(39.55 to 0)
(59200 to 0)
and each data set will have about 1,000 to 7,000 measurements each.
Here's how I was trying to solve this issue:
I was using SLOPE() and trying to latch onto where the slop of the line took a dive / started to work its way to a zero slope (which is a vertical line) so when it starts approaching a really small slope then it MUST be taking a dive. That didn't really work.
I was looking at using STDEV.P() in Excel and feeding it the entire data set. Then I was looking at doing the same thing but feeding it only the first 10, 30, 60 measurements but then I thought - we never really know just how many measurements will come through. Then I thought I would use the first 10% of the measurements and feed that to STDEV.P().
Please let me know what you think of this and please let me know of any ideas you may have.
Thanks.
H
Something like this should work to flag when the decay rate increases.
To find what 'direction' your data is going in you need the derivative.
Excel doesn't have a derivative formula but you can set it up pretty easily by using the (change in y)/(change in x) as demonstrated here:
http://faculty.educ.ubc.ca/sanderson/lab/CLFbiom/demo/diff.htm
I would then check a formula which counts how many datarows you have (=COUNTA(A:A) or similar)
Then uses that to get a step of 10% of your data
Then check the value of the derivative in a cell against a cell 10% further down. If it's still a negative (to account for the slight downhill at first) then you'll know
The right way to go about this is to model the data with an unknown discontinuity, something like "if time < break_time then (some constant plus noise) else (decaying exponential)". A maximum likelihood estimation for that model might require iteration or other operations which are clumsy in Excel -- maybe you should consider VB or Python or some other programming language. I.e. choose the tool to fit the problem and not the other way around.
See Seber and Wild, "Nonlinear Regression", for an extensive discussion of models with discontinuities.
If your data can be generally characterized as having:
(A) a more or less flat plateau region, followed by
(B) a downward trending region
then a basic strategy could be to start at then end of the data and march towards the beginning one point at a time, checking to see that the values are increasing. Once they stop increasing, you've found the break point.
The strategy assumes (unwisely?) that the downward trending region is smooth/noiseless. To make the solution more robust to noise, you could compare values that are 5 apart, or 10 apart, or whatever interval works to filter out the noise. Or you could use a moving average.
This strategy could potentially be made more efficient by starting the search somewhere in the middle of the data but still in downward trending portion. If you know (based on experience) that any value that is (say) 0.5X the maximum is in the downward trending portion, you could start the search there.
Hope that helps.
It appears as though you want to detect when the slope changes from something near zero to something negative. One way to detect this is to calculate the 2nd derivative of the values (calculate the slope of the slope). The 2nd derivative should be near zero in the flat portion of the data AND in the downward trending portion of the data. It should go negative at the break point. So finding the minimum (most negative) value of the 2nd should locate the break point.
To implement this, you probably will need to filter noise. So calculate the first derivative (slope) over some suitable window of data:
=SLOPE(moving window of say 25 raw values)
Then calculate the second derivative (slope of slope):
=SLOPE(moving window of say 25 slope values)
Then look for the minimum.
Hope that helps.
I need to make a graph from a list of values that don't line up with each other. There are samples being taken in a process at certain times, but they aren't always the same. For Sample A, the times are 1pm, 3pm, 5pm, etc. For Sample B the times are 2pm, 4pm, 6pm, etc. For Sample C the times are 1:30pm, 3:30pm, 5:30pm etc.
If I graph each sample individually they are fine, but when you graph them together you can only get the xy scatter points, but no lines since it thinks there are missing values. I just need a rough comparison of increase/decrease over time. If I could connect the dots with lines ignoring the missing values that would be great! I just don't know how to do that... Any suggestions?
This is for Access, otherwise this would work.
Excel - Connect Data Points with Line
You have to use old excel knowledge. In Access, set up the graph with xy scatter like you want. Then Tools > Options > Chart and select Interpolated. That should do it! Just took a while to find what it might be called and where it might be located.