Graphing and comparing two data sets

Graphing and comparing two data sets - statistics

I have 2 sets of data
In both the sets, each element is in between 0 and 1000.
130 elements
75 elements
My professor wants me to plot a "graph" which will allow him to compare these two sets of data.
I plotted a histogram, but since there are two different numbers of data ( 130 and 75), it was of no use.
What is the graph that I should plot which will show the relating between these two sets of data?
How do I "normalize" the data so that it looks like both data have some common details which can be compared?
Thanks in advance.

If there are only one variable: value,
I would have density graphs for value range so that you could see that 25% of all values fall between 0-250 in one data set but only 20% of the other falls within the same range . These could be pie chart or a bar graph.

Related

Excel scatterplot graph update automatically

My problem is as follows:
The user inputs two numbers between 2 and 25, these numbers are used to create a grid. Every point on the grid has (x,y) coordinates. Based on the amount of points the user chose, my excel sheet is filled up with up to 25x25 (x,y) coordinates.
Example: A 6x7 grid is chosen by the user, the table is filled with 42 (x,y) coordinates and all other values in the table are set to "".
Now I want to use a scatterplot with lines connecting each array to plot the data.
Problem 1: If I only select the 6x7 part of the table that has values in it and create the scatterplot the result is correct. Until the user specifies a different grid, for example 8x9, then the graph is obviously missing two rows and two columns of input data.
Problem 2: If I select the entire 25x25 part of the table, including all the "" values, the graph axes get messed up. The y-axis works properly, but the x-axis shows sequential values (0-7) instead of the x-coordinates.
Problem 3: If I replace all the "" values in the table to 0 or NaN and plot the entire table the axes are correct, but the lines between the scatter data get messed up.
Question:
Is there a way to automatically change the input data for the plot, or is there a way to correctly display the values on the x-axis if I select all the data?

Not sure this will work in your case, but it's worth a try, especially since no one's addressed your post in 3+ hours. I've had success with this approach: 1) charting the largest data set, 2) copying the resulting chart, and 3) trimming the data it draws from to produce all smaller data sets.
To get this to work takes a lot of thought in laying out that largest data set so that all the other plots follow as needed. To illustrate, I've somewhat mimicked your data and in the animated gif I show largest data set, plus 2 others produced by copying it. Then I demonstrate how to make the second one, including the rescaling required to make all plots scaled equally. Notice that I've arranged things so that only one set of x-values feeds all the series. If you can do this, it makes working with the Excel's interface much easier.

After wrestling with it all night I came to the following solution:
Instead of setting all the empty cells to "" or zero the cells should be be set to #N/A (not available). The graph properly ignores the #N/A cells exactly like I want it to and updates when values are entered into them.

Excel - Plot average of two plots with inconsistent time (X) axis

I have managed to plot two different data sets on the same axis however, I'm also looking to plotting another line showing their average.
The main problem is that both data sets have different X (time) values so it's not possible to add an average column at the end and plot that. (See the highlighted row 22 for example, corresponding Time values are different)
Is there any way I can plot an average of two plots on the same axis?

One idea that might work is to place the values of both series, one above the other in two new columns, sort this new data according to time, smooth it, then plot the smoothed combined data. Alternatively, you could do the smoothing by simply plotting the new sorted series, adding a moving average trendline to it, then change the formatting of the new series so that it is no longer visible (but the trendline is). Something like this:
In the above picture, series 3 is the plot of the sorted aggregate data of series 1 and 2. If you change the formatting of series 3 so that there is no line, you get something like this:
For my relatively small mock data sets, the results are admittedly poor (it was based on just 25 data points in each series), but if you have a large amount of closely spaced data, and you play around with the moving average window size, you might get something acceptable. If not, you should probably just interpolate both datasets to obtain two consistent time series.

plot two data sets on same chart

I have a two sets of data that I wish to plot on the same chart in Excel 2013.
The first data set is time series data and has about 100 daily observations. I would like to plot this as a line chart.
The second data set only has 6 data points & I wish to plot these as a column. Is this possible when the number of observation in each data set are different?
I know it can be done if you have the same number of observations in both data sets.

You will make things easier if you give them the same categories and use blanks to skip missing entries. Excel is not very smart about matching categories between different sets of data unless you are using scatter plots.

How to produce the data points for a circle in Excel using ROW INDIRECT

The page linked to here has been a great help to me. The method of using the named function (=(ROW(INDIRECT("1:361"))-1)*PI()/180) to produce the circle data points is very slick compared to my original method that was to calculate them individually, writing them in to rows.
My data set includes some 50k rows of data, each one defining a circle. The set is divided into 50 groups and I need to plot one circle from each group as selected via a scroll bar controlling a LOOKUP routine.
Please can someone suggest how I might modify the function (=(ROW(INDIRECT("1:361"))-1)*PI()/180) to reduce the number of data points it produces? I want to reduce the computing load and also, it's not practical to display & format data markers with such high data density. My existing circles are produced with just 18 coordinate pairs and are satisfactorily rounded.
Thanks in advance. Steve.

This would give you 19 data points, 0 and 360 as the start/end points with another every 20%
=(ROW(INDIRECT("1:19"))-1)*PI()/9

How to use Excel column chart for datasets that have very different scales

There are 2 datasets that have values in the interval [0; 1]. I need to visualize these 2 datasets in Excel as a column chart. The problem is that some data points have values 0.0001, 0.0002, and other data point have values 0.8, 0.9, etc. So, the difference is hugde, and therefore it´s impossible to see data points with small values. What could be the solution? Should I use logarithmic scale? I appreciate any example.

Two possible ways below
Graph the smaller data set as a second series against a right hand Y axis (with same ratio from min to max as left hand series)
Multiply the smaller data set by 1000 and compare the multiplied data set to the larger one
Note that a log scale will give negative results given you are working with fractions, so that isn't really an option

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Graphing and comparing two data sets - statistics

If there are only one variable: value, I would have density graphs for value range so that you could see that 25% of all values fall between 0-250 in one data set but only 20% of the other falls within the same range . These could be pie chart or a bar graph.

Related

Excel scatterplot graph update automatically

Excel - Plot average of two plots with inconsistent time (X) axis

plot two data sets on same chart

How to produce the data points for a circle in Excel using ROW INDIRECT

How to use Excel column chart for datasets that have very different scales

Categories

Resources