Please dont eat me because of this question :)
I have some data in excel and I would like to make a graphical representation of those data. Structure of my data:
persons ID : from 1 to 485 to every person, there is one parameter like average jumping distance and another parameter (like height) and finally there is a class to which every person belongs to (1, 2 or 3).
To assign persons to classes I have used k-means algorithm.
Now I would like to make a graph of this result. How can I do it please in excel (or by using another tool)?
Thank you
I would use a scatter (XY chart with markers and no lines). Plot average jumping distance on one axis, height on the second axis. Then for the classes I would separate all the data into 3 series and use different colors for each series. I would adjust the marker size to see which one works best with the data.
Here is a fast example to give you an idea how to it would look like. Its not as easy as just clicking once to insert the chart from the data though:
Related
I'm trying to format the "segments" of a sunburst chart. The chart has one series and many points.
I can format the entire series like this:
With ActiveChart
.FullSeriesCollection(1).Format.Fill.ForeColor.ObjectThemeColor = msoThemeColorAccent1
End With
and i can format an individual point like this:
With ActiveChart
.FullSeriesCollection(1).points(1).Format.Fill.ForeColor.ObjectThemeColor = msoThemeColorAccent1
End With
but i can not figure out how many points there are in each "segment" so i can format them all the same. For example, the 5th point is in the second "segment" but i can't see a way to determine that.
By "segment" i mean all the points in that wedge of the pie from the centre out.
I was recently struggling with this kind of chart, too (see here). Sunburst charts are indeed very poorly documented.
My solution to the same problem was to go through the underlying data in order to get to know how many data points there are per column. Example:
The first category with result 50% has 3 out of 5 data points, which means the innermost point has index 1, while the index of the innermost data point of the second category is 4. Third one index 6, fourth 7 and so on. Knowing this index you can color the columns as you wish.
So answering your question: Using the sunburst chart, you have to know how many data points you have per column, (as far as I could find out) you can not figure this out by going through the data points themselves.
My problem is as follows:
The user inputs two numbers between 2 and 25, these numbers are used to create a grid. Every point on the grid has (x,y) coordinates. Based on the amount of points the user chose, my excel sheet is filled up with up to 25x25 (x,y) coordinates.
Example: A 6x7 grid is chosen by the user, the table is filled with 42 (x,y) coordinates and all other values in the table are set to "".
Now I want to use a scatterplot with lines connecting each array to plot the data.
Problem 1: If I only select the 6x7 part of the table that has values in it and create the scatterplot the result is correct. Until the user specifies a different grid, for example 8x9, then the graph is obviously missing two rows and two columns of input data.
Problem 2: If I select the entire 25x25 part of the table, including all the "" values, the graph axes get messed up. The y-axis works properly, but the x-axis shows sequential values (0-7) instead of the x-coordinates.
Problem 3: If I replace all the "" values in the table to 0 or NaN and plot the entire table the axes are correct, but the lines between the scatter data get messed up.
Question:
Is there a way to automatically change the input data for the plot, or is there a way to correctly display the values on the x-axis if I select all the data?
Not sure this will work in your case, but it's worth a try, especially since no one's addressed your post in 3+ hours. I've had success with this approach: 1) charting the largest data set, 2) copying the resulting chart, and 3) trimming the data it draws from to produce all smaller data sets.
To get this to work takes a lot of thought in laying out that largest data set so that all the other plots follow as needed. To illustrate, I've somewhat mimicked your data and in the animated gif I show largest data set, plus 2 others produced by copying it. Then I demonstrate how to make the second one, including the rescaling required to make all plots scaled equally. Notice that I've arranged things so that only one set of x-values feeds all the series. If you can do this, it makes working with the Excel's interface much easier.
After wrestling with it all night I came to the following solution:
Instead of setting all the empty cells to "" or zero the cells should be be set to #N/A (not available). The graph properly ignores the #N/A cells exactly like I want it to and updates when values are entered into them.
The page linked to here has been a great help to me. The method of using the named function (=(ROW(INDIRECT("1:361"))-1)*PI()/180) to produce the circle data points is very slick compared to my original method that was to calculate them individually, writing them in to rows.
My data set includes some 50k rows of data, each one defining a circle. The set is divided into 50 groups and I need to plot one circle from each group as selected via a scroll bar controlling a LOOKUP routine.
Please can someone suggest how I might modify the function (=(ROW(INDIRECT("1:361"))-1)*PI()/180) to reduce the number of data points it produces? I want to reduce the computing load and also, it's not practical to display & format data markers with such high data density. My existing circles are produced with just 18 coordinate pairs and are satisfactorily rounded.
Thanks in advance. Steve.
This would give you 19 data points, 0 and 360 as the start/end points with another every 20%
=(ROW(INDIRECT("1:19"))-1)*PI()/9
I have a table (came from a pivot table) where I have formatted the column 4 cells to show 1 billion as 1. But when I select the table and insert a chart, I am getting my units in millions. So the 14.8 billion number for Mexico is showing up as 14,800 on the chart. Why might this be happening and how can I fix this? This is also making all my other bars negligibly small. Note that the first three columns are not in billions and are totally different things. Some are percentages, some are other small numbers.
Table:
Chart:
You need a secondary horizontal axis and some formatting on the Axes.
In Excel 2013
First change the Chart Type to Combo and select Clustered Bar for both sets of data, then Check
Secondary Axis for the Percentage Series.
Then set up the axis limits so they match, e.g.
Percentage: min -.5 max 2
Billions: min -5e9 max 20e9
Then set the percentage format on the source data to a custom Number format of "";(0)%;0%
Then set the Billions format as 0,,,;"";0
You will get something like this:
EDIT
Now that we have the general principles, we can apply them to your specific data.
I will also switch to Excel 2010 do show the different menus.
The data selection looks like this
Select the non-Billion series (plural!) and check the secondary axis
If the larger data is always positive then you can use custom formatting to clean up the axis
Align the primary and secondary axes so that the grid lines match on both
The end result is clean and readable.
Mixing percentages and numbers for the smaller numbers is not handled by this but I would suggest that that would be confusing anyway?
The simplest way to fix this might be to plot cells containing the billions values divided by 10^9 rather than to plot the billions themselves, though via a secondary axis may be possible.
Using Excel 2007. For the purple bars, the example on the left uses ColumnE values, on the right ColumnF values. E1 contains =F1/10^9 and F1 contains =14800000000:
It appears that there are 3 questions here: 1) "Why might this be happening", 2) "how can I fix this", and 3) something like "how can I plot data which lie on two widely differing ranges, and make them all reasonably visible anyway", even if there was no explicit question on this.
There are several ways to solve issue #2 about the units (e.g., billions) and numbers (e.g., 14.8 vs. 14,800.0) shown in the axis, each one with its own pros and cons:
Use Format Axis -> Axis Options -> Display units.
This might be the answer to your issue #1 as well, you might have the following selection: Display units -> Millions, and unchecked Show display units... Otherwise, I wouldn't know why you chart shows what it shows.
Use faked tick marks, as indicated in the (excellent) site of Jon Peltier
http://peltiertech.com/Excel/Charts/ArbitraryAxis.html
It gives detailed instructions on how to create tick marks on an axis with arbitrary labels (which may be text, numbers, etc.), which is more generic than what the OP wants here. In this particular case, the labels will be the desired numbers.
Create new cells containing data that would be plotted exactly the way you want.
As for your issue #3, I guess the only option is to have a Secondary Axis (see the answer by pnuts).
Thus, to come up with the best final chart for you might use a combination of one of the options I gave here and a secondary axis.
Most simple thing to do on paper, but somehow near impossible to work out in Excel.
I need to interpolate off a standard curve in Excel.
I have a standard curve and need to find an unknown concentration of a known absorbance.
Just like this;
http://class.fst.ohio-state.edu/fst601/Lectures/spectt/Image161.gif
My lecturer would not give us any more hints than to either use the, linear regression equation (which i think i worked out but couldn't get it to calculate the concentration) or use the point finding/picking option (no idea what this could mean)
If anyone could help me out with this I would really appreciate it, and so would my whole class!
In excel, you can do it two ways:
Method 1: Use a standard interpolation/extrapolation formula. (Bingle)
Method 2: Plot any existing data points that you have as an x-y(scatter) plot. Right-click on the data line in the chart and choose 'Add trendline'. Excel will calculate a best-fit line for your data (using linear regression) and display the line over the top of your existing data. If you right-click on this trendline, you can adjust its settings and display properties of it, including the equation used to draw it. Once you have that equation, you can plug in any value of x and get the corresponding value of y.