I have data that looks like this:
1000 13 75.2
1000 21 79.21
1000 29 80.02
5000 29 87.9
5000 37 88.54
5000 45 88.56
10000 29 90.11
10000 37 90.79
10000 45 90.87
I want to use the first column as x axis labels, the second column as y axis labels and the third column as the z values. I want to display a surface in that manner. What is the best way to do this? I tried Excel but didn't really get anywhere. Does anyone have any suggestions for a tool to do this? Does anyone know how to do this in Excel?
Thanks
I ended up using matplotlib :)
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
import matplotlib.pyplot as plt
import numpy as np
x = [1000,1000,1000,1000,1000,5000,5000,5000,5000,5000,10000,10000,10000,10000,10000]
y = [13,21,29,37,45,13,21,29,37,45,13,21,29,37,45]
z = [75.2,79.21,80.02,81.2,81.62,84.79,87.38,87.9,88.54,88.56,88.34,89.66,90.11,90.79,90.87]
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_trisurf(x, y, z, cmap=cm.jet, linewidth=0.2)
plt.show()
You really can't display 3 columns of data as a 'surface'. Only having one column of 'Z' data will give you a line in 3 dimensional space, not a surface (Or in the case of your data, 3 separate lines). For Excel to be able to work with this data, it needs to be formatted as shown below:
13 21 29 37 45
1000 75.2
1000 79.21
1000 80.02
5000 87.9
5000 88.54
5000 88.56
10000 90.11
10000 90.79
10000 90.87
Then, to get an actual surface, you would need to fill in all the missing cells with the appropriate Z-values. If you don't have those, then you are better off showing this as 3 separate 2D lines, because there isn't enough data for a surface.
The best 3D representation that Excel will give you of the above data is pretty confusing:
Representing this limited dataset as 2D data might be a better choice:
As a note for future reference, these types of questions usually do a little better on superuser.com.
You can use r libraries for 3 D plotting.
Steps are:
First create a data frame using data.frame() command.
Create a 3D plot by using scatterplot3D library.
Or You can also rotate your chart using rgl library by plot3d() command.
Alternately you can use plot3d() command from rcmdr library.
In MATLAB, you can use surf(), mesh() or surfl() command as per your requirement.
[http://in.mathworks.com/help/matlab/examples/creating-3-d-plots.html]
You also can use Gnuplot which is also available from gretl. Put your x y z data on a text file an insert the following
splot 'test.txt' using 1:2:3 with points palette pointsize 3 pointtype 7
Then you can set labels, etc. using
set xlabel "xxx" rotate parallel
set ylabel "yyy" rotate parallel
set zlabel "zzz" rotate parallel
set grid
show grid
unset key
Why not merge the rows that contain the same values?
-
13 21 29 37 45
1000] -75.2 -- 79.21 -- 80.02
5000] ---------------------87.9---88.54----88.56
10000] -------------------90.11--90.97----90.87
Excel can use that pretty well..
Related
I am coding in Python 3.8.
I have two variables, x and y which when varied output a different value of z. I would like to make a contour plot, however I am struggling to find a way to make the M x N data other than manually making it in a CSV.
Sample data:
x = [4 4 2 2 6 12 4 2]
y = [1 4 2 15 1 4 4 1]
z= [100 24 54 21 24 50 29 19]
How do I create a sorted matrix with x rows, and y columns for my contour plot?
I have also just tried doing:
plt.contourf(x,y,z)
However this does not give me the output I want.
I believe I need to use np.mesh in some way, but I cannot figure out how.
The dataset is a lot larger than this, and I would like to understand the best way to tackle this.
Thanks!
I'm using MS Excel 2019 and I'm trying to copy only coordinates of specific selected data points in a scatter plot. Does anyone know whether this is possible. Any possible workaround if possible? My input to the Excel scatter plot are basically x and y coordinates in all 4 quadrants.
Data used:
x
y
-2
-10
39
-8
56
10
34
8
-89
-8
43
5
-9
4
45
3
67
-16
-87
-19
Scatter plot:
What I need is basically select specific points in the first quadrant as mentioned in the red circle from the Excel plot itself and export the selected data points value to separate table. The mouse pointer hover on each point shows its value, but I can't capture multiple data points value using mouse.
What about creating a helper column, which only allows coordinates in the first quadrant? You can achieve this, using following formulas:
in C2 : =IF(AND(A2>=0;B2>=0);A2;0)
in D2 : =IF(AND(A2>=0;B2>=0);B2;0)
I have a dataframe in Pandas in which the rows are observations at different times and each column is a size bin where the values represent the number of particles observed for that size bin. So it looks like the following:
bin1 bin2 bin3 bin4 bin5
Time1 50 200 30 40 5
Time2 60 60 40 420 700
Time3 34 200 30 67 43
I would like to use plotly/cufflinks to create a scatterplot in which the x axis will be each size bin, and the y axis will be the values in each size bin. There will be three colors, one for each observation.
As I'm more experienced in Matlab, I tried indexing the values using iloc (note the example below is just trying to plot one observation):
df.iplot(kind="scatter",theme="white",x=df.columns, y=df.iloc[1,:])
But I just get a key error: 0 message.
Is it possible to use indexing when choosing x and y values in Pandas?
Rather than indexing, I think you need to better understand how pandas and matplotlib interact each other.
Let's go by steps for your case:
As the pandas.DataFrame.plot documentation says, the plotted series is a column. You have the series in the row, so you need to transpose your dataframe.
To create a scatterplot, you need both x and y coordinates in different columns, but you are missing the x column, so you also need to create a column with the x values in the transposed dataframe.
Apparently pandas does not change color by default with consecutive calls to plot (matplotlib does it), so you need to pick a color map and pass a color argument, otherwise all points will have the same color.
Here a working example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#Here I copied you data in a data.txt text file and import it in pandas as a csv.
#You may have a different way to get your data.
df = pd.read_csv('data.txt', sep='\s+', engine='python')
#I assume to have a column named 'time' which is set as the index, as you show in your post.
df.set_index('time')
tdf = df.transpose() #transpose the dataframe
#Drop the time column from the trasponsed dataframe. time is not a data to be plotted.
tdf = tdf.drop('time')
#Creating x values, I go for 1 to 5 but they can be different.
tdf['xval'] = np.arange(1, len(tdf)+1)
#Choose a colormap and making a list of colors to be used.
colormap = plt.cm.rainbow
colors = [colormap(i) for i in np.linspace(0, 1, len(tdf))]
#Make an empty plot, the columns will be added to the axes in the loop.
fig, axes = plt.subplots(1, 1)
for i, cl in enumerate([datacol for datacol in tdf.columns if datacol != 'xval']):
tdf.plot(x='xval', y=cl, kind="scatter", ax=axes, color=colors[i])
plt.show()
This plots the following image:
Here a tutorial on picking colors in matplotlib.
I have written code for customizing my x ticks, snippet of the same is below
arr_label = ['sum_msg_len','log_count','info_hit','debug_hit','error_hit']
for label in arr_label :
fig = plt.figure(figsize=(15,6))
axes = fig.add_axes([1,1,1,1])
axes.xaxis.set_major_locator(plt.LinearLocator(30))
axes.tick_params(axis ='x',labelsize=6)
axes.plot(df.index,df[label],'g',label =label)
axes.legend()
fig.autofmt_xdate()
fig.savefig('images_indv/'+app_index+"_"+label+".png",bbox_inches='tight')
#fig.close()
fig.clf()
my requirement is that is have timestamps spaced by minute and i want to plot timestamp vs ('sum_msg_len'/'log_count'/'info_hit'/'debug_hit'/'error_hit') one by one,
but problem is X ticks, i want some specified no of ticks to appear within the range of the data which i am plotting.
Earlier when i was not specifing any Locator then all the timestamps got overlapped and one cannot read the timestamps properly. So when i try to use a locator, it labels the x-axis with out any relation to the plotted value.
Like if i use LinearLocator(30) it just plots the first 00 to 29 mins in the graph,and if i use LinearLocator(50) it just plots the first 00 to 49 mins in the graph with no change to the y axis values. Plots of both I am putting below. I also tried with different locators Like MultipleLocator and MaxNlocator, but issue sustains
In short, I just want the graph plotted for 21July 00:00:00 to 22 July 00:00:00 which will be 1440 entries but the i want to see around 30-40 intermediate entries mentioned on the plot.
I have a file that contains 4 numbers (min, max, mean, standard derivation) and I would like to plot it with gnuplot.
Sample:
24 31 29.0909 2.57451
12 31 27.2727 5.24129
14 31 26.1818 5.04197
22 31 27.7273 3.13603
22 31 28.1818 2.88627
If I have 4 files with one column, then I can do:
gnuplot "file1.txt" with lines, "file2.txt" with lines, "file3.txt" with lines, "file4.txt" with lines
And it will plot 4 curves. I do not care about the x-axis, it should just be a constant increment.
How could I please plot? I can't seem to find a way to have 4 curves with 1 file with 4 columns, just having a constantly incrementing x value.
Thank you.
You can plot different columns of the same file like this:
plot 'file' using 0:1 with lines, '' using 0:2 with lines ...
(... means continuation). A couple of notes on this notation: using specifies which column to plot i.e. column 0 and 1 in the first using statement, the 0th column is a pseudo column that translates to the current line number in the data file. Note that if only one argument is used with using (e.g. using n) it corresponds to saying using 0:n (thanks for pointing that out mgilson).
If your Gnuplot version is recent enough, you would be able to plot all 4 columns with a for-loop:
set key outside
plot for [col=1:4] 'file' using 0:col with lines
Result:
Gnuplot can use column headings for the title if they are in the data file, e.g.:
min max mean std
24 31 29.0909 2.57451
12 31 27.2727 5.24129
14 31 26.1818 5.04197
22 31 27.7273 3.13603
22 31 28.1818 2.88627
and
set key outside
plot for [col=1:4] 'file' using 0:col with lines title columnheader
Results in:
Just to add that you can specify the increment in the for loop as third argument. It is useful if you want to plot every nth column.
plot for [col=START:END:INC] 'file' using col with lines
In this case it changes nothing but anyway:
plot for [col=1:4:1] 'file' using col with lines