Difficulties creating a scatter graph in Excel - excel

I have 4 columns of data to display in a scatter graph in excel
Trade Name
Amatib
AMOXICI
Amoxinsol
Amoxival
Amoxy Activ
Bioamoxi
Biocillin
Citramox
CITRAMOX 50
MAXYL
Octacillin
Rhemox
SOLAMOCTA 
Trioxyl500
Irl DDD
16
15
15
20
20
20
15
15
15
15
12
15
13.1
15
AVE Irl
15.8
15.8
15.8
15.8
15.8
15.8
15.8
15.8
15.8
15.8
15.8
15.8
15.8
15.8
EU DDD
16
16
16
16
16
16
16
16
16
16
16
16
16
16
I want the y axis to be a list of the names in a row and the x axis to go somewhere from 10 to 22 with each number a non connected point, or just the first two rows of data and then I can add in a straight line for the 15.8 and 16. I can't figure out how to do it!
Thanks

lbr, you will have to use a Line Chart and then make it look like a scatter. Scatter seems to only accept numeric x and y axis values.
Here I selected my Trade Name and irlDD data in columns and went Insert Line Chart. Then I set:
Line - No Line
Marker Options - built in
Let me know if you need more help with this. To set the axis to 10 to 22 then you will just format the y axis (double click on axis, set the minimum bound to 10 for example).

Related

Missing Date xticks on chart for matplotlib on Python 3. Bug?

I am following this section, I realize this code was made using Python 2 but they have xticks showing on the 'Start Date' axis and I do not. My chart only shows Start Date and no dates are provided. I have attempted to convert the object to datetime but that shows the dates and breaks the graph below it and the line is missing:
Graph
# Set as_index=False to keep the 0,1,2,... index. Then we'll take the mean of the polls on that day.
poll_df = poll_df.groupby(['Start Date'],as_index=False).mean()
# Let's go ahead and see what this looks like
poll_df.head()
Start Date Number of Observations Obama Romney Undecided Difference
0 2009-03-13 1403 44 44 12 0.00
1 2009-04-17 686 50 39 11 0.11
2 2009-05-14 1000 53 35 12 0.18
3 2009-06-12 638 48 40 12 0.08
4 2009-07-15 577 49 40 11 0.09
Great! Now plotting the Difference versus time should be straight forward.
# Plotting the difference in polls between Obama and Romney
fig = poll_df.plot('Start Date','Difference',figsize=(12,4),marker='o',linestyle='-',color='purple')
Notebook is here

Parsing Data Output in Python

So I have this code:
si.get_stats("aapl")
which returns this junk:
0 Market Cap (intraday) 5 877.04B
1 Enterprise Value 3 966.56B
2 Trailing P/E 15.52
3 Forward P/E 1 12.46
4 PEG Ratio (5 yr expected) 1 1.03
5 Price/Sales (ttm) 3.30
6 Price/Book (mrq) 8.20
7 Enterprise Value/Revenue 3 3.64
8 Enterprise Value/EBITDA 6 11.82
9 Fiscal Year Ends Sep 29, 2018
10 Most Recent Quarter (mrq) Sep 29, 2018
11 Profit Margin 22.41%
12 Operating Margin (ttm) 26.69%
13 Return on Assets (ttm) 11.96%
14 Return on Equity (ttm) 49.36%
15 Revenue (ttm) 265.59B
16 Revenue Per Share (ttm) 53.60
17 Quarterly Revenue Growth (yoy) 19.60%
18 Gross Profit (ttm) 101.84B
19 EBITDA 81.8B
20 Net Income Avi to Common (ttm) 59.53B
21 Diluted EPS (ttm) 11.91
22 Quarterly Earnings Growth (yoy) 31.80%
23 Total Cash (mrq) 66.3B
24 Total Cash Per Share (mrq) 13.97
25 Total Debt (mrq) 114.48B
26 Total Debt/Equity (mrq) 106.85
27 Current Ratio (mrq) 1.12
28 Book Value Per Share (mrq) 22.53
29 Operating Cash Flow (ttm) 77.43B
30 Levered Free Cash Flow (ttm) 48.42B
31 Beta (3Y Monthly) 1.21
32 52-Week Change 3 5.27%
33 S&P500 52-Week Change 3 4.97%
34 52 Week High 3 233.47
35 52 Week Low 3 150.24
36 50-Day Moving Average 3 201.02
37 200-Day Moving Average 3 203.28
38 Avg Vol (3 month) 3 38.6M
39 Avg Vol (10 day) 3 42.36M
40 Shares Outstanding 5 4.75B
41 Float 4.62B
42 % Held by Insiders 1 0.07%
43 % Held by Institutions 1 61.16%
44 Shares Short (Oct 31, 2018) 4 36.47M
45 Short Ratio (Oct 31, 2018) 4 1.06
46 Short % of Float (Oct 31, 2018) 4 0.72%
47 Short % of Shares Outstanding (Oct 31, 2018) 4 0.77%
48 Shares Short (prior month Sep 28, 2018) 4 40.2M
49 Forward Annual Dividend Rate 4 2.92
50 Forward Annual Dividend Yield 4 1.51%
51 Trailing Annual Dividend Rate 3 2.72
52 Trailing Annual Dividend Yield 3 1.52%
53 5 Year Average Dividend Yield 4 1.73
54 Payout Ratio 4 22.84%
55 Dividend Date 3 Nov 15, 2018
56 Ex-Dividend Date 4 Nov 8, 2018
57 Last Split Factor (new per old) 2 1/7
58 Last Split Date 3 Jun 9, 2014
This is a third party function, scraping data off of Yahoo Finance. I need something like this
def func( si.get_stats("aapl") ):
**magic**
return Beta (3Y Monthly)
Specifically, I want it to return the number assocaited with Beta, not the actual text.
I'm assuming that the function call returns a single string or list of strings for each line in the table and is not writing to the stdout.
To get the value associated with Beta (3Y Monthly) or any of the other parameter names:
1) If the return is a single string with formatting to print as the table above it should have \n at the end of each line. So you can split this string to a list then iterate over to find the parameter name and split again to fetch the numeric associated with it
# Split the single formatted string to a list of elements, each element
# is one line in the table
str_lst = si.get_stats("aapl").split('\n')
for line in str_lst:
# change Beta (3Y Monthly) to any other parameter required.
if 'Beta (3Y Monthly)' in line:
# split this line with the default split value of white space
# this should provide a list of elements split at each white space.
# eg : ['31', 'Beta', '(3Y', 'Monthly)', '1.21'], the numeric value is the
# last element. Strip to remove trailing space/newline.
num_value_asStr = line.split()[-1].strip()
return num_value_asStr
2) If it already a list that is returned then just iterate over the list items and use the if condition as above and split the required list element to get the numeric value associated with the parameter.
str_lst = si.get_stats("aapl")
for line in str_lst:
# change Beta (3Y Monthly) to any other parameter required.
if 'Beta (3Y Monthly)' in line:
# split this line with the default split value of white space
# this should provide a list of elements split at each white space.
# eg : ['31', 'Beta', '(3Y', 'Monthly)', '1.21'], the numeric value is the
# last element. Strip to remove trailing space/newline.
num_value_asStr = line.split()[-1].strip()
return num_value_asStr

PM3d and Impulses combined not scaling

I am new to gnuplot, but I think I have all the basics. I am trying to plot a 3d surface with some impulses. When I do each splot individually, they look great, but when I splot them together, the scale gets all messed up. Any thoughts? Autoscale is set in all cases.
1st splot:
splot "C:/data/file1.dat" matrix rowheaders columnheaders with pm3d
2nd splot:
splot "C:/Data/file2.dat" with impulses, "C:/Data/file2.dat" with points pt 7
Combined:
splot "C:/data/file1.dat" matrix rowheaders columnheaders with pm3d, \
"C:/Data/file2.dat" with impulses, \
"C:/Data/file2.dat" with points pt 7
See how the scale gets all messed up, and the first chart gets scrunched down to one corner? Both data sets have roughly the same ranges in data.
file1.dat
6 8 10 12 16 20 24
30 3.513999939 4.515999794 5.293000221 5.894999981 6.633999825 6.870999813 6.901000023
35 4.235000134 5.330999851 6.169000149 6.72300005 7.196000099 7.374000072 7.434000015
40 4.818999767 5.940999985 6.776000023 7.171000004 7.558000088 7.722000122 7.802999973
45 5.291999817 6.453999996 7.136000156 7.480999947 7.831999779 7.997000217 8.092000008
50 5.656000137 6.791999817 7.393000126 7.718999863 8.057999611 8.232999802 8.340000153
55 5.968999863 7.014999866 7.587999821 7.913000107 8.255000114 8.44299984 8.565999985
60 6.225999832 7.176000118 7.741000175 8.079999924 8.434000015 8.642000198 8.788000107
65 6.414000034 7.326000214 7.859000206 8.225999832 8.602000237 8.840000153 9.015000343
70 6.624000072 7.494999886 7.956999779 8.357000351 8.767000198 9.039999962 9.25
75 6.801000118 7.638999939 8.100999832 8.468000412 8.930000305 9.251999855 9.496999741
80 6.93599987 7.758999825 8.222000122 8.56799984 9.107999802 9.491000175 9.772000313
85 7.035999775 7.855000019 8.322999954 8.690999985 9.289999962 9.748999596 10.10700035
90 7.102000237 7.919000149 8.409999847 8.80300045 9.470999718 10.03199959 10.47500038
95 7.125 7.933000088 8.479000092 8.901000023 9.642999649 10.31599998 10.83600044
100 7.107999802 7.907999992 8.534000397 8.987000465 9.812000275 10.60000038 11.18799973
105 7.053999901 7.849999905 8.515999794 9.06000042 9.972999573 10.86600018 11.52400017
110 6.965000153 7.769999981 8.43500042 9.090999603 10.11800003 11.10400009 11.84200001
115 6.840000153 7.663000107 8.309000015 8.961000443 10.24100018 11.31099987 12.14299965
120 6.672999859 7.524000168 8.149999619 8.75399971 10.32299995 11.48900032 12.42500019
125 6.436999798 7.349999905 7.961999893 8.529000282 9.987000465 11.64599991 12.68999958
130 6.044000149 7.133999825 7.749000072 8.298000336 9.579000473 11.67500019 12.96199989
135 5.572000027 6.856999874 7.513000011 8.06499958 9.237999916 11.11900043 13.27099991
140 5.127999783 6.440000057 7.257999897 7.831999779 8.937999725 10.52499962 12.90999985
145 4.683000088 5.933000088 6.981999874 7.598999977 8.670000076 10.0170002 12.10299969
150 4.30700016 5.52699995 6.657999992 7.363999844 8.425999641 9.602999687 11.39599991
155 3.996999979 5.196000099 6.294000149 7.122000217 8.194000244 9.262000084 10.79100037
160 3.730999947 4.887000084 5.936999798 6.868999958 7.973999977 8.970999718 10.27600002
165 3.506999969 4.620999813 5.642000198 6.610000134 7.78000021 8.737999916 9.892000198
170 3.342999935 4.421999931 5.427999973 6.385000229 7.625 8.56499958 9.626999855
175 3.233999968 4.288000107 5.281000137 6.217000008 7.506999969 8.43900013 9.44299984
180 3.170000076 4.209000111 5.191999912 6.111000061 7.428999901 8.354000092 9.32199955
file2.dat
7.5 172.0 4.5
5.6 56.8 4.7
6.7 35.0 5.1
11.0 158.7 5.3
13.8 24.8 5.6
12.1 180.0 6.0
5.1 83.2 6.4
13.2 158.0 6.6
15.8 34.5 6.67
15.6 32.9 6.69
11.8 180.0 6.8
13.7 96.0 7.2
15.0 62.4 7.3
11.2 76.2 7.3
11.7 84.9 7.4
13.8 121.8 7.46
9.7 90.9 7.6
13.2 66.0 7.64
14.3 61.3 7.8
14.8 124.6 8.0
9.5 118.8 8.20
15.1 148.8 8.29
12.2 81.8 8.4
You can see in your first image that the spacing between x=10 and x=12 is as big as the spacing between x=12 and x=16, which gives a clue to what's going on: while first plot looks like gnuplot is using the x coordinates 8,10,12,16,20,24, those are really only labels, while numerically gnuplot uses the x coordinates 0,1,2,3,4,5,6. So when you then plot the second graph on the same scale, the data points have x values between 5.1 and 15.8, so will show up the side of the pm3d surface.
If you want gnuplot to use the first column and first row as actual coordinates, you have to use the nonuniform matrix format (see help matrix nonuniform). First, you need to change your data file file1.dat to start with the number 7, the number of columns. The beginning of the file should look like this:
7 6 8 10 12 16 20 24
30 3.513999939 4.515999794 5.293000221 5.894999981 6.633999825 6.870999813 6.901000023
35 4.235000134 5.330999851 6.169000149 6.72300005 7.196000099 7.374000072 7.434000015
Then you can plot the data as follows:
splot "file1.dat" nonuniform matrix w pm3d, \
"file2.dat" with impulses, \
"file2.dat" with points pt 7

read many lines with specific position

Thank you for the time you soent reading it, maybe it is a nooby question
I have a file of 10081 lines, this is an example of the file (a nordic seismic bulletin):
2016 1 8 0921 21.5 L -22.382 -67.835 148.9 OSC 18 0.3 4.7LOSC 1
2016 1 8 1515 43.7 L -20.762 -67.475 188.7 OSC 16 .30 3.7LOSC 1
2016 1 9 0529 35.9 L -18.811 -67.278 235.9 OSC 16 0.5 3.9LOSC 1
2016 110 313 55.6 L -22.032 -67.375 172.0 OSC 14 .30 3.0LOSC 1
2016 110 1021 36.5 L -16.923 -66.668 35.0 OSC 16 0.4 4.5LOSC 1
I tried the following code to extract some information from the file and save them in a separate file.
awk 'NR==1 {print substr($0,24,7), substr($0,32,7), substr($0,40,5)}' select.inp > lat_lon_depth.xyz
substr($0,24,7) means that I take from the 24th position 7 characters which is
the latitude information (-22.382) and the same for the others (longitude from 32th place with 7 characters and depth on 4oth position with 5 characters).
So the question, is possible to go trought all the lines of file and have all latitude, longitude and depth.
Thank you for the time

Dividing excel chart into sections

I want to create this typ of chart in excel:
With the vertical gridlines dividing the chart by year, and the labels for each year. The guy who made this chart said he thinks he just drew in the lines and added the labels manually somehow. But can this be done any other way? drawing lines in charts isnt very exact and the only other solutions i've found can't really produce the same result.
If you have data that looks something like:
Jan-14 4
Feb-14 30
Mar-14 56
Apr-14 23
May-14 3
Jun-14 62
Jul-14 74
Aug-14 12
Sep-14 3
Oct-14 15
Nov-14 63
Dec-14 74
Jan-15 45
Feb-15 3
Mar-15 4
Apr-15 56
May-15 23
Jun-15 3
Jul-15 62
Aug-15 74
Sep-15 12
Oct-15 3
Nov-15 15
Dec-15 63
Jan-16 74
You can select that data and add a new scatter plot style chart. It will, by default, look very similar to the one above. To get vertical lines at the years, you can right-click the x-axis and choose "Format Axis". Click "Fixed" for the "Major Unit" and enter 356 as the number.
Right click again on the x-axis and choose "Add Major Gridlines". You should get a vertical line for each year.
As for the boxes/labels with the years, you may have to do that manually or get creative with VBA.

Resources