Optimal way to display data with different ranges - python-3.x

I have an application which I pull data from an FPGA & display it for the engineers. Good application ... until you start displaying data which are extremely different in ranges...
say: a signal perturbating around +4000 and another around zero (both with small peak-peak).
At the moment the only real workaround is to "export to csv" and then view in Excel but I would like to improve the application so that this isn't needed
Option 1 is a more dynamic pointer that will give you readings of ALL visible plots for the present x
Option 2. Multiple Y axis. This is where it gets a bit ... tight with respect to UI area.
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA
import numpy as np
t = np.arange(0,1,0.00001)
data = [5000*np.sin(t*2*np.pi*10),
10*np.sin(t*2*np.pi*20),
20*np.sin(t*2*np.pi*30),
np.sin(t*2*np.pi*40)+5000,
np.sin(t*2*np.pi*50)-5000,
np.sin(t*2*np.pi*60),
np.sin(t*2*np.pi*70),
]
fig = plt.figure()
host = host_subplot(111, axes_class=AA.Axes)
axis_list = [None]*7
for i in range(len(axis_list)):
axis_list[i] = host.twinx()
new_axis = axis_list[i].get_grid_helper().new_fixed_axis
axis_list[i].axis['right'] = new_axis(loc='right',
axes=axis_list[i],
offset=(60*i,0))
axis_list[i].axis['right'].toggle(all=True)
axis_list[i].plot(t,data[i])
plt.show()
for i in data:
plt.plot(t,i)
plt.show()
This code snippet doesn't contain any figure resize to ensure all 7 y-axis are visible BUT ignoring that, you can see it is quite large...
Any advice with respect to multi-Y or a better solution to displaying no more than 7 datasets?

Related

Is there a way using librosa's waveplot to store the coordinates of the graph rather than show the image of the waveplot?

I am working on an audio project where I am using Librosa and have the following code from an example online. Rather than opening up an image with a graph of the amplitude versus time, I want to be able to store the coordinates that make up the graph in an array. I have tried a lot of different examples found on stackoverflow as well as other websites with no luck. I am relatively new to python and this is my first question on stackoverflow so please be kind.
import librosa.display
import matplotlib.pyplot as plt
from IPython.display import display, Audio
filename = 'queen2.mp3'
samples, sampleRate = librosa.load(filename)
display(Audio(filename))
plt.figure(figsize=(12, 4))
librosa.display.waveplot(y, sr=None, max_points=200)
plt.show()
librosa is open-source (under the ISC license), so you can look at the code to see how it does this. The documentation for functions has a handy [source] link which takes you do the code. For librosa.display.waveplot you will see that it calls a function __envelope() to compute the envelope. Presumably it is these coordinates you are after.
hop_length = 1
y = __envelope(y, hop_length)
y_top = y[0]
y_bottom = -y[-1]
import numpy as np
def __envelope(x, hop):
'''Compute the max-envelope of non-overlapping frames of x at length hop
x is assumed to be multi-channel, of shape (n_channels, n_samples).
'''
x_frame = np.abs(util.frame(x, frame_length=hop, hop_length=hop))
return x_frame.max(axis=1)

Show only some bar labels for matplotlib bar chart

I have a bar chart with a lot of columns (around 100). I want to show only some of the bar labels (they are ordered in such a way that this is a perfectly reasonable way to present the data). Is there a simple way to do this, say show every 3rd or 5th label? I know I can manually pull together the list, but I figure there's likely an elegant option.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(100)
groupings = np.arange(0,100)
x_pos = [i for i, _ in enumerate(groupings)]
plt.bar(x_pos,data)
plt.xticks(x_pos,groupings)

Bokeh plot line not updating after checking CheckboxGroup in server mode (python callback)

I have just initiated myself to Bokeh library and I would like to add interactivity in my dashboard. To do so, I want to use CheckboxGroup widget in order to select which one of a pandas DataFrame column to plot.
I have followed tutorials but I must have misunderstood the use of ColumnDataSource as I cannot make a simple example work...
I am aware of previous questions on the matter, and one that seems relevant on the StackOverflow forum is the latter :
Bokeh not updating plot line update from CheckboxGroup
Sadly I did not succeed in reproducing the right behavior.
I have tried to reproduce an example following the same updating structure presented in Bokeh Server plot not updating as wanted, also it keeps shifting and axis information vanishes by #bigreddot without success.
import numpy as np
import pandas as pd
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
from bokeh.palettes import Spectral
from bokeh.layouts import row
from bokeh.models.widgets import CheckboxGroup
from bokeh.io import curdoc
# UPDATE FUNCTION ------------------------------------------------
# make update function
def update(attr, old, new):
feature_selected_test = [feature_checkbox.labels[i] for i in feature_checkbox.active]
# add index to plot
feature_selected_test.insert(0, 'index')
# create new DataFrame
new_df = dummy_df.filter(feature_selected_test)
plot_src.data = ColumnDataSource.from_df(data=new_df)
# CREATE DATA SOURCE ------------------------------------------------
# create dummy data for debugging purpose
index = list(range(0, 890))
index.extend(list(range(2376, 3618)))
feature_1 = np.random.rand(len(index))
feature_2 = np.random.rand(len(index))
feature_3 = np.random.rand(len(index))
feature_4 = np.random.rand(len(index))
dummy_df = pd.DataFrame(dict(index=index, feature_1=feature_1, feature_2=feature_2, feature_3=feature_3,feature_4=feature_4))
# CREATE CONTROL ------------------------------------------------------
# list available data to plot
available_feature = list(dummy_df.columns[1:])
# initialize control
feature_checkbox = CheckboxGroup(labels=available_feature, active=[0, 1], name='checkbox')
feature_checkbox.on_change('active', update)
# INITIALIZE DASHBOARD ---------------------------------------------------
# initialize ColumnDataSource object
plot_src = ColumnDataSource(dummy_df)
# create figure
line_fig = figure()
feature_selected = [feature_checkbox.labels[i] for i in feature_checkbox.active]
# feature_selected = ['feature_1', 'feature_2', 'feature_3', 'feature_4']
for index_int, col_name_str in enumerate(feature_selected):
line_fig.line(x='index', y=col_name_str, line_width=2, color=Spectral[11][index_int % 11], source=plot_src)
curdoc().add_root(row(feature_checkbox, line_fig))
The program should work with a copy/paste... well without interactivity...
Would someone please help me ? Thanks a lot in advance.
You are only adding glyphs for the initial subset of selected features:
for index_int, col_name_str in enumerate(feature_selected):
line_fig.line(x='index', y=col_name_str, line_width=2, color=Spectral[11][index_int % 11], source=plot_src)
So that is all that is ever going to show.
Adding new columns to the CDS does not automatically make anything in particular happen, it's just extra data that is available for glyphs or hover tools to potentially use. To actually show it, there have to be glyphs configured to display those columns. You could do that, add and remove glyphs dynamically, but it would be far simpler to just add everything once up front, and use the checkbox to toggle only the visibility. There is an example of just this in the repo:
https://github.com/bokeh/bokeh/blob/master/examples/app/line_on_off.py
That example passes the data as literals the the glyph function but you could put all the data in CDS up front, too.

How to change scatter plot marker color in plotting loop using pandas?

I'm trying to write a simple program that reads in a CSV with various datasets (all of the same length) and automatically plots them all (as a Pandas Dataframe scatter plot) on the same figure. My current code does this well, but all the marker colors are the same (blue). I'd like to figure out how to make a colormap so that in the future, if I have much larger data sets (let's say, 100+ different X-Y pairings), it will automatically color each series as it plots. Eventually, I would like for this to be a quick and easy method to run from the command line. I did not have luck reading the documentation or stack exchange, hopefully this is not a duplicate!
I've tried the recommendations from these posts:
1)Setting different color for each series in scatter plot on matplotlib
2)https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.scatter.html
3) https://matplotlib.org/users/colormaps.html
However, the first one essentially grouped the data points according to their position on the x-axis and made those groups of data the same color (not what I want, each series of data is roughly a linearly increasing function). The second and third links seemed to have worked, but I don't like the colormap choices (e.g. "viridis", many colors are too similar and it's hard to distinguish data points).
This is a simplified version of my code so far (took out other lines that automatically named axes, etc. to make it easier to read). I've also removed any attempts I've made to specify a colormap, for more of a blank canvas feel:
''' Importing multiple scatter data and plotting '''
import pandas as pd
import matplotlib.pyplot as plt
### Data file path (please enter Dataframe however you like)
path = r'/Users/.../test_data.csv'
### Read in data CSV
data = pd.read_csv(path)
### List of headers
header_list = list(data)
### Set data type to float so modified data frame can be plotted
data = data.astype(float)
### X-axis limits
xmin = 1e-4;
xmax = 3e-3;
## Create subplots to be plotted together after loop
fig, ax = plt.subplots()
### Since there are multiple X-axes (every other column), this loop only plots every other x-y column pair
for i in range(len(header_list)):
if i % 2 == 0:
dfplot = data.plot.scatter(x = "{}".format(header_list[i]), y = "{}".format(header_list[i + 1]), ax=ax)
dfplot.set_xlim(xmin,xmax) # Setting limits on X axis
plot.show()
The dataset can be found in the google drive link below. Thanks for your help!
https://drive.google.com/drive/folders/1DSEs8D7lIDUW4NIPBl2qW2EZiZxslGyM?usp=sharing

Creating a structured grid of subplots with Seaborn FacetGrid

My attempt to use FacetGrid in Seaborn does not produces the expected results.
Moreover, I would like to control the white space in the grid.
My data and code is the following:
toy.to_json()
'{"has_cus_id_but_not_acc_id":{"0":0,"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":1,"19":0,"20":0,"21":0,"22":1,"23":0,"24":0,"25":1,"26":0,"27":1,"28":0,"29":1,"30":0,"31":1,"32":0,"33":1,"34":0,"35":1,"36":0,"37":1,"38":0,"39":0,"40":1,"41":1,"42":0,"43":1,"44":0,"45":1,"46":0,"47":1,"48":0,"49":1,"50":0,"51":1,"52":0,"53":1,"54":0,"55":1,"56":0,"57":1,"58":0,"59":1,"60":0,"61":1,"62":0,"63":1,"64":0,"65":1,"66":0,"67":1,"68":0,"69":1,"70":0,"71":1,"72":0,"73":1,"74":0,"75":1,"76":0,"77":0,"78":1,"79":0,"80":1,"81":0,"82":0,"83":1,"84":0,"85":1},"reg_year":{"0":2014.0,"1":2014.0,"2":2014.0,"3":2014.0,"4":2014.0,"5":2014.0,"6":2014.0,"7":2014.0,"8":2015.0,"9":2015.0,"10":2015.0,"11":2015.0,"12":2015.0,"13":2015.0,"14":2015.0,"15":2015.0,"16":2015.0,"17":2016.0,"18":2016.0,"19":2016.0,"20":2016.0,"21":2016.0,"22":2016.0,"23":2016.0,"24":2016.0,"25":2016.0,"26":2016.0,"27":2016.0,"28":2016.0,"29":2016.0,"30":2016.0,"31":2016.0,"32":2016.0,"33":2016.0,"34":2016.0,"35":2016.0,"36":2016.0,"37":2016.0,"38":2017.0,"39":2017.0,"40":2017.0,"41":2017.0,"42":2017.0,"43":2017.0,"44":2017.0,"45":2017.0,"46":2017.0,"47":2017.0,"48":2017.0,"49":2017.0,"50":2017.0,"51":2017.0,"52":2017.0,"53":2017.0,"54":2017.0,"55":2017.0,"56":2017.0,"57":2017.0,"58":2017.0,"59":2017.0,"60":2018.0,"61":2018.0,"62":2018.0,"63":2018.0,"64":2018.0,"65":2018.0,"66":2018.0,"67":2018.0,"68":2018.0,"69":2018.0,"70":2018.0,"71":2018.0,"72":2018.0,"73":2018.0,"74":2018.0,"75":2018.0,"76":2018.0,"77":2018.0,"78":2018.0,"79":2018.0,"80":2018.0,"81":2018.0,"82":2019.0,"83":2019.0,"84":2019.0,"85":2019.0},"reg_month":{"0":3.0,"1":5.0,"2":6.0,"3":7.0,"4":9.0,"5":10.0,"6":11.0,"7":12.0,"8":1.0,"9":3.0,"10":5.0,"11":6.0,"12":7.0,"13":8.0,"14":9.0,"15":11.0,"16":12.0,"17":1.0,"18":1.0,"19":2.0,"20":3.0,"21":4.0,"22":4.0,"23":5.0,"24":6.0,"25":6.0,"26":7.0,"27":7.0,"28":8.0,"29":8.0,"30":9.0,"31":9.0,"32":10.0,"33":10.0,"34":11.0,"35":11.0,"36":12.0,"37":12.0,"38":1.0,"39":2.0,"40":2.0,"41":3.0,"42":4.0,"43":4.0,"44":5.0,"45":5.0,"46":6.0,"47":6.0,"48":7.0,"49":7.0,"50":8.0,"51":8.0,"52":9.0,"53":9.0,"54":10.0,"55":10.0,"56":11.0,"57":11.0,"58":12.0,"59":12.0,"60":1.0,"61":1.0,"62":2.0,"63":2.0,"64":3.0,"65":3.0,"66":4.0,"67":4.0,"68":5.0,"69":5.0,"70":6.0,"71":6.0,"72":7.0,"73":7.0,"74":8.0,"75":8.0,"76":9.0,"77":10.0,"78":10.0,"79":11.0,"80":11.0,"81":12.0,"82":1.0,"83":1.0,"84":2.0,"85":2.0},"Total_Revenue":{"0":35852.02,"1":2623.97,"2":3526.67,"3":21466.71,"4":72784.1200000003,"5":103921.2899999999,"6":10852.87,"7":16522.07,"8":7443.76,"9":68962.1600000002,"10":10956.38,"11":193856.8799999985,"12":110766.6099999997,"13":123861.8599999987,"14":2722.34,"15":303488.6900000007,"16":6876.58,"17":17729.5,"18":4687.93,"19":26914.06,"20":2228.12,"21":15708.93,"22":859.58,"23":19164.89,"24":163164.4799999995,"25":33180.7300000001,"26":10033.01,"27":1114.48,"28":462613.2900000042,"29":9822.95,"30":70901.4400000003,"31":22370.29,"32":46711.8900000002,"33":2335.02,"34":7259.28,"35":11.83,"36":13590.51,"37":7677.77,"38":282.01,"39":358522.7900000003,"40":5844.0,"41":7027.28,"42":1908.71,"43":4032.35,"44":11072.6,"45":3973.15,"46":30706.23,"47":2644.13,"48":23831.75,"49":670.12,"50":6949.54,"51":4687.7,"52":9672.69,"53":7333.01,"54":12814.33,"55":689.39,"56":6962.86,"57":2283.16,"58":1259.5,"59":224.84,"60":12812.12,"61":247.68,"62":25452.65,"63":1245.02,"64":24211.36,"65":5255.25,"66":28402.76,"67":9148.55,"68":14822.61,"69":345.37,"70":12408.13,"71":989.93,"72":10601.33,"73":730.32,"74":169020.5000000001,"75":697.54,"76":3862038.6799997138,"77":6148750.9899984254,"78":194.06,"79":2379382.4500000761,"80":1174.11,"81":1729567.9000000793,"82":889650.029999995,"83":95.8,"84":415996.6999999974,"85":654.78}}'
g = sns.FacetGrid(toy, col='has_cus_id_but_not_acc_id', hue='reg_year')
g.map(sns.barplot, 'reg_month', 'Total_Revenue')
g.add_legend();
If I use bar in pyplot I get this:
g = sns.FacetGrid(toy, col='has_cus_id_but_not_acc_id', hue='reg_year')
g.map(plt.bar, 'reg_month', 'Total_Revenue')
g.add_legend();
Again, I would like to be able to define the white space of the grid.
In addition I would not like to have the bars stacked one over the other but rather one next to the other.
Some values of the year 2018 are really large compared to the any of the values where has_cus_id_but_not_acc_id is 1. Hence the right plot is almost empty. It might make sense to use a logarithmic scale.
Now you have 6 years, so each month would need to show 6 bars next to each other. That will make bars pretty small and does not let the chart be easily readable. Still it's possible.
The following does not use seaborn, but pandas and matplotlib:
import matplotlib.pyplot as plt
import pandas as pd
toy = '{"has_cus_id_but_not_acc_id":{"0":0,"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":1,"19":0,"20":0,"21":0,"22":1,"23":0,"24":0,"25":1,"26":0,"27":1,"28":0,"29":1,"30":0,"31":1,"32":0,"33":1,"34":0,"35":1,"36":0,"37":1,"38":0,"39":0,"40":1,"41":1,"42":0,"43":1,"44":0,"45":1,"46":0,"47":1,"48":0,"49":1,"50":0,"51":1,"52":0,"53":1,"54":0,"55":1,"56":0,"57":1,"58":0,"59":1,"60":0,"61":1,"62":0,"63":1,"64":0,"65":1,"66":0,"67":1,"68":0,"69":1,"70":0,"71":1,"72":0,"73":1,"74":0,"75":1,"76":0,"77":0,"78":1,"79":0,"80":1,"81":0,"82":0,"83":1,"84":0,"85":1},"reg_year":{"0":2014.0,"1":2014.0,"2":2014.0,"3":2014.0,"4":2014.0,"5":2014.0,"6":2014.0,"7":2014.0,"8":2015.0,"9":2015.0,"10":2015.0,"11":2015.0,"12":2015.0,"13":2015.0,"14":2015.0,"15":2015.0,"16":2015.0,"17":2016.0,"18":2016.0,"19":2016.0,"20":2016.0,"21":2016.0,"22":2016.0,"23":2016.0,"24":2016.0,"25":2016.0,"26":2016.0,"27":2016.0,"28":2016.0,"29":2016.0,"30":2016.0,"31":2016.0,"32":2016.0,"33":2016.0,"34":2016.0,"35":2016.0,"36":2016.0,"37":2016.0,"38":2017.0,"39":2017.0,"40":2017.0,"41":2017.0,"42":2017.0,"43":2017.0,"44":2017.0,"45":2017.0,"46":2017.0,"47":2017.0,"48":2017.0,"49":2017.0,"50":2017.0,"51":2017.0,"52":2017.0,"53":2017.0,"54":2017.0,"55":2017.0,"56":2017.0,"57":2017.0,"58":2017.0,"59":2017.0,"60":2018.0,"61":2018.0,"62":2018.0,"63":2018.0,"64":2018.0,"65":2018.0,"66":2018.0,"67":2018.0,"68":2018.0,"69":2018.0,"70":2018.0,"71":2018.0,"72":2018.0,"73":2018.0,"74":2018.0,"75":2018.0,"76":2018.0,"77":2018.0,"78":2018.0,"79":2018.0,"80":2018.0,"81":2018.0,"82":2019.0,"83":2019.0,"84":2019.0,"85":2019.0},"reg_month":{"0":3.0,"1":5.0,"2":6.0,"3":7.0,"4":9.0,"5":10.0,"6":11.0,"7":12.0,"8":1.0,"9":3.0,"10":5.0,"11":6.0,"12":7.0,"13":8.0,"14":9.0,"15":11.0,"16":12.0,"17":1.0,"18":1.0,"19":2.0,"20":3.0,"21":4.0,"22":4.0,"23":5.0,"24":6.0,"25":6.0,"26":7.0,"27":7.0,"28":8.0,"29":8.0,"30":9.0,"31":9.0,"32":10.0,"33":10.0,"34":11.0,"35":11.0,"36":12.0,"37":12.0,"38":1.0,"39":2.0,"40":2.0,"41":3.0,"42":4.0,"43":4.0,"44":5.0,"45":5.0,"46":6.0,"47":6.0,"48":7.0,"49":7.0,"50":8.0,"51":8.0,"52":9.0,"53":9.0,"54":10.0,"55":10.0,"56":11.0,"57":11.0,"58":12.0,"59":12.0,"60":1.0,"61":1.0,"62":2.0,"63":2.0,"64":3.0,"65":3.0,"66":4.0,"67":4.0,"68":5.0,"69":5.0,"70":6.0,"71":6.0,"72":7.0,"73":7.0,"74":8.0,"75":8.0,"76":9.0,"77":10.0,"78":10.0,"79":11.0,"80":11.0,"81":12.0,"82":1.0,"83":1.0,"84":2.0,"85":2.0},"Total_Revenue":{"0":35852.02,"1":2623.97,"2":3526.67,"3":21466.71,"4":72784.1200000003,"5":103921.2899999999,"6":10852.87,"7":16522.07,"8":7443.76,"9":68962.1600000002,"10":10956.38,"11":193856.8799999985,"12":110766.6099999997,"13":123861.8599999987,"14":2722.34,"15":303488.6900000007,"16":6876.58,"17":17729.5,"18":4687.93,"19":26914.06,"20":2228.12,"21":15708.93,"22":859.58,"23":19164.89,"24":163164.4799999995,"25":33180.7300000001,"26":10033.01,"27":1114.48,"28":462613.2900000042,"29":9822.95,"30":70901.4400000003,"31":22370.29,"32":46711.8900000002,"33":2335.02,"34":7259.28,"35":11.83,"36":13590.51,"37":7677.77,"38":282.01,"39":358522.7900000003,"40":5844.0,"41":7027.28,"42":1908.71,"43":4032.35,"44":11072.6,"45":3973.15,"46":30706.23,"47":2644.13,"48":23831.75,"49":670.12,"50":6949.54,"51":4687.7,"52":9672.69,"53":7333.01,"54":12814.33,"55":689.39,"56":6962.86,"57":2283.16,"58":1259.5,"59":224.84,"60":12812.12,"61":247.68,"62":25452.65,"63":1245.02,"64":24211.36,"65":5255.25,"66":28402.76,"67":9148.55,"68":14822.61,"69":345.37,"70":12408.13,"71":989.93,"72":10601.33,"73":730.32,"74":169020.5000000001,"75":697.54,"76":3862038.6799997138,"77":6148750.9899984254,"78":194.06,"79":2379382.4500000761,"80":1174.11,"81":1729567.9000000793,"82":889650.029999995,"83":95.8,"84":415996.6999999974,"85":654.78}}'
df = pd.read_json(toy)
df['reg_year'].astype(int)
u = df["has_cus_id_but_not_acc_id"].unique()
y = df['reg_year'].unique()
fig, axes = plt.subplots(1,len(u), sharey=True)
axes[0].set_yscale("log")
for ax, (n, grp) in zip(axes.flat, df.groupby("has_cus_id_but_not_acc_id")):
piv = grp.pivot('reg_month', 'reg_year', 'Total_Revenue')
empty = pd.DataFrame(index=range(1,12), columns=y)
empty.combine_first(piv).plot.bar(ax=ax, width=0.8, legend=False)
axes[1].legend()
plt.show()

Resources