Optimal way to display data with different ranges - python-3.x
I have an application which I pull data from an FPGA & display it for the engineers. Good application ... until you start displaying data which are extremely different in ranges...
say: a signal perturbating around +4000 and another around zero (both with small peak-peak).
At the moment the only real workaround is to "export to csv" and then view in Excel but I would like to improve the application so that this isn't needed
Option 1 is a more dynamic pointer that will give you readings of ALL visible plots for the present x
Option 2. Multiple Y axis. This is where it gets a bit ... tight with respect to UI area.
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA
import numpy as np
t = np.arange(0,1,0.00001)
data = [5000*np.sin(t*2*np.pi*10),
10*np.sin(t*2*np.pi*20),
20*np.sin(t*2*np.pi*30),
np.sin(t*2*np.pi*40)+5000,
np.sin(t*2*np.pi*50)-5000,
np.sin(t*2*np.pi*60),
np.sin(t*2*np.pi*70),
]
fig = plt.figure()
host = host_subplot(111, axes_class=AA.Axes)
axis_list = [None]*7
for i in range(len(axis_list)):
axis_list[i] = host.twinx()
new_axis = axis_list[i].get_grid_helper().new_fixed_axis
axis_list[i].axis['right'] = new_axis(loc='right',
axes=axis_list[i],
offset=(60*i,0))
axis_list[i].axis['right'].toggle(all=True)
axis_list[i].plot(t,data[i])
plt.show()
for i in data:
plt.plot(t,i)
plt.show()
This code snippet doesn't contain any figure resize to ensure all 7 y-axis are visible BUT ignoring that, you can see it is quite large...
Any advice with respect to multi-Y or a better solution to displaying no more than 7 datasets?
Related
Is there a way using librosa's waveplot to store the coordinates of the graph rather than show the image of the waveplot?
I am working on an audio project where I am using Librosa and have the following code from an example online. Rather than opening up an image with a graph of the amplitude versus time, I want to be able to store the coordinates that make up the graph in an array. I have tried a lot of different examples found on stackoverflow as well as other websites with no luck. I am relatively new to python and this is my first question on stackoverflow so please be kind. import librosa.display import matplotlib.pyplot as plt from IPython.display import display, Audio filename = 'queen2.mp3' samples, sampleRate = librosa.load(filename) display(Audio(filename)) plt.figure(figsize=(12, 4)) librosa.display.waveplot(y, sr=None, max_points=200) plt.show()
librosa is open-source (under the ISC license), so you can look at the code to see how it does this. The documentation for functions has a handy [source] link which takes you do the code. For librosa.display.waveplot you will see that it calls a function __envelope() to compute the envelope. Presumably it is these coordinates you are after. hop_length = 1 y = __envelope(y, hop_length) y_top = y[0] y_bottom = -y[-1] import numpy as np def __envelope(x, hop): '''Compute the max-envelope of non-overlapping frames of x at length hop x is assumed to be multi-channel, of shape (n_channels, n_samples). ''' x_frame = np.abs(util.frame(x, frame_length=hop, hop_length=hop)) return x_frame.max(axis=1)
Show only some bar labels for matplotlib bar chart
I have a bar chart with a lot of columns (around 100). I want to show only some of the bar labels (they are ordered in such a way that this is a perfectly reasonable way to present the data). Is there a simple way to do this, say show every 3rd or 5th label? I know I can manually pull together the list, but I figure there's likely an elegant option. import matplotlib.pyplot as plt import numpy as np data = np.random.rand(100) groupings = np.arange(0,100) x_pos = [i for i, _ in enumerate(groupings)] plt.bar(x_pos,data) plt.xticks(x_pos,groupings)
Bokeh plot line not updating after checking CheckboxGroup in server mode (python callback)
I have just initiated myself to Bokeh library and I would like to add interactivity in my dashboard. To do so, I want to use CheckboxGroup widget in order to select which one of a pandas DataFrame column to plot. I have followed tutorials but I must have misunderstood the use of ColumnDataSource as I cannot make a simple example work... I am aware of previous questions on the matter, and one that seems relevant on the StackOverflow forum is the latter : Bokeh not updating plot line update from CheckboxGroup Sadly I did not succeed in reproducing the right behavior. I have tried to reproduce an example following the same updating structure presented in Bokeh Server plot not updating as wanted, also it keeps shifting and axis information vanishes by #bigreddot without success. import numpy as np import pandas as pd from bokeh.models import ColumnDataSource from bokeh.plotting import figure from bokeh.palettes import Spectral from bokeh.layouts import row from bokeh.models.widgets import CheckboxGroup from bokeh.io import curdoc # UPDATE FUNCTION ------------------------------------------------ # make update function def update(attr, old, new): feature_selected_test = [feature_checkbox.labels[i] for i in feature_checkbox.active] # add index to plot feature_selected_test.insert(0, 'index') # create new DataFrame new_df = dummy_df.filter(feature_selected_test) plot_src.data = ColumnDataSource.from_df(data=new_df) # CREATE DATA SOURCE ------------------------------------------------ # create dummy data for debugging purpose index = list(range(0, 890)) index.extend(list(range(2376, 3618))) feature_1 = np.random.rand(len(index)) feature_2 = np.random.rand(len(index)) feature_3 = np.random.rand(len(index)) feature_4 = np.random.rand(len(index)) dummy_df = pd.DataFrame(dict(index=index, feature_1=feature_1, feature_2=feature_2, feature_3=feature_3,feature_4=feature_4)) # CREATE CONTROL ------------------------------------------------------ # list available data to plot available_feature = list(dummy_df.columns[1:]) # initialize control feature_checkbox = CheckboxGroup(labels=available_feature, active=[0, 1], name='checkbox') feature_checkbox.on_change('active', update) # INITIALIZE DASHBOARD --------------------------------------------------- # initialize ColumnDataSource object plot_src = ColumnDataSource(dummy_df) # create figure line_fig = figure() feature_selected = [feature_checkbox.labels[i] for i in feature_checkbox.active] # feature_selected = ['feature_1', 'feature_2', 'feature_3', 'feature_4'] for index_int, col_name_str in enumerate(feature_selected): line_fig.line(x='index', y=col_name_str, line_width=2, color=Spectral[11][index_int % 11], source=plot_src) curdoc().add_root(row(feature_checkbox, line_fig)) The program should work with a copy/paste... well without interactivity... Would someone please help me ? Thanks a lot in advance.
You are only adding glyphs for the initial subset of selected features: for index_int, col_name_str in enumerate(feature_selected): line_fig.line(x='index', y=col_name_str, line_width=2, color=Spectral[11][index_int % 11], source=plot_src) So that is all that is ever going to show. Adding new columns to the CDS does not automatically make anything in particular happen, it's just extra data that is available for glyphs or hover tools to potentially use. To actually show it, there have to be glyphs configured to display those columns. You could do that, add and remove glyphs dynamically, but it would be far simpler to just add everything once up front, and use the checkbox to toggle only the visibility. There is an example of just this in the repo: https://github.com/bokeh/bokeh/blob/master/examples/app/line_on_off.py That example passes the data as literals the the glyph function but you could put all the data in CDS up front, too.
How to change scatter plot marker color in plotting loop using pandas?
I'm trying to write a simple program that reads in a CSV with various datasets (all of the same length) and automatically plots them all (as a Pandas Dataframe scatter plot) on the same figure. My current code does this well, but all the marker colors are the same (blue). I'd like to figure out how to make a colormap so that in the future, if I have much larger data sets (let's say, 100+ different X-Y pairings), it will automatically color each series as it plots. Eventually, I would like for this to be a quick and easy method to run from the command line. I did not have luck reading the documentation or stack exchange, hopefully this is not a duplicate! I've tried the recommendations from these posts: 1)Setting different color for each series in scatter plot on matplotlib 2)https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.scatter.html 3) https://matplotlib.org/users/colormaps.html However, the first one essentially grouped the data points according to their position on the x-axis and made those groups of data the same color (not what I want, each series of data is roughly a linearly increasing function). The second and third links seemed to have worked, but I don't like the colormap choices (e.g. "viridis", many colors are too similar and it's hard to distinguish data points). This is a simplified version of my code so far (took out other lines that automatically named axes, etc. to make it easier to read). I've also removed any attempts I've made to specify a colormap, for more of a blank canvas feel: ''' Importing multiple scatter data and plotting ''' import pandas as pd import matplotlib.pyplot as plt ### Data file path (please enter Dataframe however you like) path = r'/Users/.../test_data.csv' ### Read in data CSV data = pd.read_csv(path) ### List of headers header_list = list(data) ### Set data type to float so modified data frame can be plotted data = data.astype(float) ### X-axis limits xmin = 1e-4; xmax = 3e-3; ## Create subplots to be plotted together after loop fig, ax = plt.subplots() ### Since there are multiple X-axes (every other column), this loop only plots every other x-y column pair for i in range(len(header_list)): if i % 2 == 0: dfplot = data.plot.scatter(x = "{}".format(header_list[i]), y = "{}".format(header_list[i + 1]), ax=ax) dfplot.set_xlim(xmin,xmax) # Setting limits on X axis plot.show() The dataset can be found in the google drive link below. Thanks for your help! https://drive.google.com/drive/folders/1DSEs8D7lIDUW4NIPBl2qW2EZiZxslGyM?usp=sharing
Creating a structured grid of subplots with Seaborn FacetGrid
My attempt to use FacetGrid in Seaborn does not produces the expected results. Moreover, I would like to control the white space in the grid. My data and code is the following: toy.to_json() '{"has_cus_id_but_not_acc_id":{"0":0,"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":1,"19":0,"20":0,"21":0,"22":1,"23":0,"24":0,"25":1,"26":0,"27":1,"28":0,"29":1,"30":0,"31":1,"32":0,"33":1,"34":0,"35":1,"36":0,"37":1,"38":0,"39":0,"40":1,"41":1,"42":0,"43":1,"44":0,"45":1,"46":0,"47":1,"48":0,"49":1,"50":0,"51":1,"52":0,"53":1,"54":0,"55":1,"56":0,"57":1,"58":0,"59":1,"60":0,"61":1,"62":0,"63":1,"64":0,"65":1,"66":0,"67":1,"68":0,"69":1,"70":0,"71":1,"72":0,"73":1,"74":0,"75":1,"76":0,"77":0,"78":1,"79":0,"80":1,"81":0,"82":0,"83":1,"84":0,"85":1},"reg_year":{"0":2014.0,"1":2014.0,"2":2014.0,"3":2014.0,"4":2014.0,"5":2014.0,"6":2014.0,"7":2014.0,"8":2015.0,"9":2015.0,"10":2015.0,"11":2015.0,"12":2015.0,"13":2015.0,"14":2015.0,"15":2015.0,"16":2015.0,"17":2016.0,"18":2016.0,"19":2016.0,"20":2016.0,"21":2016.0,"22":2016.0,"23":2016.0,"24":2016.0,"25":2016.0,"26":2016.0,"27":2016.0,"28":2016.0,"29":2016.0,"30":2016.0,"31":2016.0,"32":2016.0,"33":2016.0,"34":2016.0,"35":2016.0,"36":2016.0,"37":2016.0,"38":2017.0,"39":2017.0,"40":2017.0,"41":2017.0,"42":2017.0,"43":2017.0,"44":2017.0,"45":2017.0,"46":2017.0,"47":2017.0,"48":2017.0,"49":2017.0,"50":2017.0,"51":2017.0,"52":2017.0,"53":2017.0,"54":2017.0,"55":2017.0,"56":2017.0,"57":2017.0,"58":2017.0,"59":2017.0,"60":2018.0,"61":2018.0,"62":2018.0,"63":2018.0,"64":2018.0,"65":2018.0,"66":2018.0,"67":2018.0,"68":2018.0,"69":2018.0,"70":2018.0,"71":2018.0,"72":2018.0,"73":2018.0,"74":2018.0,"75":2018.0,"76":2018.0,"77":2018.0,"78":2018.0,"79":2018.0,"80":2018.0,"81":2018.0,"82":2019.0,"83":2019.0,"84":2019.0,"85":2019.0},"reg_month":{"0":3.0,"1":5.0,"2":6.0,"3":7.0,"4":9.0,"5":10.0,"6":11.0,"7":12.0,"8":1.0,"9":3.0,"10":5.0,"11":6.0,"12":7.0,"13":8.0,"14":9.0,"15":11.0,"16":12.0,"17":1.0,"18":1.0,"19":2.0,"20":3.0,"21":4.0,"22":4.0,"23":5.0,"24":6.0,"25":6.0,"26":7.0,"27":7.0,"28":8.0,"29":8.0,"30":9.0,"31":9.0,"32":10.0,"33":10.0,"34":11.0,"35":11.0,"36":12.0,"37":12.0,"38":1.0,"39":2.0,"40":2.0,"41":3.0,"42":4.0,"43":4.0,"44":5.0,"45":5.0,"46":6.0,"47":6.0,"48":7.0,"49":7.0,"50":8.0,"51":8.0,"52":9.0,"53":9.0,"54":10.0,"55":10.0,"56":11.0,"57":11.0,"58":12.0,"59":12.0,"60":1.0,"61":1.0,"62":2.0,"63":2.0,"64":3.0,"65":3.0,"66":4.0,"67":4.0,"68":5.0,"69":5.0,"70":6.0,"71":6.0,"72":7.0,"73":7.0,"74":8.0,"75":8.0,"76":9.0,"77":10.0,"78":10.0,"79":11.0,"80":11.0,"81":12.0,"82":1.0,"83":1.0,"84":2.0,"85":2.0},"Total_Revenue":{"0":35852.02,"1":2623.97,"2":3526.67,"3":21466.71,"4":72784.1200000003,"5":103921.2899999999,"6":10852.87,"7":16522.07,"8":7443.76,"9":68962.1600000002,"10":10956.38,"11":193856.8799999985,"12":110766.6099999997,"13":123861.8599999987,"14":2722.34,"15":303488.6900000007,"16":6876.58,"17":17729.5,"18":4687.93,"19":26914.06,"20":2228.12,"21":15708.93,"22":859.58,"23":19164.89,"24":163164.4799999995,"25":33180.7300000001,"26":10033.01,"27":1114.48,"28":462613.2900000042,"29":9822.95,"30":70901.4400000003,"31":22370.29,"32":46711.8900000002,"33":2335.02,"34":7259.28,"35":11.83,"36":13590.51,"37":7677.77,"38":282.01,"39":358522.7900000003,"40":5844.0,"41":7027.28,"42":1908.71,"43":4032.35,"44":11072.6,"45":3973.15,"46":30706.23,"47":2644.13,"48":23831.75,"49":670.12,"50":6949.54,"51":4687.7,"52":9672.69,"53":7333.01,"54":12814.33,"55":689.39,"56":6962.86,"57":2283.16,"58":1259.5,"59":224.84,"60":12812.12,"61":247.68,"62":25452.65,"63":1245.02,"64":24211.36,"65":5255.25,"66":28402.76,"67":9148.55,"68":14822.61,"69":345.37,"70":12408.13,"71":989.93,"72":10601.33,"73":730.32,"74":169020.5000000001,"75":697.54,"76":3862038.6799997138,"77":6148750.9899984254,"78":194.06,"79":2379382.4500000761,"80":1174.11,"81":1729567.9000000793,"82":889650.029999995,"83":95.8,"84":415996.6999999974,"85":654.78}}' g = sns.FacetGrid(toy, col='has_cus_id_but_not_acc_id', hue='reg_year') g.map(sns.barplot, 'reg_month', 'Total_Revenue') g.add_legend(); If I use bar in pyplot I get this: g = sns.FacetGrid(toy, col='has_cus_id_but_not_acc_id', hue='reg_year') g.map(plt.bar, 'reg_month', 'Total_Revenue') g.add_legend(); Again, I would like to be able to define the white space of the grid. In addition I would not like to have the bars stacked one over the other but rather one next to the other.
Some values of the year 2018 are really large compared to the any of the values where has_cus_id_but_not_acc_id is 1. Hence the right plot is almost empty. It might make sense to use a logarithmic scale. Now you have 6 years, so each month would need to show 6 bars next to each other. That will make bars pretty small and does not let the chart be easily readable. Still it's possible. The following does not use seaborn, but pandas and matplotlib: import matplotlib.pyplot as plt import pandas as pd toy = '{"has_cus_id_but_not_acc_id":{"0":0,"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":1,"19":0,"20":0,"21":0,"22":1,"23":0,"24":0,"25":1,"26":0,"27":1,"28":0,"29":1,"30":0,"31":1,"32":0,"33":1,"34":0,"35":1,"36":0,"37":1,"38":0,"39":0,"40":1,"41":1,"42":0,"43":1,"44":0,"45":1,"46":0,"47":1,"48":0,"49":1,"50":0,"51":1,"52":0,"53":1,"54":0,"55":1,"56":0,"57":1,"58":0,"59":1,"60":0,"61":1,"62":0,"63":1,"64":0,"65":1,"66":0,"67":1,"68":0,"69":1,"70":0,"71":1,"72":0,"73":1,"74":0,"75":1,"76":0,"77":0,"78":1,"79":0,"80":1,"81":0,"82":0,"83":1,"84":0,"85":1},"reg_year":{"0":2014.0,"1":2014.0,"2":2014.0,"3":2014.0,"4":2014.0,"5":2014.0,"6":2014.0,"7":2014.0,"8":2015.0,"9":2015.0,"10":2015.0,"11":2015.0,"12":2015.0,"13":2015.0,"14":2015.0,"15":2015.0,"16":2015.0,"17":2016.0,"18":2016.0,"19":2016.0,"20":2016.0,"21":2016.0,"22":2016.0,"23":2016.0,"24":2016.0,"25":2016.0,"26":2016.0,"27":2016.0,"28":2016.0,"29":2016.0,"30":2016.0,"31":2016.0,"32":2016.0,"33":2016.0,"34":2016.0,"35":2016.0,"36":2016.0,"37":2016.0,"38":2017.0,"39":2017.0,"40":2017.0,"41":2017.0,"42":2017.0,"43":2017.0,"44":2017.0,"45":2017.0,"46":2017.0,"47":2017.0,"48":2017.0,"49":2017.0,"50":2017.0,"51":2017.0,"52":2017.0,"53":2017.0,"54":2017.0,"55":2017.0,"56":2017.0,"57":2017.0,"58":2017.0,"59":2017.0,"60":2018.0,"61":2018.0,"62":2018.0,"63":2018.0,"64":2018.0,"65":2018.0,"66":2018.0,"67":2018.0,"68":2018.0,"69":2018.0,"70":2018.0,"71":2018.0,"72":2018.0,"73":2018.0,"74":2018.0,"75":2018.0,"76":2018.0,"77":2018.0,"78":2018.0,"79":2018.0,"80":2018.0,"81":2018.0,"82":2019.0,"83":2019.0,"84":2019.0,"85":2019.0},"reg_month":{"0":3.0,"1":5.0,"2":6.0,"3":7.0,"4":9.0,"5":10.0,"6":11.0,"7":12.0,"8":1.0,"9":3.0,"10":5.0,"11":6.0,"12":7.0,"13":8.0,"14":9.0,"15":11.0,"16":12.0,"17":1.0,"18":1.0,"19":2.0,"20":3.0,"21":4.0,"22":4.0,"23":5.0,"24":6.0,"25":6.0,"26":7.0,"27":7.0,"28":8.0,"29":8.0,"30":9.0,"31":9.0,"32":10.0,"33":10.0,"34":11.0,"35":11.0,"36":12.0,"37":12.0,"38":1.0,"39":2.0,"40":2.0,"41":3.0,"42":4.0,"43":4.0,"44":5.0,"45":5.0,"46":6.0,"47":6.0,"48":7.0,"49":7.0,"50":8.0,"51":8.0,"52":9.0,"53":9.0,"54":10.0,"55":10.0,"56":11.0,"57":11.0,"58":12.0,"59":12.0,"60":1.0,"61":1.0,"62":2.0,"63":2.0,"64":3.0,"65":3.0,"66":4.0,"67":4.0,"68":5.0,"69":5.0,"70":6.0,"71":6.0,"72":7.0,"73":7.0,"74":8.0,"75":8.0,"76":9.0,"77":10.0,"78":10.0,"79":11.0,"80":11.0,"81":12.0,"82":1.0,"83":1.0,"84":2.0,"85":2.0},"Total_Revenue":{"0":35852.02,"1":2623.97,"2":3526.67,"3":21466.71,"4":72784.1200000003,"5":103921.2899999999,"6":10852.87,"7":16522.07,"8":7443.76,"9":68962.1600000002,"10":10956.38,"11":193856.8799999985,"12":110766.6099999997,"13":123861.8599999987,"14":2722.34,"15":303488.6900000007,"16":6876.58,"17":17729.5,"18":4687.93,"19":26914.06,"20":2228.12,"21":15708.93,"22":859.58,"23":19164.89,"24":163164.4799999995,"25":33180.7300000001,"26":10033.01,"27":1114.48,"28":462613.2900000042,"29":9822.95,"30":70901.4400000003,"31":22370.29,"32":46711.8900000002,"33":2335.02,"34":7259.28,"35":11.83,"36":13590.51,"37":7677.77,"38":282.01,"39":358522.7900000003,"40":5844.0,"41":7027.28,"42":1908.71,"43":4032.35,"44":11072.6,"45":3973.15,"46":30706.23,"47":2644.13,"48":23831.75,"49":670.12,"50":6949.54,"51":4687.7,"52":9672.69,"53":7333.01,"54":12814.33,"55":689.39,"56":6962.86,"57":2283.16,"58":1259.5,"59":224.84,"60":12812.12,"61":247.68,"62":25452.65,"63":1245.02,"64":24211.36,"65":5255.25,"66":28402.76,"67":9148.55,"68":14822.61,"69":345.37,"70":12408.13,"71":989.93,"72":10601.33,"73":730.32,"74":169020.5000000001,"75":697.54,"76":3862038.6799997138,"77":6148750.9899984254,"78":194.06,"79":2379382.4500000761,"80":1174.11,"81":1729567.9000000793,"82":889650.029999995,"83":95.8,"84":415996.6999999974,"85":654.78}}' df = pd.read_json(toy) df['reg_year'].astype(int) u = df["has_cus_id_but_not_acc_id"].unique() y = df['reg_year'].unique() fig, axes = plt.subplots(1,len(u), sharey=True) axes[0].set_yscale("log") for ax, (n, grp) in zip(axes.flat, df.groupby("has_cus_id_but_not_acc_id")): piv = grp.pivot('reg_month', 'reg_year', 'Total_Revenue') empty = pd.DataFrame(index=range(1,12), columns=y) empty.combine_first(piv).plot.bar(ax=ax, width=0.8, legend=False) axes[1].legend() plt.show()