How to make a categorical count bar plot with time on x-axis - python-3.x

I want to count the number of occurrences of categories in a variable and plot it against time.
The data looks like following:
Date_column Categorical_variable
20-01-2019 A
20-01-2019 B
20-01-2019 C
21-01-2019 A
21-02-2019 A
22-02-2019 B
........................
23-04-2020 A
I want to show that in month of Jan I had 1 occurrence of B/C whereas 2 occurrences of A. In feb, I had 1 occurrence of A/B and so on. The bar plots can be stacked to know the total number of occurrences.
I've been very close to it. But haven't been able to draw plot out of it.
df['Date_column'].groupby([df.Date_column.dt.year, df.Date_column.dt.month]).agg('count')
The other way is to change the dates to 1st of every month, and then group by to count a occurence. But I'm unable to draw plot out of it.
df.groupby(df['Date_column'], df['Categorical_variable']).count()

Use crosstab with Series.dt.to_period:
df['Date_column'] = pd.to_datetime(df['Date_column'])
df = pd.crosstab(df['Date_column'].dt.to_period('m'), df['Categorical_variable'])
df.plot.bar()

Related

Histogram bins size to equal 1 day - pyplot

I have this list of delivery times in days for cars that are 0 years old. The list contains nearly 20,000 delivery days with many days being repeated. My question is how do i get the histogram to show bin sizes as 1 day. I have set the bin size to the amount of unique delivery days there by:
len(set(list))
but when i generate the histogram, the frequency of 0 delivery days is over 5000, however when i do list.count(0) it returns with 4500.
As you pointed out, len(set(list)) is the number of unique values for the "delivery days" variable. This is not the same thing as the bin size; it's the number of distinct bins. I would use "bin size" to describe the number of items in one bin; "bin count" would be a better name for the number of bins.
If you want to generate a histogram, supposing the original list of days is called days_list, a quick high-level approach is:
Make a new set unique_days = set(days_list)
Iterate over each value day in unique_days
For the current day, set the height of the bar (or size of the bin) in the
histogram to be equal to days_list.count(day). This will tell you the number
of times the current "day" value for number of delivery days appeared in the
days_list list of delivery times.
Does this make sense?
If the problem is not that you're manually calculating the histogram wrong but that pyplot is doing something wrong, it would help if you included some code for how you are using pyplot.
The number of bins would be determined by the number of days up to the maximum number of possible days.
Say daylist is the list you want to histogram (never call a list list, because that overwrites the python command with the same name), you would use the maximum of that list and create a range of bins like
maxi = max(daylist)
bins = range(0, maxi)
plt.hist(daylist, bins=bins)
or, if you want to use numpy,
bins = np.arange(0,np.max(daylist))
plt.hist(daylist, bins=bins)

Plotting time on X axis in excel

I have done 24 hour measurement and results obtain contains around 1400 entries. Now I want to plot those results in such a way
That x axis represent my time and y axis the corresponding value.
My x axis should be divided into 24 sections each representing 1
hour.
My exact start time is 14:00 and end time is next day 14:00.
For more clarification I am adding a simple version of my data here below
And resulting Plot I am getting is this.
I look forward to your answers. Thank you.
If the time values go across midnight, you need to add a date part to the time value, so they can be plotted correctly as before and after midnight. At the very least, the time values for the first day should have a 0 before the decimal, e.g. 0.875 for 9 pm, and the values after midnight should have a 1 before the decimal, e.g. 1.125 for 1 am, so it falls on the next day and not the same day as the 9pm value.
Then plot an XY Scatter chart.
Work out what Excel's internal number (date/time value showing in General format) is for the desired X axis minimum, maximum and major/minor increments and format the x axis accordingly. Set the number format to hh:mm
Edit: For example: you want the minimum X axis value to be 24-Dec-2015 11 pm. Write that into a cell as a date/time. Format the cell to General. Then use the number you see in the format dialog for the X axis minimum.
If you want the major unit to be 1 hour, write the time value 1:00 into a cell and format it with general. Use that number in the dialog for Major.
Format the X axis labels to show time values, not dates.

how to show 0 data point in visualization when data is missing in data table?

i have one visualization,
on x-axis i have months of date column,
on Y-axis i have unique count of issues.
i have 2 filters in text area,
when i am selecting some values in filter 1 or filter 2 than if count is not available for any month than i need 0 for that month.
Lets suppose i have 5 month(Jan to may) data in my data table, i selected 1 value from filter 1 and 1 value from filter 2, if that combination data is not available for march and April, than trend should show 0 count for march and April.
it is a trend line so if count is not available for any month for selected filters than it should show 0. any lead will help.
TIA
Go to the x-axis settings of your vizualization, click on settings next to your month variable and with the categorical options, choose show all values. If you have a bar chart it will show you the 0 values. In a line chart Spotfire will continue the line as if there are no missing values. In this case you need to add markers to your line.
1:

how to put the different times in y axis in matlab

greeting for every one,
I have data in excel file and i want to draw a plot in Matlab in which the Y axis represent the time with starting time in 10:45 for 24 hours i.e, from 10:00 am to the next day in 10:00 am. The x-axis represents the excel file data( the values of frequencies during 24 hours)
how to put the different times in the y axis showing the values of time in the formula of time(00:00 am/pm) using matlab?
if i use this code: ylim(subplot2,[1 24]) and xlim(subplot2,[170 230]) it will be plotted but the y-axis shows only the hours from 1 to 24 hours and i need the y-axis from 10:45 am(starting time) to(10:45)am in interval 24 hours
You can create custom tick labels by specifying tick strings with the command:
time_cells = {'10:45','11:45',...,'9:45','10:45'};
set(gca, 'YTickLabel', time_cells)
Where gca is the handle of your current plot (axes), and the time_cells is a cell array containing all your required tick labels (without the ellipse). It is probably easiest to generate this using a for-loop to create the numbers you want, and then num2str to convert to the strings you need.

how to plot time points in a day on excel

I have a set of points as time of day like
05:36:37
06:31:41
06:38:24
06:39:42
07:03:47
07:04:18
07:09:28
07:28:40
07:29:20
07:29:49
07:31:57
.....
Now i would like to plot this for an entire 24 hour range in a day.
Basically x axis is the 24 hour range and i need a point for every above time on the y axis. ofcourse y axis may not vary, but i am loooking at finding the coverage as to, in a day, when did the even occur more times. Its the same event. If there are better ways to represent you can also suggest that.
Let's say your data is from A1 to A_N.
in column B enter =HOUR(A1)
in column C enter =MINUTE(A1)
Then Insert -> XY Scatter Chart

Resources