How can I analyze Google Voice data dump? - excel

I use GV for business and have since about 2011. Over that time I've amassed about 10,000 calls with various clients. I'd like to analyze this data to understand things like what days of the week did I have the most calls, what months had the highest call volume, what hour of the day has the highest call volume, et cetera. (Eventually I would also like to compare that to my Google Calendar data to analyze my conversion rates for a given month, but that's step 2)
My question is, is there any easy way to do this short of actually learning to use Excel? Are there any free or relatively cheap statistics programs that will cut some of the work out for me? It's easy enough to clean the data and drop it into Excel, but there are so many intermediary steps between having a good clean data set and actually getting a histogram out of it that it's starting to feel like it isn't worth it.
I have a list of about 10k calls in this format:
Col.A Col.B Col.C
client date 24hr time
I'm not particularly concerned with who the client is... I just want to analyze the second two columns.
Any help at all would be greatly appreciated.

Related

Forecast or Estimate Next Month Sales For Each Customer

The problem statement that I am currently working on has data available for 27 customers and the purchase amount they have transacted on (in total) for each month in 2021 from Jan until Sept. The data looks like the attached image with this question/post.
sample dataset
I could simply use average to find the next value but that'd not be precise to a very good extent, but then, in absence of any other data or features/columns, is that the only way to solve this question, or are there any other methods anyone can suggest? Note, both Excel &/or Python examples are fine.
Additional Note: I have already tried FORECAST functions in Excel, but I am not sure if the outcome is correct or not, since Microsoft documentation merely provides the formula by means of which this function performs the calculations. Overall there are 5 total types of FORECAST(.**) functions that Excel provides, but the documentation is poor, hence tomorrow, if I want to write the same solution in Python or any other programming language.
Taking a cursory glance at the data, there's a complexity that I'm missing like seasonality, trend, noise, outliers, etc., but let's just assume that this data is a simple trend line for each client.
From a purely high-level, excel can do a simple FORECAST.ETS(target_date, values, timeline, [seasonality], [data_completion], [aggregation]) formula.
It can be streamlined with excel's built in data tool Forecast Sheet.
I could talk about Python but that's a little more hands on with a time series forecast.

Excel bill paying checklist with different currencies

Heading
so I'm starting to learn really soon abroad, there-for ill be needing to handle my own billings.
i would like to handle it in EXCEL form.
but the problem is that i have different currencies EUR,USD,HUF and ILS.
Problem
adding to new lines to stock .price (currencies) so it'll keep all dates and it'll be easy to keep track
tried doing it with macros (found out i don't really know how to)
cant keep currencies from different date and adding "today" automatically every day
thanks for the help for who can help.
if anybody has a different way / solutions how to handle money management with different currencies (when i need total in 1 currencies) then i would like to hear your idea and learn from you. thanks
similar to
XE Travel Expense Calculator
https://www.xe.com/travel-expenses-calculator/

Calculate most common time of day from spreadsheet values

Preliminary
This question applies to any spreadsheet system. I would like help in breaking down the problem, as opposed to an answer to the problem. (Although the latter would be most useful.)
I understand Stack Overflow is good for specific programming problems, and I understand it may take me a few attempts to get my question right, so please help me clarify my question by providing suggestions and I will update it.
Like many data novices I have good experience with discreet data (e.g. how many enquiries last month), but I struggle to understand how to deal with continuous data (e.g. how to discover patterns, and where the criteria for a query are not yet known).
The question
I have a spreadsheet where each row represents a "website enquiry". There is a datetime column, and I'd like to discover patterns in this data, to answer questions like:
what is the most common time of day to receive an enquiry
what is the most common day of the week to receive an enquiry
other useful information I can glean from the data, to allow me to target possible customers
This would be similar to the functions you often see in Social Media analytics, such as "best time to tweet".
I understand that calculating the most common day of the week is very simple, as days are discreet objects. So I don't need help with this!
I would like to avoid simply splitting up the day into four arbitrary time periods (e.g. breakfast, lunch, dinner, nighttime) and counting the number of rows that fall into these bounds. What if these time periods are not best to use to segment the data?
Is there another way, other than quantizing my data using arbitrary bounds?
You could use clustering to find out what the most common times are. Basically, you compare the time separation of enquiries and cluster them just like discrete 1D set of numbers using, for example, the average linkage clustering criterion. As you reach a reasonably small number of clusters, you will start to see the most dominant times of day (and if you want to evaluate those, you can take the time values which are the weighted centres of the biggest clusters).

How Can I Model Many Short Time Series Samples?

How Can I Model Multiple Short Time Series Samples?
For example, let's say I have a new subject each month, and I measure each subject every day for the entire month. I then want to model these multiple strings of independent time series because I assume that there is an underlying pattern that applies to all 12 subjects. However, a time series with an n of 30 is too short to model, so is there some way to group these 12 time series together for a parallel analysis?
I imagine the way to handle this is similar to how one might handle a time series with multiple breaks of unknown length. Unfortunately, I unaware of how to deal with this type of data structure.
Any thoughts on where to even begin? What terms I should research?
Well. Depends on what you're interested in. Makes it a lot easier if we know what kind of data you have, and what you're trying to analyse.
Trying to answer your question: If you assume that there is some underlying structure which is homogenous for, say, 6 of the subjects, and different for the other half, you can just pool the two data sets and do some kind of group-mean analysis. If you're interested in a temporal change over the 12 months, then you need to assume that each subject are homogenous across whatever variable you're measuring.
Normally, for e.g. timeseries in economics, what you're describing is called "censored" or "truncated data".
If we want to measure the income of everyone in a country, we do this by checking electronic paychecks or something. But some people at the end of each tail, may not have a visible income. Poor people may be earning income in other ways, and rich people may want to hide some of their income. This is censored data, and any advanced timeseries stats book will have something on that.
Truncated data is similar. Just imagine income again. If we truncate everyone who makes < 10,000$ a year, then this will "cut off the end" of your distribution. There are also remedies for this. Again check an advanced time series book.
Hope this helped a bit.

What kind of stats and info can I get (mine) from time series data?

I have a database with time series data of different solar power plants: how strong was the sun and how much power that plant created / harvested. This data is in 15 min increments.
I would like to use data mining to get new insights and to then visualize the findings to the users.
I know this falls into the domain of data mining, but my problem is maybe more specific (dealing with time series data). So what can I extract from this kind of data or where can I read about this?
Time Series Analysis is a whole field in itself. That said, you can always start with a few basics and keep adding more to your analysis.
Here are a few things to try for starters from your solar power data:
First, profile your solar power data. That is, calculate Min, Max, daily averages, hourly peaks and lows etc. to get a feel for the data. Plotting with x-axis as time will give you visual information.
Time Series data can be decomposed into "Trend" & "Seasonality" (can be for any repeating time interval)
Look for outliers, abnormalities in your data stream. Missing values, repeats etc.
If you want to learn more about time-series, (and if know R) then the forecast package is a good way to get started. (Especially this free e-book)
Any search on Time Series will take you to Prof. Hyndman's pages, and I have found the free chapters of his forecasting book very useful.
Hope that helps you get started.

Resources