Is there a recent or known issue with the #flurry Data Download? - flurry

Our #flurry App Data Download appears bugged.
We requested raw data for analytics recently Oct, 2rd 2020, but the result was not enough than our expected data amount. there are only a few raw data. for example we compered Arbitrary period old which got around Sept 11th to new which got after 5th Oct.
around Sept 11th data is 16MB
after Oct 5th data is 18.6kB
Above data is same period and same data choice.
There is few raw data which is reported but also there is enough event counts on the Flurry Analytics. the every data graph is normal.
Flurry analytics web site. --> about 30,000 data
Exported data --> about 60 data
It's not relate the export file format (CSV, XML, JSON).
It's same result
Add information 2020.Oct.7th
I did data download how to this below.
Flurry analytics console login
Click the Data Download of Sessions
And select application SmartSync(iOS) or SmartSync(Android)
Set Event for any period, and CSV or else.
Is this a known issue or recent bug?
If someone know the any tips or correct setting, could you please advice?

This is now fixed. Please email support if you have further difficulties.

Related

Rally CumulativeFlowData endpoint showing data > 24hours old

Using the CumulativeFlowData diagram available in Rally we can see a chart includes data up to the current day. When pulling data from the Rally API we can only get data for up to 24 hours earlier.
We have tried pulling CumulativeFlow data by both Iteration Id and Created date and both only provide data with a 24 hour delay.
Does anyone know why there is such a big lag with the data being available via the API?
https://rally1.rallydev.com/slm/webservice/v2.0/iterationcumulativeflowdata?workspace=<<myworkspace>>&project=https://rally1.rallydev.com/slm/webservice/v2.0/project/<<myprojectId>>&query=(IterationObjectID = <<myIterationId>>)&fetch=CreationDate,CardEstimateTotal,CardState&start=1&pagesize=200
eg.
Rally View of CFD
View of data as retried from API

data being overwritten when outputting data from stream analytics to powerbi

lately I've been playing around with Stream Analytics queries with PowerBI as output sink. I made a simple query which retrieves the total count of http responsecodes of our website requests over time and groups them by date and response code.
The input data is retrieved from a storage account which holds BLOB storage. This is my query:
SELECT
DATETIMEFROMPARTS(DATEPART(year,R.context.data.eventTime), DATEPART(month,R.context.data.eventTime),DATEPART(day,R.context.data.eventTime),0,0,0,0) as datum,
request.ArrayValue.responseCode,
count(request.ArrayValue.responseCode)
INTO
[requests-httpresponsecode]
FROM
[cvweu-internet-pr-sa-requests] R TIMESTAMP BY R.context.data.eventTime
OUTER APPLY GetArrayElements(R.request) as request
GROUP BY DATETIMEFROMPARTS(DATEPART(year,R.context.data.eventTime), DATEPART(month,R.context.data.eventTime),DATEPART(day,R.context.data.eventTime),0,0,0,0), request.ArrayValue.responseCode, System.TimeStamp
Since continuous export became active on 3 september 2018, I chose a job start time of 3 september 2018. Since I am interested in the statistics until today, I did not include a date interval so I am expecting to see data from 3 september 2018 until now (20 december 2018). The job is running fine without errors and I chose PowerBI as an output sink. Immediately I saw the chart being propagated starting from 3 september grouped by day and counting. So far, so good. A few days later I noticed the output dataset didnt start from 3 september anymore but from 2 December until now. Apparently data is being overwritten.
The following link says:
https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-power-bi-dashboard
"defaultRetentionPolicy: BasicFIFO: Data is FIFO, with a maximum of 200,000 rows."
But my output table does not have close to 200.000 rows:
datum,count,responsecode
2018-12-02 00:00:00,332348,527387
2018-12-03 00:00:00,3178250,3282791
2018-12-04 00:00:00,3170981,4236046
2018-12-05 00:00:00,2943513,3911390
2018-12-06 00:00:00,2966448,3914963
2018-12-07 00:00:00,2825741,3999027
2018-12-08 00:00:00,1621555,3353481
2018-12-09 00:00:00,2278784,3706966
2018-12-10 00:00:00,3160370,3911582
2018-12-11 00:00:00,3806272,3681742
2018-12-12 00:00:00,4402169,3751960
2018-12-13 00:00:00,2924212,3733805
2018-12-14 00:00:00,2815931,3618851
2018-12-15 00:00:00,1954330,3240276
2018-12-16 00:00:00,2327456,3375378
2018-12-17 00:00:00,3321780,3794147
2018-12-18 00:00:00,3229474,4335080
2018-12-19 00:00:00,3329212,4269236
2018-12-20 00:00:00,651642,1195501
EDIT: I have created the STREAM input source according to
https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-quick-create-portal. I can create a REFERENCE input as well, but this invalidates my query since APPLY and GROUP BY are not supported and I also think STREAM input is what I want according to https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-add-inputs.
What am I missing? Is it my query?
It looks like you are streaming to a Streaming dataset. Streaming datasets doesn't store the data in a database, but keeps only the last hour of data. If you want to keep the data pushed to it, then you must enable Historic data analysis option, when you create the dataset:
This will create PushStreaming dataset (a.k.a. Hybrid) with basicFIFO retention policy (i.e. about 200k-210k records kept).
You're correct that Azure Stream Analytics should be creating a "PushStreaming" or "Hybrid" dataset. Can you confirm that your dataset is correctly configured as "Hybrid" (you can check this attribute even after creation as shown here)?
If it is the correct type, can you please clarify the following:
Does the schema of your data change? If, for example, you send the datum {a: 1, b: 2} and then {c: 3, d: 4}, Azure Stream Analytics will attempt to change the schema of your table, which can invalidate older data.
How are you confirming the number of rows in the dataset?
Looks like my query was the problem. I had to use TUMBLINGWINDOW(day,1) instead of System.TimeStamp.
TUMBLINGWINDOW and System.TimeStamp produce exactly the same chart output on the frontend, but seem to be processed in a different way in the backend. This was not reflected to the frontend in any way so this was confusing. I suspect something is happening in the backend due to the way the query is processed when not using TUMBLINGWINDOW and you happen to hit the 200k row per dataset limit sooner than expected. The query below is the one which is producing the expected result.
SELECT
request.ArrayValue.responseCode,
count(request.ArrayValue.responseCode),
DATETIMEFROMPARTS(DATEPART(year,R.context.data.eventTime), DATEPART(month,R.context.data.eventTime),DATEPART(day,R.context.data.eventTime),0,0,0,0) as date
INTO
[requests-httpstatuscode]
FROM
[cvweu-internet-pr-sa-requests] R TIMESTAMP BY R.context.data.eventTime
OUTER APPLY GetArrayElements(R.request) as request
GROUP BY DATETIMEFROMPARTS(DATEPART(year,R.context.data.eventTime), DATEPART(month,R.context.data.eventTime),DATEPART(day,R.context.data.eventTime),0,0,0,0),
TUMBLINGWINDOW(day,1),
request.ArrayValue.responseCode
As we speak my stream analytics job is running smoothly and producing the expected output from 3 september until now without data being overwritten.

Azure : Resource usage API issue

I tried to pull the Azure resource usage data for billing metrics. I followed the steps as mentioned in the blog to get Usage data of resources.
https://msdn.microsoft.com/en-us/library/azure/mt219001.aspx
Even If I set "start and endtime" parameter in the URL, its not take effect. It returns entire output [ from resource created/added time ].
For example :
https://management.azure.com/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Commerce/UsageAggregates?api-version=2015-06-01-preview&reportedStartTime=2017-03-03T00%3a00%3a00%2b00%3a00&reportedEndTime=2017-03-04T00%3a00%3a00%2b00%3a00&aggregationGranularity=Hourly&showDetails=true"
As per the above URL, it should return the data between "2017-03-03 to 2017-03-04". But It shows the data from 2nd March [ 2017-03-02]. don't know why this return entire output and time filter section is not working.
Note : Endtime parameter value takes effect, mean it shows the output upto what mentioned in the endtime. But it doesn't consider the start time.
Anyone have a suggestion on this.
So there are a few things to consider:
There is usage date/time and then there is reported date/time.
Former tells you the date/time when the resources were used while the
latter tells you the date/time when this information was received by
the billing sub-system. There will be some delay in when the
resources used versus when they are reported. From this link:
Set {dateTimeOffset-value} for reportedStartTime and reportedEndTime
to valid dateTime values. Please note that this dateTimeOffset value
represents the timestamp at which the resource usage was recorded
within the Azure billing system. As Azure is a distributed system,
spanning across 19 datacenters around the world, there is bound to be
a delay between the resource usage time (when the resource was
actually consumed) and the resource usage reported time (when the
usage event reached the billing system) and callers need a predictable
way to get all usage events for a subscription for a given time
period.
The query only lets you search for reported date/time and there is no provision for usage date/time. However the data returned back to you contains usage date/time and not the reported date/time.
Long story short, because of the delay in propagating the usage information to the billing sub-system, the behavior you're seeing is correct. In my experience, it takes about 24 hours for all the usage information to show up in the billing sub-system.
The way we handle this scenario in our application is we fetch the data for a longer duration and then pick up only the data we're interested in seeing. So for example, if I need to see the data for 1st of March then we query the data for reported date/time from 1st March to say 4th March (i.e. today's date) and then discard any data where usage date is not 1st of March.
If we don't find any data (which is quite possible and is happening in your case as well), we simply tell the users that usage information is not yet available.

Oracle responsys - send record to table in warehouse on event

I'm trying to figure out how to send Responsys record/s to a table in our data warehouse (MS SQL) in real time, when triggered to do so from an interaction event.
Use case is-
- Mass email is sent
- Customer X interacts with email (e.g. open, click)
- Responsys sends contact along with unique identifier (let's call it 'customer_key') and phone number to the table in the warehouse, within several minutes of customer interaction
Once in the table I can pass to our third party call centre platform.
Any help would be greatly appreciated!
Thanks
Alex
From what I know of Responsys, the most you can download interaction data is 6 times a day via the Export Event Data Feed.
If you need it more often than that I think you will to set up a filter in Responsys that checks user interactions in the last 15 mins. And then schedule a daily download per 15 mins interval via connect.
It would have to be 15 mins as you can only schedule a custom download within a 15 min window on responsys.
You'd then need to automate downloading the file, loading and importing.
I highly doubt this is responsive enough for you however!

Server logs / Webalizer, 206 partial content for audio and video files – how do I calculate the number of downloads?

I need to calculate the number of video and audio file downloads from our media server. Our media server only hosts audio/video files (mp3 and mp4) and we parse our IIS log files monthly using Stone Steps Webalizer.
When I look at the Webalizer stats most of the ‘hits’ are ‘code 206 partial content’ and most of the remainder are ‘code 200 ok’. So for instance our most recent monthly Webalizer stats look something like this -
Total hits: 1,600,000
Code 200 - ok: 300,000
Code 206 - Partial Content: 1,300,000
The total hits figure is much larger than I would expect it to be in relation to the amount of data being served (Total Kbytes).
When I analyse the log files it looks as though media players (iTunes, Quicktime etc) create multiple 206's for a single download/play and I suspect that Webalizer does not group these multiple 206's from the same IP/visit and instead records each 206 as a ‘hit’ - and because of this the total hits figure is vastly inflated. There is a criticism of Weblizer on the Wiki page which appears to confirm this - http://en.wikipedia.org/wiki/Webalizer
Am I correct about the 206's and Webalizer, and if I am correct how would I calculate the number of downloads? Is there an industry standard methodology and/or are there alternative web analytics applications that would be better suited to the task?
Any help or advice would be much appreciated.
Didn't receive any response to my question but thought I would give an update.
We have analysed a one hour sample of our log files and we have done some testing of different browsers / media players on an mp3 and mp4 file.
Here are our findings -
Some media players, particularly iTunes/Quicktime, produce a series
of 206 requests but do not produce a 200 request.
Most but not all web broswers (Chrome is the exception), produce a
200 request and no 206 requests when downloading a media file i.e.
download to desktop as opposed to playing in a desktop media player
or media player plug-in
If the file is cached by the browser/media player it may produce 304
request and no 200 and no 206 request.
Given the above we think it's impossible to count 'downloads' of media files from log file analysis unless the software has an intelligent algorithm designed specifically for that purpose. For example, it would need to group all requests for a specific media file from the same IP within a set time period (say 30 minutes) and count that as one download. As far as I'm aware there isn't any log file analysis software on the market which can offer that functionality.
I did a quick Google search to find out more about podcast/video metrics / log file analysis and it does seem to be a very real, albeit niche problem. Google Analytics and other web metrics tools that use web beacons e.g. SiteStat, are not an option unless your media files are only available for download from your website i.e. no RSS or iTunes syndication etc. Even then I'm not sure if they could do the job.
I think this is why companies such as podtrac and blubrry offer specialised podcast/video measurement tools using redirects as opposed to log file analysis.
Podtrac
http://podtrac.com/publisher/measurement
Blubrry
http://www.blubrry.com/podcast_statistics/
If anyone has experience or expertise in this area feel free to chime in and offer advice or correct me if I'm wrong.
Try my software. I encountered the same issue with mp3's being split into multiple streams for IPods and Iphones. It is really easy to implement and works a treat.
Github
This is probably WAY too late to help you specifically but if you have parsed your server logs and stored them somewhere sensible like a DBMS a quick bit of SQL will give you the combined results you're after. Given a very simple log table where each 206 is recorded with a 'hit time' the ip address of the endpoint and an id/foreign key of the item fetched you could run this query:
select min(hit_time) as hit_time, ip_address, episode_id
from podcast_hit
group by DATE(hit_time), ip_address, episode_id
This will group up all the 206 records and make them unique by day and user giving you more accurate stats. Hope this helps someone!

Resources