Analytics counter using sofa and couchApp - couchdb

Couch has a REST interface.
This means that data-updates are exclusive to PUT calls.
I'm inspecting ways to implement a humble analyics counters, and came accross the features of couchdb, sofa and couchapp - which are kin'da cool, having in mind my strong JavaScript orientation.
However, most web-analytics servcies end with making count update calls using requesting some resource, usually in an IMG or SCRIPT tag.
Is there a way I can use couchApp to
use GET request to perform my counts?
Would that be abuse of the
architecture? I mean, not everything in couch is REST - i,g, - the administration parts are not.
I'd be very happy to hear what the experts have to say :)
** Editted *
I just noted that CouchDB and Sofa are shipped with a Mochiweb web-server!
Maybe there's a way I could hook on that?

Fork or plugin idea
If you are an Erlang programmer (or you're looking for a new project to learn Erlang), then you definitely can write anything you want as a plugin/extension to CouchDB. The smallest example I know of is Die CouchDB, my proof-of-concept which adds one query that will simply stop the server.
https://github.com/iriscouch/die_couchdb
You could in principle write a plugin or fork of CouchDB to handle GET requests and do anything with them.
Note about REST architecture
I am not super familiar with analytics implementations, but the point of REST and HTTP is that GET queries have no side-effects and/or are idempotent (running 50 queries is no different from running one).
The upshot is, proxies can and will cache many GET responses, in both standard and nonstandard ways. That seems incompatible with user tracking and data gathering techniques; however maybe analytics tools still think the benefits outweigh the costs.
For most people, it's probably easier to use external tools.
Log idea
One trick is to GET anything from Couch, and then check the log entry from couch. You can get the couch log by querying /_log as the admin. The log will show users' IP address, request path, and any query parameters.
For example
$ curl -X GET http://localhost:5984/?userid=abcde\&windowsize=1024x768\&color=blue
{"couchdb":"Welcome","version":"1.1.0"}
$ curl localhost:5984/_log | grep userid
[Mon, 23 May 2011 00:34:54 GMT] [info] [<0.1409.0>] 127.0.0.1 - - 'GET' /?userid=abcde&windowsize=1024x768&color=blue 200
Next you can process that log entry and re-insert into your actual analytics database yourself.
Wrapper idea
A final solution is to run a simple reverse-proxy which converts your GET requests into whatever you need. NodeJS is getting popular for tasks like that, but you can use any web platform you prefer: PHP, ASP, JSP, whatever you know already.
You simply respond to the GET request and do whatever you need on the server side, such as inserting the relavant information into your analytics db.
Good luck!

Related

Tracking Discord with GA4

Bots are amazing, unless you're Google Analytics
After many months of learning to host my own Discord bot, I finally figured it out! I now have a node server running on my localhost that sends and receives data from my Discord server; it works great. I can do all kinds of the things I want to with my Discord bot.
Given that I work with analytics everyday, one project I want to figure out is how to send data to Google Analytics (specifically GA4) from this node server.
NOTE: I have had success in sending data to my Universal Analytics property. However, as awesome as that was to finally see pageviews coming into, it was equally heartbreaking to recall that Google will be getting rid of Universal Analytics in July of this year.
I have tried the following options:
GET/POST requests to the collect endpoint
This option presented itself as impossible from the get-go. In order to send a request to the collection endpoint, a client_id must be sent along with the request itself. And this client_id is something that must be generated using Google's client id algorithm. So, I can't just make one up.
If you consider this option possible, please let me know why.
Install googleapis npm package
At first, I thought I could just install the googleapis package and be ready to go, but that idea fell on its face immediately too. With this package, I can't send data to GA, I can only read with it.
Find and install a GTM npm package
There are GTM npm packages out there, but I quickly found out that they all require there to be a window object, which is something my node server would not have because it isn't a browser.
How I did this for Universal Analytics
My biggest goal is to do this without using Python, Java, C++ or any other low level languages. Because, that route would require me to learn new languages. Surely it's possible with NodeJS alone... no?
I eventually stumbled upon the idea of actually hosting a webpage as some sort of pseudo-proxy that would send data from the page to GA when accessed by something like a page scraper. It was simple. I created an HTML file that has Google Tag Manager installed on it, and all I had to do was use the puppeteer npm package.
It isn't perfect, but it works and I can use Google Tag Manager to handle and manipulate input, which is wonderful.
Unfortunately, this same method will not work for GA4 because GA4 automatically excludes all identified bot traffic automatically, and there is no way to turn that setting off. It is a very useful feature for GA4, giving it quite a bit more integrity than UA, and I'm not trying to get around that fact, but it is now the Bane of my entire goal.
https://support.google.com/analytics/answer/9888366?hl=en
Where to go from here?
I'm nearly at the end of my wits on figuring this one out. So, either an npm package exists out there that I haven't found yet, or this is a futile project.
Does anyone have any experience in sending data from NodeJS to GA4? (or even GTM?) How did you do it?
...and this client_id is something that must be generated using Google's client id algorithm. So, I can't just make one up...
Why, of course you can. GA4 generates it pretty much the same as UA does. You don't need anything from google to do it.
Besides, instead of mimicking just requests to the collect endpoint, you may just wanna go the MP route right away: https://developers.google.com/analytics/devguides/collection/protocol/ga4 The links #dockeryZ gave, work perfectly fine. Maybe try opening them in incognito, or in a different browser? Maybe you have a plugin blocking analytics urls.
Moreover, you don't really need to reinvent the bicycle. Node already has a few packages to send events to GA4, here's one looking good: https://www.npmjs.com/package/ga4-mp?activeTab=readme
Or you can just use gtag directly to send events. I see a lot of people doing it even on the front-end: https://www.npmjs.com/package/ga-gtag Gtag has a whole api not described in there. Here's more on gtag: https://developers.google.com/tag-platform/gtagjs/reference Note how the library allows you to set the client id there.
The only caveat there is that you'll have to track client ids and session ids manually. Shouldn't be too bad though. Oh, and you will have to redefine the concept of a pageview, I guess. Well, the obvious one is whenever people post in the chan that is different from the previous post in a session. Still, this will have to be defined in the code.
Don't worry about google's bot traffic detection. It's really primitive. Just make sure your useragent doesn't scream "bot" in it. Make something better up.

BreezeJS with a NodeJS API (not directly to MongoDB)

I'm researching BreezeJS for a big upcoming project.
Our goal is a offline first web app.
But here is what I can't fully understand (and would take to much time to test) - Does BreezeJS allow for the backend to be a REST API (built with NodeJS and Express)?
We need this because we don't want to simply sync to a remote DB (in our case Mongo), but use a remote REST API so that we can embed some business logic. Things like workflow triggering on a POST to a particular entity.
Is this possible with BreezeJS? If not what would be a good option?
Thanks in advance
It is certainly possible, simply take the breeze-mongo server implementation and strip out the mongo specific code. This should be fairly straight forward, express and mongo are pretty well separated in the code.
That said, you would lose or have to rewrite much of the server side code that converts an OData query string into a mongo query, but if you are going pure 'REST' you probably don't want that anyway.
You would have to do something similar on the save/POST side, but this is presumably something you are already familiar with.

storing metadata from the spotify api

I want to use the spotify api to create a webapp. Without going into too much detail about the project, I want to clear up whether it would be against the terms and conditions or not.
After reading the terms and conditions, i read this line under things NOT to do: "aggregate Metadata to create data bases, or any other compilations of Metadata".
I don't plan to do any automated requests, for example, hammering the service with different queries to build a database... I'm just wondering whether I can store results from users who have performed searches via my application to the api, so that I can build content from my database on other parts of the application.
Thanks
I'm not a lawyer, so you'll need to have a lawyer confirm this (contracts, including ToS contracts, are important), but the general gist is that if you cache the results of user-generated requests to create features then you're ok. If you start caching stuff not generated by a user, you're in muddy water.
Good:
Other users who searched for "Madonna" in MyAwesomeApp also searched for "Backstreet Boys"!
Bad:
Here's a list of all the blue cover arts on Spotify: [list]
To generate the first example you can cache and work with searches explicitly done by users of your application. The second would require scraping all of the coverart in the service, which isn't allowed.

How do I run server-side code from couchdb?

Couchdb is great at storing and serving data, but I'm having some trouble getting to grips with how to do back-end processing with it. GWT, for example, has out of the box support for synchronous and asynchronous call backs, which allow you to run arbitrary Java code on the server. Is there any way to do something like this with couchdb?
For example, I'd like to generate and serve a PDF file when the user clicks a button a web app. Ideally the workflow would look something like this:
User enters some data
User clicks a generate button
A call is made to the server, and the PDF is generated server side. The server code can be written in any language, but preferably Java.
When PDF generation is finished, the user is prompted to download and save the document.
Is there a way to do this with out of the box couchdb, or is some additional, third-party software required to communicate between the web client and backend data processing code?
EDIT:Looks like I did a pretty poor job of explaining my question. What I'm interested in is essentially serving servlets from Couchdb similarly to the way that you can serve Java servlets along side web pages from a war file. I used GWT as an example because it has support for developing the servlets and client side code together and compiling everything into a single war file. I'd be very interested in something like this because it would make deploying fully functional websites a breeze through Couchdb replication.
By the looks of it, however, the answer to my question is no, you can't serve servlets from couchdb. The database is set up for CRUD style interactions, and any servlet style components need to either be served separately, or done by polling the db for changes and acting accordingly.
Here's what I would propose as the general workflow:
When user clicks Generate: serialize the data they've entered and any other relevant metadata (e.g. priority, username) and POST it to couchdb as a new document. Keep track of the _id of the document.
Code up a background process that monitors couchdb for documents that need processing.
When it sees such a document, have it generate the PDF and attach it to that same couch doc.
Now back to the client side. You could use ajax polling to repeatedly GET the couch doc and test whether is has an attachment or not. If it does, then you can show the user the download link.
Of course the devil is in the details...
Two ways your background process(es) can identify pending documents:
Use the _changes API to monitor for new documents with _rev beginning with "1-"
Make requests on a couchdb view that only returns docs that do not have an "_attachments" property. When there are no documents to process it will return nothing.
Optionally: If you have multiple PDF-making processes working on the queue in parallel you will want to update the couch doc with a property like {"being-processed":true} and filter these out the view as well.
Some other thoughts:
I do not recommend using the couchdb externals API for this use case because it (basically) means couchdb and your PDF-generating code must be on the same machine. But it's something to be aware of.
I don't know a thing about GWT, but it doesn't seem necessary to accomplish your goals. Certainly CouchDB can serve any static files (js or other) you want either as attachments to docs in a db or from the filesystem. You could even eval() JSON properties you put into couch docs. So you can use GWT to make ajax calls or whatever but GWT can be completely decoupled from couchdb. Might be simpler that way.
GWT has two parts to it. One is a client that the GWT compiler translates to Java, and the other is a Servlet if you do any RPC. Typically you would run your Client code on a browser and then when you made any RPC calls you would contact a Java Servlet Engine (Such as Tomcat or Jetty or ...) , which in turn calls you persistence layer.
GWT does have the ability to do JSON requests over HTTP and coincidentally, this is what CouchDB uses. So in theory it should be possible. (I do not know if anybody has tried it). There would be a couple of issues.
CouchDB would need to serve up the .js files that have the compiled GWT client code.
The main issue I see in your case is that couchDB would need to generate your PDF files, while couchDB is just a storage engine and does not typically do any processing. I guess you could extend it if you are any good with the Erlang programming language.

Simple sip based client interaction... Any Ideas

I am tring to do the following:
I want a SIP User Agent to perform the following steps on receiving an inbound call (call set up request).
1) Read the caller ID from the SIP request and Log the details to file
2) Drop the call (terminate the call without picking up the call)
I have not been able to find a high level api that will let me script this interaction. I have taken a look at Jain but it seems to be a very low level API and I imagine will require a lot of work to get the above interaction coded up and working. Can anyone suggest an apropriate API to implement the above.
NOTE: I have tried ROXEO.com and their CCXML based apps are great but their pricing is aimed at big companies, so Voxeo is not an option.
There are quite a few open source SIP stacks around two examples of many are pjsip and sipsorcery (as a disclaimer I do some dev work on the latter). It will all depend on your language and prefeences as to which one suits. There are also lots of SIP tools around that may be a more efficient approach for you such as SIPp.
Apart from those options and given your very simple requirements you could probably get away with 20 or 30 lines of code that listens on a UDP socket, parses the incoming INVITE to extract the From header and then sends back a rejection response by changing the top line of the request to make it a response and sending it back to where it came from.
If you're using C, try eXosip, you could easily whatever you want.
Here
It's clear that Jain SIP could be quite painful (actually all the configuration but the API otherwise is quite high-level, to manipulate messages) , but you can take the jain-sip-presence-proxy and removes almost everything from their INVITE handler and build your own message
if you're using java, you can use peers which provides a high level api in package net.sourceforge.peers.sip.core.useragent. The entry point is UserAgent class, take a look at gui package if you want to see how it is used. Traces are in log files so you can track calls.
ivrworx but it can handle one scenarion at a time only
Asterisk pbx can act as a simple sip client, and do just that, however if you wante to integrate something in your own solution, take a look at: http://sipsimpleclient.org/projects/sipsimpleclient/wiki/SipMiddlewareApi

Resources