Share files and data with external clients - azure

We have a use case to share data and some associated files with external clients. Data is stored in data lake(snowflake). We are thinking of storing related files in s3 or azure blob storage. These files are supplementary information/additional attachments to the data.
The data is securely served from snowflake.
Is it possible to generate a secure link to the file and serve that along with the data for users to access?
Pre signed or SAS URL's will not work because of security concerns.
Is it possible to generate links to files that are easy to open through a browser with b2b type of authentication? Do we need to build a custom function/app to achieve this? Or are there any other options? Did anyone work through a similar use case before?

Moving from comments to an answer for closure:
There are upcoming features that might be exactly what you need. These features are in private preview now, and official documentation will come soon.
In the meantime check the presentation at https://events.snowflake.com/summit/agenda/session/546417?widget=true to learn more.
I'll update this answer when the features get released and publicly documented.

Related

azure QnA with data source as secure API

My legacy system is CRM, it has lots of knowledge base articles and users keeps on adding data on it so it’s dynamic.
Now I want to take knowledge base data to my QnA service and take advantage of LUIS with QnA to develop chatbot which will be installed in my companies website.
microsoftluis
qnamaker
As per the documentation here:
Content is brought into a knowledge base from a data source. Data source locations are public URLs or files, which do not require authentication.
I have bolded the important parts. So, if the URL is publicly accessible from the internet (not only from an internal network), AND does not require authentication, then QnA Maker will be able to import it.
Support for secured SharePoint files has recently been added, you can read about it here.
In your case you might have to do one of the following to get the data out:
Write a piece of software to crawl through and scrape the contents from the CRM system
Write a piece of software that accesses the data store/API behind the CRM system
Then use the QnA Maker REST APIs to update your knowledge base.
There may be other options that are a better approach in your case, but due to my limited knowledge of your internal systems I cannot make any more specific recommendations.

Is there any way to use the wordnik API for a desktop app without server-side access?

I am writing a desktop app using PyQt5 which uses the Wordnik API to get word definitions. I do not have server-side access, nor do I wish to invest in acquiring it. Is there any way I can reliably hide my key so I can share my program on GitHub?
At the very least you could store your API key in a separate source file (which you would exclude from the repository via .gitignore) and check for exceptions while importing that file (see this), alerting to provide own API key if that fails.
Storing the API key in a non-source configuration file is another option, but then your worries are in storing that file in a way that is not accessible to the end user of your application.
Unfortunately, no, our Wordnik terms of service don't allow for sharing keys where they are accessible by end-users. If your app is noncommercial you can share instructions for users to help them apply for and add their own Wordnik keys to their copy of the application (and this also helps you, in that your key won't hit our API limits based on your users).
If this is a commercial application, please get in touch with us (apiteam#wordnik) with more details about your use case as we are looking into how to make this easier. As a small nonprofit with limited engineering resources we can't promise a quick solution but since our mission is to find & share every English word we're always interested in learning more about how folks are using our API. :)
Thanks for using Wordnik!

Gather browser details used to browse webpage

I am tasked with the development of a web page.
As a part of this, I also need to collect the details of browser used by the users to browse the web page (along with version, timestamp & IP).
Tried searching over the internet but maybe my search was not properly directed.
I am not sure how to go about this.
Any pointer to get me started here would help me a long way.
I would also like to know how to store the information thus collected.
Not sure if this is a good idea - but just thinking out loud - are there any online service/channel available where the data can be uploaded in real time - like thingspeak.
Thanks
Google Analytics is great, but it doens't give you the data (for free). So if you want your data in e.g. SQL format then I may suggest a you use a tool that collects the data for you and then sends it to Google Analytics.
We use Segment (segment.io, but there are probably other tools out there too) for this, they have a free plan. You will need to create a Warehouse in AWS and all your data will be stored there, incl. details of the browser (version, timestamp, IP).
Disclaimer: I do not work for Segment, but I am a customer.
Enable Google Analytics in your website, then after 1 week, take a look at the Google Analytics page to see data that was collected.
Follow the guide here to configure Google Analytics on your website: https://support.google.com/analytics/answer/1008080?hl=en
You should go for alternatives like Segment(https://segment.com/), Mixpanel(https://mixpanel.com/) and similars. They can guarantee consistency for your data and also integrate to many different analytics tools and platforms.

How can I diagnose and remedy strange download behavior from an Azure public blob?

I host a software product on Azure, and store the downloads themselves in a public container, which the website links to via URL. You can see my downloads page here: https://flyinside-fsx.com/Download
Normally I get somewhere in the range of 200mb-500mb worth of downloads per day, with the downloaded files themselves being 15-30mb. Starting these week, I've seen spikes of up to 220GB per day from this storage container. It hasn't harmed the website in any way but the transfer is costing me money. I'm certainly not seeing an increase in website traffic that would accompany 220GB worth of downloads, so this appears to either be some sort of DOS attack or a broken automated downloader.
Is there a way to remedy this situation? Can I set the container to detect and block malicious traffic? Or should I be using a different type of file hosting entirely, which offers these sorts of protections?
To see what's going on with your storage account, best way would be to use Storage Analytics especially see the storage activity logs. These logs are stored in a special blob container called $logs. You can download the contents of the blob using any storage explorer which supports exploring the contents of it.
I would highly recommend starting from there and identify what exactly is going on. Based on the findings, you can take some corrective actions. For example, if the traffic is coming via some bots, you can put a simple CAPTCHA on the download page.

Automate the export of Facebook Insights data

I'm looking for a way of programmatically exporting Facebook insights data for my pages, in a way that I can automate it. Specifically, I'd like to create a scheduled task that runs daily, and that can save a CSV or Excel file of a page's insights data using a Facebook API. I would then have an ETL job that puts that data into a database.
I checked out the oData service for Excel, which appears to be broken. Does anyone know of a way to programmatically automate the export of insights data for Facebook pages?
It's possible and not too complicated once you know how to access the insights.
Here is how I proceed:
Login the user with the offline_access and read_insights.
read_insights allows me to access the insights for all the pages and applications the user is admin of.
offline_access gives me a permanent token that I can use to update the insights without having to wait for the user to login.
Retrieve the list of pages and applications the user is admin of, and store those in database.
When I want to get the insights for a page or application, I don't query FQL, I query the Graph API: First I calculate how many queries to graph.facebook.com/[object_id]/insights are necessary, according to the date range chosen. Then I generate a query to use with the Batch API (http://developers.facebook.com/docs/reference/api/batch/). That allows me to get all the data for all the available insights, for all the days in the date range, in only one query.
I parse the rather huge json object obtained (which weight a few Mb, be aware of that) and store everything in database.
Now that you have all the insights parsed and stored in database, you're just a few SQL queries away from manipulating the data the way you want, like displaying charts, or exporting in CSV or Excel format.
I have the code already made (and published as a temporarily free tool on www.social-insights.net), so exporting to excel would be quite fast and easy.
Let me know if I can help you with that.
It can be done before the week-end.
You would need to write something that uses the Insights part of the Facebook Graph API. I haven't seen something already written for this.
Check out http://megalytic.com. This is a service that exports FB Insights (along with Google Analytics, Twitter, and some others) to Excel.
A new tool is available: the Analytics Edge add-ins now have a Facebook connector that makes downloads a snap.
http://www.analyticsedge.com/facebook-connector/
There are a number of ways that you could do this. I would suggest your choice depends on two factors:
What is your level of coding skill?
How much data are you looking to move?
I can't answer 1 for you, but in your case you aren't moving that much data (in relative terms). I will still share three options of many.
HARD CODE IT
This would require a script that accesses Facebook's GraphAPI
AND a computer/server to process that request automatically.
I primarily use AWS and would suggest that you could launch an EC2
and have it scheduled to launch your script at X times. I haven't used AWS Pipeline, but I do know that it is designed in a way that you can have it run a script automatically as well... supposedly with a little less server know-how
USE THIRD PARTY ADD-ON
There are a lot of people who have similar data needs. It has led to a number of easy-to-use tools. I use Supermetrics Free to run occasional audits and make sure that our tools are running properly. Supermetrics is fast and has a really easy interface to access Facebooks API's and several others. I believe that you can also schedule refreshes and updates with it.
USE THIRD PARTY FULL-SERVICE ETL
There are also several services or freelancers that can set this up for you at little to no work on your own. Depending on where you want the data. Stitch is a service I have worked with on FB-ads. There might be better services, but it has fulfilled our needs for now.
MY SUGGESTION
You would probably be best served by using a third-party add-on like Supermetrics. It's fast and easy to use. The other methods might be more worth looking into if you had a lot more data to move, or needed it to be refreshed more often than daily.

Resources