Elastic Search vs Azure Search [closed] - azure

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 months ago.
Improve this question
I'm trying to decide between Elastic Search and Azure Search.
I've found that they are very similar (except Azure Search supports multiple languages and has an AI for processing blobs and files and autocomplete).
But what made me think more was the price. For Elastic Search I have to pay $0.0615 per hour which means $44.28 per month (35GB storage | 1GB RAM | Up to 2.1 vCPU) and for the similar infrastructure Azure Cognitive Search the cost is $73.73 (Basic tier).
And going forward the price difference is very big.
Is anyone who can help with more details in this direction, please? Are there any hidden costs on Elastic search or why is this huge difference and nobody talks about that?

The price is different because they are different products to solve different problems.
Elasticsearch is a search engine where you will have to build your own ingest pipeline and queries.
Azure Search is a Search-as-a-service platform that includes some AI features in the pipeline like NER (Named Entity Recognition) ,image detection, audio transcription, NLP, etc.
If you need all these features and don't want to implement them yourself, then you should go stick to Azure Search.

Related

using serverless for large scale cloud database [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
What guidelines have been useful for folks who are looking to have a very large (100TB - PB) cloud database with multiple readers/writers (IoT) sources?
We expect to have a REDIS cache backed by either DynamoDB, Azure CosmosDB, or other (Not yet decided).
But is it a problem to have purely lambda and serverless to service the read/write requests? There are some guidelines from AWS about this:
https://aws.amazon.com/blogs/architecture/how-to-design-your-serverless-apps-for-massive-scale/
https://aws.amazon.com/blogs/compute/best-practices-for-organizing-larger-serverless-applications/
and one case study:
https://www.serverless.com/blog/how-droplr-scales-to-millions-serverless-framework
Your best bet for information like this is Azure Architecture Center that has articles on best practices and architectural guidance.
Regarding using Dynamo or Cosmos DB to back Redis, I can't offer any guidance on the efficacy for doing such a thing. What I can say is that I do see customers opt-out of using Redis altogether and use Dynamo or Cosmos as a key/value cache-layer because the latency is good enough.

reading sql server log files (ldf) with spark [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
this is probably far fetched but... can spark - or any advanced "ETL" technology you know - connect directly to sql server's log file (the .ldf) - and extract its data?
Agenda is to get SQL server's real time operational data without replicating the whole database first (nor selecting directly from it).
Appreciate your thoughts!
Rea
to answer your question, I have never heard of any tech to read an LDF directly, but there are several products on the market that can "link-clone" a database almost instantly by using some internal tricks. Keep in mind that the data is not copied using these tools, but it allows instant access for use cases like yours.
There may be some free ways to do this, especially using cloud functions, or maybe linked-clone functions that Virtual Machines offer, but I only know about paid products at this time like Dell EMC, Redgate's and Windocks.
The easiest to try that are not in the cloud are:
Red Gate SQL Clone with a 14 day free trial:
Red Gate SQL Clone Link
Windocks.com (this is free for some cases, but harder to get started with)

Benefits of Azure PluralSight [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am a beginner in the IT world.
I would like to know what are the advantages of doing the Microsoft Azure PluralSight courses? What kind of It jobs could I Apply having this certificate?
Doing the Azure course on Pluralsight will give you knowledge of the Azure Cloud Computing platform from Microsoft. Azure is typically used for storing databases in the cloud or for deploying applications to a cloud environment or similar tasks.
While many companies look for developers who have experience with cloud platforms like Azure and AWS most often the person who performs Deployments, a task that brings an application from the Development stage to the stage where it is publicly available and usable, specifically specializes in that skill and those tools.
The position where a person is responsible for the deployment, integration, and maintenance of the application is typically called "Dev Ops".

Cron bigquery jobs [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Which is the best way to schedule BigQuery jobs?
BigQuery doesn't offer a direct approach, and the best I got from searching is using app engine cron service, but from what I understood I have to create a web application to use this service.
My use case is to do some aggregations over clicks and impressions, daily or weekly and use them in our admin portal.
I used Hive as a data warehouse before and Oozie as our scheduler.
Is there a way to accomplish the same logic with BigQuery?
Unfortunately, there is no built in scheduler within BigQuery, although the engineering team takes requests! link.
However, there are a few interesting alternatives.
As you mentioned, using the cron service from App Engine would absolutely work, and you could write a small, simple web service that would invoke the query you want on a regular cadence. This service will not be web facing, so the charges should remain extremely small.
Apache Airflow is a service that I have been playing around with that is very promising; it allows you to define more complex data manipulation tasks across a variety of cloud services in Python and execute them on whatever cadence you choose. Very handy.
Regular Cron - if you have a server available to you, you could just set up a basic cron job that uses the 'bq' command line tool to execute whatever queries you want and save the results to tables in BQ.
Hope that helps! I'm positive there are other options as well, just wanted to give you a few.

Photo Sharing Vs Storage [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
There are lot of photo sharing applications out there, some are making money and some don't. Photo sharing takes lot of space, so I doubt where they host these! Rich services probably using Amazon or their own server, but the rest? Do they have access to any kind of free service? Or they have purchased terabytes from their web host?
AWS S3 is what you are generally referring to. The cost is mainly due to the reliability they give to the data they store. For photo-sharing, generally this much reliability is not required (compared with say a financial statement).
They also have other services like S3 RRS (Reduced redundancy), and Glacier. They are lot cheaper. Say those photos not accessed for a long time may be kept on Glacier (it will take time to retrieve, but cheap). RRS can be used for any transformed images (which can be re-constructed even if lost) - like thumbnails. So these good photo-sharing services, will do a lot of such complicated decisions on storage to manage cost.
You can read more on these types here : http://aws.amazon.com/s3/faqs/
There is also a casestudy of SmugMug on AWS. I also listened to him once, where he was telling about using his own hard-disks initially to store, but later S3 costs came down and he moved on to AWS. Read the details here:
AWS Case Study: SmugMug's Cloud Migration : http://aws.amazon.com/solutions/case-studies/smugmug/

Resources