reading sql server log files (ldf) with spark [closed] - apache-spark

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
this is probably far fetched but... can spark - or any advanced "ETL" technology you know - connect directly to sql server's log file (the .ldf) - and extract its data?
Agenda is to get SQL server's real time operational data without replicating the whole database first (nor selecting directly from it).
Appreciate your thoughts!
Rea

to answer your question, I have never heard of any tech to read an LDF directly, but there are several products on the market that can "link-clone" a database almost instantly by using some internal tricks. Keep in mind that the data is not copied using these tools, but it allows instant access for use cases like yours.
There may be some free ways to do this, especially using cloud functions, or maybe linked-clone functions that Virtual Machines offer, but I only know about paid products at this time like Dell EMC, Redgate's and Windocks.
The easiest to try that are not in the cloud are:
Red Gate SQL Clone with a 14 day free trial:
Red Gate SQL Clone Link
Windocks.com (this is free for some cases, but harder to get started with)

Related

Does Bot Framework store any data in Azure if I replace the default bot storage with my own custom storage? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I have a client that is very jealous about her data and she asked me to replace the default bot storage of my bot with a custom storage that saves all the data in an on-premises database.
If I replace the storage, will the bot framework save permanently any conversation data in any other place? (let's say, somewhere in Azure) That's something my client would like to avoid for security concerns.
Thanks!
Saving and loading of all session data is handled in the ChatConnector's getData() and saveData() unless you provided your own via settings.storage. In non-emulator real-life scenarios it will go to https://state.botframework.com/v3/botstate/...
The bot framework doesn't store anything else, I believe. I explored this exact question very recently. Take a look:
http://www.pveller.com/smarter-conversations-part-3-breadcrumbs/
http://www.pveller.com/smarter-conversations-part-4-transcript/
I had to read the source (many times actually) to trace the inner workings of the Bot Framework and I didn't see anything that would make me think that there's another persistence somewhere.
You are probably better off asking on the official support channel to confirm and assure your client but I think you're good.
As to how reasonable it is... companies do far more crazier things for all kinds of reasons :) By the way, will you also use Microsoft's LUIS for NLU? Does your client have similar concerns about all incoming messages going through that service? It's a deep rabbit hole. I think of engagement (vs. back office automation) bots as very much cloud-native. Not easy to shield yourself from it and yet benefit from all the new tech built for it.

Distributed file system for linux [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I've got a web app where I use plain file system for my custom logs - a lot of small files, I don't want to put that into db, that works for me quite well. But now I need to scale my app by using a load balancer in front, so I also need to keep those logs in sync between servers. Is there any reliable solution for such cases ? I know I could sync it by some OS means, or by scripting, but I'm thinking if there is any better solution for such scenarios? Is it the case for MongoDB usage or something more modern or is it better to keep it on file system as plain files ?
This questions is going to get you some heat since essentially your asking for our opinion. Ill be frank tho and wont argue with anyone since its just MY opinion. With web apps in my humble opinion, its always better to keep your data in a DB for scalability but also for analytical research. I know little about what your app does but its easier to write third party data apps that tell you how many of X or Y etc when its centrally stored in a DB. Since the app that gets said data can be anywhere. I know I probably wasted time with an argument but hey, hope I helped a bit.

Photo Sharing Vs Storage [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
There are lot of photo sharing applications out there, some are making money and some don't. Photo sharing takes lot of space, so I doubt where they host these! Rich services probably using Amazon or their own server, but the rest? Do they have access to any kind of free service? Or they have purchased terabytes from their web host?
AWS S3 is what you are generally referring to. The cost is mainly due to the reliability they give to the data they store. For photo-sharing, generally this much reliability is not required (compared with say a financial statement).
They also have other services like S3 RRS (Reduced redundancy), and Glacier. They are lot cheaper. Say those photos not accessed for a long time may be kept on Glacier (it will take time to retrieve, but cheap). RRS can be used for any transformed images (which can be re-constructed even if lost) - like thumbnails. So these good photo-sharing services, will do a lot of such complicated decisions on storage to manage cost.
You can read more on these types here : http://aws.amazon.com/s3/faqs/
There is also a casestudy of SmugMug on AWS. I also listened to him once, where he was telling about using his own hard-disks initially to store, but later S3 costs came down and he moved on to AWS. Read the details here:
AWS Case Study: SmugMug's Cloud Migration : http://aws.amazon.com/solutions/case-studies/smugmug/

put all images in a database or just in a folder [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am developing a website which uses a lot of images.
The images get manipulated very often (every few seconds, by the clients). All images are on a linux server. It is also possible that two clients try to change an image at the same time.
So my question is: should I put the images into a database or just leave them in a folder (how does the OS handle the write-write-collisions?)?
I use node.js and mongoDB on the server.
You usually store the reference to the file location inside of the database. As far as write-write collisions In most whoever has the file open first gets it however it mostly depends on the OS that you are working with. You will want to look into file locking. This wikipedia article gives a good overview.
http://en.wikipedia.org/wiki/File_locking
It is also considered good practice in your code to check and notify the user if the file is in use if write collisions are likely to occur.
I suggest you store your images within the MongoDB using the GridFS file system. This allows you to keep images and their metadata together, it has an atomic update and two more advantages, if you are using replica sets for your database:
Your images have the same high availability as the rest of your data and get backed-up together
You can, eventually, direct queries to secondary members of the set
For more information, see
http://docs.mongodb.org/manual/applications/gridfs
http://docs.mongodb.org/manual/replication/
Does this help?
Cheers
Ronald

Is G-Wan web server already dead? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
We are using this server for almost a year now.
Last forum post seen in November, 2011.
Last server version released 28/03/12.
Just wondering if anyone knows whats happening inside the company?
Should we expect something or should we start looking for alternatives?
I did what you did not do: using email to ask the question to the people able to answer.
And they replied that:
the forum was closed because they could not cope with the amount of accounts created daily to publish junk
the next version will be the most important ever made for G-Wan, with new features like a caching reverse proxy and an elastic load-balancer as well as system replacements like a wait-free memory allocator.
With regard to such developments, a 3-month period without publishing releases sounds reasonable.
More reasonable than assuming that such an 'inactivity period' means that "the project is dead".
Would you say that for other Web servers like Apache which have much larger release cycles?
You should always be expecting something from G-WAN. It's a great piece of software. Here's the other thing too: G-WAN was expertly engineered. That doesn't mean that there are no bugs in it, or that features can't be implemented, but G-WAN is incredibly tight.
It has lean code, it does what it supposed to do, very well, and it is built for the developer to add in the functionality that hasn't been put in there yet.
That's the beauty of it, or one facet of the beauty.

Resources