Script to retrieve the (large) contents of a Rackspace cloud files container? - rackspace-cloud

I've decided that Cloud Files is getting too expensive for me now that I'm approaching 1TB of files, and it'll be silly when I get to 2-3TB within a year, so I'm going down the dedicated box route instead.
Can someone point me to a simple/bulletproof way to download 600,000 items from a container? I've searched around and found conflicting advice of the best way to do this, but figured I trust this community more than most random pages that google throws up!
Thanks

I've had good success with turbolift for rapidly uploading or downloading large batches of files.

Related

Azure Web Apps: Fastest Way(s) to Transfer Files Between Apps

I've found many questions and answers that are similar to mine, but they all seem to be very specific use cases - or, at least, different/old enough to not really apply to me (I think).
What I want to do is something that I thought would be simple today. The most inefficient thing with the web apps is that copying files between them can be slow and/or time-consuming. You have to FTP (or similar) down, the send it back up.
There must be a way to do this same thing, but natively within Azure so the files don't necessarily have to go far and certainly not with the same bandwidth restrictions.
Are there any solid code samples or open-source/commercial tools out there that help make this possible? So far, I haven't come across any code samples, products, or anything else that makes it possible (aside from many very old PowerShell blogs from 5+ years ago). (I'm not opposed to a PowerShell-based solution, either.)
In my case, these are all the same web apps that have minor configuration-based customization differences between them. So, I don't think webdeploy is an option because it's not about deployment of code. Sometimes it's simply creating a clone for a new launch, and other times its creating a copy for staging/development.
Any help in the right direction is appreciated.
As you've noticed, copying files over to AppService is not the way to go. For managing files across different AppService instances, try using a Storage Account. You can attach an Azure Storage file share to the app service and mount it. There's a comprehensive answer below on how to get files into the storage account.
Alternatively, if you have control over the app code, you can use blob storage instead of files and read the content directly from the app.

Huge files download, what should I use to build this?

I want to build an API that the user can download files, but there is one problem, the files can be huge, something like more than 100GB sometimes. I'm thinking about making the API using nodejs, but I don't know if it gonna be a great idea to make the file download features using node, some users may spend more than a day to make a download, node is single thread and I'm afraid that can hold to much time and make the others request slower, or worse, block them.
I gonna use clouding computing to host this API, I gonna start to study serverless hosts to see if it worths it in my case. Do you have any idea what I should use to make the download feature? There is an open-source code to use as an example?

Azure- one drive transfer

I am very new to microsoft azure. I would like to transfer 5gb of files(datasets) from my Microsoft one drive account to azure storage(blob storage I guess), and then share those files to about 10 other azure accounts on azure(I have some idea as to how to share files these files). I am not really sure how to go about it, and I prefer not downloading the 5gb of files from one drive and then uploading to azure. Help would be greatly appreciated, thanks a lot.
David's comment is correct, but I still want to provide a couple links to get you started. Like he mentioned, if you can break this into several questions that are more specific you can probably get much better StackOverflow response. I think the first part of the question could be phrased as 'How can I quickly transfer 5GB of files to Azure storage?'. This is still opinion based to some degree but has a couple more finite answers:
AzCopy/DmLib are, respectively, a command line tool and an Azure library that specialize in bulk transfer. There's a couple options including async copy and sync copy. These libraries are specialized to a greater degree for upload/download from the file system but will get you started.
There's a variety of language storage libraries where you can write custom code to connect up with OneDrive. Here is a getting started with .Net.
I think this is a very genuine question as downloading huge files and uploading them back is a very expensive and time consuming task. You can refer to a template here which would allow you to do a server side copy.
Hopefully, if not you, someone else would be benefited with this.

If I use Google Analytics can I delete my IIS logs?

MY IIS logs are taking up a lot of space and I don't know what to do with them. If Google Analytics is meeting all of my needs can I just delete all my IIS logs and turn off the daily creation of them?
I've spent the past hour hour or so understand what IIS logs are and what they do. I've seen that a lot of people delete them after "x" number of days, or they archive and delete? Do I need to archive and delete them? I didn't even know they existed until today, so needless to say I'm not really using them, so I'm hoping I can just delete them this once and turn off future creations of them?
Can you let me know if this is a good idea?
You might want to to archive them just in case. For example, if something odd happens it is always useful to have the logs.
If you are just monitoring stats Google analytics does not require them.
First, remember that web server logs can tell you lots of things GA can't. For example, they can easily tell you who downloaded which non-html files. Or if you have a nasty string of HTTP 500 errors that prevent pages from being rendered.
All that said, there is no technical reason why one would need to keep them. We actually disable IIS logging on some internal servers entirely so as not to have to clean up after it.

Is there such a thing as a reverse CDN? (content 'retrieval' network)

Our clients upload a serious amount of data from all over the world and we'd like to do our best to make that as painless as possible. Our clients upload 2GB worth of files over their sometimes very 'retail' broadband packages (with capped upload speeds) that draw out upload times to 24-48 hours. At any given time we have 10 or more concurrent uploads and peek periods we can have 100 concurrent uploads. So we decided to consider ways to reduce latency and keep our clients traffic local... so just as a CDN has download servers in various locations, we'd like upload servers.
Any experience or thoughts?
We're not a huge company but this is a problem worth solving so we'll consider all options.
What about putting some servers physically closer to your clients ?
Same ISP, or at the very least in the same countries. Then you just collect it on schedule. I don't imagine that they're getting top speeds when there's 100 of them uploading to you either, so the sooner you can get them completed the better.
Also, do they need to upload this stuff immediately ?? Can some of them post DVD for whatever isn't time sensitive ? I know it sux dealing with media in the post.... so it's hardly ideal.
A reverse CDN sort of situation would only really happen if you had multiple clients using torrents and seeding their uploads (somehow) to one of your servers.
You haven't really said if this is a problem for you, or your clients. So, some more info is going to get you a better answer here.
2GB per what time period? Hour? Day?
If your operation is huge, I wouldn't be too surprised if Akamai or one of the other usual CDN suspects can provide this service to you for the right price. You might get your bizdev folks (or purchasing) in touch with them.

Resources