How to download a public folder from sharepoint using wget

How to download a public folder from sharepoint using wget - sharepoint

I want to download these public folders on sharepoint using a command in terminal (Ubuntu server connected via ssh).
I select all folders then hit Download button, it starts to downloading them in Firefox, I tried to copy the download link which is the following and give it to wget
wget https://japaneast1-mediap.svc.ms/transform/zip?cs=fFNQTw
However, it just download few bytes and stops:
--2021-05-06 21:41:27-- https://japaneast1-mediap.svc.ms/transform/zip?cs=fFNQTw
Resolving japaneast1-mediap.svc.ms (japaneast1-mediap.svc.ms)... 13.107.136.13
Connecting to japaneast1-mediap.svc.ms (japaneast1-mediap.svc.ms)|13.107.136.13|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 0
Saving to: ‘zip?cs=fFNQTw.5’
zip?cs=fFNQTw.5 [ <=> ] 0 --.-KB/s in `0s`
2021-05-06 21:41:29 (0.00 B/s) - ‘zip?cs=fFNQTw.5’ saved [0/0]

I was able to download the file in FireFox since it uses some session and cookie variables. For example, I couldn't download it in Chrome when I wasn't logged in Microsoft account, and when I logged in again I couldn't access the page.
Anyways, the following FireFox add-on was a solution. It copies any required variable and make a link that can be used in curl.
Download login-protected files from the command line using curl, wget or aria2.
https://addons.mozilla.org/en-US/firefox/addon/cliget/

Related

Problems downloading packages using "ptxdist get"

I'm trying to compile Linux for an embedded system. I'm using ptxdist command to build the system.
My colleague told me to use ptxdist get so we can get all dependencies downloaded before we start compiling.
We also sit behind a proxy and my /etc/environment is configured with the http/https/ftp proxy (same IP and port for all), as well as apt (meaning that wget works fine without extra parameters)
PROBLEM:
Running ptxdist get will download some of the packages, others it will just not download and the following error appears (example for 2 packages):
Here is an example when it worked fine:
---------------------------
target: cmake-3.13.4.tar.gz
---------------------------
--2019-07-23 20:54:28-- https://cmake.org/files/v3.13/cmake-3.13.4.tar.gz
Connecting to <PROXY IP HERE>:8080... connected.
Proxy request sent, awaiting response... 200 OK
Length: 8617881 (8.2M) [application/x-gzip]
Saving to: '/home/xxxxxxx/xxxxxxx/xxxxxxx/src/cmake-3.13.4.tar.gz.RSjrC7k7mC'
/home/xxxxxxx/x 100%[===================>] 8.22M 2.00MB/s in 4.7s
2019-07-23 20:54:34 (1.75 MB/s) - '/home/xxxxxxx/xxxxxxx/xxxxxxx/src/cmake-3.13.4.tar.gz.RSjrC7k7mC' saved [8617881/8617881]
And here an example when it failed:
-----------------------
target: lzo-2.08.tar.gz
-----------------------
--2019-07-23 20:55:13-- http://www.oberhumer.com/opensource/lzo/download/lzo-2.08.tar.gz
Connecting to <PROXY IP HERE>:8080... connected.
Proxy request sent, awaiting response... 502 Proxy Error ( The specified network name is no longer available. )
2019-07-23 20:55:13 ERROR 502: Proxy Error ( The specified network name is no longer available. ).
--2019-07-23 20:55:13-- http://www.pengutronix.de/software/ptxdist/temporary-src/lzo-2.08.tar.gz
Connecting to <PROXY IP HERE>:8080... connected.
Proxy request sent, awaiting response... 502 Proxy Error ( The specified network name is no longer available. )
2019-07-23 20:55:13 ERROR 502: Proxy Error ( The specified network name is no longer available. ).
make: *** [/home/xxxxxxxx/xxxxxxx/xxxxxxx/src/lzo-2.08.tar.gz] Error 1
Could not download package
URL: http://www.oberhumer.com/opensource/lzo/download/lzo-2.08.tar.gz
/usr/local/lib/ptxdist-2019.03.1/rules/post/ptxd_make_world_get.make:17: recipe for target '/home/xxxxxxx/xxxxxxx/xxxxxxx/src/lzo-2.08.tar.gz' failed
Even if I repeat ptxdist get, it stops at the same exact package that fails.
BUT:
If I just use
wget http://www.oberhumer.com/opensource/lzo/download/lzo-2.08.tar.gz
it downloads the file just fine, same happens to ANY other package that fails, using the URL that shows up above.
When I run ptxdist get then it moves on to other downloads as it sees the file already there, and then hangs on the next download that failes.
ALSO TRIED:
I also tried to run ptxdist setup and configure the proxy there. Nothing changes in the terminal, so no luck...
ALSO TRIED:
I also wrote a script that takes package names using ptxdist print PACKAGES, then gets the URL using ptxdist package-info <PKG> then downloads the file via wget <URL>, this solved most part of the errors while running ptxdist get as wget works fine, but it seems that there are many other packages that are needed and not listed with ptxdist print PACKAGES...
Any idea what is wrong and what can I do?
Since I'm supposed to automate this task, there is no way manual interaction (for using wget on failing packages) is an option...

lftp pget 401 Unauthorized

Yo
I'm trying to download a file from DigitalBlasphemy.com using lftp and pget on cygwin on windows.
Now, the usual route involves logging in to the website via web browser (It asks for username and password).
When I try to use lftp's pget command to download the file, lftp just farts out with "401 Unauthorized". How can I provide the relevant credentials to my command?

You have to edit the url of the file you are downloading.
For example instead of
pget https://example.com/directory-structure/filename.ext
you have to do
pget https://username:password#example.com/directory-structure/filename.ext

Download file/folder from sharepoint using Curl/Wget automatically

I have been trying to use Curl and wget to download file from Sharepoint. I am planning to make it as Script which runs automatically everyday and download the file from URL.
I tried using CURL with following command
curl -O --user Myusername:Mypassword https://OurDomain.sharepoint.com/_XXX&file=IPS_cleaned.xlsx&action=default
But it gave me error about SSL connection. I got to know that there is some existing bug in CURL 7.35 So i downgraded it to 7.22. But still gives me same error.
I also tried using Wget
wget --user=Myusername --password=MyPassword --no-check-certificate https://OurDomain.sharepoint.com/_XXX&file=IPS_cleaned.xlsx&action=default
But it still gives me error -- Unable to establish SSL connection
Can someone please let me know how i can accomplish my task
UPDATE
I was able to resolve the error in CURL. Below is the command that i gave
curl -O -L --sslv3 -A "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13" --user Myusername:Mypassword 'https://OurDomain.sharepoint.com/_%7BB21r-9CA2-345DEF%7D&file=IPS_cleaned.xlsx&action=default'
Now what it downloads is a file, which when i open it shows me Login page of Sharepoint. It does not download the actual excel file.
Any reason?

Another potential solution to this involves taking your sharepoint link and replacing the text after the '?' with download=1:
This:
https://my.sharepoint.com/:u:/g/XXX/XXXX-bunchofRandomText?e=kRlVi
Becomes this:
https://my.sharepoint.com/:u:/g/XXX/XXXX-bunchofRandomText?download=1
Now, you can just:
wget https://my.sharepoint.com/:u:/g/XXX/XXXX-bunchofRandomText?download=1
*Note, this example used a single file and a link where anyone with the link could access the file (no credentials required)

Please use rclone
Download and install the latest one from https://rclone.org/downloads
First option: Use OneDrive to access SharePoint sites/personal folder. This option will help you to upload large files.
1.create rclone configurations using the rclone config command
2.Select New remote and give a name
3.Select cloud storage OneDrive
4.Leave client ID and secret as blank
5.Edit advanced config: n
6.Remote config: Use auto-config: y
7.Open the URL on the browser and give access to rclone
8.Select personal/shared site URL option
8a.Shared site URL option you have to give the site URL. ie; https://sharepoint.com/sites/SiteName
9.Select personal/Documents drive. Documents drive will show if you selected the shared site URL option in the 8th step
Save config and quit
And the configuration file contents will be like the following. If you selected the Personal option drive type will be personal.
[onedrive]
type = onedrive
token =
drive_id =
drive_type = documentLibrary
Second option: In this option, you can upload up to 2 GB-sized files.
1.create rclone configurations using rclone config command
2.Select New remote and give a name
3.Select cloud storage WebDAV
4.Give site URL, username and password
5.Save and quit
And the configuration file contents will be like the following. Password will be in an encrypted format.
vim /root/.config/rclone/rclone.conf
[sharepoint]
type = webdav
url = https://sharepoint.com/sites/SiteName/Documents
vendor = sharepoint
user =
pass =
Download a file from SharePoint.
rclone copy --ignore-times --ignore-size --verbose sharepoint:SourceFolder/file.txt DestFolder

Firefox plugin that captures the link with session ID etc.. and it provides a command you could paste in the console for curl or wget.
If anyone has a better suggestion please let me know.
It gives you a curl or wget command with headers, cookies and all, with a copy to clipboard button, right on the download dialogue.
Download URL: https://addons.mozilla.org/en-US/firefox/addon/cliget
Reference: https://superuser.com/questions/27243/how-to-find-out-the-real-download-url-on-download-sites-that-use-redirects/1239026#1239026

Struggled with the same issue myself, and had my not-so-automatic-but-man-so-convenient way, with a daily log-in.
logged into Sharepoint with a browser,
exported the cookie,
run the following command.
wget --cookies=on --load-cookies cookies.txt --keep-session-cookies --no-check-certificate -m https://yoursharepoint.com
And files were downloaded just fine.

For anyone using CURL to download a file on Sharepoint with an "Anyone with the link" download option. Below are the steps I had to follow to download. Essentially you have to use the cookie from the share link, and then download the file from a different download link they don't provide easily for you.
When sending the CURL command for the “share link” it returns a 302 message, a forward link, and a cookie. If we save that cookie and use it to hit a “download” link I am able to download the file. Essentially, Microsoft uses the initial “share link” to send the cookie to the browser, and then redirect to their “View File” website. On that website you need to use the cookie provided (authentication), and select your next function (On screen view, print, download, etc). When you click the download button you hit a different link. I was able to find this link by going to the "view page" website for the file/link, turning on developer tools, and watching the link the browser follows when hitting download. You can then replicate that link for each file. If we use that download link along with the cookie, we can download the file.
curl -i -c cookies.txt SHARE LINK
curl -o docsdownloaded.pdf -b cookies.txt DOWNLOAD LINK
Share Link Ex: https://tenant.sharepoint.com/:b:/s/Folder/EdNUf4xAVzFJgBoO0MqkfppR5tgobxLrmCnRqU4LFJQ?e=rOGNSD
Download Link Ex:https://tenant.sharepoint.com/sites/Folder/_layouts/15/download.aspx?SourceUrl=%2Fsites%2FFolder%2FShared%20Documents%2FGeneral%2FBig%2Dfile%2Epdf

Similar to the answer Zyglute gave, using cURL:
You can export your login cookie using the cookies.txt Chrome extension: https://chrome.google.com/webstore/detail/njabckikapfpffapmjgojcnbfjonfjfg
Then use the following code:
curl -b cookie.txt https://OurDomain.sharepoint.com/_XXX&file=IPS_cleaned.xlsx&action=default
At some point your Sharepoint session will expire (not sure how long that takes), and you will need a new cookie file.
EDIT: If a malicious user gets a hold of your cookie.txt, they could get into your SharePoint account, so be sure to keep it safe.

Use wget adding &download=1 at the end of the link.
wget "<yourlink>&download=1"
it will be download with <yourlink> string as name, then just mv with the correct name after.

ftp: Name or Service not known

in command line
> ftp ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/
Work on one computer but does not work on my other one. Error returned
ftp: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/: Name or service not known
I also tried the raw IP address which is
> ftp ftp://130.14.250.10/1000genomes/ftp/data/
But it didn't work.
What is the problem here? how can I fix this?

The ftp command accepts the server name, not a URL. Your session likely should look like:
ftp ftp-trace.ncbi.nih.gov
(Server asks for login and password)
cd /1000genomes/ftp/data/
mget *

This depends on the ftp client you are using. On Mac OSX (ftp client from BSD), for example, the default command line ftp client accepts the full url, while for example in CentOS the default client doesn't, and you need to connect just to the hostname. So, it depends on the flavor of linux and the installed default ftp client.
Default ftp client in CentOS (ARPANET):
ftp ftp-trace.ncbi.nih.gov
cd 1000genomes/ftp/data
If you want to use the full url in CentOS 5.9 or Fedora 18 (where I tested it), you could install an additional ftp client. For example ncftp and lftp have the behavior you are looking for.
ncftp, available through yum or your favorite package manager:
ncftp ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/
NcFTP 3.2.2 (Aug 18, 2008) by Mike Gleason (http://www.NcFTP.com/contact/).
Connecting to ...
...
Logged in to ftp-trace.ncbi.nih.gov.
Current remote directory is /1000genomes/ftp/data
lftp, also available through your favorite package manager:
lftp ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/
cd ok, cwd=/1000genomes/ftp/data
lftp ftp-trace.ncbi.nih.gov:/1000genomes/ftp/data>
Another, more efficient, way to retrieve a page, is using wget or curl. These work for http, ftp and other protocols.

It looks to me like the computer that isn't working is already adding the ftp: to the URL, have you tried removing it from yours and seeing if that works?
> ftp ftp-trace.ncbi.nih.gov/1000genomes/ftp/data

Download non web accessible file with wget

Is it possible to download a file in say /home/... using wget to my local machine? I'm pretty newbish on the bash shell side so perhaps this is just a matter of using the options correctly. What I've gleaned is that something like this should work, but my test aren't downloading the file locally but placeing them within the folder i'm using wget in
root#mysite [/home/username/public_html/themes/themename/images]# wget -O "tester.png"
"http://www.mysite.com/themes/themename/images/previous.png"
--2011-09-08 14:28:49-- http://www.mysite.com/themes/themename/images/previous.png
Resolving www.mysite.com... 173.193.xxx.xxx
Connecting to www.mysite.com|173.193.xxx.xxx|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 352 [image/png]
Saving to: `tester.png'
100%[==============================================================================================>] 352 --.-K/s in 0s
2011-09-08 14:28:49 (84.3 MB/s) - `tester.png' saved [352/352]
Perhaps the above is a bad example but I can't seem how to figure out how to use wget (or some other command) to get something from a non web accessable directory (its a backup file) is wget the correct command for this?

wget uses the http (or ftp) protocol to transfer it's files, so no, you can't use it to transfer anything which is not availible through those services. What you should do is use scp. It uses ssh, and you can use it to get any file (which you have the permission to read, that is).
Say you want /home/myuser/test.file from the computer mycomp, and you want to save it as test.newext. Then you'd invoke it like this:
scp myuser#mycomp:/home/myuser/test.file test.newext
You can do a lot of other nifty stuff with scp so read the manual for more possibilities!

This belongs on superuser, but you want to use scp to copy the file to your local machine.
When a file isn't web accessible, you cant' get it with wget.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string