How can I download a website's content into a string?

How can I download a website's content into a string? - rust

I would simply like to download a website and put its contents into a String.
Similar to how this is done in C#:
WebClient c = new WebClient();
string ex = c.DownloadString("http://url.com");

Rust has no HTTP functionality in the standard library, so you probably want to use another crate (library) to handle HTTP stuff for you. There are several different crates for this purpose.
reqwest: "higher level HTTP client library"
let body = reqwest::get("http://url.com")?.text()?;
ureq: "Minimal HTTP request library"
let body = ureq::get("http://url.com").call().into_string()?
isahc: "The practical HTTP client that is fun to use."
let mut response = isahc::get("https://example.org")?;
let body = response.text()?;
curl: "Rust bindings to libcurl for making HTTP requests"
use curl::easy::Easy;
// First write everything into a `Vec<u8>`
let mut data = Vec::new();
let mut handle = Easy::new();
handle.url("http://url.com").unwrap();
{
let mut transfer = handle.transfer();
transfer.write_function(|new_data| {
data.extend_from_slice(new_data);
Ok(new_data.len())
}).unwrap();
transfer.perform().unwrap();
}
// Convert it to `String`
let body = String::from_utf8(data).expect("body is not valid UTF8!");
hyper?
Hyper is a very popular HTTP library, but it's rather low-level. This makes it usually too hard/verbose to use for small scripts. However, if you want to write a HTTP server, Hyper sure is the way to go (that's why Hyper is used by most Rust Web Frameworks).
Many others!
I couldn't list all available libraries in this answer. So feel free to search crates.io for more crates that could help you.

Related

Authenticating to MS Graph without user interaction (Rust)

I have a Java application using the Microsoft Graph SDK to read from Azure AD. I am supposed to translate that application to Rust. In Java, i use code similar to this:
import com.azure.identity.ClientSecretCredentialBuilder;
import com.azure.identity.TokenCachePersistenceOptions;
import com.microsoft.graph.authentication.TokenCredentialAuthProvider;
import com.microsoft.graph.requests.GraphServiceClient;
import java.util.List;
public class GraphSdkClient {
public static void testApplicationClient() {
TokenCachePersistenceOptions options = new TokenCachePersistenceOptions()
.setName("ILM-Demo")
.setUnencryptedStorageAllowed(false);
ClientSecretCredentialBuilder builder = new ClientSecretCredentialBuilder()
.clientId("<MyClientId>")
.tenantId("<MyTenantId>")
.clientSecret("<MyClientSecret>")
.tokenCachePersistenceOptions(options);
TokenCredentialAuthProvider provider = new TokenCredentialAuthProvider(
List.of("Calendars.Read", "Calendars.ReadBasic.All"),
builder.build()
);
GraphServiceClient<?> client = GraphServiceClient
.builder()
.authenticationProvider(provider)
.buildClient();
client.me().calendar().calendarView().buildRequest().get();
}
}
It authenticates as an application, using only the client secret. The permission was given half a year ago and as long as the three values from the ClientSecretCredentialBuilder are correct, it works perfectly fine. Now i tried using a similar conecpt in Rust, taken from the graph-rs-sdk crate:
#[cfg(test)]
mod test {
use graph_rs_sdk::oauth::OAuth;
use warp::Filter;
use crate::{CLIENT_ID, CLIENT_SECRET, TENANT_ID};
use serde::{Deserialize, Serialize};
#[derive(Default, Debug, Clone, Serialize, Deserialize)]
pub struct ClientCredentialsResponse {
admin_consent: bool,
tenant: String,
}
#[tokio::test]
async fn ut_client_credentials() {
let query = warp::query::<ClientCredentialsResponse>()
.map(Some)
.or_else(|_| async {
Ok::<(Option<ClientCredentialsResponse>,), std::convert::Infallible>((None,))
});
let routes = warp::get()
.and(warp::path("redirect"))
.and(query)
.and_then(handle_redirect);
// Get the oauth client and request a browser sign in
let mut oauth = get_oauth_client();
let mut request = oauth.build_async().client_credentials();
request.browser_authorization().open().unwrap();
warp::serve(routes).run(([127, 0, 0, 1], 8300)).await;
}
async fn handle_redirect(client_credential_option: Option<ClientCredentialsResponse>)
-> Result<Box<dyn warp::Reply>, warp::Rejection> {
match client_credential_option {
Some(client_credential_response) => {
// Print out for debugging purposes.
println!("{:#?}", client_credential_response);
// Request an access token.
request_access_token().await;
// Generic login page response.
Ok(Box::new(
"Successfully Logged In! You can close your browser.",
))
}
None => Err(warp::reject()),
}
}
async fn request_access_token() {
let mut oauth = get_oauth_client();
let mut request = oauth.build_async().client_credentials();
let access_token = request.access_token().send().await.unwrap();
println!("{:#?}", access_token);
oauth.access_token(access_token);
}
fn get_oauth_client() -> OAuth {
let mut oauth = OAuth::new();
oauth
.client_id(CLIENT_ID)
.client_secret(CLIENT_SECRET)
.tenant_id(TENANT_ID)
.add_scope("https://graph.microsoft.com/User.Invite.All")
.redirect_uri("http://localhost:8300/redirect")
// .authorize_url("https://login.microsoftonline.com/common/adminconsent")
.access_token_url("https://login.microsoftonline.com/common/oauth2/v2.0/token");
oauth
}
}
Note that i commented out the authorize url. If this url exits, a browser window opens, requesting an admin to log in. This must not happen. When it is commented out, it sends the request directly to <tenantId>/oauth2/v2.0/authorize instead of <tenantId>/oauth2/v2.0/adminconsent, which is what i want, it instead complains: AADSTS900144: The request body must contain the following parameter: 'scope'.. The scope is given though.
I already tried fiddling with urls for hours and also tried every other authentication concept in the crate, but it doesn't seem to work without interaction, which is not possible in my use case. Does someone know how to achieve that? (All permissions are granted as application permissions, not deligated)
Edit request: Please create and add the tag "graph-rs-sdk". I cannot create it myself but knowing the crate being used would be useful.

How to handle multipart/form-data upload in rocket (rust)?

I'm trying to handle POST requests on my webserver for file uploads, but i haven't found good documentation or examples on how to do it.
Here is the POST function i came up with after a long time:
#[post("/upload",data="<file>")]
fn upload_handler(file: Data) -> Redirect {
let mut buffer = Vec::new();
file.stream_to(&mut buffer).unwrap();
let split_pos = buffer.windows(4).position(|pos|pos == b"\r\n\r\n").unwrap();
let split_end = buffer.windows(8).position(|pos|pos == b"\r\n------").unwrap();
let headers = String::from_utf8_lossy(&buffer[0..split_pos]);
let content = &buffer[split_pos+4..split_end];
println!("{}",&headers);
let re = Regex::new("filename=\"(?P<filename>.*)\"").unwrap();
let captures = re.captures(&headers).unwrap();
let filename = &captures["filename"].to_owned();
let mut local_file = File::create(filename).unwrap();
local_file.write(content).unwrap();
Redirect::to("/")
}
It kinda works but has a lot of flaws like:
I can't limit the file upload size.
I don't know the size of the file being uploaded.
I had to manually regex the filename (which i don't think is a thing you have to do when using a framework).
I can't see the entire request header.
When spliting the headers from the content this is all i got:
Content-Disposition: form-data; name="file"; filename="image.jpg"
Content-Type: image/jpeg
Here's a simple working code i had on a python flask server that did the job:
#app.route("/upload",methods=["POST"])
def upload():
if int(request.headers["Content-Length"]) > app.config["MAX_CONTENT_LENGTH"]:
abort(413)
f = request.files["file"]
filename = secure_filename(f.filename)
if filename.endswith(ALLOWED_EXTENSIONS):
f.save("src/"+filename)
return render_template("upload.html",file=filename),{"Refresh": "2; url=/"}
else:
return render_template("denied.html")
If anyone knows a better/right way of doing it please tell me.

How to return contents as a file download in Axum?

I have a Rust application that is acting as a proxy. From the user perspective there is a web UI front end. This contains a button that when invoked will trigger a GET request to the Rust application. This in turn calls an external endpoint that returns the CSV file.
What I want is have the file download to the browser when the user clicks the button. Right now, the contents of the CSV file are returned to the browser rather than the file itself.
use std::net::SocketAddr;
use axum::{Router, Server};
use axum::extract::Json;
use axum::routing::get;
pub async fn downloadfile() -> Result<Json<String>, ApiError> {
let filename = ...;
let endpoint = "http://127.0.0.1:6101/api/CSV/DownloadFile";
let path = format!("{}?filename={}", endpoint, filename);
let response = reqwest::get(path).await?;
let content = response.text().await?;
Ok(Json(content))
}
pub async fn serve(listen_addr: SocketAddr) {
let app = Router::new()
.route("/downloadfile", get(downloadfile));
Server::bind(&listen_addr)
.serve(app.into_make_service())
.await
.unwrap();
}
I understand the reason I'm getting the contents is because I'm returning the content string as JSON. This makes sense. However, what would I need to change to return the file itself so the browser downloads it directly for the user?

I've managed to resolve it and now returns the CSV as a file. Here's the function that works:
use axum::response::Headers;
use http::header::{self, HeaderName};
pub async fn downloadfile() -> Result<(Headers<[(HeaderName, &'static str); 2]>, String), ApiError> {
let filename = ...;
let endpoint = "http://127.0.0.1:6101/api/CSV/DownloadFile";
let path = format!("{}?filename={}", endpoint, filename);
let response = reqwest::get(path).await?;
let content = response.text().await?;
let headers = Headers([
(header::CONTENT_TYPE, "text/csv; charset=utf-8"),
(header::CONTENT_DISPOSITION, "attachment; filename=\"data.csv\""),
]);
Ok((headers, content))
}
I'm still struggling with being able to set a dynamic file name, but that for another question.

SPO Modern: How to inject and execute webpart programmatically, using js?

I have only the URL of webpart (example 'https://sitename.com/sites/site-colection/ClientSideAssets/hereisguid/webpartname.js') and I need to inject and run it programmatically via js, is it possible?

It's not officially supported but You can use global variable (available on every modern page) _spComponentLoader. The problem is - it requires You to provide WebPartContext which You cannot simply get outside of SPFx.
If You want to do it in SPFx here is a sample code:
webPartId = hereisguid from Your url
let component = await _spComponentLoader.loadComponentById(webPartId);
let manifest = _spComponentLoader.tryGetManifestById(webPartId);
let wpInstance = new component.default();
context.manifest = manifest;
//#ts-ignore
context._domElement = document.getElementById("<id-of-element-you-want-wp-to-render-in>")
await wpInstance._internalInitialize(context, {}, 1);
wpInstance._properties = webPart.properties;
await wpInstance.onInit();
wpInstance.render();
wpInstance._renderedOnce = true;
Again - I don't think it's supported so try it on Your own risk.
Note this web part must be available at the site You are going to execute this script.

How to simulate the browser to login in https website using c++ based on Linux?

everybody,my goal is logging in a https website and downloading webpage using C++ background service program based on Linux.
Detail needs is follow:
(1)connect to "https://www.space-track.org/auth/login"
(2)enter username and password in order to login in successful
(3)post some formdata to this website
(4)downloading the webpage.
Now,my method is using MFC::CInternetSession(code is follow. It is in the MS-Windows),but it's not successful. there must exist some problems in the codes. I hope you can help me solve the problem. maybe you can come up with better solutions using C++ to simulate the browser based on Linux. thank you very much！
Url = "https://www.space-track.org/auth/login/";
nPort = INTERNET_DEFAULT_HTTP_PORT;
CString strHeaders = _T("Content-Type: application/x-www-form-urlencoded");
if (AfxParseURL(Url,dwSeviceType,strServerName,strTarget,nPort) == 0)
return false;
CInternetSession sess;
sess.SetOption(INTERNET_OPTION_CONNECT_TIMEOUT,1000*20);
sess.EnableStatusCallback(TRUE);
CHttpConnection* pHttpConnect = sess.GetHttpConnection(strServerName,nPort);
CHttpFile* pHttpFile = pHttpConnect->OpenRequest(CHttpConnection::HTTP_VERB_POST,
strTarget,NULL,1,NULL,NULL,INTERNET_FLAG_SECURE);
CString strUserName = "*****";
CString strPassword = "*****";
CString strUserinfo;
strUserinfo.Format(_T("identity=%s&password=%s"),strUserName,strPassword);
try
{
BOOL bResult =pHttpFile->SendRequest(strHeaders,(LPVOID)(LPCTSTR)strUserinfo,strUserinfo.GetLength()* sizeof(TCHAR));
//BOOL bResult =pHttpFile->SendRequest(strHeaders);
}
catch (CInternetException* pException)
{
pException->m_dwError;
pException->Delete();
}
pHttpFile->SetReadBufferSize(2048);
CString str;
CString strGetData;
while(pHttpFile->ReadString(strGetData))
{
str +="\r\n";
str +=strGetData;
}
CString fileName("index.html");
CFile file(fileName,CFile::modeCreate | CFile::modeWrite);
file.Write(str,str.GetLength());
file.Close();
pHttpFile->Close();
delete pHttpFile;
pHttpConnect->Close();
delete pHttpConnect;
sess.Close();
return TRUE;

There are a couple of Linux libraries that implement an HTTP client API, that can be used to implement HTTP/HTTPS requests in C or C++.
The grand-daddy of them all is W3C's own libwww:
http://www.w3.org/Library/
A more recent HTTP/HTTPS client library is libcurl:
http://curl.haxx.se/libcurl/
Either one of them can be used to implement an HTTP/HTTPS client in C or C++. However, in all cases, before using them you do need to have some understanding of HTTP/HTTPS protocols work; specifically HTTPS when it comes to certificate validation and verification.
Both of these libraries are fairly common, and most Linux distributions already have them packaged. You probably have one or both of them installed already.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How can I download a website's content into a string? - rust

I would simply like to download a website and put its contents into a String. Similar to how this is done in C#: WebClient c = new WebClient(); string ex = c.DownloadString("http://url.com");

Related

Authenticating to MS Graph without user interaction (Rust)

How to handle multipart/form-data upload in rocket (rust)?

How to return contents as a file download in Axum?

SPO Modern: How to inject and execute webpart programmatically, using js?

How to simulate the browser to login in https website using c++ based on Linux?

Categories

Resources