How to get (first n bytes of) file from s3 url

How to get (first n bytes of) file from s3 url - object

this is a basic question, not very advanced, but I am a bit stuck.
I am trying to get the first n bytes of a file hosted on s3. I understand the basic building block to the issue. I know how to get the bytes. Here is an example from the AmazonS3
GET /example-object HTTP/1.1
Host: example-bucket.s3.amazonaws.com
x-amz-date: Fri, 28 Jan 2011 21:32:02 GMT
Range: bytes=0-9
Authorization: AWS AKIAIOSFODNN7EXAMPLE:Yxg83MZaEgh3OZ3l0rLo5RTX11o=
Sample Response with Specified Range of the Object Bytes
Now I have an s3 url, like this:
http://s3.amazonaws.com/bucket-name/foo/1/2/3/4/file.jpg
How do I translate this, into the kind of request specified in the docs? I know this is remedial, but I am stuck, this feels somewhat opaque, though that could just be me.
Please help me deconstruct the s3 url to the components of the GET request example. Help is appreciated!
UPDATE: If I am to Use the GetObjectRequest api, what would I use for the bucket and key in the constructor?
UPDATE2: In other words is it as simple as
// modelled after http://s3.amazonaws.com/foo/
private String s3UrlToBucket(String s3Url)
{
Pattern pattern = Pattern.compile("^https?://[^/]+/([^/]+)/.*$");
Matcher matcher = pattern.matcher(s3Url);
if(matcher.find())
{
return matcher.group(1);
}
return null;
}
// modelled after http://s3.amazonaws.com/foo/1/2/3/4.jpg
private String s3UrlToKey(String s3Url)
{
Pattern pattern = Pattern.compile("^https?://[^/]+/[^/]+/(.*)$");
Matcher matcher = pattern.matcher(s3Url);
if(matcher.find())
{
return matcher.group(1);
}
return null;
}
UPDATE 3: Can you please explain to me what the key refers to???

If you use the AWS SDK for Java, you would just set the range on the GetObjectRequest. Here is a code example:
AmazonS3Client s3Client = new AmazonS3Client();
GetObjectRequest request = new GetObjectRequest("bucket-name", "foo/1/2/3/4/file.jpg");
request.withRange(0, numberOfBytesToGet);
S3Object s3Object = s3Client.getObject(request);
//s3Object.getObjectContent() has a stream to your object.

If you are struggling to do this in Ruby (which we were for ages) the correct syntax is:
s3_resource.bucket(bucket).object(key).get(range: "bytes=0-3").body.read

Related

What is the correct way to encode SAML request to the Azure AD endpoint?

I am using the following code to encode the SAMLRequest value to the endpoint, i.e. the XYZ when calling https://login.microsoftonline.com/common/saml2?SAMLRequest=XYZ.
Is this the correct way to encode it?
private static string DeflateEncode(string val)
{
var memoryStream = new MemoryStream();
using (var writer = new StreamWriter(new DeflateStream(memoryStream, CompressionMode.Compress, true), new UTF8Encoding(false)))
{
writer.Write(val);
writer.Close();
return Convert.ToBase64String(memoryStream.GetBuffer(), 0, (int)memoryStream.Length, Base64FormattingOptions.None);
}
}

If you just want to convert string to a base64 encoded string, then you can use the following way:
var encoded = Convert.ToBase64String(System.Text.Encoding.Default.GetBytes(val));
Console.WriteLine(encoded);
return encoded;

Yes, that looks correct for the Http Redirect binding.
But don't do this yourself unless you really know what you are doing. Sending the AuthnRequest is the simple part. Correctly validating the received response, including guarding for Xml signature wrapping attacks is hard. Use an existing library, there are both commercial and open source libraries available.

Spring cloud data flow Httpclient

I have the following stream.
Context of the problem
1.
rabbit --password='******' --queues=springdataflow-q --virtual-host=springdataflow --host=172.24.172.184 --username=springdataflow | transform | httpclient --url-expression='http://172.20.24.47:8080/push' --http-method=POST --headers-expression={'Content-Type':'application/x-www-form-urlencoded'} --body-expression={arg1:payload} | log
2.
I have spring boot running locally.
#RestController
public class HelloController {
#RequestMapping(value = "/push", method = RequestMethod.POST,produces = {MediaType.TEXT_PLAIN})
public String pushMessage(#RequestParam(value="arg1") String payload) {
System.out.println(payload);
return payload;
}
}
I would like to have the rabbit message come into httpclient as value for the the 'arg1' parameter value to the post request. The intent being that message published on rabbit queue is consumed by a rest post point, the message being captured by SpEL payload.
For this I am using the body-expression = {arg1:payload} but this is not working, maybe syntactically wrong.
Any suggestions ?

The #RequestParam(value="arg1") is really about request param, the part of the URL after ?, which is called query string: https://en.wikipedia.org/wiki/Query_string.
So, if you really would like to have an arg1=payload pair in the query string, you need to use a proper url-expression:
--url-expression='http://172.20.24.47:8080/push?arg1='+payload

This seems to work to pass strings as payloads. It seems that by default the payload becomes requestbody.
So on the rest service I made a change:
#RequestMapping(value = "/pushbody", method = RequestMethod.POST,consumes = {MediaType.TEXT_PLAIN})
public String pushBody(#RequestBody String payload) {
System.out.println(payload);
return payload;
}
And the stream that seems to work now is :
rabbit --password='******' --queues=springdataflow-q1 --host=172.24.172.184 --virtual-host=springdataflow --username=springdataflow | httpclient --http-method=POST --headers-expression={'Content-Type':'text/plain'} --url=http://172.20.24.47:8080/pushbody | log
I did try with inputType= text/plain suggestion both on httpclient and logsink and removing the consumes and produces on the rest service post method, but no luck there.

Dotnet Core 2.0.3 Migration | Encoding unable to translate bytes [8B]

Im not sure if this should just go direct to the github but I thought id check here first if anyone has encountered this issue before.
I recently have upgraded one of my apps to use dot net 2.0.3 From 1.1.4.
Everything works fine locally but when I deploy to my app service in azure I get the following exception.
System.Text.DecoderFallbackException: Unable to translate bytes [8B] at index 1 from specified code page to Unicode.
The code that calls it is a httpclient that talks between the apps.
public async Task<T1> Get<T1>(string url, Dictionary<string, string> urlParameters = null) where T1 : DefaultResponse, new()
{
var authToken = _contextAccessor.HttpContext.Request.Cookies["authToken"];
using (var client = new HttpClient().AcceptJson().Acceptgzip().AddAuthToken(authToken))
{
var apiResponse = await client.GetAsync(CreateRequest(url, urlParameters));
T1 output;
if (apiResponse.IsSuccessStatusCode)
{
output = await apiResponse.Content.ReadAsAsync<T1>();
//output.Succeeded = true;
}
else
{
output = new T1();
var errorData = GlobalNonSuccessResponseHandler.Handle(apiResponse);
output.Succeeded = false;
output.Messages.Add(errorData);
}
return output;
}
}
public static HttpClient AcceptJson(this HttpClient client)
{
client.DefaultRequestHeaders.Clear();
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
return client;
}
public static HttpClient Acceptgzip(this HttpClient client)
{
// Commenting this out fixes the issue.
//client.DefaultRequestHeaders.AcceptEncoding.Add(StringWithQualityHeaderValue.Parse("gzip"));
client.DefaultRequestHeaders.AcceptEncoding.Add(StringWithQualityHeaderValue.Parse("deflate"));
return client;
}
public static HttpClient AddAuthToken(this HttpClient client, string authToken)
{
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", authToken);
return client;
}
Im a bit stumped as to whats going on.
So I have 2 apps which we call client and server from now on.
Client uses the above code to talk to the server.
Locally this is fine on azure not so, this all worked fine before upgrading.
So I setup the client locally to talk to the server on azure I was able to replicate the issue.
I had a look at the response in fiddler and it is able to correctly decode it.
If anyone has any idea where I should look and has seen it before any info would be great :D.
UPDATE 1
So after some more digging I decided to remove gzip and then everything started working.
client.DefaultRequestHeaders.AcceptEncoding.Add(StringWithQualityHeaderValue.Parse("gzip"));
Can anyone explain this?

8B can be a second byte of multi-byte UTF8 character. The DecoderFallbackException tells that you’re interpreting the data as some other encoding. Probably Latin-1 which doesn’t have 8B character.
In fiddler, you should look at the content-type HTTP header in the response. If it says application/json or application/json; charset=utf-8, it’s probably a bug in .NET, because even without charset=utf-8 RFC 4627 says the default encoding is already UTF-8.
If it says something else, I would try changing the server so it sends the correct content-type header in the response.

How do I escape a '/' in the URI for a GET request?

I'm trying to use Groovy to script a GET request to our GitLab server to retrieve a file. The API URI format is:
https://githost/api/v4/projects/<namespace>%2F<repo>/files/<path>?ref=<branch>
Note that there is an encoded '/' between namespace and repo. The final URI needs to look like the following to work properly:
https://githost/api/v4/projects/mynamespace%2Fmyrepo/files/myfile.json?ref=master
I have the following code:
File f = HttpBuilder.configure {
request.uri.scheme = scheme
request.uri.host = host
request.uri.path = "/api/v4/projects/${apiNamespace}%2F${apiRepoName}/repository/files/${path}/myfile.json"
request.uri.query.put("ref", "master")
request.contentType = 'application/json'
request.accept = 'application/json'
request.headers['PRIVATE-TOKEN'] = apiToken
ignoreSslIssues execution
}.get {
Download.toFile(delegate as HttpConfig, new File("${dest}/myfile.json"))
}
However, the %2F is re-encoded as %252F. I've tried multiple ways to try and create the URI so that it doesn't encode the %2F in between the namespace and repo, but I can't get anything to work. It either re-encodes the '%' or decodes it to the literal "/".
How do I do this using Groovy + http-builder-ng to set the URI in a way that will preserve the encoded "/"? I've searched but can't find any examples that have worked.
Thanks!

As of the 1.0.0 release you can handle requests with encoded characters in the URI. An example would be:
def result = HttpBuilder.configure {
request.raw = "http://localhost:8080/projects/myteam%2Fmyrepo/myfile.json"
}.get()
Notice, the use of raw rather than uri in the example. Using this approach requires you to do any other encoding/decoding of the Uri yourself.

Possible Workaround
The Gitlab API allows you to query via project id or project name. Look up the project id first, then query the project.
Lookup the project id first. See https://docs.gitlab.com/ee/api/projects.html#list-all-projects
def projects = // GET /projects
def project = projects.find { it['path_with_namespace'] == 'diaspora/diaspora-client' }
Fetch Project by :id, See https://docs.gitlab.com/ee/api/projects.html#get-single-project
GET /projects/${project.id}

Swagger generated API for the Microsoft Cognative Services Recommendations

I can't seem to figure out how to include the CSV file content when calling the Swagger API generated methods for the Microsoft Cognitive Services Recommendations API method Uploadacatalogfiletoamodel(modelID, catalogDisplayName, accountKey);. I've tried setting the catalogDisplayName to the full path to the catalog file, however I'm getting "(EXT-0108) Passed argument is invalid."
When calling any of the Cog Svcs APIs that require HTTP body content, how do I include the body content when the exposed API doesn't have a parameter for the body?

I guess, Swagger can't help you testing functions that need to pass data thru a form. And I guess sending the CSV content in the form data shall do the trick, if you know the proper headers.
I work with nuGet called "Microsoft.Net.Http" and code looks like
HttpContent stringContent = new StringContent(someStringYouWannaSend);
HttpContent bytesContent = new ByteArrayContent(someBytesYouWannaSend);
using (var client = new HttpClient())
using (var formData = new MultipartFormDataContent())
{
formData.Add(stringContent, "metadata", "metadata");
formData.Add(bytesContent, "bytes", "bytes");
HttpResponseMessage response = client.PostAsync(someWebApiEndPoint.ToString(), formData).Result;
if (!response.IsSuccessStatusCode)
{
return false; //LOG
}
string responseContent = response.Content.ReadAsStringAsync().Result;
jsonResult= JsonConvert.DeserializeObject<someCoolClass>(responseContent);
return true;
}
Sorry about that someVariables that can't compile. Hope you'll figure this out.

When you are basing your code on the Swagger definition you depend on the good will of the person that created that Swagger definition. Maybe it is not complete yet.
If you are working on C#, try looking at the Samples repo.
Particularly for the Uploading of the catalog there are several functions on the ApiWrapper class that might be helpful, one has this signature: public CatalogImportStats UploadCatalog(string modelId, string catalogFilePath, string catalogDisplayName), another has this other signature public UsageImportStats UploadUsage(string modelId, string usageFilePath, string usageDisplayName) (where it seems like you can point to a public url).
In your case I'd probably try the second one.
Download the sample and use the Wrapper code defined there in your project.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to get (first n bytes of) file from s3 url - object

If you are struggling to do this in Ruby (which we were for ages) the correct syntax is: s3_resource.bucket(bucket).object(key).get(range: "bytes=0-3").body.read

Related

What is the correct way to encode SAML request to the Azure AD endpoint?

Spring cloud data flow Httpclient

Dotnet Core 2.0.3 Migration | Encoding unable to translate bytes [8B]

How do I escape a '/' in the URI for a GET request?

Swagger generated API for the Microsoft Cognative Services Recommendations

Categories

Resources