i would like to implement a lambda in aws which receives as input pixel coordinates (x/y), retrieve that pixel's RGB from one image, and then do something with it.
the catch now is that the image is very large: 21600x10800 pixels (a 684MB tif file).
Many of the image's pixels will likely never be accessed (its a world map so it includes e.g. oceans, for which no lambda calls will happen. But i don't know which pixels will be needed.)
The result of the lambda will be persisted so that the image operation is only done once per pixel.
My main concern is that i would like to avoid large unnecessary processing time and costs. I expect multiple calls per second of the lambda. The naive way would be to throw the image into an s3 bucket, then read it in the lambda to get one pixel - but i would think that then each lambda invoke would become very heavy. I could do some custom solution such as storing the rows separately but was wondering if there is some set of technologies that handles it more elegant.
Right now i am using Node.js 14.x but that's not a strong requirement.
the image is in tif format but i could convert it to another image format beforehand if needed. (just not to the answer of the lambda as that is even bigger)
How can i efficiently design this lambda?
As I said in the comments, I think Lambda is the wrong solution unless your traffic is very bursty. If you have continuous traffic with "multiple calls per second," it will be more cost-effective to use an alternate technology, such as EC2 or ECS. And these give you far more control over storage.
However, if you're set on using Lambda, then I think the best solution is to store the file on an EFS volume, then mount that filesystem onto your Lambda. In my experiments, it takes roughly 150 ms for a Lambda to read an arbitrary byte from a file on EFS, using Python and the mmap package.
Of course, if your TIFF library attempts to read the file into memory before performing any operations, this is moot. The TIFF format is designed so that shouldn't be necessary, but some libraries take the easy way out (because in most cases you're displaying or transforming the entire image). You may need to pre-process your file, to produce a raw byte format, in order to be able to make single-pixel gets.
Thanks everyone for the useful information!
so after some testing i settled for the solution from luk2302's comment with 100x100 pixel sub-images hosted on s3, but can't flag a comment as the solution. My tests showed that the lambda operates within 110ms to access a pixel (from the now only 4kb large files) which i think is quite sufficient for my expectations. (the very first request was at 1s time, but now even requests to sub-images which have never been touched before are answered within 110ms)
Parsifal's solution is what i originally envisioned to be ideal in order to really only check the relevant data (open question then being which image library actually omits loading the entire file) but i don't have the means to check the file system aspect more closely if that has more potential. In my case indeed the requests are very much burst driven (with long periods of expected inactivity after the bursts), so for the moment i will remain with a lambda but will keep the mentioned alternatives in mind.
Related
I want to use dataset at https://ingmec.ual.es/datasets/lidar3d-pf-benchmark/ in my project. The available map is .simplemap. What I understand is it stores both map and the robot poses as well. I want to get the point cloud representation of this map (which later I can convert into octomap) as well as vehicles ground truth pose in the map.
I have been able to get the CPose3DPDF from which I obtained CPose3d which I believe is the desired vehicle's ground truth pose. Please correct me if I am wrong. Now I have two problems. First the length of trajectory is just 97 which makes me suspicious about my code to obtain it. Second is about the CSensoryFrame which I obtain along with CPose3DPDF. When I get CObservation by doing CSensorFrame->getObservationByIndex and write to a file, it gives me idea that it stores velodyne readings. But I am unable to recover point cloud from it. Could anyone please guide me to a tool which can convert a .simplemap into a point cloud or an octomap representation and obtain vehicle's pose out of it as well. Many thanks in advance.
For the records: this one was answered here:
Your assumptions were all correct.
I realized the full UAL campus map was not included into the downloads. It's now available to download inside 2018-02-26-ual-campus-map.zip, at the bottom of this dataset page.
You can also regenerate the pointcloud, octomap from the .simplemap using the app application-observations2map.
Example .ini files can be found under MRPT/share/mrpt/config_files/*
You can also visually inspect .simplemap files with the robot-map-gui app.
I wonder if there is an obvious and elegant way to add additional data to a jpeg while keeping it readable for standard image viewers. More precisely I would like to embed a picture of the backside of a (scanned) photo into it. Old photos often have personal messages written on the back, may it be the date or some notes. Sure you could use EXIF and add some text, but an actuall image of the back is more preferable.
Sure I could also save 2 files xyz.jpg and xyz_back.jpg, or arrange both images side by side, always visible in one picture, but that's not what I'm looking for.
It is possible and has been done, like on Samsung Note 2 and 3 you can add handwritten notes to the photos you've taken as a image. Or some smartphones allow to embed voice recordings to the image files while preserving the readability of those files on other devices.
There are two ways you can do this.
1) Use and Application Marker (APP0–APPF)—the preferred method
2) Use a Comment Marker (COM)
If you use an APPn marker:
1) Do not make it the first APPn in the file. Every known JPEG file format expects some kind of format specific APPn marker right after the SOI marker. Make sure that your marker is not there.
2) Place a unique application identifier (null terminated string) at the start of the data (something done by convention).
All kinds of applications store additional data this way.
One issue is that the length field is only 16-bits (Big Endian format). If you have a lot of data, you will have to split it across multiple markers.
If you use a COM marker, make sure it comes after the first APPn marker in the file. However, I would discourage using a COM marker for something like this as it might choke applications that try to display the contents.
An interesting question. There are file formats that support multiple images per file (multipage TIFF comes to mind) but JPEG doesn't support this natively.
One feature of the JPEG file format is the concept of APP segments. These are regions of the JPEG file that can contain arbitrary information (as a sequence of bytes). Exif is actually stored in one of these segments, and is identified by a preamble.
Take a look at this page: http://www.sno.phy.queensu.ca/~phil/exiftool/#JPEG
You'll see many segments there that start with APP such as APP0 (which can store JFIF data), APP1 (which can contain Exif) and so forth.
There's nothing stopping you storing data in one of these segments. Conformant JPEG readers will ignore this unrecognised data, but you could write software to store/retrieve data from within there. It's even possible to embed another JPEG file within such a segment! There's no precedent I know for doing this however.
Another option would be to include the second image as the thumbnail of the first. Normally thumbnails are very small, but you could store a second image as the thumbnail of the first. Some software might replace or remove this though.
In general I think using two files and a naming convention would be the simplest and least confusing, but you do have options.
I'm trying to compare (for performance) the use of either dataURIs compared to a large number of images. What I've done is setup two tests:
Regular Images (WPT)
Base64 (WPT)
Both pages are exactly the same, other than "how" these images/resources are being offered. I've ran a WebPageTest against each (noted above - WPT) and it looks the average load time for base64 is a lot faster -- but the cached view of regular view is faster. I've implemented HTML5 Boilerplate's .htaccess to make sure resources are properly gzipped, but as you can see I'm getting an F for base64 for not caching static resources (which I'm not sure if this is right or not). What I'm ultimately trying to figure out here is which is the better way to go (assuming let's say there'd be that many resources on a single page, for arguments sake). Some things I know:
The GET request for base64 is big
There's 1 resource for base64 compared to 300 some-odd for the regular (which is the bigger downer here... GET request or number of resources)? The thing to remember about the regular one is that there are only so many resources that can be loaded in parallel due to restrictions -- and for base64 - you're really only waiting until the HTML can be read - so nothing is technically loaded than the page itself.
Really appreciate any help - thanks!
For comparison I think you need to run a test with the images sharded across multiple hostnames.
Another option would be to sprite the images into logical sets.
If you're going to go down the BASE64 route, then perhaps you need to find a way to cache them on the client.
If these are the images you're planning on using then there's plenty of room for optimisation in them, for example: http://yhmags.com/profile-test/img_scaled15/interior-flooring.jpg
I converted this to a PNG and ran it through ImageOptim and it came out as 802 bytes (vs 1.7KB for the JPG)
I'd optimise the images and then re-run the tests, including one with multiple hostnames.
I am learning directx. It provides a huge amount of freedom in how to do things, but presumably different stategies perform differently and it provides little guidance as to what well performing usage patterns might be.
When using directx is it typical to have to swap in a bunch of new data multiple times on each render?
The most obvious, and probably really inefficient, way to use it would be like this.
Stragety 1
On every single render
Load everything for model 0 (textures included) and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
Load everything for model 1 (textures included) and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
etc...
I am guessing you can make this more efficient partly if the biggest things to load are given dedicated slots, e.g. if the texture for model 0 is really complicated, don't reload it on each step, just load it into slot 1 and leave it there. Of course since I'm not sure how many registers there are certain to be of each type in DX11 this is complicated (can anyone point to docuemntation on that?)
Stragety 2
Choose some texture slots for loading and others for perpetual storage of your most complex textures.
Once only
Load most complicated models, shaders and textures into slots dedicated for perpetual storage
On every single render
Load everything not already present for model 0 using slots you set aside for loading and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
Load everything not already present for model 1 using slots you set aside for loading and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
etc...
Strategy 3
I have no idea, but the above are probably all wrong because I am really new at this.
What are the standard strategies for efficient rendering on directx (specifically DX11) to make it as efficient as possible?
DirectX manages the resources for you and tries to keep them in video memory as long as it can to optimize performance, but can only do so up to the limit of video memory in the card. There is also overhead in every state change even if the resource is still in video memory.
A general strategy for optimizing this is to minimize the number of state changes during the rendering pass. Commonly this means drawing all polygons that use the same texture in a batch, and all objects using the same vertex buffers in a batch. So generally you would try to draw as many primitives as you can before changing the state to draw more primitives
This often will make the rendering code a little more complicated and harder to maintain, so you will want to do some profiling to determine how much optimization you are willing to do.
Generally you will get better performance increases through more general algorithm changes beyond the scope of this question. Some examples would be reducing polygon counts for distant objects and occlusion queries. A popular true phrase is "the fastest polygons are the ones you don't draw". Here are a couple of quick links:
http://msdn.microsoft.com/en-us/library/bb147263%28v=vs.85%29.aspx
http://www.gamasutra.com/view/feature/3243/optimizing_direct3d_applications_.php
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter06.html
Other answers are better answers to the question per se, but by far the most relevant thing I found since asking was this discussion on gamedev.net in which some big title games are profiled for state changes and draw calls.
What comes out of it is that big name games don't appear to actually worry too much about this, i.e. it can take significant time to write code that addresses this sort of issue and the time it takes to spend writing code fussing with it probably isn't worth the time lost getting your application finished.
I want to create an application to read and write DICOM files without using any third party software
How can I do that?
Can any one help me?
"I my project, I need only to update pixel data. So it was not too tough to handle. I just parsde the DICOM file till I reach pixel data, and then I replaced the same with my own data. and It become success."
Even though there are quite a few research applications that do the same thing that you've done, it is precisely The Wrong Thing To Do(TM). Why is this such a bad practice? DICOM images are supposed to be uniquely identified by their SOP Instance UIDs. When you take an existing DICOM image and replace the pixel data, leaving the original header information unaltered, you are creating two data objects that share the same primary key.
Consider what will happen if you take this image and send it to a DICOM Storage SCP that already has a copy of the original image. The Storage SCP has to invoke a conflict resolution procedure because it can't have two SOP Instances with the same UID. Upon receipt of your new image, the Storage SCP detects that the new image has the same UID as an existing image and the required behavior of the SCP is not well defined. The Storage SCP can treat your new image as if it is just a retransmission of the original image and ignore your new image, or it can treat it as if it is a corrected version of the original image and replace the original image with your new image, or it can give up and admit that it has absolutely no idea what to do with this new image and throw it into a holding area and require a human being to interact with the application to decide what to do with the two images. You, the creator of the new image, have no way of knowing or controlling what the behavior of the Storage SCP will be when it receives your new image.
At a minimum, you need to generate a new valid SOP Instance UID when you create a new image. Your image type should also be one of the DERIVED\SECONDARY types because it is a post-processed image, not a primary acquisition generated by the modality. You should also look at the other DICOM tags present in the original header and seriously consider whether they accurately describe the new image that you've created.
That would pretty much mean starting from the DICOM standard and writing a lot of code.