unsigned char* buffer to System::Drawing::Bitmap - visual-c++

I'm trying to create a tool/asset converter that rasterises a font to a texture page for an XNA game using the FreeType2 engine.
Below, the first image is the direct output from the FreeType2]1 engine. The second image is the result after attempting to convert it to a System::Drawing::Bitmap.
target http://www.freeimagehosting.net/uploads/fb102ee6da.jpg currentresult http://www.freeimagehosting.net/uploads/9ea77fa307.jpg
Any hints/tips/ideas on what is going on here would be greatly appreciated. Links to articles explaining byte layout and pixel formats would also be helpful.
FT_Bitmap *bitmap = &face->glyph->bitmap;
int width = (face->bitmap->metrics.width / 64);
int height = (face->bitmap->metrics.height / 64);
// must be aligned on a 32 bit boundary or 4 bytes
int depth = 8;
int stride = ((width * depth + 31) & ~31) >> 3;
int bytes = (int)(stride * height);
// as *.bmp
array<Byte>^ values = gcnew array<Byte>(bytes);
Marshal::Copy((IntPtr)glyph->buffer, values, 0, bytes);
Bitmap^ systemBitmap = gcnew Bitmap(width, height, PixelFormat::Format24bppRgb);
// create bitmap data, lock pixels to be written.
BitmapData^ bitmapData = systemBitmap->LockBits(Rectangle(0, 0, width, height), ImageLockMode::WriteOnly, bitmap->PixelFormat);
Marshal::Copy(values, 0, bitmapData->Scan0, bytes);
systemBitmap->UnlockBits(bitmapData);
systemBitmap->Save("Test.bmp");
Update. Changed PixelFormat to 8bppIndexed.
FT_Bitmap *bitmap = &face->glyph->bitmap;
// stride must be aligned on a 32 bit boundary or 4 bytes
int depth = 8;
int stride = ((width * depth + 31) & ~31) >> 3;
int bytes = (int)(stride * height);
target = gcnew Bitmap(width, height, PixelFormat::Format8bppIndexed);
// create bitmap data, lock pixels to be written.
BitmapData^ bitmapData = target->LockBits(Rectangle(0, 0, width, height), ImageLockMode::WriteOnly, target->PixelFormat);
array<Byte>^ values = gcnew array<Byte>(bytes);
Marshal::Copy((IntPtr)bitmap->buffer, values, 0, bytes);
Marshal::Copy(values, 0, bitmapData->Scan0, bytes);
target->UnlockBits(bitmapData);

Ah ha. Worked it out.
FT_Bitmap is an 8bit image, so the correct PixelFormat was 8bppIndexed, which resulted this output.
Not aligned to 32byte boundary http://www.freeimagehosting.net/uploads/dd90fa2252.jpg
System::Drawing::Bitmap needs to be aligned on a 32 bit boundary.
I was calculating the stride but was not padding it when writing the bitmap. Copied the FT_Bitmap buffer to a byte[] and then wrote that to a MemoryStream, adding the necessary padding.
int stride = ((width * pixelDepth + 31) & ~31) >> 3;
int padding = stride - (((width * pixelDepth) + 7) / 8);
array<Byte>^ pad = gcnew array<Byte>(padding);
array<Byte>^ buffer = gcnew array<Byte>(size);
Marshal::Copy((IntPtr)source->buffer, buffer, 0, size);
MemoryStream^ ms = gcnew MemoryStream();
for (int i = 0; i < height; ++i)
{
ms->Write(buffer, i * width, width);
ms->Write(pad, 0, padding);
}
Pinned the memory so the GC would leave it alone.
// pin memory and create bitmap
GCHandle handle = GCHandle::Alloc(ms->ToArray(), GCHandleType::Pinned);
target = gcnew Bitmap(width, height, stride, PixelFormat::Format8bppIndexed, handle.AddrOfPinnedObject());
ms->Close();
As there is no Format8bppIndexed Grey the image was still not correct.
alt text http://www.freeimagehosting.net/uploads/8a883b7dce.png
Then changed the bitmap palette to grey scale 256.
// 256-level greyscale palette
ColorPalette^ palette = target->Palette;
for (int i = 0; i < palette->Entries->Length; ++i)
palette->Entries[i] = Color::FromArgb(i,i,i);
target->Palette = palette;
alt text http://www.freeimagehosting.net/uploads/59a745269e.jpg
Final solution.
error = FT_Load_Char(face, ch, FT_LOAD_RENDER);
if (error)
throw gcnew InvalidOperationException("Failed to load and render character");
FT_Bitmap *source = &face->glyph->bitmap;
int width = (face->glyph->metrics.width / 64);
int height = (face->glyph->metrics.height / 64);
int pixelDepth = 8;
int size = width * height;
// stride must be aligned on a 32 bit boundary or 4 bytes
// padding is the number of bytes to add to make each row a 32bit aligned row
int stride = ((width * pixelDepth + 31) & ~31) >> 3;
int padding = stride - (((width * pixelDepth) + 7) / 8);
array<Byte>^ pad = gcnew array<Byte>(padding);
array<Byte>^ buffer = gcnew array<Byte>(size);
Marshal::Copy((IntPtr)source->buffer, buffer, 0, size);
MemoryStream^ ms = gcnew MemoryStream();
for (int i = 0; i < height; ++i)
{
ms->Write(buffer, i * width, width);
ms->Write(pad, 0, padding);
}
// pin memory and create bitmap
GCHandle handle = GCHandle::Alloc(ms->ToArray(), GCHandleType::Pinned);
target = gcnew Bitmap(width, height, stride, PixelFormat::Format8bppIndexed, handle.AddrOfPinnedObject());
ms->Close();
// 256-level greyscale palette
ColorPalette^ palette = target->Palette;
for (int i = 0; i < palette->Entries->Length; ++i)
palette->Entries[i] = Color::FromArgb(i,i,i);
target->Palette = palette;
FT_Done_FreeType(library);

Your "depth" value doesn't match the PixelFormat of the Bitmap. It needs to be 24 to match Format24bppRgb. The PF for the bitmap needs to match the PF and stride of the FT_Bitmap as well, I don't see you take care of that.

Related

DirectX 11 changing the pixel bytes

Followed this guide here
I am tasked with "using map and unmap methods to draw a line across the screen by setting pixel byte data to rgb red values".
I have the sprite and background displaying but have no idea how to get the data.
I also tried doing this:
//Create device
D3D11_TEXTURE2D_DESC desc;
ZeroMemory(&desc, sizeof(D3D11_TEXTURE2D_DESC));
desc.Width = 500;
desc.Height = 300;
desc.Format = DXGI_FORMAT_B8G8R8A8_UNORM;
desc.Usage = D3D11_USAGE_DYNAMIC;
desc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
desc.MiscFlags = 0;
desc.MipLevels = 1;
desc.ArraySize = 1;
desc.SampleDesc.Count = 1;
desc.SampleDesc.Quality = 0;
desc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
m_d3dDevice->CreateTexture2D(&desc, nullptr, &texture);
m_d3dDevice->CreateShaderResourceView(texture, 0, &textureView);
// Render
D3D11_MAPPED_SUBRESOURCE mapped;
m_d3dContext->Map(texture, 0, D3D11_MAP_WRITE_DISCARD, 0, &mapped);
data = (BYTE*)mapped.pData;
rows = (BYTE)sizeof(data);
std::cout << "hi" << std::endl;
m_d3dContext->Unmap(texture, 0);
Problem is that in that case data array is size 0 but has a pointer. This means that I am pointing to a texture that doesn't have any data or am I not getting this?
Edit:
currently I found
D3D11_SHADER_RESOURCE_VIEW_DESC desc;
m_background->GetDesc(&desc);
desc.Buffer; // buffer
I felt the need to create an Answer for this as when I searched for how do this. This question pops up first and the supplied answer didn't really solve the problem for me and wasn't quite the way I wanted to do it anyways...
In my program I have a method as below.
void ContentLoader::WritePixelsToShaderIndex(uint32_t *data, int width, int height, int index)
{
D3D11_TEXTURE2D_DESC desc = {};
desc.Width = width;
desc.Height = height;
desc.MipLevels = 1;
desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
desc.SampleDesc.Count = 1;
desc.SampleDesc.Quality = 0;
desc.Usage = D3D11_USAGE_DEFAULT;
desc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
desc.CPUAccessFlags = 0;
desc.MiscFlags = 0;
D3D11_SUBRESOURCE_DATA initData;
initData.pSysMem = data;
initData.SysMemPitch = width * 4;
initData.SysMemSlicePitch = width * height * 4;
Microsoft::WRL::ComPtr<ID3D11Texture2D> tex;
Engine::device->CreateTexture2D(&desc, &initData, tex.GetAddressOf());
Engine::device->CreateShaderResourceView(tex.Get(), NULL, ContentLoader::GetTextureAddress(index));
}
Then using the below code I tested drawing a Blue Square with a White Line. And it works perfectly fine. The issue I was getting was setting the System Mem Slice and Mem Pitch after looking in the WICTextureLoader class I was able to figure out how the data is stored. So it appears the
MemPitch = The Row's Size in Bytes.
MemSlice = The Total Image Pixels Size In Bytes.
const int WIDTH = 200;
const int HEIGHT = 200;
const uint32_t RED = 255 | (0 << 8) | (0 << 16) | (255 << 24);
const uint32_t WHITE = 255 | (255 << 8) | (255 << 16) | (255 << 24);
const uint32_t BLUE = 0 | (0 << 8) | (255 << 16) | (255 << 24);
uint32_t *buffer = new uint32_t[WIDTH * HEIGHT];
bool flip = false;
for (int X = 0; X < WIDTH; ++X)
{
for (int Y = 0; Y < HEIGHT; ++Y)
{
int pixel = X + Y * WIDTH;
buffer[pixel] = flip ? BLUE : WHITE;
}
flip = true;
}
WritePixelsToShaderIndex(buffer, WIDTH, HEIGHT, 3);
delete [] buffer;
First of all, most of those functions return HRESULT values that you are ignoring. That's not safe as you will miss important errors that invalidate the remaining code. You can use if(FAILED(...)) if you want, or you can use ThrowIfFailed, but you can't just ignore the return value in a functioning app.
HRESULT hr = m_d3dDevice->CreateTexture2D(&desc, nullptr, &texture);
if (FAILED(hr))
// error!
hr = m_d3dDevice->CreateShaderResourceView(texture, 0, &textureView);
if (FAILED(hr))
// error!
// Render
D3D11_MAPPED_SUBRESOURCE mapped;
hr = m_d3dContext->Map(texture, 0, D3D11_MAP_WRITE_DISCARD, 0, &mapped);
if (FAILED(hr))
// error!
Second, you should enable the Debug Device and look for diagnostic output which will likely point you to the reason for the failure.
sizeof(data) is always going to be 4 or 8 since data is a BYTE* i.e. the size of a pointer. It has nothing to do with the size of your data array. The locked buffer pointed to by mapped.pData is going to be mapped.RowPitch * desc.Height bytes in size.
You have to copy your pixel data into it row-by-row. Depending on the format and other factors, mapped.RowPitch is not necessarily going to be 4 * desc.Width--4 bytes per pixel is because you are using a format of DXGI_FORMAT_B8G8R8A8_UNORM. It should be at least that big, but it could be bigger to align the overall size.
This is pseudo-code and not necessarily an efficient way to do it, but:
for(UINT y = 0; y < desc.Height; ++y )
{
for(UINT x = 0; x < desc.Width; ++x )
{
// Find the memory location of the pixel at (x,y)
int pixel = y * mapped.RowPitch + (x*4)
BYTE* blue = data[pixel];
BYTE* green = data[pixel] + 1;
BYTE* red = data[pixel] + 2;
BYTE* alpha = data[pixel] + 3;
*blue = /* value between 0 and 255 */;
*green = /* value between 0 and 255 */;
*red = /* value between 0 and 255 */;
*alpha = /* value between 0 and 255 */;
}
}
You should take a look at DirectXTex which does a lot of this kind of row-by-row processing.

Why are these shapes the wrong color?

So I'm writing up a processing sketch to test a randomized terrain generator for a scorched earth clone I'm working on. It seems to work as intended but with one minor problem. In the code I generate 800 1 pixel wide rectangles and set the fill to brown beforehand. The combination of the rectangles should be a solid mass with a brown dirt-like color (77,0,0).
However, the combination shows up as black regardless of the rgb fill value set. I think it might have something to do with each rectangle's border being black? Does anyone know what is happening here offhand?
final int w = 800;
final int h = 480;
void setup() {
size(w, h);
fill(0,128,255);
rect(0,0,w,h);
int t[] = terrain(w,h);
fill(77,0,0);
for(int i=0; i < w; i++){
rect(i, h, 1, -1*t[i]);
}
}
void draw() {
}
int[] terrain(int w, int h){
width = w;
height = h;
//min and max bracket the freq's of the sin/cos series
//The higher the max the hillier the environment
int min = 1, max = 6;
//allocating horizon for screen width
int[] horizon = new int[width];
double[] skyline = new double[width];
//ratio of amplitude of screen height to landscape variation
double r = (int) 2.0/5.0;
//number of terms to be used in sine/cosine series
int n = 4;
int[] f = new int[n*2];
//calculating omegas for sine series
for(int i = 0; i < n*2 ; i ++){
f[i] = (int) random(max - min + 1) + min;
}
//amp is the amplitude of the series
int amp = (int) (r*height);
for(int i = 0 ; i < width; i ++){
skyline[i] = 0;
for(int j = 0; j < n; j++){
skyline[i] += ( sin( (f[j]*PI*i/height) ) + cos(f[j+n]*PI*i/height) );
}
skyline[i] *= amp/(n*2);
skyline[i] += (height/2);
skyline[i] = (int)skyline[i];
horizon[i] = (int)skyline[i];
}
return horizon;
}
I think it might have something to do with each rectangle's border being black?
I believe this is the case. In your setup() function, I added the noStroke() function before you draw the rectangles. This removes the black outline to the rectangles. Since each rectangle is only 1 pixel wide, having this black stroke (which is on by default) makes the color of each rectangle black, no matter what color you try to choose before.
Here is an updated setup() function - I now see a reddish brown terrain:
void setup() {
size(w, h);
fill(0, 128, 255);
rect(0, 0, w, h);
int t[] = terrain(w, h);
fill(77, 0, 0);
noStroke(); // here
for (int i=0; i < w; i++) {
rect(i, h, 1, -1*t[i]);
}
}

A simple Vertex Buffer Object (C++) that doesnt render

Im trying to use VBOs to render just a normal 2d textured square onto an FBO. Immediate mode functions work flawlessly but not this VBO. GL_TEXTURE_2D is already enabled for the code. What is wrong with it?
unsigned int VBOid = 0;
unsigned int Iid = 0;
float *geometry;
unsigned int *indices;
int num_geometry = 1;
int num_vertices = 4;
int num_indices = num_geometry*num_vertices;
geometry = new float[num_geometry*num_vertices*4];
indices = new unsigned int[num_indices];
indices[0] = 0;
indices[1] = 1;
indices[2] = 2;
indices[3] = 3;
/* Fill geometry: 0, 1, = vertex_xy
* 2, 3 = tex_coord_uv
*/
geometry[0] = 0.0f;
geometry[1] = 0.0f;
geometry[2] = 0.0f;
geometry[3] = 0.0f;
geometry[4] = 50.0f;
geometry[5] = 0.0f;
geometry[6] = 1.0f;
geometry[7] = 0.0f;
geometry[8] = 50.0f;
geometry[9] = 50.0f;
geometry[10] = 1.0f;
geometry[11] = 1.0f;
geometry[12] = 0.0f;
geometry[13] = 50.0f;
geometry[14] = 0.0f;
geometry[15] = 1.0f;
glGenBuffers(1, &VBOid);
glBindBuffer(GL_ARRAY_BUFFER, VBOid);
glBufferData(GL_ARRAY_BUFFER, sizeof(geometry), geometry, GL_STATIC_DRAW);
glGenBuffers(1, &Iid);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, Iid);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(indices), indices, GL_STATIC_DRAW);
//GL_TEXTURE_2D is already enabled here
//Buffers are already bound from above
glBindTexture( GL_TEXTURE_2D, 2); //I used 2 just to test to see if it is rendering a texture correctly. Yes, 2 does exist in my program thats why i arbitrarily used it
//glClientActiveTexture(GL_TEXTURE0); I dont know what this is for and where to put it
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
//glActiveTexture(GL_TEXTURE0); same here I dont know what this is for or where to put it
glVertexPointer(2, GL_FLOAT, sizeof(GLfloat)*4, 0);
glTexCoordPointer(2, GL_FLOAT, sizeof(GLfloat)*4, (float*)(sizeof(GLfloat)*2));
glDrawElements(GL_QUADS, num_indices, GL_UNSIGNED_INT, indices);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
The problem is your usage of sizeof(geometry) (and the same for indices) inside the glBufferData calls. Those variables are actually just pointers, no matter if they point to dynamically allocated arrays (which the compiler doesn't know). So you will always get the size of a pointer (4 or 8 bytes, depending on platform).
Replace sizeof(geometry) with num_geometry*num_vertices*4*sizeof(float) and sizeof(indices) with num_indices*sizeof(unsigned int). Well, in fact you don't need any indices here at all and can just draw the whole thing with a simple
glDrawArrays(GL_QUADS, 0, 4);
Always be aware of the differences between an actual (compile-time sized) array and a mere pointer pointing to a dynamicallly allocated array, with the result of the sizeof operator being one of those differences (and the requirement to free the memory of the latter using delete[] at some later point in time being another, but not less important, difference).

How to scale font when the font-face is having only one font-size using MFC's DrawText

.
void CMainWindow::OnPaint()
{
CPaintDC DC(this);
//CRect rc(5, 5, 191, 99);
CRect rc(1, 1, 38, 19);
CBrush brush(DC.GetBkColor());
CBrush* pOldBrush = DC.SelectObject(&brush);
DC.FillRect(&rc, &brush);
DC.SelectObject(pOldBrush);
DC.SetBkMode(TRANSPARENT);
LOGFONT LogFont;
LogFont.lfHeight = -13;
LogFont.lfWidth = 0;
LogFont.lfEscapement = 0;
LogFont.lfOrientation = 0;
LogFont.lfWeight = 400;
LogFont.lfItalic = 0;
LogFont.lfUnderline = 0;
LogFont.lfStrikeOut = 0;
LogFont.lfCharSet = 0;
LogFont.lfOutPrecision = 0;
LogFont.lfClipPrecision = 0;
LogFont.lfQuality = 0;
LogFont.lfPitchAndFamily = 0;
wcscpy_s(LogFont.lfFaceName, _T("System"));
//float OffSetY = 1.0;
//float OffSetX = 1.0;
float OffSetY = 0.2;
float OffSetX = 0.2;
LogFont.lfHeight = (int)(LogFont.lfHeight * OffSetY);
LogFont.lfWidth = (int)(LogFont.lfWidth * OffSetX);
CFont* pFont = new CFont;
pFont->CreateFontIndirect(&LogFont);
CFont* pOldFont = DC.SelectObject( pFont );
CString sTemp(_T("Title current_folder\r\nField1\r\nComment:\r\nControl #:\r\nDescription:\r\nMagnification:\r\n"));
sTemp.Replace(_T("&"), _T("&&"));
int alignment = 0;
switch(alignment)
{
case 1:
DC.DrawText(sTemp, -1, rc, DT_WORDBREAK | DT_RIGHT | DT_EDITCONTROL);
break;
case 2:
DC.DrawText(sTemp, -1, rc, DT_WORDBREAK | DT_CENTER | DT_EDITCONTROL);
break;
default:
DC.DrawText(sTemp, -1, rc, DT_WORDBREAK | DT_EDITCONTROL);
break;
}
DC.SelectObject( pOldFont );
delete pFont;
}
When using System or FixedSys font (which have only one font size 10 and 9 resp) then in the text drawn is perfect in case when OffSetX and OffSetY is 1 and rc(5, 5, 191, 99). But If I change the OffSetX and OffSetY to 0.2 and rc(1, 1, 38, 19) then the text if truncated from bottom-right. This is case only when using the mentioned font which is having just one font-size and for other font is working fine and text drawn is properly scaled.
Since the font is having one font-size so DrawText is using this font size in all cases and rect given is too small to accommodate this text so it is showing only the few characters.
Is there any way I can fixed it, so that the text get scaled at these zoom conditions. This is the behavior I am getting in one of MFC Project when I perform zoom-in operation at the scenario mentioned above.
Any suggestion or alternative for this will be very helpful and appreciable.
Thanks.
Don't hard-code your font size like that! Instead, perform necessary calculations:
const int SIZE_IN_POINTS = 12;
LogFont.lfHeight = -MulDiv(SIZE_IN_POINTS, DC.GetDeviceCaps(LOGPIXELSY), 72);

OpenCL image2d_t writing mostly zeros

I am trying to use OpenCL and image2d_t objects to speed up image convolution. When I noticed that the output was a blank image of all zeros, I simplified the OpenCL kernel to a basic read from the input and write to the output (shown below). With a little bit of tweaking, I got it to write a few scattered pixels of the image into the output image.
I have verified that the image is intact up until the call to read_imageui() in the OpenCL kernel. I wrote the image to GPU memory with CommandQueue::enqueueWriteImage() and immediately read it back into a brand new buffer in CPU memory with CommandQueue::enqueueReadImage(). The result of this call matched the original input image. However, when I retrieve the pixels with read_imageui() in the kernel, the vast majority of the pixels are set to 0.
C++ source:
int height = 112;
int width = 9216;
unsigned int numPixels = height * width;
unsigned int numInputBytes = numPixels * sizeof(uint16_t);
unsigned int numDuplicatedInputBytes = numInputBytes * 4;
unsigned int numOutputBytes = numPixels * sizeof(int32_t);
cl::size_t<3> origin;
origin.push_back(0);
origin.push_back(0);
origin.push_back(0);
cl::size_t<3> region;
region.push_back(width);
region.push_back(height);
region.push_back(1);
std::ifstream imageFile("hri_vis_scan.dat", std::ifstream::binary);
checkErr(imageFile.is_open() ? CL_SUCCESS : -1, "hri_vis_scan.dat");
uint16_t *image = new uint16_t[numPixels];
imageFile.read((char *) image, numInputBytes);
imageFile.close();
// duplicate our single channel image into all 4 channels for Image2D
cl_ushort4 *imageDuplicated = new cl_ushort4[numPixels];
for (int i = 0; i < numPixels; i++)
for (int j = 0; j < 4; j++)
imageDuplicated[i].s[j] = image[i];
cl::Buffer imageBufferOut(context, CL_MEM_WRITE_ONLY, numOutputBytes, NULL, &err);
checkErr(err, "Buffer::Buffer()");
cl::ImageFormat inFormat;
inFormat.image_channel_data_type = CL_UNSIGNED_INT16;
inFormat.image_channel_order = CL_RGBA;
cl::Image2D bufferIn(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, inFormat, width, height, 0, imageDuplicated, &err);
checkErr(err, "Image2D::Image2D()");
cl::ImageFormat outFormat;
outFormat.image_channel_data_type = CL_UNSIGNED_INT16;
outFormat.image_channel_order = CL_RGBA;
cl::Image2D bufferOut(context, CL_MEM_WRITE_ONLY, outFormat, width, height, 0, NULL, &err);
checkErr(err, "Image2D::Image2D()");
int32_t *imageResult = new int32_t[numPixels];
memset(imageResult, 0, numOutputBytes);
cl_int4 *imageResultDuplicated = new cl_int4[numPixels];
for (int i = 0; i < numPixels; i++)
for (int j = 0; j < 4; j++)
imageResultDuplicated[i].s[j] = 0;
std::ifstream kernelFile("convolutionKernel.cl");
checkErr(kernelFile.is_open() ? CL_SUCCESS : -1, "convolutionKernel.cl");
std::string imageProg(std::istreambuf_iterator<char>(kernelFile), (std::istreambuf_iterator<char>()));
cl::Program::Sources imageSource(1, std::make_pair(imageProg.c_str(), imageProg.length() + 1));
cl::Program imageProgram(context, imageSource);
err = imageProgram.build(devices, "");
checkErr(err, "Program::build()");
cl::Kernel basic(imageProgram, "basic", &err);
checkErr(err, "Kernel::Kernel()");
basic.setArg(0, bufferIn);
basic.setArg(1, bufferOut);
basic.setArg(2, imageBufferOut);
queue.finish();
cl_ushort4 *imageDuplicatedTest = new cl_ushort4[numPixels];
for (int i = 0; i < numPixels; i++)
{
imageDuplicatedTest[i].s[0] = 0;
imageDuplicatedTest[i].s[1] = 0;
imageDuplicatedTest[i].s[2] = 0;
imageDuplicatedTest[i].s[3] = 0;
}
double gpuTimer = clock();
err = queue.enqueueReadImage(bufferIn, CL_FALSE, origin, region, 0, 0, imageDuplicatedTest, NULL, NULL);
checkErr(err, "CommandQueue::enqueueReadImage()");
// Output from above matches input image
err = queue.enqueueNDRangeKernel(basic, cl::NullRange, cl::NDRange(height, width), cl::NDRange(1, 1), NULL, NULL);
checkErr(err, "CommandQueue::enqueueNDRangeKernel()");
queue.flush();
err = queue.enqueueReadImage(bufferOut, CL_TRUE, origin, region, 0, 0, imageResultDuplicated, NULL, NULL);
checkErr(err, "CommandQueue::enqueueReadImage()");
queue.flush();
err = queue.enqueueReadBuffer(imageBufferOut, CL_TRUE, 0, numOutputBytes, imageResult, NULL, NULL);
checkErr(err, "CommandQueue::enqueueReadBuffer()");
queue.finish();
OpenCL kernel:
__kernel void basic(__read_only image2d_t input, __write_only image2d_t output, __global int *result)
{
const sampler_t smp = CLK_NORMALIZED_COORDS_TRUE | //Natural coordinates
CLK_ADDRESS_NONE | //Clamp to zeros
CLK_FILTER_NEAREST; //Don't interpolate
int2 coord = (get_global_id(1), get_global_id(0));
uint4 pixel = read_imageui(input, smp, coord);
result[coord.s0 + coord.s1 * 9216] = pixel.s0;
write_imageui(output, coord, pixel);
}
The coordinates in the kernel are currently mapped to (x, y) = (width, height).
The input image is a single channel greyscale image with 16 bits per pixel, which is why I had to duplicate the channels to fit into OpenCL's Image2D. The output after convolution will be 32 bits per pixel, which is why numOutputBytes is set to that. Also, although the width and height appear weird, the input image's dimensions are 9216x7824, so I'm only taking a portion of it to test the code first, so it doesn't take forever.
I added in a write to global memory after reading from the image in the kernel to see if the issue was reading the image or writing the image. After the kernel executes, this section of global memory also contains mostly zeros.
Any help would be greatly appreciated!
The documentation for read_imageui states that
Furthermore, the read_imagei and read_imageui calls that take integer coordinates must use a sampler with normalized coordinates set to CLK_NORMALIZED_COORDS_FALSE and addressing mode set to CLK_ADDRESS_CLAMP_TO_EDGE, CLK_ADDRESS_CLAMP or CLK_ADDRESS_NONE; otherwise the values returned are undefined.
But you're creating a sampler with CLK_NORMALIZED_COORDS_TRUE (but seem to be passing in non-normalized coords :S ?).

Resources