There's a pretty clear problem right off the bat: It's not particularly friendly to linear textures. If you simply attempt to convert sRGB values into linear space and store the result in YCoCg, you will experience severe banding owing largely to the loss of precision at lower values. Gamma space provides a lot of precision at lower intensity values where the human visual system is more sensitive.

sRGB texture modes exist as a method to cheaply convert from gamma space to linear, and are pretty fast since GPUs can just use a look-up table to get the linear values, but YCoCg can't be treated as an sRGB texture and doing sRGB decodes in the shader is fairly slow since it involves a divide, power raise, and conditional.

This can be resolved first by simply converting from a 2.2-ish sRGB gamma ramp to a 2.0 gamma ramp, which preserves most of the original gamut: 255 input values map to 240 output values, low intensity values maintain most of their precision, and they can be linearized by simply squaring the result in the shader.

Another concern, which isn't really one if you're aiming for speed and doing things real-time, but is if you're considering using such a technique for offline processing, is the limited scale factor. DXT5 provides enough resolution for 32 possible scale factor values, so there isn't any reason to limit it to 1, 2, or 4 if you don't have to. Using the full range gives you more color resolution to work with.

Here's some sample code:

`unsigned char Linearize(unsigned char inByte)`

{

float srgbVal = ((float)inByte) / 255.0f;

float linearVal;

if(srgbVal < 0.04045)

linearVal = srgbVal / 12.92f;

else

linearVal = pow( (srgbVal + 0.055f) / 1.055f, 2.4f);

return (unsigned char)(floor(sqrt(linearVal)* 255.0 + 0.5));

}

void ConvertBlockToYCoCg(const unsigned char inPixels[16*3], unsigned char outPixels[16*4])

{

unsigned char linearizedPixels[16*3]; // Convert to linear values

for(int i=0;i<16*3;i++)

linearizedPixels[i] = Linearize(inPixels[i]);

// Calculate Co and Cg extents

int extents = 0;

int n = 0;

int iY, iCo, iCg;

int blockCo[16];

int blockCg[16];

const unsigned char *px = linearizedPixels;

for(int i=0;i<16;i++)

{

iCo = (px[0]<<1) - (px[2]<<1);

iCg = (px[1]<<1) - px[0] - px[2];

if(-iCo > extents) extents = -iCo;

if( iCo > extents) extents = iCo;

if(-iCg > extents) extents = -iCg;

if( iCg > extents) extents = iCg;

blockCo[n] = iCo;

blockCg[n++] = iCg;

px += 3;

}

// Co = -510..510

// Cg = -510..510

float scaleFactor = 1.0f;

if(extents > 127)

scaleFactor = (float)extents * 4.0f / 510.0f;

// Convert to quantized scalefactor

unsigned char scaleFactorQuantized = (unsigned char)(ceil((scaleFactor - 1.0f) * 31.0f / 3.0f));

// Unquantize

scaleFactor = 1.0f + (float)(scaleFactorQuantized / 31.0f) * 3.0f;

unsigned char bVal = (unsigned char)((scaleFactorQuantized << 3) | (scaleFactorQuantized >> 2));

unsigned char *outPx = outPixels;

n = 0;

px = linearizedPixels;

for(i=0;i<16;i++)

{

// Calculate components

iY = ( px[0] + (px[1]<<1) + px[2] + 2 ) / 4;

iCo = ((blockCo[n] / scaleFactor) + 128);

iCg = ((blockCg[n] / scaleFactor) + 128);

if(iCo < 0) iCo = 0; else if(iCo > 255) iCo = 255;

if(iCg < 0) iCg = 0; else if(iCg > 255) iCg = 255;

if(iY < 0) iY = 0; else if(iY > 255) iY = 255;

px += 3;

outPx[0] = (unsigned char)iCo;

outPx[1] = (unsigned char)iCg;

outPx[2] = bVal;

outPx[3] = (unsigned char)iY;

outPx += 4;

}

}

.... And to decode it in the shader ...

float3 DecodeYCoCg(float4 inColor)

{

float3 base = inColor.arg + float3(0, -0.5, -0.5);

float scale = (inColor.b*0.75 + 0.25);

float4 multipliers = float4(1.0, 0.0, scale, -scale);

float3 result;

result.r = dot(base, multipliers.xzw);

result.g = dot(base, multipliers.xyz);

result.b = dot(base, multipliers.xww);

// Convert from 2.0 gamma to linear

return result*result;

}

## No comments:

## Post a Comment