Friday, March 30, 2018

BC7 endpoint search using endpoint extrapolation

Convection Texture Tools is now roughly equal quality-wise with NVTT at compressing BC7 textures despite being about 140 times faster, making it one of the fastest and highest-quality BC7 compressors.

How this was accomplished turned out to be simpler than expected.  Recall that Squish became the gold standard of S3TC compressors by implementing a "cluster fit" algorithm that ordered all of the input colors on a line and tried every possible grouping of them to least-squares fit them.

Unfortunately, using this technique isn't practical in BC7 because the number of orderings has rather extreme scaling characteristics.  While 2-bit indices have a few hundred possible orderings, 4-bit indices have millions, most BC7 mode indices are 3 bits, and some have 4.

With that option gone, most BC7 compressors until now have tried to solve endpoints using various types of endpoint perturbation, which tends to require a lot of iterations.

Convection just uses 2 rounds of K-means clustering and a much simpler technique based on a guess about why Squish's cluster fit algorithm is actually useful: It can create endpoint mappings that don't use some of the terminal ends of the endpoint line, causing the endpoint to be extrapolated out, possibly to a point that loses less accuracy to quantization.

Convection just tries cutting off 1 index at each end, then 1 index at both ends.  That turned out to be enough to place it near the top of the quality benchmarks.

Now I just need to add color weighting and alpha weighting and it'll be time to move on to other formats.

Friday, December 2, 2011

Spherical harmonics self-shadowing

Valve's self-shadowing radiosity normal maps concept can be used with spherical harmonics in approximately the same way: Integrate a sphere based on how much light will affect a sample if incoming from numerous sample direction, accounting for collision with other samples due to elevation.

You can store this as three DXT1 textures, though you can improve quality by packing channels with similar spatial coherence. Coefficients 0, 2, and 6 in particular tend to pack well, since they're all dominated primarily by directions aimed perpendicular to the texture.

I use the following packing:
Texture 1: Coefs 0, 2, 6
Texture 2: Coefs 1, 4, 5
Texture 3: Coefs 3, 7, 8

You can reference an early post on this blog for code on how to rotate a SH vector by a matrix, in turn allowing you to get it into texture space. Once you've done that, simply multiply each SH coefficient from the self-shadowing map by the SH coefficients created from your light source (also covered on the previous post) and add together.

Tuesday, October 18, 2011

Introducing RDX

Has it really been a year since the last update?

Well, things have been chugging along with less discovery and more actual work. However, development on TDP is largely on hold due to the likely impending release of the Doom 3 source code, which has numerous architectural improvements like rigid-body physics and much better customization of entity networking.

In the meantime, however, a component of TDP has been spun off into its own project: The RDX extension language. Initially planned as a resource manager, it has evolved into a full-fledged programmability API. The main goal was to have a runtime with very straightforward integration, to the point that you can easily use it for managing your C++ resources, but also to be much higher performance than dynamically-typed interpreted languages, especially when dealing with complex data types such as float vectors.

Features are still being implemented, but the compiler seems to be stable and load-time conversion to native x86 code is functional. Expect a real release in a month or two.

The project now has a home on Google Code.

Thursday, October 7, 2010

YCoCg DXT5 - Stripped down and simplified

You'll recall some improvements I proposed to the YCoCg DXT5 algorithm a while back.

There's another realization of it I made recently: As a YUV-style color space, the Co and Cg channels are constrained to a range that's directly proportional to the Y channel. The addition of the scalar blue channel was mainly introduced to deal with resolution issues that caused banding artifacts on colored objects changing value, but the entire issue there can be sidestepped by simply using the Y channel as a multiplier for the Co and Cg channels, causing them to only respect tone and saturation while the Y channel becomes fully responsible for intensity.

This is not a quality improvement, in fact it nearly doubles PSNR in testing. However, it does result in considerable simplification of the algorithm, both on the encode and decode sides, and the perceptual loss compared to the old algorithm is very minimal.

This also simplifies the algorithm considerably:

int iY = px[0] + 2*px[1] + px[2]; // 0..1020
int iCo, iCg;

if (iY == 0)
iCo = 0;
iCg = 0;
iCo = (px[0] + px[1]) * 255 / iY;
iCg = (px[1] * 2) * 255 / iY;

px[0] = (unsigned char)iCo;
px[1] = (unsigned char)iCg;
px[2] = 0;
px[3] = (unsigned char)((iY + 2) / 4);

... And to decode:

float3 DecodeYCoCgRel(float4 inColor)
return (float3(4.0, 0.0, -4.0) * inColor.r
+ float3(-2.0, 2.0, -2.0) * inColor.g
+ float3(0.0, 0.0, 4.0)) * inColor.a;

While this does the job with much less perceptual loss than DXT1, and eliminates banding artifacts almost entirely, it is not quite as precise as the old algorithm, so using that is recommended if you need the quality.

Friday, June 4, 2010

... and they're still compressable

As a corollary to the last entry, an orthogonal tangent basis is commonly compressed by storing the normal and one of the texture axis vectors, along with a "handedness" multiplier which is either -1 or 1. The second texture axis is regenerated by taking the cross product of the normal and the stored axis, and multiplying it by the handedness.

The method I proposed was faulted for breaking this scheme, but there's no break at all. Since the two texture axes are on the triangle plane, and the normal is perpendicular, you can use the same compression scheme by simply storing the two texture axis vectors, and regenerating the normal by taking the cross product of them, multiplying it by a handedness multiplier, and normalizing it.

This does not address mirroring concerns if you use my "snap-to-normal" recommendation, though you could detect those cases in a vertex shader by using special handedness values.

Thursday, April 22, 2010

Tangent-space basis vectors: Don't assume your texture projection is orthogonal

How do you generate the tangent vectors, which represent which way the texture axes on a textured triangle, are facing?

Hitting up Google tends to produce articles like this one, or maybe even that exact one. I've seen others linked too, the basic formulae tend to be the same. Have you looked at what you're pasting into your code though? Have you noticed that you're using the T coordinates to calculate the S vector, and vice versa? Well, you can look at the underlying math, and you'll find that it's because that's what happens when you assume the normal, S vector, and T vectors form an orthonormal matrix and attempt to invert it, in a sense you're not really using the S and T vectors but rather vectors perpendicular to them.

But that's fine, right? I mean, this is an orthogonal matrix, and they are perpendicular to each other, right? Well, does your texture project on to the triangle with the texture axes at right angles to each other, like a grid?

... Not always? Well, you might have a problem then!

So, what's the real answer?

Well, what do we know? First, translating the vertex positions will not affect the axial directions. Second, scrolling the texture will not affect the axial directions.

So, for triangle (A,B,C), with coordinates (x,y,z,t), we can create a new triangle (LA,LB,LC) and the directions will be the same:

We also know that both axis directions are on the same plane as the points, so to resolve that, we can to convert this into a local coordinate system and force one axis to zero.

Now we need triangle (Origin, PLB, PLC) in this local coordinate space. We know PLB[y] is zero since LB was used as the X axis.

Now we can solve this. Remember that PLB[y] is zero, so...

Do this for both axes and you have your correct texture axis vectors, regardless of the texture projection. You can then multiply the results by your tangent-space normalmap, normalize the result, and have a proper world-space surface normal.

As always, the source code spoilers:

terVec3 lb = ti->points[1] - ti->points[0];
terVec3 lc = ti->points[2] - ti->points[0];
terVec2 lbt = ti->texCoords[1] - ti->texCoords[0];
terVec2 lct = ti->texCoords[2] - ti->texCoords[0];

// Generate local space for the triangle plane
terVec3 localX = lb.Normalize2();
terVec3 localZ = lb.Cross(lc).Normalize2();
terVec3 localY = localX.Cross(localZ).Normalize2();

// Determine X/Y vectors in local space
float plbx = lb.DotProduct(localX);
terVec2 plc = terVec2(lc.DotProduct(localX), lc.DotProduct(localY));

terVec2 tsvS, tsvT;

tsvS[0] = lbt[0] / plbx;
tsvS[1] = (lct[0] - tsvS[0]*plc[0]) / plc[1];
tsvT[0] = lbt[1] / plbx;
tsvT[1] = (lct[1] - tsvT[0]*plc[0]) / plc[1];

ti->svec = (localX*tsvS[0] + localY*tsvS[1]).Normalize2();
ti->tvec = (localX*tsvT[0] + localY*tsvT[1]).Normalize2();

There's an additional special case to be aware of: Mirroring.

Mirroring across an edge can cause wild changes in a vector's direction, possibly even degenerating it. There isn't a clear-cut solution to these, but you can work around the problem by snapping the vector to the normal, effectively cancelling it out on the mirroring edge.

Personally, I check the angle between the two vectors, and if they're more than 90 degrees apart, I cancel them, otherwise I merge them.

Thursday, February 11, 2010

Volumetric fog spoilers

Okay, so you want to make volumetric fog. Volumetric fog has descended from its days largely as a gimmick to being situationally useful, and there are still some difficulties: It's really difficult to model changes in the light inside the fog. There are techniques you can use for volumetric shadows within the fog, like rendering the depths of the front and back sides of non-overlapping volumes into a pair of accumulation textures, and using the difference between the two to determine the amount of distance penetrated.

Let's focus on a simpler implementation though: Planar, infinite, and with a linear transitional region. A transitional region is nice because it means the fog appears to gradually taper off instead of being conspicuously contained entirely below a flat plane.

In practice, there is one primary factor that needs to be determined: The amount of fog penetrated by the line from the viewpoint to the surface. In determining that, the transitional layer and the surface layer actually need to be calculated separately:

Transition layer

For the transition layer, what you want to do is multiply the distance traveled through the transition layer by the average density of the fog. Fortunately, due to some quirks of the math involved, there's a very easy way to get this: The midpoint of the entry and exit points of the transitional region will be located at a point where the fog density is equal to the average density passed through. The entry and exit points can be done by taking the viewpoint and target distances and clamping them to the entry and exit planes.

Full-density layer

The full-density layer is a bit more complex, since it behaves differently whether the camera is inside or outside of the fog. For a camera inside the fog, the fogged portion is represented by the distance from the camera to the fog plane. For a camera outside of the fog, the fogged portion is represented by the distance from the object to the fog plane. If you want to do it in one pass, both of these modes can be represented by dividing one linearly interpolated value by the linearly interpolated distance of the camera-to-point distance relative to the fog plane.

Since the camera being inside or outside the fog is completely determinable in advance, you can easily make permutations based on it and skip a branch in the shader. With a deferred renderer, you can use depth information and the fog plane to determine all of the distances. With a forward renderer, most of the distance factors interpolate linearly, allowing you to do some clamps and divides entirely in the shader.

Regardless of which you use, once you have the complete distance traveled, the most physically accurate determination of the amount still visible as:

min(1, e-(length(cameraToVert) * coverage * density))

You don't have to use e as the base though: Using 2 is a bit faster, and you can rescale the density coefficient to achieve any behavior you could have attained with using e.

As usual, the shader code spoilers:

// EncodeFog : Encodes a 4-component vector containing fraction components used
// to calculate fog factor
float4 VEncodeFog(float3 cameraPos, float3 vertPos, float4 fogPlane, float fogTransitionDepth)
float cameraDist, pointDist;

cameraDist = dot(cameraPos,;
pointDist = dot(vertPos,;

return float4(cameraDist, fogPlane.w, fogPlane.w - fogTransitionDepth, pointDist);

// PDecodeFog : Returns the fraction of the original scene to display given
// an encoded fog fraction and the camera-to-vertex vector
// rcpFogTransitionDepth = 1/fogTransitionDepth
float PDecodeFog(float4 fogFactors, float3 cameraToVert, float fogDensityScalar, float rcpFogTransitionDepth)
// x = cameraDist, y = shallowFogPlaneDist, z = deepFogPlaneDist (< shallow), w = pointDist
float3 diffs = fogFactors.wzz - fogFactors.xxw;

float cameraToPointDist = diffs.x;
float cameraToFogDist = diffs.y;
float nPointToFogDist = diffs.z;

float rAbsCameraToPointDist = 1.0 / abs(cameraToPointDist);

// Calculate the average density of the transition zone fog
// Since density is linear, this will be the same as the density at the midpoint of the ray,
// clamped to the boundaries of the transition zone
float clampedCameraTransitionPoint = max(fogFactors.z, min(fogFactors.y, fogFactors.x));
float clampedPointTransitionPoint = max(fogFactors.z, min(fogFactors.y, fogFactors.w));
float transitionPointAverage = (clampedPointTransitionPoint + clampedCameraTransitionPoint) * 0.5;

float transitionAverageDensity = (fogFactors.y - transitionPointAverage) * rcpFogTransitionDepth;

// Determine a coverage factor based on the density and the fraction of the ray that passed through the transition zone
float transitionCoverage = transitionAverageDensity *
abs(clampedCameraTransitionPoint - clampedPointTransitionPoint) * rAbsCameraToPointDist;

// Calculate coverage for the full-density portion of the volume as the fraction of the ray intersecting
// the bottom part of the transition zone
float fullCoverage = cameraToFogDist * rAbsCameraToPointDist;
if(nPointToFogDist >= 0.0)
fullCoverage = 1.0;
# else
float fullCoverage = max(0.0, nPointToFogDist * rAbsCameraToPointDist);
# endif

float totalCoverage = fullCoverage + transitionCoverage;

// Use inverse exponential scaling with distance
// fogDensityScalar is pre-negated
return min(1.0, exp2(length(cameraToVert) * totalCoverage * fogDensityScalar));