DEGA: Real-Time Photorealistic Full-Body Avatars

DEGA: Real-Time Photorealistic Full-Body Avatars

·

3 min read

DEGA is a system that creates real-time photorealistic full-body avatars using 3D Gaussian splatting. It's designed to handle dynamic scenarios, including challenging cases like sliding garments over the body.

Consider a realm where digital personas are sculpted not by polygons or voxels, but by the probabilistic grace of Gaussian distributions. This is the domain we step into, where avatars reside as three-dimensional ensembles of Gaussian components—a veritable cornucopia of statistical shapes melding to form human likenesses.

Embark on a journey through the intricate process that transforms a plethora of captured poses into a flexible model, akin to preparing a gourmet dish with meticulously gathered ingredients. We witness how traditional tetrahedral cages evolve, embracing a transformation under the influence of linear blend skinning, setting the stage for more nuanced, pose-dependent refinements.

The core of this system is a neural network ensemble, where each network—an artisan in its own right—carves out precise body deformations, ensuring that each pose retains natural form and fluidity. This team of networks weaves together the final form of our avatars, endowing them with not just shape, but also texture, color, and the subtleties that make them almost indistinguishable from reality.

But the endeavor does not halt here. An empirical odyssey unfolds as these Gaussian protagonists are scrutinized alongside their kin—other methodologies that vie for realism and accuracy in the portrayal of virtual humans. Side by side, they reveal their strengths in capturing the essence of motion, each with its own unique blend of computational alchemy and aesthetic charm.

And so, we find in this exploration a harmony of mathematics, computer science, and artistry

Key Features

  • 3D Gaussian Splatting: Extends surface splatting to animate characters, a technique initially for static scenes.

  • Input and Composition: Utilizes multiview video captures from 200 synchronized cameras. Inputs include character pose, 3D facial keypoints, RGB ground truth, and segmentation masks.

Technical Aspects

  1. Gaussian Deformations:

    • Uses tetrahedral cages for 3D Gaussian deformations.

    • Each Gaussian's 3D mean is a combination of tetrahedron vertices and barycentric coordinates.

    • Incorporates scaling and rotation into the Gaussian covariance matrix.

  2. Gaussian Nets:

    • Employs small MLPs for independently modeling avatar parts (face, upper garment, lower garment, body).

    • Predicts corrections to cage node positions, Gaussian positions, rotations, and scales.

    • Determines per-Gaussian color and opacity.

  3. Volumetric Rendering:

    • Uses forward mapping from canonical space with motion vectors, face embeddings, color features, and view direction.

    • The final image is achieved through alpha blending-based volumetric rendering.

Comparisons and Advantages

  • Compared with Other Methods:

    • Shows better garment modeling and motion capture than mesh-based body decoders, MVP, and DVA.

    • Excels in handling view-dependent effects.

  • Advantages:

    • Effective in modeling sliding garments and dynamic motions.

    • Allows decomposition into independent garments for flexible animation.

  • Limitations:

    • Challenges in capturing fine garment details like wrinkles and self-shadows.

Conclusion

DEGA marks a significant advancement in creating photorealistic, controllable avatars, suitable for applications in virtual reality, gaming, and digital media.

Did you find this article valuable?

Support AI Boom by becoming a sponsor. Any amount is appreciated!