Model Overview

Description:

UniRelight is a relighting framework that jointly models the distribution of scene intrinsics and illumination. It enables high-quality relighting and intrinsic decomposition from a single input image or video, producing temporally consistent shadows, reflections, and transparency, and outperforms state-of-the-art methods. This model is ready for non-commercial use.

License/Terms of Use:

The model is distributed under the Nvidia Source Code License.

Deployment Geography:

Global

Use Case:

UniRelight supports studies and prototyping in intrinsic decomposition and controllable relighting. This release is an open-source implementation of our research paper, intended for AI research, development and benchmarking for image/video delighting and relighting tasks.

Release Date:

Github 04/04/2026 via https://github.com/nv-tlabs/UniRelight

References(s):

Project page: https://research.nvidia.com/labs/toronto-ai/UniRelight/

Model Architecture:

Architecture Type: Transformer

Network Architecture: Transformer.

** This model was developed based on Cosmos-Predict1

** This model has 7B model parameters.

Input:

Input Type(s): Video

Input Format(s): Red, Green, Blue (RGB) image frames.

Input Parameters: The input video data are five-dimensional, with the input dimension specified as [batch_size, num_frames, height, width, 3], where the three channels are Red, Green, Blue (RGB) channels.

Other Properties Related to Input: The input resolution is 480 x 848.

Output:

Output Type(s): Video

Output Format(s): Red, Green, Blue (RGB) image frames.

Output Parameters: The output video data are five-dimensional, with the dimension specified as [batch_size, num_frames, height, width, 3], where the three channels are Red, Green, Blue (RGB) channels.

Other Properties Related to Output: The output resolution is 480 x 848.

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.

Software Integration:

Runtime Engine(s): Not Applicable - Uses Python Scripts and Pytorch

Supported Hardware Microarchitecture Compatibility: NVIDIA Ampere (A100)

[Preferred/Supported] Operating System(s): ['Linux']

Model Version(s):

Model Version(s):

  • UniRelight_Cosmos_7B: initial release, supports albedo estimation and video relighting

Training, Testing, and Evaluation Datasets:

Training Dataset:

** Data Modality

  • Video

** Video Training Data Size

  • ~108,000 rendered videos. Each video contains 57 frames at 704×1280 resolution.

** Data Collection Method by dataset

  • [Synthetic] — all data is generated via a physically based path tracer

** Labeling Method by dataset

  • [Synthetic] — all labels are produced by the renderer

**Properties: The SyntheticScenes dataset consists of 108k synthetic multi-illumination videos generated entirely using an OptiX-based physically based path tracer. Each sample includes four machine-generated modalities:

  • Input RGB video (LDR)
  • Albedo video (reflectance)
  • HDR environment lighting
  • Relit target video under a second environment map

Testing Dataset:

** Data Modality

  • Video

** Video Training Data Size

  • ~108,000 rendered videos. Each video contains 57 frames at 704×1280 resolution.

** Data Collection Method by dataset

  • [Synthetic] — all data is generated via a physically based path tracer

** Labeling Method by dataset

  • [Synthetic] — all labels are produced by the renderer

**Properties:

Testing split is a held-out 10% subset of the same dataset used for training.

Evaluation Dataset:

** Data Modality

  • Video

** Video Training Data Size

  • ~108,000 rendered videos. Each video contains 57 frames at 704×1280 resolution.

** Data Collection Method by dataset

  • [Synthetic] — all data is generated via a physically based path tracer

** Labeling Method by dataset

  • [Synthetic] — all labels are produced by the renderer

**Properties:

Evaluation split is a held-out 10% subset of the same dataset used for training.

Inference:

Engine: Tensor(RT)

Test Hardware:

A100 GPUs

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please make sure you have proper rights and permissions for all input image and video content; if image or video includes people, personal health information, or intellectual property, the image or video generated will not blur or maintain proportions of image subjects included.
Users are responsible for model inputs and outputs. Users are responsible for ensuring safe integration of this model, including implementing guardrails as well as other safety mechanisms, prior to deployment.
Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support