Project Description
Project Description:
We are looking for an experienced PyTorch optimization expert to accelerate the DOVE (Video Super-Resolution) model. The goal is to achieve an end-to-end inference speedup of 1.5x to 1.8x using strictly training-free methods.
Acceptable Optimization Techniques (Training-Free only):
You are free to explore and combine the following training-free approaches:
Token-level routing, Token Merging (ToMe), or Token Pruning.
Post-Training Quantization (PTQ).
Attention simplification (e.g., efficient attention mechanisms).
Coordination/Synergy optimization between VAE and DiT.
Core Requirements:
Target Model: DOVE (Repository: https://github.com/zhengchen1999/DOVE)
Acceleration Target: 1.5x - 1.8x end-to-end inference speedup.
Hardware Baseline: The speedup must be achieved and evaluated on a high-end NVIDIA GPU with 40GB+ VRAM (specifically targeting NVIDIA L40S, H100, or equivalent).
Testing Condition: The inference speed MUST be measured under the "no tiling" setting.
Quality Metrics Constraints:
Image Quality Assessment (IQA) metrics and Temporal metrics must NOT decrease compared to the original DOVE baseline.
PSNR and SSIM are allowed to drop, but the degradation must strictly not exceed 8%.
Tech Stack: Native PyTorch.
Current Progress:
We already have a preliminary working version implemented with Token Merging. This can be provided to the hired freelancer as a baseline/reference.
Budget & Timeline:
Budget: 700Euros (Fixed Price upon successful delivery and testing).
Timeline: 1 Month (4 Weeks).
To Apply, Please Provide:
Briefly describe your proposed training-free pipeline (e.g., which combination of pruning/quantization/attention simplification you plan to use) and confirm your availability for the 1-month timeline.