Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
-
Updated
Jan 10, 2025 - Python
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks" [TMLR 2024]
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation [TMLR 2024]
This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
HeyGem — Your AI face, made free
AI Talking Head: create video from plain text or audio file in minutes, support up to 100+ languages and 350+ voice models.
Diverse Video Generation using a Gaussian Process Trigger
视频合成(videosyn)是由PlugLink官方开发的多素材合成插件,主要用于矩阵号发布,解放双手。Video synthesis (videosyn) is a multi-material synthesis plugin developed by PlugLink, mainly used for matrix number publishing, freeing up your hands.
Generating Diverse Audio-Visual 360º Soundscapes for Sound Event Localization and Detection
devola2 is an open source real-time audio visualizer (video-synth) tool. Purposely designed for live-music performance of B.L.M.D and Even Tide.
This model synthesises high-fidelity fashion videos from single images featuring spontaneous and believable movements.
Collection of openFrameworks video synthesis examples
SG2VID: Scene Graphs Enable Fine-Grained Control for Video Synthesis (MICCAI 2025 - ORAL)
AI-Text-Video is an open-source project that leverages advanced artificial intelligence and deep learning technologies to automatically convert written text into engaging videos. This tool generates video content—including visuals, animations, and voiceovers—from simple text input, making it ideal for content creators, educators, marketers, and any
NeRF- Real-time View Synthesis
An automated video storytelling pipeline that turns online articles into narrated clips with custom scripts, titles, and visuals. Combines scraping, summarization, TTS, and video generation—ideal for tech news recaps, AI-powered media channels, or hands-free content creation.
LipGANs is a text-to-viseme GAN framework that generates realistic mouth movements directly from text, without requiring audio. It maps phonemes → visemes, predicts phoneme durations, and uses per-viseme 3D GANs to synthesize photorealistic frames that can be exported as PNG sequences, GIFs, or MP4 videos.
Add a description, image, and links to the video-synthesis topic page so that developers can more easily learn about it.
To associate your repository with the video-synthesis topic, visit your repo's landing page and select "manage topics."