Breakthroughs in Video-Infinity, LongVA, Text-Animator

🚨𝐓𝐞𝐱𝐭-𝐀𝐧𝐢𝐦𝐚𝐭𝐨𝐫: Controllable Visual Text Video Generation 🌟𝐏𝐫𝐨𝐣: https://t.co/w0IN7cZ9ak 🚀𝐀𝐛𝐬: https://t.co/4epuSFo8r8 Improve the stability of generated visual text by controlling the camera movement as well as the motion of visualized text https://t.co/fCQuqfae0M

Zhengzhong Tu@_vztu

4 d

🚨𝐌𝐨𝐭𝐢𝐨𝐧𝐁𝐨𝐨𝐭𝐡: Motion-Aware Customized Text-to-Video Generation 🌟𝐏𝐫𝐨𝐣: https://t.co/hPNkvQBgGP 🚀𝐀𝐛𝐬: https://t.co/wl7UONW0HE Animating customized subjects with precise control over both object and camera movements. https://t.co/D1DGJjXqa2

AI Bites | YouTube Channel@ai_bites

4 d

MotionBooth is an innovative framework designed for animating customized subjects with precise control over both object and camera movements. Paper: MotionBooth: Motion-Aware Customized Text-to-Video Generation Link: https://t.co/dZcHTg0ifh Project: https://t.co/bGOuIVyki6 #AI… https://t.co/B7Rc671tyJ

AI Bites | YouTube Channel@ai_bites

4 d

While recent advances in text-to-image (T2I) visual text generation show promise, transitioning these techniques into the video domain faces problems, notably in preserving textual fidelity and motion coherence. This paper proposes an innovative approach termed Text-Animator for… https://t.co/Vz19GR2gZI

Fuzhao Xue on the job market!@XueFz

4 d

simple but smart way to achieve real-time video generation!!! Training-free * almost no performance drop -> real-time DiT video gen (over 20 fps) I do like this analyzing process to invent this tech: 1) Found the attention patterns are extremely similar in nearby diffusion… https://t.co/3Obw2AJr1I

Ziwei Liu@liuziwei7

4 d

🏆Our #LongVA is now the **best** open-source video LLM among the 7B-scale models🏆 * Long-context capability of processing 2000+ frames or over 200K visual tokens - Code: https://t.co/Jb7P5F59Bf - Blog: https://t.co/FMxFY4dIEx - Demo @huggingface : https://t.co/BYyiXj8pc2 https://t.co/V1wGg50iZp https://t.co/fUuRMiFqbo

Xuanlei Zhao@oahzxl

4 d

Real-Time Video Generation: Achieved 🥳 Share our latest work with @JxlDragon, @VictorKaiWang1, and @YangYou1991: "Real-Time Video Generation with Pyramid Attention Broadcast." 3 features: real-time, lossless quality, and training-free! Blog: https://t.co/e6nTwd5J0L (🧵1/6) https://t.co/tPvBvSvcMp

Aran Komatsuzaki@arankomatsuzaki

4 d

Tencent and Huawei present Text-Animator: Controllable Visual Text Video Generation Demonstrates the superiority of their approach to the accuracy of generated visual text over state-of-the-art video generation methods abs: https://t.co/hXR7QRC2xz proj: https://t.co/QDqF9xKhnS https://t.co/r4gTmy7X3i

Aran Komatsuzaki@arankomatsuzaki

5 d

Video-Infinity: Distributed Long Video Generation Can generate super long videos, up to 2300 frames within 5 mins by Clip parallelism and Dual-scope attention proj: https://t.co/7ywX3pY4h9 abs: https://t.co/CUkAGG8RSn repo: https://t.co/PdR05xhyrj https://t.co/qWrREC09vL

Ziwei Liu@liuziwei7

5 d

🔥Long Context from Langugae to Vision🔥 #LongVA can process 2000 frames or over 200K visual tokens with SoTA performance on Video-MME among 7B models - Paper: https://t.co/iCVi2EISeB - Code: https://t.co/Jb7P5F59Bf - Demo @Gradio: https://t.co/BYyiXj8pc2 . Thanks to @_akhaliq! https://t.co/ZBdWx4HrlG

Gradio@Gradio

5 d

🤩Long Video Assistant (LongVA): Breakthrough in long 🎥video understanding! - Transfers long context capability from language to vision 🧠 - Only opensource model supporting 384 input frames🤩 - Handles 2000+ frames (200K+ visual tokens) 🤯 - SoTA on Video-MME among 7B models -… https://t.co/GH4g0q9hhV

Aran Komatsuzaki@arankomatsuzaki

5 d

Long Context Transfer from Language to Vision - Can process 2000 frames or over 200K visual tokens - SotA perf on VideoMME among 7B-scale models abs: https://t.co/JlXz5TPbVP repo: https://t.co/Nyi6fTS5qh https://t.co/ehRr5V0syo

Xingyi Yang@yxy2168

5 d

🚀 Introducing Video-Infinity! Our new distributed framework revolutionizes long video generation. 🎥✨ 🌟 Generate videos up to 2,300 frames in just 5 minutes—100x faster than previous methods!#AI #Video #AIGC Project Page: https://t.co/QTN0uoxv8f Paper: https://t.co/R7X9QADFzN https://t.co/CzoZiYupj5

Similar Stories

Breakthroughs in Video-Infinity, LongVA, Text-Animator, and Real-Time Video Generation Revolutionize Long Video Processing

Similar Stories

Sources

Breakthroughs in Video-Infinity, LongVA, Text-Animator, and Real-Time Video Generation Revolutionize Long Video Processing