Hunyuan-GameCraft

High-dynamic Interactive Game Video Generation with Hybrid History Condition

Jiaqi Li1,2* Junshu Tang1* Zhiyong Xu1 Longhuang Wu1
Yuan Zhou1 Shuai Shao1 Tianbao Yu1 Zhiguo Cao2 Qinglin Lu1†
1 Tencent Hunyuan 2 Huazhong University of Science and Technology
* Equal Contribution † Corresponding author
We are looking for collaboration and self-motivated interns. Contact: juliatang@tencent.com.
Paper
Abstract
Recent advances in diffusion-based and controllable video generation have enabled high-quality and temporally coherent video synthesis, laying the groundwork for immersive interactive gaming experiences. However, current methods face limitations in dynamics, generality, long-term consistency, and efficiency, which limit the ability to create various gameplay videos. To address these gaps, we introduce Hunyuan-GameCraft, a novel framework for high-dynamic interactive video generation in game environments. To achieve fine-grained action control, we unify standard keyboard and mouse inputs into a shared camera representation space, facilitating smooth interpolation between various camera and movement operations. Then we propose a hybrid history-conditioned training strategy that extends video sequences autoregressively while preserving game scene information. Additionally, to enhance inference efficiency and playability, we achieve model distillation to reduce computational overhead while maintaining consistency across long temporal sequences, making it suitable for real-time deployment in complex interactive environments. The model is trained on a large-scale dataset comprising over one million gameplay recordings across over 100 AAA games, ensuring broad coverage and diversity, then fine-tuned on a carefully annotated synthetic dataset to enhance precision and control. The curated game scene data significantly improves the visual fidelity, realism and action controllability. Extensive experiments demonstrate that Hunyuan-GameCraft significantly outperforms existing models, advancing the realism and playability of interactive game video generation.
Method
Hunyuan-GameCraft Pipeline Diagram
Overall architecture of Hunyuan-GameCraft. Given a reference image and the corresponding prompt, the keyboard or mouse signal, we transform these options to the continuous camera space. Then we design a light-weight action encoder to encode the input camera trajectory. The action and image features are added after patchify. For long video extension, we design a variable mask indicator, where 1 and 0 indicate history frames and predicted frames, respectively.
Qualitative Performance
We demonstrate and compare control accuracy, long-term consistency, history preservation, and dynamic performance across multiple scenarios and styles to reveal the power of Hunyuan-GameCraft.
Single-action Comparison
We compared control accuracy under single-action control with other interactive video methods in multiple game scenarios and art styles.
00
A vibrant, sunlit street lined with colorful European-style buildings and a tram track stretches into the distance, framed by modern skyscrapers in the background.
Input
A vibrant, sunlit street lined with colorful European-style buildings and a tram track stretches into the distance, framed by modern skyscrapers in the background.
Ours
A vibrant, sunlit street lined with colorful European-style buildings and a tram track stretches into the distance, framed by modern skyscrapers in the background.
Matrix-Game
A vibrant, sunlit street lined with colorful European-style buildings and a tram track stretches into the distance, framed by modern skyscrapers in the background.
MotionCtrl
A vibrant, sunlit street lined with colorful European-style buildings and a tram track stretches into the distance, framed by modern skyscrapers in the background.
CameraCtrl
A vibrant, sunlit street lined with colorful European-style buildings and a tram track stretches into the distance, framed by modern skyscrapers in the background.
WanX-Cam
01
A serene landscape features a river winding through lush green fields under a bright blue sky dotted with fluffy clouds.
Input
A serene landscape features a river winding through lush green fields under a bright blue sky dotted with fluffy clouds.
Ours
A serene landscape features a river winding through lush green fields under a bright blue sky dotted with fluffy clouds.
Matrix-Game
A serene landscape features a river winding through lush green fields under a bright blue sky dotted with fluffy clouds.
MotionCtrl
A serene landscape features a river winding through lush green fields under a bright blue sky dotted with fluffy clouds.
CameraCtrl
A serene landscape features a river winding through lush green fields under a bright blue sky dotted with fluffy clouds.
WanX-Cam
08
A vibrant, idyllic garden scene features lush greenery, blooming purple flowers, a rustic wooden fence, and a small basket resting on the ground amidst the natural beauty.
Input
A vibrant, idyllic garden scene features lush greenery, blooming purple flowers, a rustic wooden fence, and a small basket resting on the ground amidst the natural beauty.
Ours
A vibrant, idyllic garden scene features lush greenery, blooming purple flowers, a rustic wooden fence, and a small basket resting on the ground amidst the natural beauty.
Matrix-Game
A vibrant, idyllic garden scene features lush greenery, blooming purple flowers, a rustic wooden fence, and a small basket resting on the ground amidst the natural beauty.
MotionCtrl
A vibrant, idyllic garden scene features lush greenery, blooming purple flowers, a rustic wooden fence, and a small basket resting on the ground amidst the natural beauty.
CameraCtrl
A vibrant, idyllic garden scene features lush greenery, blooming purple flowers, a rustic wooden fence, and a small basket resting on the ground amidst the natural beauty.
WanX-Cam
11
A sniper rifle with a mounted scope is positioned on a rocky outcrop overlooking a mountainous landscape under a clear sky.
Input
A sniper rifle with a mounted scope is positioned on a rocky outcrop overlooking a mountainous landscape under a clear sky.
Ours
A sniper rifle with a mounted scope is positioned on a rocky outcrop overlooking a mountainous landscape under a clear sky.
Matrix-Game
A sniper rifle with a mounted scope is positioned on a rocky outcrop overlooking a mountainous landscape under a clear sky.
MotionCtrl
A sniper rifle with a mounted scope is positioned on a rocky outcrop overlooking a mountainous landscape under a clear sky.
CameraCtrl
A sniper rifle with a mounted scope is positioned on a rocky outcrop overlooking a mountainous landscape under a clear sky.
WanX-Cam
13
A charming Parisian street scene featuring "Chez Marceau Brasserie" with its vibrant red awning and outdoor seating area, surrounded by quaint shops and lush greenery under a bright blue sky.
Input
A charming Parisian street scene featuring "Chez Marceau Brasserie" with its vibrant red awning and outdoor seating area, surrounded by quaint shops and lush greenery under a bright blue sky.
Ours
A charming Parisian street scene featuring "Chez Marceau Brasserie" with its vibrant red awning and outdoor seating area, surrounded by quaint shops and lush greenery under a bright blue sky.
Matrix-Game
A charming Parisian street scene featuring "Chez Marceau Brasserie" with its vibrant red awning and outdoor seating area, surrounded by quaint shops and lush greenery under a bright blue sky.
MotionCtrl
A charming Parisian street scene featuring "Chez Marceau Brasserie" with its vibrant red awning and outdoor seating area, surrounded by quaint shops and lush greenery under a bright blue sky.
CameraCtrl
A charming Parisian street scene featuring "Chez Marceau Brasserie" with its vibrant red awning and outdoor seating area, surrounded by quaint shops and lush greenery under a bright blue sky.
WanX-Cam
19
A picturesque village scene featuring quaint houses, a windmill, lush greenery, and a serene mountain backdrop under a bright blue sky.
Input
A picturesque village scene featuring quaint houses, a windmill, lush greenery, and a serene mountain backdrop under a bright blue sky.
Ours
A picturesque village scene featuring quaint houses, a windmill, lush greenery, and a serene mountain backdrop under a bright blue sky.
Matrix-Game
A picturesque village scene featuring quaint houses, a windmill, lush greenery, and a serene mountain backdrop under a bright blue sky.
MotionCtrl
A picturesque village scene featuring quaint houses, a windmill, lush greenery, and a serene mountain backdrop under a bright blue sky.
CameraCtrl
A picturesque village scene featuring quaint houses, a windmill, lush greenery, and a serene mountain backdrop under a bright blue sky.
WanX-Cam
22
A picturesque rural landscape featuring a traditional windmill surrounded by golden fields under a partly cloudy sky.
Input
A picturesque rural landscape featuring a traditional windmill surrounded by golden fields under a partly cloudy sky.
Ours
A picturesque rural landscape featuring a traditional windmill surrounded by golden fields under a partly cloudy sky.
Matrix-Game
A picturesque rural landscape featuring a traditional windmill surrounded by golden fields under a partly cloudy sky.
MotionCtrl
A picturesque rural landscape featuring a traditional windmill surrounded by golden fields under a partly cloudy sky.
CameraCtrl
A picturesque rural landscape featuring a traditional windmill surrounded by golden fields under a partly cloudy sky.
WanX-Cam
Multi-action Visualization
With multiple sequential operational signal inputs we focus on the quality, continuity and consistency of the overall output generation. Hunyuan-GameCraft also enables the exploration of multiple complex trajectories for a single scene, and the related comparison and visualization results are shown below.
A pixelated, blocky landscape featuring a wooden house surrounded by lush greenery and a serene pond, set against a backdrop of towering, jagged mountains under a twilight sky.
Ours
A pixelated, blocky landscape featuring a wooden house surrounded by lush greenery and a serene pond, set against a backdrop of towering, jagged mountains under a twilight sky.
Matrix-Game
A cozy, whimsical bedroom filled with colorful decorations, books, and personal items, bathed in soft, ambient lighting.
Ours
A cozy, whimsical bedroom filled with colorful decorations, books, and personal items, bathed in soft, ambient lighting.
Ours
A sunlit courtyard features white adobe buildings with arched doorways and windows, surrounded by lush greenery and palm trees, creating a serene Mediterranean ambiance.
Ours
A sunlit courtyard features white adobe buildings with arched doorways and windows, surrounded by lush greenery and palm trees, creating a serene Mediterranean ambiance.
Ours
History Preservation
For immersive game experiences, 3D consistency and scene coherence are essential for interactive video modeling of game scenes. Thanks to hybrid history condition, Hunyuan-GameCraft effectively preserves the original scene information after significant movement.
A serene temple courtyard features multiple seated Buddha statues surrounded by misty, lush greenery under a cloudy sky.
A sunken pirate ship rests on a sandy beach surrounded by lush tropical vegetation and towering cliffs under a bright blue sky.
A serene forest scene featuring a wooden bridge crossing over a small stream, surrounded by lush green trees and vibrant wildflowers under a bright blue sky.
A scenic view of a forested hillside under a clear blue sky.
A serene landscape features vibrant hot air balloons floating above a lush green meadow with snow-capped mountains in the background under a bright blue sky.
A wide, open street lined with trees and modern buildings under a partly cloudy sky.
A red-cloaked figure rides a white horse past a lakeside castle with domed towers, surrounded by lush greenery and purple flowers.
A quaint cottage with a red-tiled roof sits nestled in a vibrant garden filled with colorful flowers and lush greenery.
A medieval stone castle stands tall under a dark sky, its glowing windows contrasting with the surrounding snow-covered landscape.
Third-person Perspective
As an important category of games, Hunyuan-GameCraft is able to generalize to third-person gaming scenarios to achieve natural controls.
A dark, sleek car is driving down a winding road at night, its headlights illuminating the path ahead.
A small sailboat navigates through choppy waters under a vast sky.
A first-person view of a character riding a horse through a dense forest, holding a bow.
A blue car drives alone on a foggy two-lane country road through rolling hills, with its red taillights glowing in the mist.
A sniper rifle with a mounted scope is aimed across a vast field of dry grass toward distant trees under a partly cloudy sky.
A yellow sports car speeds down a tree-lined two-lane road with white walls under a clear blue sky.
Application
We show here the results generated by Hunyuan-GameCraft for a wider range of applications, further illustrating the excellent generalization and dynamic performance of the model.
A giraffe stands gracefully among golden grasses with a lush green forest in the background.
A white dog wearing a blue harness runs energetically along a grassy path surrounded by greenery.
A figure in dark clothing walks through a sunlit ancient courtyard with cobblestone paths, stone pillars, and vine-covered arches surrounded by lush greenery.
Two shadowy figures on horseback ride through a foggy medieval village with thatched-roof huts and muddy paths under a gloomy sky.
A beige Toyota SUV with off-road tires stands parked amidst rugged desert rock formations under a cloudy sky.
A young girl blows bubbles in a sunny park setting, surrounded by vibrant colors and natural light.
A young girl in a blue vest blows bubbles in a sunny park, surrounded by trees and a path.
A man wearing a hat and vest sits on a brick wall, playing a guitar, with a scenic view of trees and buildings in the background.

BibTeX

@misc{li2025hunyuangamecrafthighdynamicinteractivegame,
    title={Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition}, 
    author={Jiaqi Li and Junshu Tang and Zhiyong Xu and Longhuang Wu and Yuan Zhou and Shuai Shao and Tianbao Yu and Zhiguo Cao and Qinglin Lu},
    year={2025},
    eprint={2506.17201},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2506.17201}, 
}