LayerAnimate


Layer-level Control for Animation


Yuxue Yang1,2     Lue Fan2     Zuzeng Lin3     Feng Wang4     Zhaoxiang Zhang1,2
1UCAS     2CASIA     3TJU     4CreateAI

We update LayerAnimate and support trajectory control for a flexible composition of various layer-level controls. The video on BiliBili and YouTube illustrates our original framework, which will be updated soon.

Abstract

Traditional animation production decomposes visual elements into discrete layers to enable independent processing for sketching, refining, coloring, and in-betweening. Existing anime generation video methods typically treat animation as a distinct data domain different from real-world videos, lacking fine-grained control at the layer level. To bridge this gap, we introduce LayerAnimate, a novel video diffusion framework with layer-aware architecture that empowers the manipulation of layers through layer-level controls. The development of a layer-aware framework faces a significant data scarcity challenge due to the commercial sensitivity of professional animation assets. To address the limitation, we propose a data curation pipeline featuring Automated Element Segmentation and Motion-based Hierarchical Merging. Through quantitative and qualitative comparisons, and user study, we demonstrate that LayerAnimate outperforms current methods in terms of animation quality, control precision, and usability, making it an effective tool for both professional animators and amateur enthusiasts. This framework opens up new possibilities for layer-level animation applications and creative flexibility.

Layer Curation

Layer Curation Pipeline. The bottom orange dashed box illustrates curated layer masks with different motion scores, where motion scores remain temporally constant throughout the animation clip. Yellow dashed boxes denote new elements absent in the first frame, demonstrating our pipeline's capability to segment dynamically appearing elements. We transparently present some frames of masklets to highlight the new elements in Key Frame Ki.

Architecture

Overview of LayerAnimate. LayerAnimate establishes a layer-level control architecture for animation generation. It enables the flexible composition of control signals at the layer level, allowing for injecting distinct conditions (e.g., motion scores, trajectories, and sketches) for different layers. For simplicity, the text and image injection branches are omitted from the core architecture schematic.

Comparison

Qualitative comparison with other competitors. We select several clips to exemplify the representative characteristics of animation, including particle effects in Image-to-Video, a knife appearing off-screen in Image-to-Video with Sketch, and an unconventional fade-in visual style in Interpolation with Sketch.

Composite Control

Composite Control. LayerAnimate provides multiple user-friendly control options at the layer level, leading to a composite control manner.

More Examples

BibTeX

@article{yang2025layeranimate,
  author    = {Yang, Yuxue and Fan, Lue and Lin, Zuzeng and Wang, Feng and Zhang, Zhaoxiang},
  title     = {LayerAnimate: Layer-level Control for Animation},
  journal   = {arXiv preprint arXiv:2501.08295},
  year      = {2025},
}

Acknowledgements

We sincerely thank the great work ToonCrafter, LVCD, and AniDoc for their inspiring work and contributions to the animation community.