Skip to content

TwinWeaver

TwinWeaver Logo TwinWeaver Logo

License Python 3.8+ GitHub Repo

TwinWeaver is a longitudinal framework for LLM-based Patient Digital Twins. It serializes longitudinal patient histories into text, enabling unified event prediction as well as forecasting with large language models (LLMs). This framework transforms structured patient history—including demographics, labs, treatments, and genetics—into a single, human-readable text prompt, enabling LLMs to jointly forecast continuous biomarkers and predict discrete clinical events.


Get Started

  • Installation


    Install TwinWeaver with pip in seconds

    Installation Guide

  • Tutorials


    Step-by-step notebooks to learn TwinWeaver

    View Tutorials

  • Quick Start


    Minimal code example for experienced users

    Quick Start

  • Dataset Format


    Understand the expected data structure

    Dataset Format

  • Data Splitting


    How patient timelines become training examples

    Data Splitting

  • Pro Tips


    Debugging, scaling, and optimization advice

    Pro Tips

  • API Index


    Quick reference to all classes and functions

    API Index


Why TwinWeaver?

TwinWeaver addresses the challenge of modeling sparse, multi-modal clinical time series by leveraging the generative capabilities of LLMs:

  • Text Serialization: Transforms multi-modal inputs into structured textual representations
  • Unified Tasks: Supports both time-series forecasting and landmark event prediction
  • Flexible Horizons: Avoids overfitting to specific canonical time points
  • MEDS Integration: Easily integrate existing MEDS datasets

Learn more about the framework


GDT is a pan-cancer model instantiated using TwinWeaver, trained on over 93,000 patients across 20 cancer types. It achieves a median MASE of 0.87 for forecasting and an average C-index of 0.703 for risk stratification.

GDT Repository


Citation

If you use TwinWeaver in your research, please cite our paper:

@misc{makarov2026twinweaver,
      title={TwinWeaver: An LLM-Based Foundation Model Framework for Pan-Cancer Digital Twins},
      author={Nikita Makarov and Maria Bordukova and Lena Voith von Voithenberg and Estrella Pivel-Villanueva and Sabrina Mielke and Jonathan Wickes and Hanchen Wang and Mingyu Derek Ma and Keunwoo Choi and Kyunghyun Cho and Stephen Ra and Raul Rodriguez-Esteban and Fabian Schmich and Michael Menden},
      year={2026},
      eprint={2601.20906},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2601.20906},
}