Skip to content

TwinWeaver Examples

This directory contains examples demonstrating how to use TwinWeaver for various tasks including data preparation, inference, and fine-tuning.

Data Preprocessing

  • data_preprocessing/raw_data_preprocessing.ipynb: Start here if you have raw clinical data. Shows how to transform raw EHR exports into the three TwinWeaver dataframes (df_events, df_constant, df_constant_description), including handling death events and other time-to-event outcomes.

Basic Examples

Advanced Examples

Located in the advanced/ directory, these examples cover more specific use cases.

Custom Splitting (advanced/custom_splitting/)

Custom Output (advanced/custom_output/)

  • customizing_text_generation.ipynb: A comprehensive tutorial on customizing every textual component of the instruction generation pipeline, including preambles, event formatting, time units, genetic data tags, forecasting prompts, and more.
  • custom_summarized_row.ipynb: Shows how to customize the summarized row section of the instruction prompt using set_custom_summarized_row_fn(). Includes minimal and advanced examples, plus error handling guidance.

Pretraining (advanced/pretraining/)

TTE Probability Inference (advanced/tte_inference/)

  • tte_probability_inference.ipynb: Demonstrates how to estimate probabilities for time-to-event outcomes (e.g., death, disease progression) using a fine-tuned LLM served via vLLM. Scores three mutually exclusive completions per patient and derives softmax probabilities from length-normalised log-probabilities. Includes evaluation across multiple time horizons. Requires a fine-tuned model and a GPU with enough memory for vLLM.

Integrations

Located in the integrations/ directory.

Data

  • example_data/: Contains the generator script and sample CSV files (events, constants, etc.) used by the examples.