BoneCoT

A whole-body skeleton foundation model for bone metastases, guided by clinician-derived chain of thought and validated across multiple centers.

Hui Zhao1,*,#, Ruipeng Zhang2,*, Zhiyu Wang1,*, Yifeng Gu2, Shengyuan Xu3, Sheng Wang4,#, Yuehua Li2,#

  1. Metastatic Bone Tumor Clinical Center, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
  2. Institute of Diagnostic and Interventional Radiology, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
  3. Mailman School of Public Health, Columbia University, New York, NY, USA
  4. Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA

*These authors contributed equally: Hui Zhao, Ruipeng Zhang, Zhiyu Wang

#Email: zhao-hui@sjtu.edu.cn; swang@cs.washington.edu; liyuehua77@sjtu.edu.cn

Journal Nature Biomedical Engineering
DOI 10.1038/s41551-026-01736-1
Model BoneFM on Hugging Face
Input focus Bone-window CT, WL 300 / WW 1500
Figure 1d workflow showing BoneCoT model training and clinician-guided reasoning inference.
Figure 1d. BoneCoT model training and clinician-guided reasoning inference.

Method Overview

BoneCoT separates representation learning, task-specific modelling, and clinician-derived dependency reasoning so the public release can be used as a code and model reference without exposing private clinical data.

01

Bone-window CT input

CT images are prepared around bone structures with WL 300 and WW 1500 before being loaded by the public inference code.

02

BoneFM backbone

BoneFM provides skeleton-focused visual representations for downstream bone lesion and metastasis tasks.

03

Task-specific heads

Local users can connect their own de-identified manifests and task checkpoints through the released configuration templates.

04

Clinician-derived CoT

BoneCoT uses task dependencies curated from clinical reasoning to support downstream inference for bone-related disease assessment.

Supplementary model framework showing BoneFM inference and BoneCoT inference architecture from CT images to prediction.
Supplementary model framework. BoneFM extracts image features from bone-window CT images; BoneCoT adds task-related hidden features before classification.

Bone-window preprocessing

The public inference path assumes that raw CT data have already been converted into de-identified image slices. The key public convention is the bone-window normalization used before PIL loading and ImageNet-style tensor normalization.

Window level 300
Window width 1500
Mapping clip((HU - (WL - WW / 2)) / WW, 0, 1)
Default crop 518
Model path finetune/checkpoints/BoneFM.pth

Public Release Boundary

The repository is code-first. Large weights move to Hugging Face, and sensitive assets stay outside public documentation.

Included

  • BoneFM and BoneCoT model/evaluation code.
  • Inference notebooks with private-path placeholders.
  • Configuration templates for local data.
  • BoneFM model card for Hugging Face.
  • This GitHub Pages project page.

Not included

  • Training, validation, or test datasets.
  • Patient-level clinical metadata or annotations.
  • Reviewer-only reproduction packages.
  • Internal training launch recipes and logs.
  • Task-specific fine-tuned checkpoints unless separately released.

Selected Figures

This page intentionally shows only the core orientation artwork and the Fig. 2a-b summary panels. Additional experimental result figures are kept out of the public landing page.

Figure 1a showing multi-center data growth by modality from 2013 to 2023.
Figure 1a. Multi-center data growth by modality.
Figure 1b showing skeletal distribution across normal, primary, and metastatic cohorts.
Figure 1b. Skeletal distribution across normal, primary, and metastatic cohorts.
Figure 1c showing clinician-derived task relation graph for bone metastasis diagnosis, quality classification, tumour classification, biomarkers, and complications.
Figure 1c. Clinician-derived task relation graph for BoneCoT reasoning.
Figure 2a radar plot comparing BoneCoT with DINOv2 and Merlin across bone-related tasks.
Figure 2a. Task-level comparison of BoneCoT, DINOv2, and Merlin.
Figure 2b heatmap of Phi coefficients among bone-related diagnostic and clinical tasks.
Figure 2b. Task association structure measured by Phi coefficients.

Citation

@article{bonecot2026,
  title = {BoneCoT: Multi-center validation of a whole-body skeleton foundation model for bone metastases guided by clinician-derived chain of thought},
  author = {Zhao, Hui and Zhang, Ruipeng and Wang, Zhiyu and Gu, Yifeng and Xu, Shengyuan and Wang, Sheng and Li, Yuehua},
  journal = {Nature Biomedical Engineering},
  year = {2026},
  doi = {10.1038/s41551-026-01736-1}
}