This paper establishes a benchmark for 3D content-based image retrieval (CBIR) in medical imaging using the TotalSegmentator dataset. It evaluates supervised embeddings trained on medical images against self-supervised embeddings from non-medical datasets, testing retrieval at both organ and region levels. By introducing a late interaction re-ranking method inspired by text retrieval, the study achieves near-perfect recall across diverse anatomical structures. The results provide a much-needed benchmark and roadmap for future development of AI-powered medical image retrieval systems, enabling more reliable, precise, and efficient radiology workflows.This paper establishes a benchmark for 3D content-based image retrieval (CBIR) in medical imaging using the TotalSegmentator dataset. It evaluates supervised embeddings trained on medical images against self-supervised embeddings from non-medical datasets, testing retrieval at both organ and region levels. By introducing a late interaction re-ranking method inspired by text retrieval, the study achieves near-perfect recall across diverse anatomical structures. The results provide a much-needed benchmark and roadmap for future development of AI-powered medical image retrieval systems, enabling more reliable, precise, and efficient radiology workflows.

Medical Image Retrieval Needs a New Benchmark

7 min read

:::info Authors:

(1) Farnaz Khun Jush, Bayer AG, Berlin, Germany (farnaz.khunjush@bayer.com);

(2) Steffen Vogler, Bayer AG, Berlin, Germany (steffen.vogler@bayer.com);

(3) Tuan Truong, Bayer AG, Berlin, Germany (tuan.truong@bayer.com);

(4) Matthias Lenga, Bayer AG, Berlin, Germany (matthias.lenga@bayer.com).

:::

Abstract and 1. Introduction

  1. Materials and Methods

    2.1 Vector Database and Indexing

    2.2 Feature Extractors

    2.3 Dataset and Pre-processing

    2.4 Search and Retrieval

    2.5 Re-ranking retrieval and evaluation

  2. Evaluation and 3.1 Search and Retrieval

    3.2 Re-ranking

  3. Discussion

    4.1 Dataset and 4.2 Re-ranking

    4.3 Embeddings

    4.4 Volume-based, Region-based and Localized Retrieval and 4.5 Localization-ratio

  4. Conclusion, Acknowledgement, and References

ABSTRACT

While content-based image retrieval (CBIR) has been extensively studied in natural image retrieval, its application to medical images presents ongoing challenges, primarily due to the 3D nature of medical images. Recent studies have shown the potential use of pre-trained vision embeddings for CBIR in the context of radiology image retrieval. However, a benchmark for the retrieval of 3D volumetric medical images is still lacking, hindering the ability to objectively evaluate and compare the efficiency of proposed CBIR approaches in medical imaging. In this study, we extend previous work and establish a benchmark for region-based and localized multi-organ retrieval using the TotalSegmentator dataset (TS) with detailed multi-organ annotations. We benchmark embeddings derived from pre-trained supervised models on medical images against embeddings derived from pre-trained unsupervised models on non-medical images for 29 coarse and 104 detailed anatomical structures in volume and region levels. For volumetric image retrieval, we adopt a late interaction re-ranking method inspired by text matching. We compare it against the original method proposed for volume and region retrieval and achieve a retrieval recall of 1.0 for diverse anatomical regions with a wide size range. The findings and methodologies presented in this paper provide insights and benchmarks for further development and evaluation of CBIR approaches in the context of medical imaging.

1 Introduction

In the realm of computer vision, content-based image retrieval (CBIR) has been the subject of extensive research for several decades [Dubey, 2021]. CBIR systems typically utilize low-dimensional image representations stored in a database and subsequently retrieve similar images based on distance metrics or similarity measures of the image representations. Early approaches to CBIR involved manually crafting distinctive features, which led to a semantic gap, resulting in the loss of crucial image details due to the limitations of low-dimensional feature design [Dubey, 2021, Wang et al., 2022]. However, recent studies in deep learning have redirected attention towards the creation of machine-generated discriminative feature spaces, effectively addressing and bridging this semantic gap [Qayyum et al., 2017]. This shift has significantly enhanced the potential for more accurate and efficient CBIR methods [Dubey, 2021].

\ While natural image retrieval has been extensively researched, the application of retrieval frameworks to medical images, particularly radiology images, presents ongoing challenges. CBIR offers numerous advantages for medical images. Radiologists can utilize CBIR to search for similar cases, enabling them to review the history, reports, patient diagnoses, and prognoses, thereby enhancing their decision-making process. In real-world use-cases, we often encounter huge anonymized and unannotated datasets available from different studies or institutions where the available meta-information, such as DICOM header data, has been removed or is inconsistent. Manually, searching for relevant images in such databases is extremely time-consuming. Moreover, the development of new tools and research in the medical field requires trustable dataset sources and therefore a reliable method for retrieving images, making CBIR an essential component in advancing computer-aided medical image analysis and diagnosis. One of the key challenges with applying standard CBIR techniques to medical images lies in the fact that algorithms developed for natural images are typically designed for 2D images, while medical images are often 3D volumes which adds a layer of complexity to the retrieval process.

\ Recent studies have proposed and demonstrated the potential use of pre-trained vision embeddings for CBIR in the context of radiology image retrieval [Khun Jush et al., 2023, Abacha et al., 2023, Denner et al., 2024, Truong et al., 2023]. However, these studies have primarily focused on 2D images [Denner et al., 2024] or specific pathologies or tasks [Abacha et al., 2023, Khun Jush et al., 2023, Truong et al., 2023], overlooking the presence of multiple organs in the volumetric images, which is a critical aspect of real-world scenarios. Large multi-organ medical image datasets can be leveraged to thoroughly evaluate the efficacy of the proposed methods, enabling a more comprehensive assessment of CBIR approaches for radiology images. Despite previous efforts, there is still no established benchmark available for comparing methods for the retrieval of 3D volumetric medical images. This absence of a benchmark impedes the ability to objectively evaluate and compare the efficiency of the proposed CBIR approaches in the context of medical imaging.

\ Our previous work [Khun Jush et al., 2023] demonstrated the potential of utilizing pre-trained embeddings, originally trained on natural images, for various medical image retrieval tasks using the Medical Segmentation Decathlon Challenge (MSD) dataset [Antonelli et al., 2022]. The approach is outlined in Figure 1. Building upon this, the current study extends the methodology proposed in Khun Jush et al. [2023] to establish a benchmark for anatomical region-based and localized multi-organ retrieval. While the focus of Khun Jush et al. [2023] was on evaluating the feasibility of using 2D embeddings and benchmarking different aggregation strategies of 2D information for 3D medical image retrieval within the context of the single-organ MSD dataset [Antonelli et al., 2022], it was observed that the single-organ labeling, hinders the evaluations for images containing multiple organs. The main objective of this study is to set a benchmark for organ retrieval at the localized level, which is particularly valuable in practical scenarios, such as when users zoom in on specific regions of interest to retrieve similar images of the precise organ under examination. To achieve this, we evaluate a count-based method in regions using the TotalSegmentator dataset (TS) [Wasserthal et al., 2023]. TS dataset along with its detailed multi-organ annotations is a valuable resource for medical image analysis and research. This dataset provides comprehensive annotations for 104 organs or anatomical structures, which allow us to derive fine-grained retrieval tasks and comprehensively evaluate the proposed methods.

\ The contribution of this work is as follows:

\ • We benchmarked pre-trained 2D embeddings trained supervised on medical images against self-supervised pretrained embeddings trained on non-medical images for 3D radiology image retrieval. We utilize a count-based method to aggregate search results based on slice similarity to volume-level data retrieval.

\ • We propose evaluation schemes based on the TotalSegmentator dataset Wang et al. [2022] for 29 aggregated coarse anatomical regions and all 104 original anatomical regions. Our proposed evaluation assesses the capabilities of a 3D image search system at different levels, including a fine-grained measure related to the localization of anatomical regions.

\ • We adopted a late interaction re-ranking method originally used for text retrieval called ColBERT [Khattab and Zaharia, 2020] for volumetric image retrieval. For a 3D image query, this two-stage method generates a candidate 3D image result list utilizing a fast slice-wise similarity search and count-based aggregation. In the second stage, the full similarity information between all query and candidate slices is aggregated to determine re-ranking scores.

\ • We benchmarked the proposed re-ranking method against the original method proposed in Khun Jush et al. [2023] for volume, region, and localized retrieval on 29 modified coarse anatomical regions and 104 original anatomical regions from TS dataset Wang et al. [2022].

\

:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Market Opportunity
NEAR Logo
NEAR Price(NEAR)
$1.017
$1.017$1.017
-4.14%
USD
NEAR (NEAR) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

REX Shares’ Solana staking ETF sees $10M inflows, AUM tops $289M for first time

REX Shares’ Solana staking ETF sees $10M inflows, AUM tops $289M for first time

The post REX Shares’ Solana staking ETF sees $10M inflows, AUM tops $289M for first time appeared on BitcoinEthereumNews.com. Key Takeaways REX Shares’ Solana staking ETF saw $10 million in inflows in one day. Total inflows over the past three days amount to $23 million. REX Shares’ Solana staking ETF recorded $10 million in inflows yesterday, bringing total additions to $23 million over the past three days. The fund’s assets under management climbed above $289.0 million for the first time. The SSK ETF is the first U.S. exchange-traded fund focused on Solana staking. Source: https://cryptobriefing.com/rex-shares-solana-staking-etf-aum-289m/
Share
BitcoinEthereumNews2025/09/18 02:34
Verimatrix: Sale of Extended Threat Defense Assets (Mobile Application Protection) to Guardsquare

Verimatrix: Sale of Extended Threat Defense Assets (Mobile Application Protection) to Guardsquare

Completion of the sale of XTD assets (code and mobile application protection), including a portfolio of patents and a team of experts. The Group is refocusing on
Share
AI Journal2026/02/06 00:49
IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge!

IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge!

The post IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge! appeared on BitcoinEthereumNews.com. Crypto News 17 September 2025 | 18:00 Discover why BlockDAG’s upcoming Awakening Testnet launch makes it the best crypto to buy today as Story (IP) price jumps to $11.75 and Hyperliquid hits new highs. Recent crypto market numbers show strength but also some limits. The Story (IP) price jump has been sharp, fueled by big buybacks and speculation, yet critics point out that revenue still lags far behind its valuation. The Hyperliquid (HYPE) price looks solid around the mid-$50s after a new all-time high, but questions remain about sustainability once the hype around USDH proposals cools down. So the obvious question is: why chase coins that are either stretched thin or at risk of retracing when you could back a network that’s already proving itself on the ground? That’s where BlockDAG comes in. While other chains are stuck dealing with validator congestion or outages, BlockDAG’s upcoming Awakening Testnet will be stress-testing its EVM-compatible smart chain with real miners before listing. For anyone looking for the best crypto coin to buy, the choice between waiting on fixes or joining live progress feels like an easy one. BlockDAG: Smart Chain Running Before Launch Ethereum continues to wrestle with gas congestion, and Solana is still known for network freezes, yet BlockDAG is already showing a different picture. Its upcoming Awakening Testnet, set to launch on September 25, isn’t just a demo; it’s a live rollout where the chain’s base protocols are being stress-tested with miners connected globally. EVM compatibility is active, account abstraction is built in, and tools like updated vesting contracts and Stratum integration are already functional. Instead of waiting for fixes like other networks, BlockDAG is proving its infrastructure in real time. What makes this even more important is that the technology is operational before the coin even hits exchanges. That…
Share
BitcoinEthereumNews2025/09/18 00:32