Annotating regions of interest in medical images, a process known as segmentation, is often one of the first steps clinical researchers take when running a new study involving biomedical images.
For instance, to determine how the size of the brain’s hippocampus changes as patients age, the scientist first outlines each hippocampus in a series of brain scans. For many structures and image types, this is often a manual process that can be extremely time-consuming, especially if the regions being studied are challenging to delineate.
To streamline the process, MIT researchers developed an artificial intelligence-based system that enables a researcher to rapidly segment new biomedical imaging datasets by clicking, scribbling, and drawing boxes on the images. This new AI model uses these interactions to predict the segmentation.
As the user marks additional images, the number of interactions they need to perform decreases, eventually dropping to zero. The model can then segment each new image accurately without user input.
It can do this because the model’s architecture has been specially designed to use information from images it has already segmented to make new predictions.
Unlike other medical image segmentation models, this system allows the user to segment an entire dataset without repeating their work for each image.
In addition, the interactive tool does not require a presegmented image dataset for training, so users don’t need machine-learning expertise or extensive computational resources. They can use the system for a new segmentation task without retraining the model.
In the long run, this tool could accelerate studies of new treatment methods and reduce the cost of clinical trials and medical research. It could also be used by physicians to improve the efficiency of clinical applications, such as radiation treatment planning.
“Many scientists might only have time to segment a few images per day for their research because manual image segmentation is so time-consuming. Our hope is that this system will enable new science by allowing clinical researchers to conduct studies they were prohibited from doing before because of the lack of an efficient tool,” says Hallee Wong, an electrical engineering and computer science graduate student and lead author of a paper on this new tool.
She is joined on the paper by Jose Javier Gonzalez Ortiz PhD ’24; John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering; and senior author Adrian Dalca, an assistant professor at Harvard Medical School and MGH, and a research scientist in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the International Conference on Computer Vision.
Streamlining segmentation
There are primarily two methods researchers use to segment new sets of medical images. With interactive segmentation, they input an image into an AI system and use an interface to mark areas of interest. The model predicts the segmentation based on those interactions.
A tool previously developed by the MIT researchers, ScribblePrompt, allows users to do this, but they must repeat the process for each new image.
Another approach is to develop a task-specific AI model to automatically segment the images. This approach requires the user to manually segment hundreds of images to create a dataset, and then train a machine-learning model. That model predicts the segmentation for a new image. But the user must start the complex, machine-learning-based process from scratch for each new task, and there is no way to correct the model if it makes a mistake.
This new system, MultiverSeg, combines the best of each approach. It predicts a segmentation for a new image based on user interactions, like scribbles, but also keeps each segmented image in a context set that it refers to later.
When the user uploads a new image and marks areas of interest, the model draws on the examples in its context set to make a more accurate prediction, with less user input.
The researchers designed the model’s architecture to use a context set of any size, so the user doesn’t need to have a certain number of images. This gives MultiverSeg the flexibility to be used in a range of applications.
“At some point, for many tasks, you shouldn’t need to provide any interactions. If you have enough examples in the context set, the model can accurately predict the segmentation on its own,” Wong says.
The researchers carefully engineered and trained the model on a diverse collection of biomedical imaging data to ensure it had the ability to incrementally improve its predictions based on user input.
The user doesn’t need to retrain or customize the model for their data. To use MultiverSeg for a new task, one can upload a new medical image and start marking it.
When the researchers compared MultiverSeg to state-of-the-art tools for in-context and interactive image segmentation, it outperformed each baseline.
Fewer clicks, better results
Unlike these other tools, MultiverSeg requires less user input with each image. By the ninth new image, it needed only two clicks from the user to generate a segmentation more accurate than a model designed specifically for the task.
For some image types, like X-rays, the user might only need to segment one or two images manually before the model becomes accurate enough to make predictions on its own.
The tool’s interactivity also enables the user to make corrections to the model’s prediction, iterating until it reaches the desired level of accuracy. Compared to the researchers’ previous system, MultiverSeg reached 90 percent accuracy with roughly 2/3 the number of scribbles and 3/4 the number of clicks.
“With MultiverSeg, users can always provide more interactions to refine the AI predictions. This still dramatically accelerates the process because it is usually faster to correct something that exists than to start from scratch,” Wong says.
Moving forward, the researchers want to test this tool in real-world situations with clinical collaborators and improve it based on user feedback. They also want to enable MultiverSeg to segment 3D biomedical images.
This work is supported, in part, by Quanta Computer, Inc. and the National Institutes of Health, with hardware support from the Massachusetts Life Sciences Center.