Keeping Surgical AI Human-Centered With Our Labeling Manager Shauna Otto

Clara Scholes

•

March 11, 2026

Scroll Down

Computers are stupid.

You read that correctly. Computers are in fact, really really stupid. But that’s ok, because they have us.

Humans.

Surgical Data Science Collective (SDSC) takes extreme pride in the rigor and excellence of our surgical artificial intelligence models. Thanks to our incredible team members, we are at the front of the field when it comes to data privacy, security, and transparency, and surgical data science is quickly becoming the future for improved education, safety, and quality of surgery for patients worldwide. But despite all the hype (which is incredibly appropriate), there are fundamental elements of these high-performing AI models that do not get the recognition they deserve. Especially considering that without these elements, the whole operation would fall apart.

So today we would like to bring your attention to a slightly less mentioned, but extremely important part of the surgical data science world: data labeling.

If ML is an engine, labeling is the fuel. And for the engine to work, this fuel must be kept clean, consistent, and of the highest quality. To explore how this is achieved, we gathered insightful expertise from the wonderful woman who leads this operation at SDSC, our labeling manager, Shauna Otto, PhD:

“Machine learning is all well and good, it’s the future, it’s amazing. But a lot of the hype sort of obfuscates the fact that you need people to tell the computer what it’s seeing. AI is just the sum of aggregated intelligence from a whole bunch of people.”

*Shauna in the laboratory during her PhD.*

Why labeling is so important

It’s a simple reality: AI systems are only as good as the data they are trained on.

When we train a surgical tool detection model, for example, one that identifies scalpels, graspers, suction devices, etc., across a whole procedure, we need thousands upon thousands of accurately labeled examples. Each bounding box drawn around a tool, each timestamp, it all becomes part of the aggregated intelligence the model learns from.

The difficulty is that even if it’s been shown 2,000 correctly labeled scalpels, if you give it one curette that’s been labeled “scalpel”, it’ll learn this mistake too. As Shauna puts it: “Garbage in, garbage out.”

Computers don’t know when they’ve made a mistake, and if left alone, AI can get into a doom cycle of eating its own tail. It will happily absorb outliers and inconsistencies, leaving systems to gradually drift away from ground truth.

In surgery, that’s not acceptable, and it makes human oversight absolutely imperative.

No matter how strong a model becomes, human-verified data will always be essential.

Humans define the outputs. Humans define the ontology. Humans interpret what the model’s predictions mean in practice. We previously discussed with Dr. Sandeep Nayak that AI will never replace doctors because clinical judgement cannot be automated away. Similarly, in surgical AI, labeling and oversight will remain permanent infrastructure.

The link between multiple worlds

Shauna acts as a go-between for multiple teams within SDSC:

The product team, who define what problems we’re solving (e.g. our new outcomes feature on SVP).
The ML team, who need structured, high-quality label-image pairs to train models.
The labeling team, who manually annotate thousands of surgical frames.

Before this role existed, individual members of the ML team had to source their own labels. But now with Shauna’s meticulous, strategic work, we have one workflow, consistent formatting, and cohesive communication with the labeling team. All of which makes the work much more scalable and brings both operational efficiency and quality assurance for our AI models.

Take our sampling strategy as an example. Surgical video can be over six hours long, and a strategy that might sample one frame per second of video would generate 20,000+ frames, many of which would not be informative or useful; the camera may have left the body, the image is blurry, or tools are not present. By working closely with the ML team, Shauna has designed a smarter strategy that minimizes wasted labeling efforts. Together, they built a tool that scans relevant videos to surface the most “interesting” frames – those likely to contain informative content which can be approved for labeling.

Looking ahead, we’re moving towards more productive approaches like active learning with some simpler models. They run on surgical video to tell us exactly when the camera is looking at the surgical field and when it is out of frame. We could also begin to ask a model which frames it is uncertain about, and what it would find most useful to label next. It’s a bit like asking the algorithm: “if you could label any of these frames to make your job easier, which would you like labeled?”

Other parts of Shauna’s role include: reviewing labels with a fine-tooth comb, defining ontologies, tracking trends in mistakes and feeding them back into training materials. For example, here are some fun animations Shauna made to help the labeling team distinguish between tools used in endoscopic endonasal surgical videos:

*From left to right: straight forceps, rongeur, grasper.*

Human quality control at scale

When you see a heatmap of tool presence on our analytics dashboard, or explore phase breakdowns of your procedure, you’re interacting with a system built on thousands of carefully reviewed decisions. Our labeling system is a core component of the quality assurance that keeps our AI top-notch – similar to how our partnership with X.Y. Han keeps our models heavily contextualized in surgery.

SDSC builds technology that is accountable to people. The future of surgical AI isn’t about replacing human clinicians, but rather empowering them to do more and better surgery, and that future depends on people like Shauna.

‍

Coming soon: A deeper dive into the technicalities behind some of SDSC’s recent machine learning projects.

‍

Please to watch this video.

Newsletter Subscription

Receive professional insights, application guidance, and the latest news.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Share this post:

Keeping Surgical AI Human-Centered With Our Labeling Manager Shauna Otto

Why labeling is so important

The link between multiple worlds

Human quality control at scale

Other Posts

Dr. Courtney Balentine: "The Operating Room Black Box"

A National Partnership to Advance Surgical Care in Ethiopia

Laying the Foundations for Digital Surgery 2.0

Join the Conversation: Engage with Us on Social Media