The solution to messy data architecture in customer success

When customer success (CS) teams talk about data, the conversation often gravitates toward dashboards, health scores, and the latest AI tooling. That focus is understandable, but it misses the point.

But buried inside those conversations are more fundamental questions about how data is structured, connected, and maintained — questions that ultimately determine whether any of those tools produce something useful... or just expensive noise.

Across several recent Customer Success Summits, four CS leaders explored challenges around AI adoption, data quality, support insights, and system integration. None of them set out to talk about “data architecture” explicitly.

And yet, taken together, their perspectives point to something more foundational: the architecture underneath the data is what shapes whether customer success teams can trust, interpret and act on what they see.

The speakers were:

Phillip Morris, Director of Strategy & Operations at WordPress VIP, who spoke at Customer Success Summit Denver 2025 about driving CS with strategic, data-backed insights.
Shipra Nirola, Director of Customer Success (EMEA/AMER) at Evotix, who spoke at Customer Success Summit London 2025 as part of a panel on the next-generation of CS with AI at its core.
Anna Korzeniowska, Lead Value Engineer at Finastra, who also spoke at Customer Success Summit London 2025 as part of the same panel on AI and the future of CS.
Ben Burkhalter, Senior Director of Customer Engineering at GitHub, who spoke at AI for Customer Success Summit 2025 about using AI for smarter feedback and ticket analysis.

Each approached the problem from a different angle – operations, AI adoption, unstructured data, and machine learning at scale. But together, their insights form a coherent view of what data architecture actually looks like in practice for customer success teams.

The problem with customer data integration

The first challenge is structural.

Most CS organizations operate across dozens of systems – Salesforce, Zendesk, Zoom, Gong, GitHub, Gainsight, and others – each with its own identifiers, data model, and operating logic. The issue is rarely whether data exists. It’s whether those systems can be joined together in a way that preserves meaning.

"They don't all have one single key that I can map everything together with," Phillip Morris explained. "So what we've gone through is actually added a key to everything, so that as the data gets pushed around, everything has a single key and we know that we're either mapping the right account or the right deal."

His team standardized on Salesforce Account ID and Opportunity ID, then pushed those identifiers across the rest of the stack. The logic is practical. Salesforce was already where people were working, so it became the reference point for identity.

But Phillip is equally clear about the limitation. Contact-level identity remains one of the hardest problems to solve. Different tools track different entities. Human behaviour adds another layer of messiness.

"There's no way to map those," he said. "This becomes some of the complications that you have to go with."

That point is easy enough to gloss over, but it really shouldn’t. This is a detail that matters enormously in practice. The theoretical elegance of a universal key runs into messy reality the moment human beings start using multiple identities across multiple systems with inconsistent emails, logins, or account relationships.

Looking to give back to the customer success community? Take the 2026 customer success salary and landscape survey

Data quality and measurement

Phillip’s argument is broader than data cleanliness alone. He spends significant time on measurement error, representation error, and the risks of built-in bias – particularly when teams assume they already know how to segment customers, interpret sentiment, or define adoption. In his framing, the issue isn’t just whether data is connected, but whether teams are measuring the right thing in the first place.

A dataset can be reliable and still be invalid. It can be stable, standardized, and neatly joined together while still answering the wrong question.

That’s why Phillip keeps returning to the need for testing assumptions. Have teams regression-tested the customer groupings they rely on? Have they validated that a metric actually reflects the behaviour they think it reflects? Are they measuring what matters, or simply what is easiest to observe?

This is an important correction to the way “data architecture” is often discussed. The problem is not only technical. It’s analytical.

AI data needs to be consistent

If Phillip brought rigor around validity and bias, Shipra Nirola added a useful dose of pragmatism to the conversation.

A common assumption in customer success is that data needs to be complete and pristine before AI becomes useful. Their perspective is more grounded than that.

"That's the biggest myth," Shipra said during a panel discussion on AI-first customer success. "In fact, it's quite the opposite. If your data is wrong and you put that through AI to the trial, you would quite clearly realize your data is wrong. We don’t usually delay starting AI because we think the data isn’t ready."

The point is not that bad data is acceptable, but rather, it’s that waiting for perfection often delays learning. What matters earlier is consistency, structure, and intentional scoping.

Shipra’s advice is to start with a constrained set of inputs and a clear use case. That might mean product usage basics, ticket volume, ticket age, or a small set of lifecycle signals. Not because those inputs tell the whole story, but because they provide enough structure to begin testing what AI can and cannot do well.

Her warning is equally clear. If teams try to force every tool, every CRM, and every signal into the system at once, the result becomes noisy and increasingly unhelpful.

Unstructured customer data

Anna Korzeniowska pushed the conversation further in the same panel discussion.

Structured data – health scores, NPS, time to onboard – remains important. But in her view, many CS teams are sitting on what is effectively an underused layer of insight in CRM notes, meeting summaries, and open-text fields.

"We did thematic and semantic analysis on the output from several systems to see what patterns emerge," she explained. "We created clusters and basically made a systematic system to pick up on those patterns and start tracking over time."

That seriously matters because it introduces a richer kind of context. Not just what happened, but how customers are talking about what happened. Not just ticket volume, but influence, sentiment, friction, urgency, and buying intent are inferred from the surrounding language.

Anna’s point isn’t simply that unstructured data is useful. It’s that, when treated systematically, it can make outputs far more actionable than structured metrics alone.

Become a CSC Insider: Access exclsuive content and templates for free

Customer data insights from GitHub

Ben Burkhalter showed what this looks like when taken further during his session at the AI for Customer Success Summit 2025.

His team built an internal machine learning intelligence engine that ingests support tickets, escalations, community data, and other sources, then surfaces insights in ways that product, engineering, customer success, and leadership can each use.

The origin story is familiar. At first, the assumption was that teams lacked data. The reality was different.

"The insights lived across all kinds of different areas," he said. "There wasn't an easy way to correlate. There was no context. Every team had its own lens of the data, and every team needed to invent its own views."

The constraint was due to fragmentation, rather than scarcity.

Ben’s contribution to this discussion is important because he reframes the problem. The goal isn’t simply to centralize data. It’s to surface patterns in a way that makes sense to different decision-makers, each of whom has a different business lens.

That is why flexibility matters. One team may care about escalations, while another cares about the impact of ARR. Then you might have someone who’s focused on documentation gaps. The architecture has to support that contextualization, no question about it.

Feedback signals from support tickets

Ben also made a point that is easy to miss and highly relevant for CS leaders trying to operationalize feedback loops.

A support ticket is not just one mere data point. Think about the psychology of a customer complaint: customers rarely raise one issue in isolation.

A frustrated customer wants to get everything off their chest in one fell swoop. Ben points out that a single ticket can contain a multiverse of problems that you’ll want to fix tout suite: product friction, documentation failure, account risk, feature feedback, and signs of broader dissatisfaction all at once.

Traditional categorization systems tend to force that interaction into a single bucket. Machine learning allows those signals to be surfaced simultaneously.

Of course, that doesn’t mean human judgment disappears altogether. It means the architecture becomes capable of exposing more of what is already there.

Or, to put it more simply: the data was always speaking. The challenge was hearing all of it at once.

Data governance in customer success

Architecture doesn’t maintain itself.

Phillip Morris is particularly direct on this point. At WordPress VIP, his team runs weekly data quality checks to surface missing fields, broken mappings and anomalies across systems. You can’t argue with the rationale: in any environment where multiple teams touch shared data, drift is inevitable.

"We do have a RevOps team that actually controls some of their revenue fields," he explained. "And when they make changes without telling us – particularly to fields mapped into tools like Gainsight – things break.”

Sure, the response isn’t glamorous, but it's typical of most of our everyday realities. And you better believe it’s essential for:

Regular cross-functional alignment
Clear ownership of fields
Simple validation rules
Constrained input structures where possible

Phillip also emphasizes historical time series data. Point-in-time snapshots provide visibility. Historical data provides trend context. Without that context, even a stable signal is harder to interpret.

Anna reinforces the same principle from a different angle when she talks about knowledge-base integrity. AI layered on top of outdated or poorly maintained knowledge assets produces unreliable outputs.

"This is probably the most common mistake," she said. "If there's no audit that is a reliable system, then we're in trouble."

Governance, in this sense, is not separate from the architecture. It is part of how the architecture remains trustworthy.

Questions of data maturity

What makes these perspectives useful together is that they reflect different stages of maturity.

Phillip is dealing with complex, established environments where reliability, validity, and standardization have to be imposed across sprawl.

Shipra and Anna are closer to the earlier foundation-building stage, where structure, consistency, and thoughtful prompting matter more than perfect completeness.

Ben works with teams that have moved past basic data hygiene and are ready to build intelligence layers on top of their data. His experience at GitHub shows what becomes possible when you invest in architecture that can decompose complex data sources, serve multiple stakeholder perspectives simultaneously, and scale to thousands of datasets.

What we have here is unique and fascinating: different operating contexts, similar lessons. In 2026, data architecture isn’t something you deal with later. It’s what decides whether anything else holds up. It shapes whether customer success teams can trust what they are seeing, measure what actually matters, and act on signals with confidence.

There’s a simple truth about data management that Phillip really nails: "Your output is only as good as your input, and keeping data clean is really, really hard."

But he’s not the only one who views data architecture as a prerequisite. Shipra framed the same issue in more practical terms: "If the foundation isn’t correct, our structure is not defined, our strategy is not laid out, it's going to be a waste of time and effort."

And Ben’s experience suggests that even organizations rich in data do not automatically become rich in insight unless that data can be connected, interpreted, and contextualized.

What you can take from these lessons…

For CS leaders, the implications are immediate. Choose an identifier strategy that can survive across systems:

Test whether your metrics are valid, not just available.
Start with a constrained dataset before expanding AI use cases.
Treat unstructured data as a source of signal, not just noise.
Put governance in place before complexity scales further.

Yes, the tools will evolve and the models will improve. And you can bet that the interfaces will get easier to use. But the underlying constraint will remain the same: the architecture determines how much of that promise becomes usable.

Editor’s note: This article synthesizes perspectives shared by Phillip Morris at Customer Success Summit Denver 2025, Shipra Nirola and Anna Korzeniowska at Customer Support Summit London 2025, and Ben Burkhalter at AI for Customer Success Summit 2025.

Plan your 2026 around customer success: See our events calendar

Experience the real deal

If you found these perspectives useful, you’ll get even more value from hearing directly from the practitioners shaping the future of customer success. Our Customer Success Summits bring together CS leaders and operators from across the industry to share what’s actually working.

Explore the 2026 events calendar and find a summit near you.

View events

The real problem with data architecture in customer success