GTM Vault
GTM Vault
GTM 46 | Context Is GTM Infrastructure, Not a Tool Connector
0:00
-38:48

GTM 46 | Context Is GTM Infrastructure, Not a Tool Connector

Why connecting Claude to MCP servers fails the moment teams try to scale it, and what changes when context is built upstream of every agent

👋 Hi, it’s Rick Koleta. Welcome to GTM Vault - a breakdown of how high-growth companies design, test, and scale revenue architecture. Join 26,000+ operators building GTM systems that compound.


Someone gave every rep a Claude license. Every team built its own ChatGPT project. Now the same question gets two different answers depending on who you ask. One customer had a rep pitch a product line that did not exist because ChatGPT made up SKUs the company had never sold. The model was not the problem. The agent was not the problem. The architecture underneath was.

Alex Bilmes built Endgame to fix that architecture. Alex was the first hire at CloudAbility (acquired), founded Reflect (acquired), and ran one of the early product-led sales CRMs with customers like Figma, Loom, Calendly, and LaunchDarkly. He has raised over $47M for Endgame, with customers including Braze, Scale AI, BetterUp, and Monte Carlo. Monte Carlo reached 80% AI adoption across its GTM team running on the platform.

Endgame is a context graph for go-to-market. It pulls everything (CRM, Gong, email, Slack, methodology, enablement, policy docs) into one source of truth and pre-processes that data upstream of any agent or human asking a question. It runs fact extraction and compression, builds dense one-pagers per account, and serves the graph through a single MCP server that any tool can plug into. The architectural premise is direct. Most AI GTM stacks fail because every agent makes a different set of tool calls at read time. Accuracy is low, cost is high, answers are inconsistent. The fix is not better agents. It is a unified context layer underneath them that does the synthesis once and serves it to everyone.

In GTM 46, Alex breaks down why revenue is the GTM function furthest behind on AI despite being the function that makes the money, why the investment mix on what to do versus doing it is 80-20 in favor of knowing what to do, and the new function that controls how every rep sells. He explains why agentic hellscape is the default state of most enterprises right now, why revenue per employee is the metric every CRO is laddering to, and why the best leaders are now the ones spending nights and weekends building with AI on their own accounts.

This is not a conversation about better AI tools. It is a conversation about the architecture that determines whether every tool you bought produces leverage or sprawl.

Inside this episode

This episode maps the structural gap between how most enterprises adopted AI in GTM and what the architecture has to become when every team is shipping agents, every rep is using ChatGPT and Claude, and the same question is producing different answers across the org.

Alex starts with the pattern he keeps seeing inside enterprise customers. Someone gave every rep a Claude license. Every team built its own ChatGPT project. Marketing has its own context, sales has its own, customer success has its own, and none of them share. The result is what he calls agentic hellscape. The same question gets two different answers depending on which department you ask. One of his customers had a rep pitch a product line the company did not sell because ChatGPT, working with nothing connected to it, fabricated SKUs the rep then pitched to a real customer. The fabrication is not a model bug. It is the predictable output of an agent with no context.

We go deep on why revenue is structurally the hardest GTM function to automate. Engineering is the furthest ahead. Teams are running Cursor and Claude Code with agents writing most of the code. Legal moved fast through Harvey and Legora. Support has Sierra and Decagon running autonomous workflows. Revenue lags every other function because the context to sell to a single account lives in Salesforce, Gong, email, Slack, methodology playbooks, pricing rules, packaging, and policy contradictions all at once. The data is not just fragmented. It is also messy, contradictory, and updated by different people at different cadences. No model can reason its way through that without a synthesis layer underneath it.

We cover why connecting Claude directly to MCP servers does not solve the problem. If the agent has to query Salesforce, Gong, email, and Slack at read time for every question, three things go wrong. Accuracy degrades because the agent is doing needle-in-a-haystack search across fragmented sources. Token costs blow up because every question pre-loads massive context. Consistency collapses because two reps asking the same question get different answers depending on how the agent navigated the tool calls. Endgame’s architectural play is to move all of that upstream. Run fact extraction once. Build a dense, fact-checked one-pager per account. Serve the same compressed context to every agent and every human. The agent stops searching. It starts reading.

We go into the function Alex calls the most slept-on opportunity in AI for go-to-market. Whoever curates the context layer controls how every rep sells. He half-jokes the title is chief propaganda officer. The joke is funny. The function is real. This is the person who teaches the agent what plays to run, what discounts apply under which conditions, what the methodology is, what the company will not do. Most knowledge bases contain contradictions on basic policy questions. Do we allow this discount? It depends on segment, contract size, and quarter. The function pre-processes those contradictions into directives the agent can act on. Some of his customers run it through enablement. Some through ops. One customer just created a new AI go-to-market leader role and staffed it from solutions architecture. The title is fluid. The function is not. In a world where every rep asks the agent before talking to a customer, whoever feeds the agent is shaping the entire sales motion.

We cover what AI is actually doing to GTM org design. Smaller teams. Flatter orgs. Fewer specialists, more generalists. Full-cycle AEs replacing land-and-expand handoffs because the agent does the prep and analysis work that used to require a separate role. One of his customers replaced most of their SDR function with a go-to-market engineer, the Endgame MCP, and a handful of approval loops. A few people now review the agent’s output and click send. Alex is direct that this is not a one-to-one headcount swap. It is a structural compression. The metric every CRO is laddering to underneath all of this is revenue per employee. Revenue per employee is the new endgame.

We go into the structural argument underneath the entire AI GTM conversation. Doing stuff is easy with software. Doing the right thing is very, very hard. The current narrative is that agents should do more, autonomously, faster. Alex thinks that is missing the point. The investment mix on what to do versus doing it should be 80-20 in favor of knowing what to do. Most teams have it inverted. They scale autonomy on top of bad instructions and end up shipping faster against the wrong assumptions. His example: you buy an AISDR, you send 10 million emails, you get no conversion, you burn your domain, and your competitors still win. Doing more is not the same as doing the right thing. Architecture decides which one happens.

We go into build versus buy. The shape of the question changed. Six months ago, building meant standing up a custom data pipeline and a fine-tuned model. Today building means a forward-deployed engineer connecting Claude to a few MCP servers. Alex’s argument is that the connect-and-ask approach works for one rep on one question. It collapses the moment a team tries to scale it because everything happens at read time. His honest answer on what to buy: almost nothing. Maybe payroll. Maybe a database. Maybe a context graph (because building one requires data engineering, applied AI, and domain expertise compounded over years). And maybe a few load-bearing brand names where the customer signal matters. He uses DocuSign as the example. Cheap to replace, but the customer recognizes the logo on a million-dollar contract. Build everything else.

We cover sales cycle compression. Alex closed an enterprise deal in 15 days from first meeting. The buyer was a COO who was problem-aware, somewhat solution-aware, had budget, and was the signer. No nurture sequence. No drawn-out evaluation. The shape of the solution was already decided before Endgame walked in. Other deals take six months. The variable is not the buying committee. It is leadership. Next-generation leaders think in systems. They do not muscle through what Alex calls meat scale, where you hire 50 people and figure it out on the ground. They architect the function. The leaders who think this way close fast. The leaders who do not run RFPs.

We cover what the AI-native GTM stack looks like 12 months out. Alex sees consolidation at the horizontal layer, not the vertical. Sales leaders bought ten point solutions that each promised 3x pipeline and ended up at zero. The next motion is fewer systems sitting between the underlying data sources and the agents doing the work. Go-to-market teams are starting to operate like product and engineering teams. They run sprints. They iterate. They understand the layers of the stack. The orgs that move fastest will be the ones where the GTM team thinks like a product team, ships in cycles, and treats the context graph as core infrastructure.

We close on a sharp note about leadership. Alex’s biggest critique of CROs adopting AI is that they outsource their own intuition. They buy Claude, they buy Copilot, they tell their team they are using AI, and the result is performance theater. The leaders moving fastest are the ones spending nights and weekends building with AI on their own accounts. Intuition is the new leverage. The leaders who build it themselves are the ones designing the architecture. The leaders who delegate it are the ones being sold the architecture by people who do not understand it either.

Listen & subscribe now across:

Apple // Spotify

Discussed in this episode

03:13 - Why revenue is the GTM function furthest behind on AI

05:11 - What goes inside a context graph, and why Claude plus MCP servers fails at scale

13:15 - The chief propaganda officer: whoever curates context controls how every rep sells

16:38 - Revenue per employee is the new endgame

18:49 - The data foundation matters more for agents, not less

22:55 - What the AI-native GTM stack looks like 12 months out

27:08 - Adoption strategy: meet people where they already work

30:11 - Pricing, sales cycles, and why systems thinkers move fastest

36:34 - Rapid fire: AISDR, the belief about agents that's wrong, build vs. buy, and the 30-day CRO move

Key takeaways

  1. The same question producing different answers is the signal that context is fragmented The default state of most enterprises right now is agentic hellscape. Every team has its own ChatGPT project. Every rep has Claude. None of them share context. The same question about a deal, an account, a policy, gets two different answers depending on which agent or rep you ask. The failure mode is not the model. It is that every agent is reasoning from a different fragment of context. The fix is not a better prompt. It is a context layer underneath every agent that synthesizes the answer once and serves it consistently.

  2. Revenue is the GTM function furthest behind on AI because context lives in eight tools at once Engineering moved fast because the context to write code lives in the codebase. Legal moved fast because the context to review a contract lives in the contract. Revenue lags because the context to sell to one account lives in Salesforce, Gong, email, Slack, the methodology doc, the pricing sheet, the policy contradictions, and last quarter’s QBR slide. No agent can reason across that without a synthesis layer doing the work upstream. Every team trying to make Claude useful for revenue without unifying the context first runs into the same wall.

  3. Doing stuff is easy. Doing the right thing is very, very hard. The current narrative around agents is autonomy, volume, and speed. The investment mix on what to do versus doing it should be 80-20 in favor of knowing what to do. Most teams have it inverted. They scale autonomy on top of bad instructions and end up moving faster against the wrong assumptions. The AISDR pattern is the canonical version. You buy the agent, you send 10 million emails, you get no conversion, you burn the domain, and the competitor still wins. Architecture decides whether autonomy compounds or accelerates the failure.

  4. Whoever curates the context layer controls how every rep sells The most slept-on function in AI-native GTM. The title is fluid (enablement, ops, AI go-to-market leader, sometimes solutions architecture). The function is not. This is the person who teaches the agent the methodology, the plays, the discount logic, the contradictions in the knowledge base, and the directives the agent should enforce. In a world where every rep asks the agent before talking to a customer, the person feeding the agent is shaping every conversation. Companies that recognize this are building the role. Companies that do not are letting the context default to whatever ChatGPT pulls from training data.

  5. AI compresses GTM org design in ways that look like layoffs but are actually architecture changes Smaller sales teams. Flatter orgs. Fewer specialists, more full-cycle AEs. One customer replaced most of their SDR team with a go-to-market engineer, the Endgame MCP, and approval loops. The metric every CRO is laddering to is revenue per employee, and revenue per employee is the new endgame. The structural move is to compress handoffs, not just headcount. Every handoff between specialists is where context leaks. Removing the handoff produces compounding leverage that a one-to-one headcount cut cannot.

  6. Adoption is not a behavior change problem when the context layer goes where people already work The fastest adoption pattern is making the context graph available inside the tools the team already uses. Claude. ChatGPT. Slack. Zapier. The admin approves Endgame in the Connector Marketplace. Reps flip a switch. They keep working the way they were already working, just with better answers. Behavior change is the hardest part of any AI rollout. The architectural move is to skip it. Meet people where they work, plug the context graph in underneath, and let the agent get smarter without anyone having to learn a new tool.

Frameworks from the episode

  1. The context graph as the operating system for revenue The architectural premise of Endgame. Pull everything into one place (CRM, Gong, email, Slack, methodology, enablement, pricing, policy). Run agent loops upstream that fact-check, extract, and compress the data into dense one-pagers per account. Serve the result through a single MCP server that any agent or human queries. The output is a single source of truth that updates in real time, every time a new call drops or an email lands. Every question gets the same consistent answer because every agent is reading from the same compressed context. The graph is not a report. It is the operating system for how the company sells.

  2. The 80-20 inversion: knowing what to do versus doing it Most teams allocate effort to autonomy and volume. Alex’s frame inverts the ratio. Eighty percent of the work belongs to deciding what the agent should do (methodology, plays, policy directives, conditions). Twenty percent belongs to having the agent do it. Teams that get this backwards scale broken instructions. Teams that get it right scale instructions that actually compound revenue. The framework is not philosophical. It is a budget allocation question. Where is the team spending its time, on the agent or on what the agent is told to do.

  3. Build versus buy in the AI-native stack The default reverses. Almost everything that was previously a SaaS purchase is now buildable on Claude, an MCP server, and a forward-deployed engineer. The exceptions are infrastructure plus domain expertise compounded over years (databases, payroll, context graphs) and a small number of brand-recognized customer-facing tools where the logo carries trust on a million-dollar contract. DocuSign is the example. Everything else, build. The architectural question is no longer which point solution to add. It is which platforms can absorb the use cases your team will discover next quarter.

What to do this week

Pick the one or two AI use cases that move the needle today. Alex’s exact prescription. Not five. Not ten. The temptation is to spin up agents across every department. The result is sprawl and contradictory output. Pick one or two use cases (account planning, deal review, prep for the meeting, whatever the highest-leverage moment is for your team) and build for those. The discipline of choosing forces a real conversation about where AI actually compounds revenue.

Get the base data structured so you can iterate. The other half of Alex’s 30-day prescription. The mistake is treating data as a six-month CRM clean-up project. The move is the opposite. Get the context graph (or the equivalent layer) standing up against your existing data, no matter how messy. Agents fix data faster than humans do. The structure compounds across every use case you add later.

Audit how many places the context to answer one revenue question lives. Pick a real example: “is this deal stalled?” Count the systems your agent or your team has to query to get a real answer. Salesforce. Gong. Email. Slack. Calendar. Notes. If the answer requires more than one system and there is no synthesis layer compressing them, every agent and every rep is producing inconsistent answers right now. The fix is not buying more agents. It is building or buying the layer that does the synthesis once.

Identify who in your org is curating the context the agents are reading. If the answer is “no one” or “every team has their own ChatGPT project,” the company is in agentic hellscape. The first move is naming the function. The title can come later. The function (curating methodology, plays, pricing, and policy into a form the agent can act on) needs an owner this quarter.

Build with AI yourself. Spend a weekend wiring up Claude, an MCP server, and one of your own data sources. The leaders moving fastest are the ones who can describe the architecture from first principles because they have built it. The leaders falling behind are the ones who outsourced the building and now describe AI in platitudes. Intuition is the new leverage. It only comes from the building.

Why this matters

The first wave of AI in GTM was tool sprawl. Every team got Claude. Every department built its own ChatGPT project. Every vendor pitched a 3x lift on whatever the buyer’s pipeline problem was. The buyers bought ten of them and pipeline did not move. The motion was activity, not architecture.

The architectural argument Alex makes is that AI does not produce leverage on a fragmented stack. It produces noise at scale. The same model, the same agent, the same prompt, will give two different answers if the underlying context is not unified. Every rep using the agent without a context layer is making decisions on a different version of reality. The org looks like it is using AI. The output is closer to performance theater.

The deeper argument is that doing stuff is easy and doing the right thing is hard. Autonomy compounds whatever instructions are underneath it. If the instructions are wrong, the agent ships the failure faster. The teams winning with AI right now are not the teams shipping the most agents. They are the teams that invested 80% of the effort in deciding what the agent should do, then let the agent do it. The teams losing inverted the ratio.

The companies that win the next cycle are the ones that built the context layer before they bought the agent. Not because the layer is glamorous. Because every other investment compounds against it or collapses against it. AI on a coherent system produces leverage. AI on a fragmented system produces noise at scale. The orgs that figured this out are running smaller teams, flatter structures, and higher revenue per employee. The orgs that did not are still asking why their agents keep telling reps to pitch products that do not exist.

This is GTM Vault.

If this episode changed how you think about the architecture underneath your AI GTM stack, forward it to one operator still treating ChatGPT and Claude as productivity tools instead of an interface to a context layer they have not built yet.

Connect

Follow Alex Bilmes // Endgame

Follow Rick Koleta // GTM Vault

Thanks for listening. See you in the next episode.

P.S. Annual paid subscribers get a Private GTM Blueprint Session. One working session to identify your primary GTM constraint and design the 90-day architecture to resolve it.

Discussion about this episode

User's avatar

Ready for more?