Key Takeaways

  • Meta's Alexandr Wang told employees the upcoming Watermelon model has reached parity with OpenAI's GPT-5.5.
  • The update comes as Meta accelerates infrastructure spending and talent acquisition to close gaps with OpenAI, Google, and Anthropic.
  • External research shows Meta's models have recently narrowed performance differentials, though OpenAI still leads on several composite benchmarks.

Meta is making progress in the AI model race, with AI chief Alexandr Wang telling employees that the company's upcoming Watermelon model has caught up with OpenAI's flagship GPT-5.5. The comment arrived during an internal town hall, reinforcing Meta's aggressive push to move beyond being merely competitive and instead become a clear contender at the frontier model layer.

Wang described Watermelon as the successor to Avocado, Meta's internal codename for Muse Spark, which launched in April. Wang cited the achievement based on closely followed AI model benchmarks. While it is not clear which specific benchmarks he referenced, the industry typically watches multitask reasoning measures, multimodal tests, and coding evaluations most closely. The signal it sent to internal teams was clear: the company sees significant training momentum building.

Independent research points to narrowing gaps between open models and top proprietary systems. An analysis from Epoch AI showed that open models historically trailed leading closed models by 5 to 22 months. However, external evaluations of Meta's recent models indicate the company is closing this gap across several benchmarks. A 2024 ophthalmology study hosted by the National Library of Medicine found Meta's systems slightly ahead of competing models on specific domain questions, marking a notable data point in specialized performance.

Even though Watermelon is currently in training, Wang told staff it uses an order of magnitude more compute than Avocado. This phrasing suggests Meta is no longer playing catch-up by iteration but is instead scaling toward the sort of frontier training runs associated with GPT-5.5, GPT-5.6, or Anthropic's Claude Opus line. The jump in compute aligns with Meta's massive investments in chips, data centers, and talent as the organization pursues the frontier model roadmap.

Wang has been increasingly public about where he thinks Meta stands. On Thursday, he wrote on X that an update to Muse Spark is coming soon, specifically targeting coding and agentic tasks aimed at closing the gap with rival models. When a user asked when Meta would land a model comparable to Anthropic's Claude Opus, Wang replied that it would be "pretty soon," adding that users would like what the company has cooking. This indicates Meta sees room to advance capabilities across programming, orchestration, and autonomous workflows—areas that represent key decision points for enterprise buyers evaluating long-term AI stack commitments.

Enterprise market perception often lags technical progress. While Muse Spark has advanced Meta's capabilities, the company has historically struggled to convince developers that its models belong at the industry's leading edge alongside OpenAI, Google, and Anthropic. Benchmark aggregators still place GPT-5.5 at the top of composite intelligence indices. That said, parity is rarely universal in model evaluations. The landscape tends to break into clusters where particular systems excel in specific workloads. The critical metric for Meta will be how consistently these pockets of strength appear across enterprise-relevant tasks.

Organizations evaluating these models systematically rely on established guidance to measure differences. The NIST AI Risk Management Framework serves as a reference point for assessing model reliability, robustness, and downstream impact, helping enterprises objectively measure which model suits their specific requirements. Similarly, emerging ISO and IEC standards provide buyers with shared language regarding performance, risk, and alignment claims, giving procurement teams firmer ground when providers make aggressive benchmark assertions.

Mark Zuckerberg's aggressive talent blitz remains a central factor in this acceleration. Combined with massive investments in infrastructure, Meta aims to own the full stack, from silicon allocation to model inference. Whether this push is enough to beat rivals outright remains uncertain, though Wang's comments suggest internal confidence is higher today than it was twelve months ago.

Timing remains a critical factor. GPT-5.5 became available in April, and OpenAI debuted GPT-5.6 late last month, though the latter is not yet generally released due to US government requests. If Meta's Watermelon is already on par with GPT-5.5, it highlights an accelerating training velocity. Matching a previous frontier model is not the same as overtaking the current leader, yet for many B2B buyers, achieving parity with a model like GPT-5.5 is sufficient for deploying new enterprise workflows and autonomous services.

Meta still must prove that Watermelon's benchmark parity holds up in external evaluations. Competitors like Google and Anthropic are advancing rapidly, and OpenAI's move toward GPT-5.6 continues to define the pace at the frontier. However, Meta's investments represent a long-term commitment to deep model training runs, reinforced by robust infrastructure scaling and an assertive talent strategy.

If Wang's internal claims hold, Meta might soon operate in a tier where its proprietary frontier models no longer trail the proprietary leaders. For enterprise buyers, that introduces genuine multi-sided competition at the high end. The stakes are rising, and the next few release cycles from the major labs will demonstrate whether this parity narrative translates into dominant enterprise deployment.