Inside Meta’s AI strategy: Zuckerberg stresses compute, open source and training data

Inside Meta's AI strategy: Zuckerberg stresses compute, open source and training data


All of the Big Tech earnings calls this week offered insights into each company’s AI efforts. Google focused on its generative AI efforts in search and cloud; Microsoft delved into detail about integrating AI across its tech stack; and Amazon talked chips, Bedrock and, oh yeah, Rufus — a new AI-powered shopping assistant. But I think Meta had them all beat in terms of offering the deepest dive into its AI strategy.

In many ways, the Meta AI playbook is unique, thanks to its consistent focus on open source AI and a massive, ever-growing well of AI training data from public posts and comments on Facebook and Instagram.

So it was interesting that in Meta’s Q4 2023 earnings call yesterday, CEO Mark Zuckerberg first touted its cushy position in one of the most competitive areas of AI development: Compute.

Meta has a clear long-term playbook for becoming leaders in building the most popular and most advanced AI products and services, Zuckerberg said, as well as building the “full general intelligence” he maintained the effort will require. The first key aspect of this, he said, is “world-class compute infrastructure.”

VB Event

The AI Impact Tour – NYC

We’ll be in New York on February 29 in partnership with Microsoft to discuss how to balance risks and rewards of AI applications. Request an invite to the exclusive event below.


Request an invite

Zuckerberg went on to repeat what he had recently disclosed in a recent Instagram Reel: that by end of this year Meta will have about 350k H100s — including other GPUs the total will be around 600k H100 equivalents of compute. The reason Meta has all that? Surprise, surprise — Instagram Reels.

“We’re well-positioned now because of the lessons that we learned from Reels,” he explained. “We initially under-built our GPU clusters for Reels, and when we were going through that I decided that we should build enough capacity to support both Reels and another Reels-sized AI service that we expected to emerge so we wouldn’t be in that situation again.”

Meta is “playing to win,” added Zuckerberg, pointing out that training and operating future models will be even more compute intensive.

“We don’t have a clear expectation for exactly how much this will be yet, but the trend has been that state-of-the-art large language models have been trained on roughly 10x the amount of compute each year,” he said. “Our training clusters are only part of our overall infrastructure and the rest obviously isn’t growing as quickly.” The company plans to continue investing aggressively in this area, he explained: “In order to build the most advanced clusters, we’re also designing novel data centers and designing our own custom silicon specialized for our workloads.”

Open source AI strategy was front and center

Next, Zuckerberg zoomed in on Meta’s never-wavering open source strategy — even though Meta has been criticized and even chastised by legislators and regulators on this issue over the past year, including over the initial leak of the first version of Llama, which was meant to be available only to researchers.

“Our long-standing strategy has been to build and open source general infrastructure while keeping our specific product implementations proprietary,” he said. “In the case of AI, the general infrastructure includes our Llama models, including Llama 3 which is training now and is looking great so far, as well as industry-standard tools like PyTorch that we’ve developed. This approach to open source has unlocked a lot of innovation across the industry and it’s something that we believe in deeply.”

Zuckerberg also offered significant detail about Meta’s open source approach to its business, statements which have already been widely shared on social media:

“There are several strategic benefits. First, open source software is typically safer and more secure, as well as more compute efficient to operate due to all the ongoing feedback, scrutiny, and development from the community. This is a big deal because safety is one of the most important issues in AI. Efficiency improvements and lowering the compute costs also benefit everyone including us. Second, open source software often becomes an industry standard, and when companies standardize on building with our stack, that then becomes easier to integrate new innovations into our products.

That’s subtle, but the ability to learn and improve quickly is a huge advantage and being an industry standard enables that. Third, open source is hugely popular with developers and researchers. We know that people want to work on open systems that will be widely adopted, so this helps us recruit the best people at Meta, which is a very big deal for leading in any new technology area. And again, we typically have unique data and build unique product integrations anyway, so providing infrastructure like Llama as open source doesn’t reduce our main advantages. This is why our long-standing strategy has been to open source general infrastructure and why I expect it to continue to be the right approach for us going forward.”

Finally, I was fascinated by Zuckerberg’s highlighting of Meta’s “unique data and feedback loops” in their products.

When it comes to the massive corpus that trains models upfront, Zuckerberg pointed out that on Facebook and Instagram there are “hundreds of billions of publicly shared images and tens of billions of public videos, which we estimate is greater than the Common Crawl dataset and people share large numbers of public text posts in comments across our services as well.”

The Common Crawl dataset contains petabytes of web data collected regularly since 2008 — raw web page data, metadata extracts, and text extracts. It’s huge. So the idea that Meta has access to its own large corpora that is potentially even larger is, literally, big.

But Zuckerberg went further: “Even more important than the upfront training corpus is the ability to establish the right feedback loops with hundreds of millions of people interacting with AI services across our products. And this feedback is a big part of how we’ve improved our AI systems so quickly with Reels and ads, especially over the last couple of years when we had to rearchitect it around new rules.”

A Bloomberg story yesterday highlighted the fact that the success of Meta’s Llama model has led to actual llamas becoming the unofficial mascot of open source AI events.

But if Meta’s earnings report is anything to go by, it looks like Meta is willing to go much farther than a cute, fuzzy camelid — many billions of dollars farther, according to Meta’s capital expenditure hints for 2024 — to win a highly-competitive, ever-faster AI race.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.


Source link