Scribd, Inc. boosts content discovery and engagement with Claude
Scribd, Inc. uses Claude to generate high-quality metadata for millions of user-uploaded documents, improving content discoverability and driving user engagement across its global platforms.
- Helped address 70% of user-uploaded content that lacked quality metadata
- Richer content descriptions significantly increased user signups and subscriptions
- Scaled to process over 100 million documents
Enhancing a platform of user-generated content
Scribd, Inc., a multinational technology company focused on the written and spoken word, offers a massive collection of user-generated content across two of its products: Scribd and Slideshare. With hundreds of millions of documents and more, these platforms faced a significant challenge: a large portion of the content lacked quality metadata, making it difficult for users to discover relevant material.
"When a creator uploads a document to Scribd or a deck to SlideShare, they've already done the work of creating the content," explains Steve Neola, Senior Director of Product at Scribd. "Attaching metadata like titles, descriptions, or categories is extra work for them."
The company found that approximately 70% of user-uploaded content had low-quality or missing metadata, primarily descriptions. This gap presented an opportunity to improve the user experience and content discoverability across these two Scribd, Inc. platforms.
Leveraging AI to scale metadata generation
To address this challenge, Scribd, Inc. turned to generative AI, specifically Claude, to automatically create high-quality metadata for their vast user-generated content library. The project began with careful scoping and experimentation, led by John Strenio, a Data Scientist on Scribd's Applied Research team.
"We started by testing language capabilities and summarization abilities across various models," Strenio recalls. "One of the difficulties in that experimentation phase was evaluating the quality of the models and the prompts we were using."
The team conducted extensive human-led evaluations, relying heavily on their content operations department to assess the quality of AI-generated metadata. As they narrowed down their options, the focus shifted to scaling, throughput, and cost considerations.
Choosing Claude for performance and support
While Scribd, Inc. initially explored other options, including open source alternatives, they ultimately chose Claude for its combination of quality, cost-effectiveness, and scalability.
"Claude was the right mix of everything," says Strenio. "It was competitive in our experiments, and from the data scientist's perspective, one of the big factors was the ability to perform a large volume of inference in a very short amount of time."
Scribd, Inc. also appreciated the support they received from Anthropic. "The support interaction was really impressive," Neola notes. "We had a Slack channel where we could ask questions in real time and get feedback, which enabled faster turnarounds and implementation efficiency."
For an enterprise like Scribd, Inc., trust and reliability were crucial factors. "Claude's Constitutional AI and all the ethical considerations that go into the model makes Anthropic a really exciting long-term partner," Neola explains. "We want to make sure we're building trust around our AI choices internally as well as with our customers."
Driving engagement and subscriptions
The impact of implementing Claude-generated metadata has been significant for Scribd. By adding AI-generated descriptions to document pages, the company saw a substantial increase in users who view content and ultimately sign up as subscribers.
"Having the content right there at first glance gives the user more context about the particular document versus having to read through the whole thing," Neola says. "We saw a 7% increase in the users who are viewing it that ultimately come to the site and then sign up and become subscribers."
Scribd, Inc. is now expanding the use of this enhanced metadata across its user-generated platforms. They've seen similar improvements when incorporating the descriptions into recommendations on their homepage, and are preparing to integrate them into search results.
Looking to the future
As Scribd, Inc. continues to refine and expand its use of AI-generated metadata, the company is developing a playbook that can be applied to other types of metadata and content enhancements. They value Anthropic's commitment to continually improving Claude's capabilities.
"As a company, we want to stay on the bleeding edge of AI technology," Neola emphasizes. "Having a partner that's at the bleeding edge helps us serve customers today and continually bring new AI capabilities to them as they’re developed."
With Claude's ability to process and enhance millions of documents efficiently, Scribd, Inc. is well-positioned to continue improving content discovery and user engagement across its global platforms, making it easier than ever for users to find the information they need.