Amazon Prime Video advances search for sports using Amazon OpenSearch Service

Passionate sports viewers expect to easily discover and access sports events and their favorite teams, leagues, and players. Providing a robust and intuitive search experience is crucial for the success of Prime Video Sports. With a vast, rapidly growing catalog of live and on-demand sports offerings, a well-designed search architecture allows Prime Video Sports to cater to this engaged audience, streamlining navigation and reducing friction in the user experience. The Prime Video search experience is one of the most clicked on elements in the global navigation bar. Search enables highly relevant recommendations and drives increased viewership and engagement. By prioritizing a seamless search experience that caters to the needs of sports fans, Prime Video has enhanced the overall customer experience, fostering trust and loyalty that contributes to the platform’s long-term growth and success. In this post, we will walk you through how Prime Video used Amazon OpenSearch Service and its AI and machine learning (AI/ML) capabilities to build a more intuitive and enhanced sports search experience.

Challenges

The Prime Video search experience was originally designed to help customers discover trending movies and TV shows that carry durable stats including ratings, viewership, and so on. As Prime Video began to acquire sports rights, they needed to rethink the approach, which was focused primarily on TV shows and movies, to understand the customers’ intent and surface the right content. The approach for TV shows and movies didn’t work as well for live sports because of the more temporal and seasonal nature of sports content making every title a cold start. For example, a search for “soccer live” surfaced documentaries such as “This is football: Season 1” and “Ronaldo VS Messi – Face Off!” rather than live soccer matches. While those entertainment options are perfectly fine on their own, they didn’t fulfill the customers’ goal of finding and watching live or upcoming games for their favorite sports. This disconnect between search queries and relevant results created challenges for customers trying to access the sports content they wanted. By surfacing these relevant sports events in search results, Prime Video enhanced the customer experience, helping customers discover the full breadth of sports coverage available on Prime Video and finding their favorite sports events. To address these issues and better serve the needs of sports fans, in 2024, Prime Video enhanced its sports-specific search capabilities, incorporating deeper sports understanding and using state-of-the-art search techniques, creating an improved and intelligent search system.

Solution overview

In 2024, Prime Video Sports Search delivered the first version of an enhanced sports search functionality powering the experience through a two layer solution comprised of coarse retrieval using semantic search and binary search relevance classification. Semantic search is a technique of searching for information that goes beyond just matching keywords. It matches queries to data (sports events in this case) based on vector embeddings, which capture the meaning of words, phrases, and sentences. The vectors can have n dimensions; when mapped into an n-dimensional space, data that is close in semantic meaning (not a direct text match) will be close to each other in the space, as shown in the following diagram of a two-dimensional vector space of sports matches (in yellow) and search queries (in green).

The foundation of using vector search for sports is the creation of vector embeddings for each sport event present in the Prime Video Sports Catalog. As event data is ingested, textual information including title, sports, team names, leagues, and other event details are used to generate a unique vector representation for each sports event. This allows the system to capture the semantic meaning and relationships between different events—including abbreviations, nicknames, and so on—that are often used by customers to search. When a customer searches for something related to sports, their query is also converted into a vector. The system then performs a K-nearest neighbor (KNN) search, comparing the customer’s query vector to the vectors of all sports events in the catalog. The events with vectors that are closest to the query vector are identified as the most relevant matches, even if the searched words were not directly indexed. For example, Thursday Night Football events might be indexed without the abbreviation tnf, however these games will be returned by semantic search if a customer searches using “tnf” as their search query.

The following figure shows a high level indexing and query flow for a KNN vector search.

Finding the nearest vectors isn’t enough—the system also runs each of these potentially relevant events through a custom binary relevance classification machine learning (ML) model, trained in-house. This allows the system to filter out any events that might be only tangentially related to the original search, leaving behind a refined list of the most pertinent and relevant results for the customer.

Finally, these highly relevant events are ranked and surfaced to the customer with factors like the event’s current live status and upcoming schedule playing a key role in determining the optimal order to display the results. This combined use of vector semantic search and relevance classification enables Prime Video to provide customers with a sports search experience that accurately surfaces the content they’re looking for, significantly enhancing their ability to discover and access the live, upcoming, and recently ended games that they’re most interested in.

Procedure

The vector semantic search implementation we developed consists of two main components: a KNN search index and an endpoint to invoke the text embedding model. To host these components, we used AWS services—the custom text embedding model was deployed on Amazon SageMaker, while the KNN index was created using OpenSearch Service, and hosted on a managed cluster consisting of more than 50 data nodes.

Both of these components are designed to handle real-time customer traffic at a scale of thousands of requests per second. We simplified our system’s application layer by using ready-to-use solutions available in AWS. The Amazon OpenSearch Ingestion pipeline enabled a seamless, code-free integration, allowing us to write sports data from an Amazon DynamoDB table directly into the OpenSearch Service index, eliminating the need for traditional extract, transform, and load (ETL) processes. Furthermore, we used the Neural Search feature of OpenSearch Service instead of directly integrating our application layer with SageMaker for text-to-vector conversion. This approach enables internal text-to-vector transformation, facilitating vector search during both ingestion and search phases. The Neural Search plugin of OpenSearch Service directly communicates with a text embedding model deployed on SageMaker as a real-time inference endpoint using ML connectors.

This architecture—illustrated in the following figure—enabled us to build a scalable and efficient vector search solution, taking advantage of the strengths of various AWS services to simplify the implementation and improve performance.

OpenSearch Ingestion : No-ETL data transfer from DynamoDB to an OpenSearch Service index

Before indexing the sports data in OpenSearch Service, the data is first stored in a DynamoDB table. This layer of storage allows us to maintain a database of all sports events and their metadata required to enable search. This layer acts as a source of truth for sports data that isn’t impacted by the evolution of customer use cases and their respective implementation.

To seamlessly transfer this data from DynamoDB to the OpenSearch Service index, we used an OpenSearch Ingestion pipeline. This allowed us to set up real-time data transfer with a zero ETL integration, abstracting away the data indexing from the application layer. The OpenSearch Ingestion pipeline configuration enables us to specify a schema mapping between the DynamoDB table and the expected document schema in OpenSearch Service. This configuration also allows us to perform data formatting operations on specific fields and configure a dead-letter queue (DLQ) if needed. The steps to setup an OpenSearch Ingestion pipeline can be found in this blog post.

Embedding model setup on SageMaker

At the core of our vector search implementation is the text-embedding model, which plays a crucial role in capturing the semantic meaning of sports-related data. The Sports Search Science team developed this text-embedding model and deployed it on SageMaker as a real-time inference endpoint using AWS Cloud Development Kit (AWS CDK).

The process of creating the SageMaker endpoint requires two key artifacts:

With these two components in place, we used the AWS CDK to programmatically provision the SageMaker endpoint, ensuring a seamless and consistent deployment of the text-embedding model. By using the capabilities of AWS services, such as SageMaker, Amazon ECR, and Amazon S3, we were able to build a scalable and efficient text-embedding model infrastructure to power the vector search solution.

ML connectors

To facilitate access to machine learning models hosted on platforms, such as SageMaker or Amazon Bedrock, OpenSearch Service provides ML connectors. These connectors enable direct integration between OpenSearch Service and external machine learning models.

In our case, the ML connector allows OpenSearch Service to directly invoke the SageMaker endpoint where our custom text-embedding model is deployed. This built-in integration between OpenSearch Service and the SageMaker hosted model simplifies the overall architecture and eliminates the need for the application layer to manage the communication between these two components.

By using the ML connectors provided by the OpenSearch Service ML plugin, we were able to seamlessly integrate our text-embedding model—which is hosted on SageMaker—into the OpenSearch-powered vector search solution. This integration streamlines the data ingestion and querying pipeline making the implementation simpler and more intuitive.

Neural search

To simplify the application layer of our vector search solution, we used the Neural Search capabilities provided by OpenSearch Service. This feature allows us to send only the text data to the index, without the need to explicitly manage the vector embedding generation and indexing. Using neural search helped simplify the application layer of the system by abstracting the generations and management of vectors required to perform a KNN search. During ingestion, neural search transforms document text into vector embeddings and indexes both the text and its vector embeddings in a vector index. When you use a neural query during search, neural search converts the query text into vector embeddings, uses vector search to compare the query and sports event embeddings, and returns the closest results. This abstracts away the need to integrate with SageMaker in the application layer to generate vector embeddings during ingestion and search.

The process of setting up a neural search index with a SageMaker-hosted inference endpoint involves the following detailed steps:

Create an ML connector and register your model in OpenSearch Service: This step generates a model ID that you’ll need in the subsequent neural index setup.
Create a neural ingest pipeline: An ingest pipeline is a sequence of processors that are applied to documents as they’re ingested into an index. To enable neural search, you can define the text_embedding processor in the pipeline. This processor converts the text in a document field to vector embeddings, and the field_map configuration determines the input and output fields for this process.
Create the neural search index: To use the text embedding processor defined in the ingest pipeline, you can create a KNN index and specify the pipeline created in the previous step as the default pipeline.
Run a neural query: To verify your neural search setup, run a neural query by providing a search text and evaluate the results.

By following these steps, you can set up a neural search index in OpenSearch Service and run a neural query. The neural query can perform KNN vector search internally, while only requiring the input of text data during both indexing and querying. This simplifies the application layer and uses the built-in vector embedding generation and indexing capabilities provided by the OpenSearch Service Neural Search feature.

Outcomes

The initial launch of this architecture for sports search had a measurably positive impact on customer experience. We observed a statistically significant increase in search-attributed conversions including streams, purchases, subscriptions, and so on. Offline analysis of the results delivered to customers indicated an improvement in the precision of search results and a reduction in the irrelevance rate of the content shown.

Additionally, we saw that customers engaged with the search feature more frequently, as it was now surfacing results that much more closely aligned with what they were looking for. This increased engagement led to greater discovery of relevant titles on the Prime Video service, including titles that had received little engagement prior to the changes.

Overall, the data clearly demonstrated that by tailoring the specific needs of sports fans into the search experience, we significantly improved their ability to find and access desired content. By developing a smarter search system that better understands sports intent, we have driven more meaningful customer activity and increased conversions directly from search interactions.

Conclusion

By using the innovative AI/ML capabilities of Amazon OpenSearch Service, Prime Video was able to create a cutting-edge search experience that effectively addressed the unique challenges presented by highly dynamic, high-volume sports content. In addition, by overcoming the hurdles that come with such large scale, Prime Video Sports Search was able to contribute valuable improvements and enhancements back to the OpenSearch open source community. These contributions help to pave the way for other developers to more readily use the advanced AI/ML features that OpenSearch Service offers.

This collaboration between Prime Video Sports Search and OpenSearch Service has resulted in a best-in-class search capability that can seamlessly accommodate the unique requirements of live sports content. It’s a partnership that has allowed the products to grow and innovate in tandem, to the benefit of customers seeking exceptional search and discovery experiences.

If you want to build a search experience that understands user intent beyond keyword matching, try the semantic search algorithm with OpenSearch Service and its AI/ML capabilities. If you have any questions, leave a comment below.

About the authors

Radhika Chandak is a Software Development Engineer at Amazon Prime Video, where she has been working for the past 3 years. Her focus is on creating high-velocity customer experiences, with a particular emphasis on building state-of-the-art search experiences for sports content. Radhika is passionate about developing solutions that solve customer problems and delight users. Her expertise lies in crafting innovative approaches to enhance the Prime Video Sports platform, ensuring seamless and engaging experiences for sports enthusiasts.

Anna Chalupowicz is a Software Development Manager at Amazon Prime Video Sports, with 6 years of diverse experience within Amazon. For the last 3.5 years, Anna has been working in Prime Video Sports, where she focuses on developing high-scale solutions and architectural approaches that directly benefit customers. With a passion for collaborative learning and knowledge sharing, Anna finds joy in tackling complex technical challenges and using data-driven insights to enhance the customer experience.

Yaliang Wu is a Software Engineering Manager at AWS, focusing on OpenSearch projects, machine learning, and generative AI applications. Solana Token Creator