Build video analysis with Amazon Nova on AWS Bedrock. Production-ready TypeScript code for object detection and S3 processing.

TL;DR: Amazon Nova on AWS Bedrock enables video analysis with object detection, bounding boxes, and content summarization via TypeScript. It supports local videos up to 25MB and S3-hosted videos up to 1GB across formats like MP4, MKV, and WebM.
Amazon Nova is a generative AI service on AWS Bedrock that enables developers to build intelligent video analysis applications -- from content annotation and object detection to automated video summarization. This guide walks through Amazon Nova's multimodal capabilities with production-ready TypeScript examples for processing both local and S3-hosted videos, including bounding box object detection.
Amazon Nova is Amazon's generative AI service specifically designed for multimodal content analysis, with robust capabilities for processing video data. Nova can:
These capabilities make Nova an ideal choice for applications requiring deep video analysis, content moderation, accessibility features, and automated content summarization.
Before diving into implementation, ensure you have:
Full fledged TypeScript examples are available in the GitHub repository.
First, install the necessary dependencies for working with Nova:
Refer to the package.json below for the dependencies:
You'll also need to set up your AWS credentials either via environment variables or AWS CLI configuration.
One of the most common use cases is analyzing videos stored in S3 buckets. One of the core difference between processing local videos and S3 videos is the limitation of video size. For local videos, the maximum size is 25MB, while for S3 videos, the maximum size is 1GB. Here's a code snippet for example, only the relevant code is shown.
For development and testing, you might want to process videos stored locally, you can check the core difference in the request schema is the "source" field. For local videos, the "source" field is "bytes" and for S3 videos, the "source" field is "s3Location" with "bucketOwner" field.
You may see similar response (Results: ) as the one below, with necessary logs for debugging:
One of Nova's most powerful features is its ability to detect objects in videos and provide bounding box coordinates. This can be used for applications like content moderation, accessibility, or interactive video experiences.
The following example demonstrates how to specify the prompt for object detection and process the image with the Nova API using retry logic.
Check the sample images below with bounding boxes detected:





Based on the examples and documentation, here are some best practices to follow when working with Nova:
Implement exponential backoff with jitter for API calls:
Nova's pricing is based on token usage. Monitor and log token consumption:
Amazon Nova's video analysis capabilities can be applied in various domains:
Amazon Nova represents a significant advancement in multimodal AI capabilities, particularly for video analysis. By providing developers with powerful tools to extract meaning from video content, Nova enables a wide range of applications that were previously challenging to implement.
As multimodal AI continues to evolve, services like Nova will become increasingly important for developers looking to build sophisticated applications that can understand and process visual content. By following the guidelines and best practices outlined in this article, you can effectively leverage Nova's capabilities to enhance your applications with intelligent video analysis.
Amazon Nova is a generative AI service on AWS Bedrock that processes video content to extract semantic understanding, detect objects with bounding boxes, and generate detailed descriptions across formats like MP4, MKV, MOV, and WebM.
Amazon Nova integrates natively with AWS services like S3 and supports videos up to 1GB via S3 URIs, making it ideal for AWS-centric architectures. It offers multimodal understanding including object detection, scene analysis, and content summarization through a unified API.
Developers building content moderation systems, intelligent video search, accessibility features, security surveillance, or automated video summarization on AWS infrastructure will benefit most from Amazon Nova's capabilities.
Aaron is an engineering leader, software architect, and founder with 18 years building distributed systems and cloud infrastructure. Now focused on LLM-powered platforms, agent orchestration, and production AI. He shares hands-on technical guides and framework comparisons at fp8.co.
Which AI video search platform wins? TwelveLabs, Google Video AI, and 8 open-source tools tested on accuracy, speed, and cost.
Multimodal AI, Video Search4 open-source multimodal models shipped in 10 months: DeepSeek VL2, Janus, JanusFlow, and Janus-Pro. See architecture choices, benchmarks, and which beats GPT-4V on real-world vision tasks.
Multimodal AI, DeepSeek