Build video analysis with Amazon Nova on AWS Bedrock. Production-ready TypeScript code for object detection, bounding boxes, and S3 video processing included.

Amazon Nova is a generative AI service on AWS Bedrock that enables developers to build intelligent video analysis applications -- from content annotation and object detection to automated video summarization. This guide walks through Amazon Nova's multimodal capabilities with production-ready TypeScript examples for processing both local and S3-hosted videos, including bounding box object detection.
Amazon Nova is Amazon's generative AI service specifically designed for multimodal content analysis, with robust capabilities for processing video data. Nova can:
These capabilities make Nova an ideal choice for applications requiring deep video analysis, content moderation, accessibility features, and automated content summarization.
Before diving into implementation, ensure you have:
Full fledged TypeScript examples are available in the GitHub repository.
First, install the necessary dependencies for working with Nova:
Refer to the package.json below for the dependencies:
You'll also need to set up your AWS credentials either via environment variables or AWS CLI configuration.
One of the most common use cases is analyzing videos stored in S3 buckets. One of the core difference between processing local videos and S3 videos is the limitation of video size. For local videos, the maximum size is 25MB, while for S3 videos, the maximum size is 1GB. Here's a code snippet for example, only the relevant code is shown.
For development and testing, you might want to process videos stored locally, you can check the core difference in the request schema is the "source" field. For local videos, the "source" field is "bytes" and for S3 videos, the "source" field is "s3Location" with "bucketOwner" field.
You may see similar response (Results: ) as the one below, with necessary logs for debugging:
One of Nova's most powerful features is its ability to detect objects in videos and provide bounding box coordinates. This can be used for applications like content moderation, accessibility, or interactive video experiences.
The following example demonstrates how to specify the prompt for object detection and process the image with the Nova API using retry logic.
Check the sample images below with bounding boxes detected:





Based on the examples and documentation, here are some best practices to follow when working with Nova:
Implement exponential backoff with jitter for API calls:
Nova's pricing is based on token usage. Monitor and log token consumption:
Amazon Nova's video analysis capabilities can be applied in various domains:
Amazon Nova represents a significant advancement in multimodal AI capabilities, particularly for video analysis. By providing developers with powerful tools to extract meaning from video content, Nova enables a wide range of applications that were previously challenging to implement.
As multimodal AI continues to evolve, services like Nova will become increasingly important for developers looking to build sophisticated applications that can understand and process visual content. By following the guidelines and best practices outlined in this article, you can effectively leverage Nova's capabilities to enhance your applications with intelligent video analysis.
Aaron is an engineering leader, software architect, and founder with 18 years building distributed systems and cloud infrastructure. Now focused on LLM-powered platforms, agent orchestration, and production AI. He shares hands-on technical guides and framework comparisons at fp8.co.
Find the best multimodal video search tool for your stack. We tested 10+ platforms on text, image, and voice queries so you don not have to.
Multimodal AI, Video SearchSee how DeepSeek-VL, VL2, Janus, and JanusFlow stack up on vision-language benchmarks. Includes architecture breakdowns and real-world performance data.
Multimodal AI, DeepSeek