Amazon DocumentDB

Amazon DocumentDB (with MongoDB compatibility) is designed from the ground-up to give users the performance, scalability, and availability need when operating mission-critical MongoDB workloads at scale. Amazon DocumentDB implements the Apache 2.0 open source MongoDB 3.6 API by emulating the responses that a MongoDB client expects from a MongoDB server, allowing the existing MongoDB drivers and tools with Amazon DocumentDB. Developers can use the same MongoDB application code, drivers, and tools as they do today to run, manage, and scale workloads on Amazon DocumentDB and enjoy improved performance, scalability, and availability without having to worry about managing the underlying infrastructure.

  • Customers can use AWS Database Migration Service (DMS) for free (for six months) to easily migrate their on-premises or Amazon Elastic Compute Cloud (EC2) MongoDB non-relational databases to Amazon DocumentDB with virtually no downtime.
  • When developing modern applications, document databases like MongoDB are a popular choice for storing semi-structured data for use cases like product catalogs, user profiles, mobile applications, and content management. 
  • Amazon DocumentDB automatically grows the size of storage volume as database storage needs grow. The storage volume grows in increments of 10 GB, up to a maximum of 64 TB.
  • Amazon DocumentDB is unique in the database architecture, that was designed for the cloud and is built on a distributed, fault-tolerant, self-healing storage system that gives the performance, scalability, and availability necessary when operating mission-critical MongoDB workloads at scale.
Amazon DocumentDB

Amazon DocumentDB  Benefits

Amazon DocumentDB increases read throughput to support high-volume application requests by creating up to 15 replica instances. Amazon DocumentDB replicas share the same underlying storage, lowering costs and avoiding the need to perform writes at the replica nodes. This capability frees up more processing power to serve read requests and reduces the replica lag time—often down to single digit milliseconds. Users can add replicas in minutes regardless of the storage volume size. Amazon DocumentDB also provides a reader endpoint, so the application can connect without having to track replicas as they are added and removed.

With Amazon DocumentDB, users don’t need to worry about database management tasks, such as hardware provisioning, patching, setup, configuration, backups, or scaling. Amazon DocumentDB automatically and continuously monitors and backs up cloud database to Amazon S3, enabling point-in-time recovery. This feature allows users to restore the cluster to any second during the retention period, up to the last 5 minutes. Users can configure automatic backup retention period up to 35 days.  Amazon DocumentDB backups are automatic, incremental, and continuous, and they have no impact on your cluster performance.

Amazon DocumentDB interacts with the Apache 2.0 open source MongoDB 3.6 and 4.0 APIs. As a result, users can use the same MongoDB drivers, applications, and tools with Amazon DocumentDB with little or no changes. By emulating the responses that a MongoDB client expects from a MongoDB server, Amazon DocumentDB allows customers to use their existing MongoDB drivers and tools with Amazon DocumentDB. Updating the application is as easy as changing the database endpoint to the new Amazon DocumentDB cluster. Learn more about supported MongoDB APIs.

Amazon DocumentDB continuously monitors the health of users cluster. On an instance failure, Amazon DocumentDB automatically restarts the instance and associated processes. Amazon DocumentDB doesn’t require a crash recovery replay of database redo logs, which greatly reduces restart times. Amazon DocumentDB also isolates the database cache from the database process, enabling the cache to survive an instance restart. Amazon DocumentDB runs in Amazon Virtual Private Cloud (Amazon VPC), so that users can isolate the database in their own virtual network. 

Amazon DocumentDB Features

MongoDB-compatible
A vast majority of the applications, drivers, and tools that customers already use today with their MongoDB non-relational database can be used with Amazon DocumentDB with little or no change. Amazon DocumentDB emulates the responses that a client expects from a MongoDB server by implementing the Apache 2.0 open source MongoDB 3.6 and 4.0 APIs on a purpose-built, distributed, fault-tolerant, self-healing storage system that gives customers the performance, scalability, and availability they need when operating mission-critical MongoDB workloads at scale.
 
  • ACID Transactions: With the launch of support for MongoDB 4.0 compatibility, Amazon DocumentDB supports the ability to perform ACID transactions across multiple documents, statements, collections, and databases.
  • Migration support: Customers can easily migrate their MongoDB databases on-premises or on Amazon EC2 to Amazon DocumentDB with virtually no downtime using the AWS Database Migration Service (DMS). With DMS, users can migration from a MongoDB replica set or from a sharded cluster to Amazon DocumentDB. 
Fully Managed
Automatic Provisioning and setup: Getting started with Amazon DocumentDB is easy. Just launch a new Amazon DocumentDB cluster using the AWS Management Console. Amazon DocumentDB instances are pre-configured with parameters and settings appropriate for the instance class selected. Users can launch a cluster and connect the application within minutes without additional configuration.
 
Monitoring and Metrics: Amazon DocumentDB provides Amazon CloudWatch metrics for cloud database instances. Users can use the AWS Management Console to view over 40 key operational metrics for their cluster, including compute, memory, storage, query throughput, MongoDB opcounters, and active connections.
 
Automatic Software Patching: Amazon DocumentDB will keep the database up-to-date with the latest patches. Users can control if and when the cluster is patched via Database Engine Version Management.
Performance at scale
High Throughput, Low Latency for Document Queries: Amazon DocumentDB has a flexible JSON document model, data types, and efficient indexing, and it uses a scale-up, in-memory optimized architecture to allow for fast query evaluation over large sets of documents. 
 
Easy Scaling of Database Compute Resources: With a few clicks in the AWS Management Console, users can scale the compute and memory resources, powering your cluster up or down, by creating new replica instances of the desired size or by removing instances. Compute scaling operations typically complete in a few minutes.
 
Storage that Automatically Scales: Amazon DocumentDB will automatically grow the size of the storage volume as the cluster storage needs grow. The storage volume will grow in increments of 10 GB up to a maximum of 64 TB. Users don’t have to provision excess storage for their NoSQL database to handle future growth.
 
Low Latency Read Replicas: Increase read throughput to support high volume application requests by creating up to 15 database read replicas. Amazon DocumentDB replicas share the same underlying storage as the source instance, lowering costs and avoiding the need to perform writes at the replica nodes. This frees up more processing power to serve read requests and reduces the replica lag time–often down to single digit milliseconds. Amazon DocumentDB also provides a single endpoint for read queries, so the application can connect without having to keep track of replicas as they are added and removed.
Highly Secure and Compliant
Network Isolation: Amazon DocumentDB runs in Amazon VPC, which allows users to isolate the cluster in their own virtual network and connect to on-premises IT infrastructure using industry-standard encrypted IPsec VPNs. 
 
Authorization: Amazon DocumentDB supports role-based access control (RBAC) with built-in roles. RBAC enables you to enforce least privilege as a best practice by restricting the actions that users are authorized to perform. Amazon DocumentDB is integrated with AWS Identity and Access Management (IAM) and provides users the ability to control the actions that the AWS IAM users and groups can take on specific Amazon DocumentDB resources, including clusters, instances, snapshots, and parameter groups. 
 
Encryption: Amazon DocumentDB allows users to encrypt their databases using keys their create and control through AWS Key Management Service (KMS). On a cluster running with Amazon DocumentDB encryption, data stored at rest in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster. 
 
Compliance Certifications: Amazon DocumentDB was designed to meet the highest security standards and to make it easy for users to verify AWS security and meet own regulatory and compliance obligations. Amazon DocumentDB has been assessed to comply with PCI DSSISO 90012700127017, and 27018SOC 1, 2 and 3, and Health Information Trust Alliance Common Security Framework certification, in addition to being HIPAA eligible.
Highly Available
Instance Monitoring and Repair: The health of Amazon DocumentDB cluster and its instances are continuously monitored. If the instance powering database fails, the instance and associated processes are automatically restarted. Amazon DocumentDB recovery does not require the potentially lengthy replay of database redo logs, that is the instance restart times are typically 30 seconds or less. It also isolates the database cache from database processes, allowing the cache to survive a database restart.
 
Multi-AZ Deployments with Read Replicas: On instance failure, Amazon DocumentDB automates failover to one of up to 15 Amazon DocumentDB replicas users created in any of three Availability Zones. If no Amazon DocumentDB replicas have been provisioned, in the case of a failure, Amazon DocumentDB will attempt to create a new instance automatically.
 
Fault-tolerant and Self-healing Storage: Each 10GB portion of users storage volume is replicated six ways, across three Availability Zones. Amazon DocumentDB uses fault-tolerant storage that transparently handles the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. Amazon DocumentDB’s storage is also self-healing; data blocks and disks are continuously scanned for errors and replaced automatically.
 
Automatic, Continuous, Incremental Backups and Point-in-time Restore: Amazon DocumentDB’s simple database backup capability enables point-in-time recovery for your clusters. This allows users to restore their cluster to any second during the retention period, up until the last five minutes. The automatic backup retention period can be configured up to thirty-five days. Automated backups are stored in Amazon S3, which is designed for 99.999999999% durability. Amazon DocumentDB backups are automatic, incremental, and continuous and have no impact on cluster performance.
 
Cluster Snapshots: Cluster snapshots are user-initiated backups of the cluster stored in Amazon S3 that will be kept until users explicitly delete them. They leverage the automated incremental snapshots to reduce the time and storage required. users can create a new cluster from a Cluster Snapshot whenever they desire.

DocumentDB Cloud Native Architecture

Decoupled Compute and Storage

The compute and storage layers are decoupled in Amazon DocumentDB, and can be scaled independently. The primary instance and replicas share the same cluster volume. Adding a read replica or replacing a failed instance does not require copying any data, and can be performed in a few minutes regardless of the size of the data.

Amazon DocumentDB Decoupled compute and storage
Amazon DocumentDB Decoupled compute and storage
Fault-Tolerant Design

In Amazon DocumentDB, the durability is handled at the storage layer. Whether the cluster contains a single instance or 16 instances, users have the same level of durability for their data. Amazon DocumentDB divides its database volume into 10-GB segments, each distributed across the cluster, thus isolating the blast radius of disk failures. Each segment is replicated six ways across three Availability Zones.

Amazon DocumentDB storage is also self-healing; data blocks and disks are continuously scanned for errors and replaced automatically. Amazon DocumentDB monitors disks and storage nodes for failures and automatically replaces or repairs the disks and storage nodes without the need to interrupt read or write processing from the database.

 

Data replicated six ways across three Availability Zones
Data replicated six ways across three Availability Zones
Low-Latency Read Replicas

Users can create up to 15 Amazon DocumentDB replicas across multiple Availability Zones to scale the read traffic. Amazon DocumentDB replicas share the same underlying storage as the source instance, avoiding the need to copy data to replicas to keep them in sync. This approach frees up more processing power to serve read requests and reduces the replica lag time—typically under 100 milliseconds. As the primary instance and replicas share the same storage, adding a replica does not require any data to be copied. Users can add a replica within minutes regardless of the size of their data.

Add up to 15 replicas in minutes regardless of size of data
Add up to 15 replicas in minutes regardless of size of data
AWS Regions and Availability Zones

The AWS Global Infrastructure comprises AWS Regions and Availability Zones. AWS Regions are separate geographic areas. AWS Regions consist of multiple, physically separated and isolated Availability Zones that are connected with low latency, high throughput, highly redundant networking. Availability Zones consist of one or more discrete data centers, each with redundant power, networking, and connectivity, and housed in separate facilities. 

  • These Availability Zones enable users to operate production applications and databases that are more highly available, fault tolerant, and scalable than possible when using a single data center.
  • Users can deploy applications and databases across multiple Availability Zones. In the unlikely event of a failure of one Availability Zone, user requests are routed to their application instances in the second Availability Zone. This approach ensures the application continues to remain available at all times.
AWS Regions and Availability Zones
AWS Regions and Availability Zones
Limitations of Traditional Architectures

Traditional databases have monolithic architectures—the compute and storage layers are tightly coupled and cannot be scaled independently. Scalability is handled by adding more nodes, each with its own compute and storage. Adding an extra node for scaling or replacing a failed node requires that users copy or replicate the existing data to the new node; this process can take hours, days, or even weeks for large databases.

Architecture of traditional databases
Architecture of traditional databases

DocumentDB Uses Cases

Document databases are useful for workloads that require a flexible schema for fast, iterative development. Users may need to use a document database or some other type of database for managing their data. The following are some examples of use cases for which document databases can provide significant advantages:

User Profiles

Because document databases have a flexible schema, they can store documents that have different attributes and data values. Document databases are a practical solution to online profiles in which different users provide different types of information. Using a document database, users can store each user’s profile efficiently by storing only the attributes that are specific to each user.

  • Suppose that a user elects to add or remove information from their profile. In this case, their document could be easily replaced with an updated version that contains any recently added attributes and data or omits any newly omitted attributes and data. Document databases easily manage this level of individuality and fluidity.
Real-Time Big Data

Historically, the ability to extract information from operational data was hampered by the fact that operational databases and analytical databases were maintained in different environments—operational and business/reporting respectively. Being able to extract operational information in real time is critical in a highly competitive business environment.

  • By using document databases, a business can store and manage operational data from any source and concurrently feed the data to the BI engine of choice for analysis. There is no requirement to have two environments.
Content Management

To effectively manage content, users must be able to collect and aggregate content from a variety of sources, and then deliver it to the customer. Due to their flexible schema, document databases are perfect for collecting and storing any type of data. Users can use them to create and incorporate new types of content, including user-generated content, such as images, comments, and videos.

Content and catalog management

Shopping sites, online publications, digital archives, point-of-sale terminals, and self-service kiosks rely on content and catalog management systems to serve their customers. These systems need fast and reliable access to user reviews, images, ratings, product information, comments, etc.

  • With Amazon DocumentDB’s flexible document model, data types, and indexing, users can store and query content (e.g., user reviews and demo videos for shopping sites) and catalogs (e.g., inventory lists for point-of-sale terminals and financial trades for trading platforms) quickly and intuitively using a simple database service.
Profile management

User profile management enables online transactions, user preferences, and user authentication. With the growth in users, increasingly complex user profile data, and growing user experience expectations, the demand for scalability, data flexibility, and performance too has grown.

  • With Amazon DocumentDB’s document data model, you can manage profiles & preferences of millions of users & scale to process millions of user requests per second with millisecond latency with a fully-managed non-relational database service.
Mobile and web applications

Build high-performance mobile and web applications that scale to process millions of user requests per second with millisecond latency. As a website database or database for mobile applications, Amazon DocumentDB lowers the operational burden, allowing users to focus on building unique experiences for their customers.

  • Amazon DocumentDB’s flexible document model, data types, and indexing allow to adapt and iterate on the applications quickly, cutting down development time.

Managing DocumentDB Clusters 

Migrating To DocumentDB

 
 

Users can migrate data from any MongoDB database, either on-premises or in the cloud (e.g. a MongoDB database running on Amazon EC2), to Amazon DocumentDB. They can migrate data from the source MongoDB database to Amazon DocumentDB using a number of approaches. There are three primary approaches for migrating data to Amazon DocumentDB.

Offline Migration

The simplest approach is to do an offline migration. Because Amazon DocumentDB is compatible with the MongoDB API, Users can use the mongodump tool to export the data from MongoDB, and the mongorestore tool to restore the data into Amazon DocumentDB. The offline migration method results in downtime while the dump and restore operations are running. This method is suitable for migration of non-production workloads or for migration of non-critical databases where users can afford the downtime. The following are the basic process for offline migration:

  1. Quiesce writes to your MongoDB source.
  2. Dump collection data and indexes from the source MongoDB deployment.
  3. Restore indexes to the Amazon DocumentDB cluster.
  4. Restore collection data to the Amazon DocumentDB cluster.
  5. Change your application endpoint to write to the Amazon DocumentDB cluster.
Online Migration

For migration of production workloads with minimal downtime, users can use the online approach or the hybrid approach. With the online migration approach, users may use AWS Database Migration Service (DMS) to migrate the data from MongoDB to Amazon DocumentDB. DMS performs an initial full load of the data from the MongoDB source to Amazon DocumentDB. During the full load, source database is available for operations.

Once the full load is completed, DMS switches to change data capture (CDC) mode to keep the source (MongoDB) and destination (Amazon DocumentDB) in sync. When the databases are in sync, users can switch the applications to point to Amazon DocumentDB with near zero downtime. The basic process for online migration is as follows:

  1. The application uses the source DB normally.
  2. Optionally, pre-create indexes in the Amazon DocumentDB cluster.
  3. Create an AWS DMS task to perform a full load, and then enable CDC from the source MongoDB deployment to the Amazon DocumentDB cluster.
  4. After the AWS DMS task has completed a full load and is replicating changes to the Amazon DocumentDB, switch the application’s endpoint to the Amazon DocumentDB cluster.
 
Hybrid Approach

The hybrid approach is a combination of the offline and online migration approaches. The hybrid approach is useful in a scenario where users need minimal downtime during migration, but the size of the source database is large or sufficient bandwidth is not available to migrate the data in a reasonable amount of time. The hybrid approach uses the mongodump and mongorestore tools to migrate data from users source MongoDB deployment to the Amazon DocumentDB cluster. The hybrid approach has two phases.

  • In the first phase, users export the data from the source MongoDB using the mongodump tool, transfer it to AWS (if the source is onpremises), and restore it to Amazon DocumentDB. Users can use AWS Direct Connect or AWS Snowball to transfer the export dump to AWS. During this phase, the source (MongoDB) is available for operations and the data restored to Amazon DocumentDB does not contain the latest changes.
  • In the second phase, users use DMS in CDC mode to copy the changes from the source (MongoDB) to Amazon DocumentDB and keep them in sync. Once the databases are in sync, you can switch your applications to point to Amazon DocumentDB with near zero downtime.

The following are the basic process for offline migration:

  1. The application uses the source MongoDB deployment normally.
  2. Dump collection data and indexes from the source MongoDB deployment.
  3. Restore indexes to the Amazon DocumentDB cluster.
  4. Restore collection data to the Amazon DocumentDB cluster.
  5. Create an AWS DMS task to enable CDC from the source MongoDB deployment to the Amazon DocumentDB cluster.
  6. When the AWS DMS task is replicating changes within an acceptable window, change the application endpoint to write to the Amazon DocumentDB cluster.
 
 

Amazon DocumentDB separates compute and storage, and offloads data replication and backup to the cluster volume. A cluster volume provides a durable, reliable, and highly available storage layer that replicates data six ways across three Availability Zones. Replicas enable higher data availability and read scaling. Amazon DocumentDB separates compute and storage, and offloads data replication and backup to the cluster volume. A cluster volume provides a durable, reliable, and highly available storage layer that replicates data six ways across three Availability Zones. Replicas enable higher data availability and read scaling. Each cluster can scale up to 15 replicas.

cluster snapshot is a complete backup of the data in users Amazon DocumentDB cluster. When the snapshot is being created, Amazon DocumentDB reads the data directly from the cluster volume. Because of this, users can create a snapshot even if the cluster doesn’t have any instances running at the time. The amount of time it takes to create a snapshot depends on the size of the cluster volume. Amazon DocumentDB supports automatic backups, which occur daily during the preferred backup window — a 30-minute period of time during the day.

Cluster: Consists of one or more instances and a cluster storage volume that manages the data for those instances.

  • Instance: Reading and writing data to the cluster storage volume is done via instances. In a given cluster, there are two types of instances: primary and replica. A cluster always has one primary instance and can have 0–15 replicas.
  • Primary instance; Supports both read and write operations, and performs all data modifications to the cluster volume. Each cluster has one primary instance.
  • Replica instance: Supports only read operations. Each Amazon DocumentDB cluster can have up to 15 replica instances in addition to the primary instance. Multiple replicas distribute the read workload. By locating replicas in separate Availability Zones, users can also increase database availability.
  • Cluster volume: A virtual database storage volume that spans three Availability Zones, with each Availability Zone having two copies of the cluster data.

Replica Set Mode: When connecting in replica set mode, users Amazon DocumentDB cluster appears to the drivers and clients as a replica set. Connecting to the cluster endpoint in replica set mode is the recommended approach for general use. Replica set mode is advantageous for high availability and effectively balancing client requests in the cluster. Instances added and removed from Amazon DocumentDB cluster are reflected automatically in the replica set configuration. Users can connect to their Amazon DocumentDB cluster endpoint in replica set mode by specifying the replica set name rs0. Connecting in replica set mode enables users database client to specify Read Concern, Write Concern, and Read Preference options.

  • Cluster Endpoint: The cluster endpoint connects to users cluster’s current primary instance. The cluster endpoint can be used for read and write operations. The cluster endpoint provides failover support. If the cluster’s current primary instance fails, the cluster endpoint automatically redirects connection requests to a new primary instance.
  • Reader Endpoint: The reader endpoint load balances read-only connections across all available replicas in users cluster including the primary instance. When adding a replica instance to the Amazon DocumentDB cluster, it is made available for load balancing read connections using the reader endpoint. This means that users do not have to make any application changes while adding or removing read replicas in the cluster.
  • Instance Endpoint: Users can connect to any instance in their cluster using the instance endpoint. The recommended way to connect to the cluster is to use the cluster endpoint for read/write operations and the reader endpoint for read operations. Users can use the instance endpoint to connect and run those analytical queries against the larger instance without affecting other instances in the cluster

Connecting DocumentDB Clusters 

 
 

 

Backing Up and Restoring

 
 

Amazon DocumentDB (with MongoDB compatibility) continuously backs up data to Amazon Simple Storage Service (Amazon S3) for 1–35 days so that you can quickly restore to any point within the backup retention period. Amazon DocumentDB also takes automatic snapshots of the data as part of this continuous backup process. Users can also retain backup data beyond the backup retention period by creating a manual snapshot of their cluster’s data. The backup process does not impact your cluster’s performance. users can use the mongodumpmongorestoremongoexport, and mongoimport utilities to move data in and out of Amazon DocumentDB cluster. 

mongodump

The mongodump utility creates a binary (BSON) backup of a MongoDB database. The mongodump tool is the preferred method of dumping data from source MongoDB deployment when looking to restore it into Amazon DocumentDB cluster due to the size efficiencies achieved by storing the data in a binary format. Depending on the resources available on the instance or machine using to perform the command, users can speed up the mongodump by increasing the number of parallel connections dumped from the default 1 using the --numParallelCollections option.

  • A good rule of thumb is to start with one worker per vCPU on your Amazon DocumentDB cluster’s primary instance.
mongorestore

The mongorestore utility enables users to restore a binary (BSON) backup of a database that was created with the mongodump utility. Users can improve restore performance by increasing the number of workers for each collection during the restore with the --numInsertionWorkersPerCollection option (the default is 1).

  • A good rule of thumb is to start with one worker per vCPU on your Amazon DocumentDB cluster’s primary instance.
mongoexport

The mongoexport tool exports data in Amazon DocumentDB to JSON, CSV, or TSV file formats. The mongoexport tool is the preferred method of exporting data that needs to be human or machine readable.

  • mongoexport does not directly support parallel exports. However, it is possible to increase performance by executing multiple mongoexport jobs concurrently for different collections.
mongoimport

The mongoimport tool imports the contents of JSON, CSV, or TSV files into an Amazon DocumentDB cluster. You can use the -–numInsertionWorkers parameter to parallelize and speed up the import (the default is 1).

Restoring from a Cluster Snapshot

Amazon DocumentDB (with MongoDB compatibility) creates a cluster snapshot of storage volume. Users can create a new cluster by restoring from a cluster snapshot. When restoring the cluster, users provide the name of the cluster snapshot to restore from and a name for the new cluster that is created by the restore. Users can’t restore from a snapshot to an existing cluster because a new cluster is created when they restore. When restoring a cluster from a cluster snapshot:

  • This action restores only the cluster, and not the instances for that cluster. Users need to invoke the create-db-instance action to create instances for the restored cluster, specifying the identifier of the restored cluster in --db-cluster-identifier. Users can create instances only after the cluster is available.
  • Users cannot restore an encrypted snapshot to an unencrypted cluster. However, they can restore an unencrypted snapshot to an encrypted cluster by specifying the AWS KMS key.
  • To restore a cluster from an encrypted snapshot, users need to have access to the AWS KMS key.
 

Monitoring Amazon DocumentDB

 

Monitoring AWS services is an important part of keeping the systems healthy and functioning optimally. It’s wise to collect monitoring data from all parts of AWS solution so that users can more easily debug and fix failures or degradations, should they occur. To understand the current performance patterns, identify performance anomalies, and formulate methods to address issues, users should establish baseline performance metrics for various times and under differing load conditions.  

The following is advice about specific types of metrics:

  • High CPU or RAM use — High values for CPU or RAM use might be appropriate, provided that they are in keeping with the goals for users application (like throughput or concurrency) and are expected.
  • Storage volume consumption — Investigate storage consumption (VolumeBytesUsed) if space that is used is consistently at or above 85 percent of the total storage volume space. Determine whether deleting data from the storage volume or archive data to a different system to free up space. 
  • Network traffic — For network traffic, talk with the system administrator to understand what the expected throughput is for users domain network and internet connection. Investigate network traffic if throughput is consistently lower than expected.
  • Database connections — Consider constraining database connections if  high numbers of user connections in conjunction with decreases in instance performance and response time. The best number of user connections for instance will vary based on the instance class and the complexity of the operations being performed.
  • IOPS metrics—The expected values for IOPS metrics depend on disk specification and server configuration, so use baseline to know what is typical. Investigate if values are consistently different from the baseline. For best IOPS performance, make sure the typical working set fits into memory to minimize read and write operations.

Amazon DocumentDB (with MongoDB compatibility) integrates with Amazon CloudWatch so that users can gather and analyze operational metrics for their clusters. They can monitor these metrics using the CloudWatch console, the Amazon DocumentDB console, the AWS Command Line Interface (AWS CLI), or the CloudWatch API.

  • CloudWatch allows users to set alarms so that they can be notified if a metric value breaches a threshold that you specify.
  • Users can set up Amazon CloudWatch Events to take corrective action if a breach occurs.

Amazon DocumentDB is integrated with AWS CloudTrail, a service that provides a record of actions taken by IAM users, IAM roles, or an AWS service in Amazon DocumentDB (with MongoDB compatibility). CloudTrail captures all AWS CLI API calls for Amazon DocumentDB as events, including calls from the Amazon DocumentDB console and from code calls to the Amazon DocumentDB SDK.

  • By creating a trail, users will be able to continuously deliver the CloudTrail events to an Amazon S3 bucket, including events for Amazon DocumentDB. 
  • Using the information collected by CloudTrail, users can determine the request that was made to Amazon DocumentDB, the IP address from which the request was made, who made the request, when it was made, and other details.

Amazon DocumentDB  Pricing

 
Amazon DocumentDB

With Amazon DocumentDB users only pay for what they use, there are no up-front costs, and there is no minimum fee. Amazon DocumentDB is priced in four dimensions:

  1. On-demand instances On-Demand Instances let users pay per second, with no long-term commitments or upfront fees. Pricing is per instance-hour consumed, from the time an instance is launched until it is stopped or deleted. Partial instance hours are billed in one-second increments, with a 10-minute minimum charge following a billable status change such as creating, modifying, or deleting an instance.
  2. Database I/O: The amount of I/O used when reading and writing data to your cluster’s storage volume (pricing per million I/Os).
  3. Database storage: The amount of data stored in your cluster’s storage volume (pricing per GB/month).
  4. Backup storage: The amount of backup storage used in excess of your cluster’s database storage usage (pricing per GB/month).

Amazon DocumentDB offers the following features to help users optimize costs:

  • Amazon DocumentDB provides per second billing for instances, with a ten minute minimum billing period.
  • Users can temporarily stop compute instances for up to 7 days when they don’t need to access the cluster (great for pausing test clusters over the weekend), and restart the instances when needed.
  • Amazon DocumentDB instances are not data bearing, so users can provision a highly durable cluster with just a single instance, a popular approach for development clusters.
  • Users get backup storage equivalent to 100% of your cluster’s data storage for free each month. Additional backup storage beyond the free allotment is priced as low as $0.02 per GB/month (prices may vary across AWS regions).
  • Amazon DocumentDB’s storage and IO automatically scale to users  workload, so users only pay for the resources used, without needing to pre-provision.
  • Amazon DocumentDB storage is highly durable and available, replicating data six ways across three AZs. While DocumentDB maintains six copies of data, users only pay for a single copy, with pricing as low as $0.10 GB/month (prices may vary across AWS regions).
  • Features like encryption at-rest with KMS, encryption in-transit with TLS, and monitoring with AWS CloudWatch are available for all clusters at no additional cost.
  • Choose from AWS premium support plans with transparent pricing to match needs. 
  • Data transferred across Availability Zones between cluster instances is free.

Amazon DocumentDB (with MongoDB compatibility) is designed from the ground-up to give users the performance, scalability, and availability need when operating mission-critical MongoDB workloads at scale. Amazon DocumentDB implements the Apache 2.0 open source MongoDB 3.6 API by emulating the responses that a MongoDB client expects from a MongoDB server, allowing the existing MongoDB drivers and tools with Amazon DocumentDB. Developers can use the same MongoDB application code, drivers, and tools as they do today to run, manage, and scale workloads on Amazon DocumentDB and enjoy improved performance, scalability, and availability without having to worry about managing the underlying infrastructure.

  • Customers can use AWS Database Migration Service (DMS) for free (for six months) to easily migrate their on-premises or Amazon Elastic Compute Cloud (EC2) MongoDB non-relational databases to Amazon DocumentDB with virtually no downtime.
  • When developing modern applications, document databases like MongoDB are a popular choice for storing semi-structured data for use cases like product catalogs, user profiles, mobile applications, and content management. 
  • Amazon DocumentDB automatically grows the size of storage volume as database storage needs grow. The storage volume grows in increments of 10 GB, up to a maximum of 64 TB.
  • Amazon DocumentDB is unique in the database architecture, that was designed for the cloud and is built on a distributed, fault-tolerant, self-healing storage system that gives the performance, scalability, and availability necessary when operating mission-critical MongoDB workloads at scale.