AWS examples in C# - create a service working with DynamoDB

Mar 8, 2020

This post is part of AWS examples in C# – working with SQS, DynamoDB, Lambda, ECS series. The code used for this series of blog posts is located in aws.examples.csharp GitHub repository. In the current post, I give an overview of DyanmoDB and what it can be used for.

NoSQL database

NoSQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases (RDBMS). There are several types of NoSQL databases:

Key-value stores - every single item in the database is stored as an attribute name (or 'key'), together with its value.
Document databases - pair each key with a complex data structure known as a document, usually, it is a JSON document. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents.
Graph stores - used to store information about networks of data. Data is organized in the form of nodes and connections between the nodes.
Wide-column stores - store columns of data together, instead of rows. It can query large data volumes faster than conventional relational databases.

A very good article on the NoSQL topic is NoSQL Databases Explained.

AWS DynamoDB

Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It's a fully managed, multi-region, multi-master, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications.

DynamoDB tables

DynamoDB stores data in tables. The data is represented as items, which have attributes. When a table is created, along with its name, a primary key should be provided. The primary key can consist only of a partition key (HASH), it is mandatory. DynamoDB uses an internal hash function to evenly distribute data items across partitions, based on their partition key values. The primary key can also consist of the partition key and sort key (RANGE), which is complementary to the partition. DynamoDB stores items with the same partition key physically close together, in sorted order by the sort key value.

Secondary indexes

DynamoDB offers the possibility to define so-called secondary indexes. There are two types - global and local. A global secondary index is a one that has a partition, a HASH, key different than the HASH key or the table, each table has a limit of 20 global indexes. A local index is one that has the same partition key but different sorting key. Up to 5 local secondary indexes per table are allowed. Properly managing those indexes is the key to using efficiently DynamoDB as a storage unit.

Streams

DynamoDB Streams is an optional feature that captures data modification events in DynamoDB tables. The data about different DynamoDB events appear in the stream in near-real-time, and in the order that the events occurred. Each event is represented by a stream record in case of add, update or delete an item. Stream records can be configured what data to hold, they can have the old and the new item, or only one of them if needed, or even only the keys. Stream records have a lifetime of 24 hours, after that, they are automatically removed from the stream. Streams are used together with AWS Lambda to create a trigger code that executes automatically whenever an event appears in a stream.

Read/Write Capacity Mode

Amazon DynamoDB has two read/write capacity modes for processing reads and writes on your tables: on-demand and provisioned, which is the default, free-tier eligible mode. The read/write capacity mode controls how charges are applied to read and write throughput and how to manage capacity. The capacity mode is set when the table is created and it can be changed later. The provisioned mode is the default one, it is recommended to be used in case of known workloads. The on-demand mode is recommended to be used in case of unpredictable and unknown workloads. DynamoDB provides auto-scaling capabilities so the table’s provisioned capacity is adjusted automatically in response to traffic changes.

Understanding the concept around read and write capacity units is tricky. One write capacity unit is up to 1KB of data per second. If write is done in a transaction though, then the capacity unit count doubles. An example is if there is 2KB of data to be written per second, then the table definition needs 2 write capacity units. If the write is done in a transaction though, then 4 capacity units have to be defined. Read capacity unit is similar, with the difference that there are two flavors of reading - strongly consistent read and eventually consistent read. An eventually consistent read means, that data returned by DynamiDB might not be up to date and some write operation might not have been refracted to it. If data should be guaranteed to be propagated on all DynamoDB nodes and it is up-to-date data, then strongly consistent read is needed. One read capacity unit gives one strongly consistent read or two eventually consistent reads for data up to 4KB. Transactions double the count if read units needed, hence two units are required to read data up to 4KB. For example, if the data to be read is 8 KB, then 2 read capacity units are required to sustain one strongly consistent read per second, 1 read capacity unit if in case of eventually consistent reads, or 4 read capacity units for a transactional read request.

In case the application exceeds the provisioned throughput capacity on a table or index, then it is subject to request throttling. Throttling prevents the application from consuming too many capacity units. When a request is throttled, it fails with an HTTP 400 code (Bad Request) and a ProvisionedThroughputExceededException. The AWS SDKs have built-in support for retrying throttled requests, so no custom logic is needed.

Different programmatic interfaces

Every AWS SDK provides one or more programmatic interfaces for working with Amazon DynamoDB. These interfaces range from simple low-level DynamoDB wrappers to object-oriented persistence layers. The available interfaces vary depending on the AWS SDK and programming language that you use. For C# available interfaces are low-level interface, document interface and object persistence interface. In AWS examples in C# – basic DynamoDB operations post I have given detailed code examples of all of them.

Low-level interface

The low-level interface lets the consumer manage all the details and do the data mapping. Data is mapped manually to its proper data type. Supported data types are:

B - binary value, a MemoryStream
BS - list of MemoryStream objects
S - string
SS - list of string objects
N - number converted into a string
NS - list of number strings
BOOL - boolean
L - list of AttributeValue objects
M - map, dictionary of AttributeValue objects
NULL - if set to true, then this is a null value

If the low-level interface is used for querying then a KeyConditionExpression is used to query the data. It is called a query, but it not actually a query in terms of RDBMS way of thinking, as the HASH key should be only used with an equality operator. For the RANGE key, there is a variety of operators to be used, such as:

sortKeyName = :sortkeyval - true if the sort key value is equal to :sortkeyval
sortKeyName < :sortkeyval - true if the sort key value is less than :sortkeyval
sortKeyName <= :sortkeyval - true if the sort key value is less than or equal to :sortkeyval
sortKeyName > :sortkeyval - true if the sort key value is greater than :sortkeyval
sortKeyName >= :sortkeyval - true if the sort key value is greater than or equal to :sortkeyval
sortKeyName BETWEEN :sortkeyval1 AND :sortkeyval2 - true if the sort key value is greater than or equal to :sortkeyval1, and less than or equal to :sortkeyval2
begins_with ( sortKeyName, :sortkeyval ) - true if the sort key value begins with a particular operand. (You cannot use this function with a sort key that is of type Number.) Note that the function name begins_with is case-sensitive.

Document interface

The document programming interface returns the full document by its unique HASH key. The document is actually a JSON.

{
    "Title": {
        "Value": "Die Hard",
        "Type": 0
    },
    "Genre": {
        "Value": "0",
        "Type": 1
    }
}

Object persistence interface

WIth object persistency client classes are mapped to DynamoDB tables. There are several attributes that can be applied to database model classes, such as DynamoDBTable, DynamoDBHashKey, DynamoDBRangeKey, DynamoDBProperty, DynamoDBIgnore, etc. To save the client-side objects to the tables, the object persistence model provides the DynamoDBContext class, an entry point to DynamoDB. This class provides a connection to DynamoDB and enables you to access tables, perform various CRUD operations.

Architectural constraints

Understanding DynamoDB nature is important in order to design a service that works with it. It is important to cost-efficiently define the table capacity. If less capacity is defined, then consumers can get 400 responses, the other extreme is to generate way too much cost. Another aspect is reading the data. DynamoDB does not provide a way to search for data. In any case, the application that used DynamoDB has to have a proper way to access the data by key.

Using DynamoDB in a service

DynamoDB can be straight forward used in a service, such as SqsReader or ActorsServerlessLambda and MoviesServerlessLambda functions, see the bigger picture in AWS examples in C# – working with SQS, DynamoDB, Lambda, ECS post. An AmazonDynamoDBClient is instantiated and used with one of the programming interfaces described above.

Another important usage is to subscribe to and process stream events. This is done in both ActorsLambdaFunction and MoviessLambdaFunction. See more details about Lambda usage in AWS examples in C# – working with Lambda functions post.

More information on how to run the solution can be found in AWS examples in C# – run the solution post.

Conclusion

In the current post, I have given a basic overview of DynamoDB. It is important to understand its specifics in order to use it efficiently.

Tags: