Types of Primary Keys

1. Partition Key
2. Composite Key (Partition key + Sort Key)

Access control to Data using IAM

You can give fine-grained access to users in DynamoDB.

e.g. users can access only items where the partitionKey value matches their User_Id

Indexes

A secondary index allows you to perform more flexible querying on DynamoDB. It allows you to query on an attribute which is not the primary key. This can be done using

  • Global Secondary Index
  • Local Secondary Index

You select the columns you want to include in the index and run your searches on the index instead of the entire dataset.

Local Secondary Index

Limitation: 
    Can only be created when you're creating the table.

Has the same partition key as your original table, but a different sort key.

Global Secondary index

Advantage:
    Can be created even after creation of the table.
    It allows you to use a different partition key as well as the sort key than your main table
    Gives a totally different view of the table than the original table created (because of the above point)

Scan and Query API calls

QUERY

A query operation finds items in your table based on the primary key attribute, and a distinct value to search for. A query can be refined by using an optional sort key name and value. By default a query returns all the attributes. You can limit the attributes to the specific attributes you want by using the ProjectionExpression parameter.

Results are always sorted by the sort key.

You can reverse the order of the result by setting the ScanIndexForward attribute to False.

REMEMBER: ScanIndexForward is applicable only to queries and NOT to scans, despite the name

By default all queries are eventually consistent.

SCAN

A scan operation examines every attribute in the table. Again, you can use the ProjectionExpression parameter to limit the attributes returned.

A query is more efficient than Scan.

Improving performance

- Set smaller page size
- Avoid scans if possible
- By using parallel scan. A scan by default operates in a sequential manner. But you can configure your DynamoDB to use parallel scans by logically dividing a table or index into segments and scanning each segment in parallel.
- Isolate scan operations to specific tables and segregate them from mission critical data

Provisioned Throughput

Provisioned Throughput is measured in capacity units.

  • Write capacity units
    • 1 x 1KB write per second
  • Strongly consistent reads
    • 1 x 4KB read per second
  • Eventually consistent reads (default option)
    • 2 x 4KB read per second

Example 1:

Imagine a table that has 5 read capacity units and 5 write capacity units

This configuration will be able to perform

  • 5 x 4KB strongly consistent reads = 20KB per second
  • Twice as many eventually consistent reads = 40KB per second
  • 5 x 1KB writes = 5KB writes

Example 2:

Imagine you have an application that needs to read 80 items per second. Each item is 3KB in size and you need strongly consistent reads. How many read capacity units will you need?

Ans:

Number of 4KB read capacity units needed per item = 3KB/4KB = 0.75 ~= 1

Number of items = 80

Therefore read capacity units needed = 80 x 1 = 80 for STRONGLY CONSISTENT

What if you need Eventually consistent reads?

    = #Strongly Consistent /  2
    = 80/2
    = 40 for EVENTUALLY CONSISTENT

Example 3:

Imagine you need to write 100 items per second to your DynamoDB table. Each item size 512 bytes. Each write capacity unit gives 1 1KB write per second. How many write capacity units will you need to provision?

Ans:

- Number of write capacity of 1KB/sec needed per item = 512/1024 bytes = 0.5 ~= 1 write capacity unit per item
- Number of writes required per second = 100
- Therefore, number of write capacity units required = 100 x 1 write capacity unit per item = 100

Provisioned Throughput Exceeded Exception

Occurs when your request rate is too high for the read/write capacity provisioned for your table.

Exponential Backoff

The requester used progressively longer waits between consecutive wait times.

DynamoDB On Demand Capacity

A pricing model for DynamoDB. Charges will be based on the activity. Can autoscale read/write capacity units as needed.

Great for:

- Unpredictable workload
- New applications where the use pattern is not known yet
- When you want to pay for only what you use

DynamoDB Accelerator (DAX)

It's a fully managed, clustered in memory cache for DynamoDB.

But only for Read performance. Ideal for read heavy or bursty read applications.
DAX is a write through caching service.
Data is written to the cache and the backend store at the same time
This allows you to point your DynamoDB API calls at the DAX cluster 

*Limitations*

1. It caters to applications that need Eventual Consistency. Not suitable for Strongly consistent read needs
2. Not suitable for write-intensive applications

DynamoDB TTL

Defines an expiry time for your data. Expired items are marked for deletion.

Really good for applications that generate irrelevant or old data. e.g. session data, event logs, any temporary data.

DynamoDB Streams

it's an ordered sequence of item level modifications.

These are stored as logs. These logs are encrypted at rest and stored for 24 hours only.

They can be used for triggering events based on certain transactions. Great for serverless architectures.

They can also be used for replicating data across multiple tables.