AWS DynamoDB indexes

Working with AWS DynamoDB indexes

Data retrival on AWS DynamoDB can be done using:

  • Query operation
  • Scan operation

Query operation allows for searching based on condition on DynamoDB table or an index. These operations take advantage of indexes and are efficient way to retrieve data from DynamoDB database. This is the recommended way to data retrival as they work only on the subset of the complete table data. They are great for improving data retrival in DynamoDB.

Scan operation allows you to apply filters while searching data on the complete databases. These are expensive operation as it scans all the items (records) in the table.

Primary Keys

DynamoDB provides two types of primary key on a database table:

  • partition key
  • partition key with sort key (aka composite key)

When you create a DynamoDB table, you need to supply few mandatory parameters which includes which attribute do you need to be used as partition key.

Here is an example:

const params = {
  TableName: collectionName,
  AttributeDefinitions: [
    {
      AttributeName: 'id',
      AttributeType: 'S'
    }
  ],
  KeySchema: [
    {
      AttributeName: 'id',
      KeyType: 'HASH'
    }
  ],
  ProvisionedThroughput: {
    ReadCapacityUnits: 1,
    WriteCapacityUnits: 1
  }
}

dbClient.createTable(params, onInsert)

KeySchema specifies the attributes that will be the primary key for a table or an index. You specify the attribute name along with the attribute type which can be either:

  • HASH
  • RANGE

If you specify the attribute type as HASH, that attribute becomes the primary key but as parition key.

Any additional attribute that you want to specify can be added using attribute type as RANGE and this will become the sort key for the table or an index.

Also, it is mandatory to define attribute type information using AttributeDefinitions for each attribute listed under KeySchema. If you fail to do so DynamoDB will throw an ValidationError for the same.

Secondary Indexes

Secondary Indexes allow searching DynamoDB table on attributes which are not primary key.

DynamoDB supports two different kinds of indexes:

  • Local secondary indexes – The partition key of the index must be the same as the partition key of its table. However, the sort key can be any other attribute.
  • Global secondary indexes – The primary key of the index can be any attribute from its table.

Local Secondary Indexes (LSI)

These are created while you create the table and cannot be added later.

Here is a sample snippet for defining LSI with params used to create table:

"LocalSecondaryIndexes": [
  {
      "IndexName": "string",
      "KeySchema": [
        {
            "AttributeName": "string",
            "KeyType": "string"
        }
      ],
      "Projection": {
        "NonKeyAttributes": [ "string" ],
        "ProjectionType": "string"
      }
  }
]

Parition key in the KeySchema under LocalSecondaryIndexes should match the KeySchema defined in the top level defined for the table. Here is an example for the same:

{
  "AttributeDefinitions": [
    {
      "AttributeName": "ForumName",
      "AttributeType": "S"
    }
  ],
  "TableName": "Thread",
  "KeySchema": [
    {
      "AttributeName": "ForumName",
      "KeyType": "HASH"
    }
  ],
  "LocalSecondaryIndexes": [
    {
      "IndexName": "LastPostIndex",
      "KeySchema": [
        {
          "AttributeName": "ForumName",
          "KeyType": "HASH"
        }
      ],
      "Projection": {
        "ProjectionType": "KEYS_ONLY"
      }
    }
  ],
  "ProvisionedThroughput": {
    "ReadCapacityUnits": 5,
    "WriteCapacityUnits": 5
  }
}

Now ProjectionType property allows you to define which attributes are mapped into the index view. The possible options for this are:

  • ALL
  • KEYS_ONLY
  • INCLUDE (define which attributes should be included in the view)

Global Secondary Indexes (GSI)

GSI is very similar to LSI in term of syntax. But unlike LSI, GSI allows creating indexes with partition key that was originally not the partition key when the table was created. Here is an example of the same:

{
  "AttributeDefinitions": [
    {
      "AttributeName": "ForumName",
      "AttributeType": "S"
    },
    {
      "AttributeName": "LastPostDateTime",
      "AttributeType": "S"
    }
  ],
  "TableName": "Thread",
  "KeySchema": [
    {
      "AttributeName": "ForumName",
      "KeyType": "HASH"
    }
  ],
  "GlobalSecondaryIndexes": [
    {
      "IndexName": "LastPostDateIndex",
      "KeySchema": [
        {
          "AttributeName": "LastPostDateTime",
          "KeyType": "HASH"
        },
        {
          "AttributeName": "ForumName",
          "KeyType": "RANGE"
        }
      ],
      "Projection": {
        "ProjectionType": "KEYS_ONLY"
      }
    }
  ],
  "ProvisionedThroughput": {
    "ReadCapacityUnits": 5,
    "WriteCapacityUnits": 5
  }
}

Also, just like LSI can be created during the table creation. But unlike LSI, GSI can be created after the table has been created as well.

Considerations

LSI shares the RCU and WCU with the table, if the main table is using provisioned capacity.

GSI will have their own RCU and WCU allocations, if the main table is using provisioned capacity.

Limitations

Global secondary indexes

  • Each table can have maximum of 20 global secondary indexes

Local secondary indexes

  • Each table can have maximum of 5 global secondary indexes
  • LSI cannot be added after the table is created.

Nested attributes

Indexes on nested attributes are sadly not supported by AWS DynamoDB.

References