Machine Learning & Big Data Blog

DynamoDB Bulk Insert: An Easy Tutorial

Banner
2 minute read
Walker Rowe

In this article, we’ll show how to do bulk inserts in DynamoDB. If you’re new to Amazon DynamoDB, start with these resources:

(This tutorial is part of our DynamoDB Guide. Use the right-hand menu to navigate.)

Bulk inserts and deletes

DynamoDB can handle bulk inserts and bulk deletes. We use the CLI since it’s language agnostic. The file can be up to 16 MB but cannot have more than 25 request operations in one file.

Request operations can be:

  • PutRequest
  • DeleteRequest

The bulk request does not handle updates.

Data from IMDB

To illustrate, we have pulled 24 items from the IMDB (Internet Movie Database) and put them into JSON format. You can download that data from here.

The format for the bulk operation is:

{ "table name: [
"request operation": {
"item: {
(put your item here in Attribute value format)
}
}
}]
}

Here is an example:

{
"title": [{
"PutRequest": {
"Item": {
"tconst": {
"S": "tt0276132"
},
"titleType": {
"S": "movie"
},
"primaryTitle": {
"S": "The Fetishist"
},
"originalTitle": {
"S": "The Fetishist"
},
"isAdult": {
"S": "0"
},
"startYear": {
"S": "2019"
},
"endYear": {
"S": "\\N"
},
"runtimeMinutes": {
"S": "\\N"
},
"genres": {
"S": "Animation"
}
}
}
}]
}

If you are running DynamoDB locally then start it like this:

java -Djava.library.path=./DynamoDBLoc_lib -jar DynamoDBLocal.jar -sharedDb

Create a table like this:

aws dynamodb create-table \
--table-name title \
--attribute-definitions AttributeName=tconst,AttributeType=S \
--key-schema AttributeName=tconst,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 \
--endpoint-url http://localhost:8000

Then load the data like this, having saved the IMDB data in the file 100.basics.json.

aws dynamodb batch-write-item \
--endpoint-url http://localhost:8000 \
--request-items file:////Users/walkerrowe/Documents/imdb/100.basics.json \
--return-consumed-capacity  TOTAL \
--return-item-collection-metrics  SIZE          

It responds:

{
"UnprocessedItems": {}, 
"ConsumedCapacity": [
{
"CapacityUnits": 23.0, 
"TableName": "title"
}
]
}

It told you how many records it wrote. You can query that it worked like this:

aws dynamodb query  \
--endpoint-url http://localhost:8000 \
--table-name title    \
--key-condition-expression "tconst = :tconst"     \
--expression-attribute-values '{ ":tconst":{"S":"tt0276132"}}'

Attribute Types and AttributeValue

Here we show some of the AttributeValues, meaning attribute or data types supported by DynamoDB. Those are:

  • S
  • BOOL
  • L
  • M
  • etc.

Note: Even with numeric values you wrap them in quotes.

attribute type description
S String
Notice that a date is in ISO-8601 value like this:”currentTime”: {“S”: “2020-07-24T09:25:49+0000”

}

BOOL Boolean. Use true or false.
L A list of values without any AttributeValue, meaning no attribute name:

“other”: {

“L”: [{“S”: “Paris”},

{“N”: “13000000”}]

}

M Map, containing attribute values. This is like a JSON object, except it has attribute values. So, it’s like a list of named attributes.

“map”: {

“M”: {“Name”: {“S”: “Joe”},

“Age”: {“N”: “35”}}

}

}

Here is an example showing how to use those DynamoDB attribute types.

{
"title": [{
"PutRequest": {
"Item": {
"tconst": {
"S": "tt9276132"
},
"titleType": {
"S": "movie"
},
"primaryTitle": {
"S": "Zorba"
},
"isAdult": {
"BOOL": true
},
"Years": {
"NS": ["2019","2020"]
},
"actors": {
"SS": ["Anthony Quinn", "Marcel Marciano", "David Niven", "Peter Sellers"]
},
"currentTime": {
"S": "2020-07-24T09:25:49+0000"
},
"other": {
"L": [{"S": "Paris"}, 
{"N": "13000000"}]
},
"map": {
"M": {"Name": {"S": "Joe"}, "Age": {"N": "35"}}
}
}
}
}]
}

Additional resources

For more on this topic, explore the BMC Machine Learning & Big Data Blog and these resources:

Learn ML with our free downloadable guide

This e-book teaches machine learning in the simplest way possible. This book is for managers, programmers, directors – and anyone else who wants to learn machine learning. We start with very basic stats and algebra and build upon that.


These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing blogs@bmc.com.

Business, Faster than Humanly Possible

BMC empowers 86% of the Forbes Global 50 to accelerate business value faster than humanly possible. Our industry-leading portfolio unlocks human and machine potential to drive business growth, innovation, and sustainable success. BMC does this in a simple and optimized way by connecting people, systems, and data that power the world’s largest organizations so they can seize a competitive advantage.
Learn more about BMC ›

About the author

Walker Rowe

Walker Rowe is an American freelancer tech writer and programmer living in Cyprus. He writes tutorials on analytics and big data and specializes in documenting SDKs and APIs. He is the founder of the Hypatia Academy Cyprus, an online school to teach secondary school children programming. You can find Walker here and here.