NoSQL
data warehouse
NoSQL is better suited to archival products, where data is stored, processed, analyzed in the background. Real-time activities like customer transactions may be better served by an SQL.
NoSQL is hierarchical. SQL is relational. ~= NoSQL will include multiple copies of the same string (like a JSON file). SQL will (?) link to an instance of the string.
NoSQL pre-computes projections of data in the form of indexes. NoSQL
Scaling SQL up means more CPU. Scaling NoSQL up means making multiple copies of data, each with light CPU load.
Data types
There are a lot more data types in SQL than in NoSQL. DynamoDB has:
- String
- Number
- Buffer and
- Boolean
- Null Only the first three can be used in the PK.
schema
Take an SQL schema. Pick a pivotal table. That's the root of the NoSQL object. FKs don't link to other tables, they are nested objects or arrays of objects. Data that cannot be nested for logical or physical (10GB) reasons goes into another table.
aws-sdk methods
scan
for backfilling data when schema is expanded. for updating attributes when schema changes. for data fetching during development.
query
good for getting lists of things. works with base table and indexes. usually paginated.
Development concepts
Variables
Method params accept placeholder / variable names of attributes in the #name format and and placeholder / variable values in the :value format.
Response object
ScannedCount
is the total number of matches (up to 1MB).Count
isScannedCount
after filtration, if any.LastEvaluatedKey
means there are more results.ScannedCount
needs to be ignored or tallied.ExclusiveStartKey
is next query's start point, the start of the second installment of data.ReturnConsumedCapacity
allows monitoring resource usage during development.
versioning
Instead of creating an Item create a { v0: Item }. Then on update, do { v0: Current, v1: Previous }. v0 is current, the 1-n versions are a stack with most recent on top / last.
Database
Consists of the base table, up to 5 local indexes (same PK) and up to 20 global indexes (different PK). A table, including local indexes, is max 10GB.
Primary keys
PK consists of a hash key, and an optional sort key. No foreign keys, just query the other table by the FK. Or nest the FK entity, if it doesn't have multiple relationships.
Indexes
A local secondary index has the same primary key but a sort key different to that of the base table. Local indexes can fetch unprojected attributes during a query, global indexes are unable to do that. Projecting ALL in an index doubles the storage & write cost. Slides 25-26 illustrate LSIs and GSIs nicely
Transactions
Similar to Promise.all([transaction, transaction, transaction]).
Consistency
Eventual = data will replicate, just don't assume it will replicate immediately. Indexes are updated about every 6 hours. Immediate = data will be identical in all copies at all times. Reads cost twice as much. Writes probably also cost more.
Streams
Streams emit events. Stream events are triggered by database activity. They in turn can trigger lambdas and others. Good for online / real-time cross-device syncs. Streams are 24 hours long.
Cost
A read is worth 5 writes. 1 write is max 1KB - so keep your items small?