* Relational
— Hard to map to the way we code
* Complex ORM frameworks
— Hard to evolve quickly
* Rigid schema is hard to change, necessitates migrations
— Hard to scale horizontally
« Joins, transactions make scaling by adding servers hard
RDBMS Limitations
Project
stare Denormalize
data model
Custom
caching layer
Custom
sharding
+30 Days
Larmeh
+6 months
+90 Days
MongoDB
Built from the start to solve the
scaling problem
Consistency, Availability, Partitioning
- (can’t have it all)
Configurable to fit requirements
Theory of noSQL: CAP
Many nodes
Nodes contain replicas of
partitions of data
Consistency
= all replicas contain the same
version of data
Availability
— system remains operational on
failing nodes
Partition tolarence
— multiple entry points
— system remains operational on
system split
CAP Theorem:
satisfying all three at the
same time is impossible
Supported languages
¿gi e».
python
A En
¿Al $ #
=p)
Scala —
http://www.mongodb.org/display/DOCS/Drivers
« Basically
+ Atomicity « Available (CP)
e Consistency e Soft-state
e Isolation
« Eventually
¢ Durability
consistent (AP)
MongoDB is easy to use
AST_INSERT_|
NULL,
COMMIT;
Scalability & Performance
As simple as possible,
but no simpler
Depth of functionality
Representing & Querying Data
Schema design
+ RDBMS: join
MySQL
rot |
20
Schema design
+ MongoDB: embed and link
e Embedding is the nesting of objects and arrays inside
a BSON document(prejoined). Links are references
between documents(client-side follow-up query).
e "contains" relationships, one to many; duplication of