NoSQL主流数据库-MongoDBx

candlewhynotΔιαχείριση Δεδομένων

31 Ιαν 2013 (πριν από 4 χρόνια και 7 μήνες)

497 εμφανίσεις

No SQL
数据库

MongoDB

用友软件股份有限公司

王晓明

2012


10


10


Yonyou

Software
Co.,Ltd
.







MongoDB
数据同步


红色箭头表示写操作可以写到
Primary
上,然后异步同步到多个
Secondary
上。


蓝色箭头表示读操作可以从
Primary

Secondary
任意一个中读取。


各个
Primary

Secondary
之间一直保持心跳同步检测,用于判断
Replica Sets

状态。


Yonyou

Software
Co.,Ltd
.







如果主服务器不可用,从服务器如何接管?

Yonyou

Software
Co.,Ltd
.







MongoDB
集群
(Replica Set)

A MongoDB replica set

mongod
实例的集群,它可以实现自动化的
.
失效备援
(Failover.)


如果主服务器不可用,从服务器将自动选举一个新的主服务器。


A replica
最多有
12
个成面
,
并且最多有
7
个成员有投票权
.

Yonyou

Software
Co.,Ltd
.







Voting
Secondarys

(
有投票权从服务器
)

)

Primary

(
主服务器
)

Arbiter(
仲裁者
)

Secondary
-
Only
Secondarys

(
无被选举权的从服务器
)





Secondary

Secondary

Secondary

Non
-
Voting
Secondarys

(
无投票权从服务器
)





Secondary

Secondary

Secondary

Hidden
Secondarys
(
隐藏从服务器
)



Real
-
time

Secondary

Delayed

Secondary

Backup

Secondary

Write to

Write
to

vote

Client App

可以作为联机
的测试服务器

MongoDB
服务器架构

Yonyou

Software
Co.,Ltd
.







从服务器的类型
(Secondary Types)

Secondary
-
Only
(只从服务器)


当主服务器不可用之后,只从服务器不可以变成主服务器

Non
-
Voting
(无投票权从服务器)


无投票权从服务器不能参与选举
.

Hidden


隐藏服务器
.
客户端不可见。如:可以作为实时的在线测试服务器。

Delayed


延迟服务器。比主服务器延迟一定时间段的从服务器。可以作为特殊的
备份服务器,一般也是隐藏的。

Arbiters


仲裁服务器。仅参与选举,不像其他服务器一样,它不拷贝主服务器的
数据。


Yonyou

Software
Co.,Ltd
.







选举
(Master Election)


有被选举权的服务器可以当选,但前提是保持与所有从服务器的心跳。优先级最
高的服务器优先当选。


有选举权的服务器
(vote=1)
可以参与选举。


选举提供了一种自动化机制,减少了管理员的工作。


选举时各服务器需要相互连通。


触发选举时,服务器将关闭所有客户端的连接。

Yonyou

Software
Co.,Ltd
.







如果数据写入主服务器,但尚未同步到从服
务器时,主服务器不可用,将如何处理?

Yonyou

Software
Co.,Ltd
.







主服务器不可用后的数据问题

回滚
(Rollback)




数据写入主服务器,但尚未同步到从服务器时,此时主服务器不可用
.
这种情况很
少见,大多是因为网络原因引起
.



当之前的主服务器重新连接进来时,将作为从服务器加入,此时该服务器需要做回
滚。

Yonyou

Software
Co.,Ltd
.







如何确保主从服务器数据的一致性?

异步写


当客户端发送一个写操作的数据库服务器的操作,
mongodb
默认不等待操作成
功返回或完成。可以使用
getLastError
检查是否写操作成功。


最终一致性


如果客户端从从服务器去读取数据,那么从服务器的数据可能会是尚未实时更新
的旧数据
.
这种情况可以描述为最终一致性,因为从服务器最终会反映主服务器
的状态
.



Yonyou

Software
Co.,Ltd
.







写一致性
(Write Concern)


getLastError

支持以下选项。


无选项
.


当您的应用程序接收此响应,表明
mongod
实例已完成内存中的写操作。


这提供了一个简单和低延时写入,并允许您的应用程序来检测
mongod
实例不
可用或者主键冲突一类的错误



Yonyou

Software
Co.,Ltd
.







写一致性
(Write Concern)

j


“journal”
选项
.





除了确认数据写入内存,而
mongod
实例也确认已将数据写到磁
盘上。这确保了如果
mongod
或服务器本身崩溃或意外关闭情况
下数据是持久的



Yonyou

Software
Co.,Ltd
.







写一致性
(Write Concern)

w
选项
.




w
选项确认写操作复制到了指定数量的从服务器。你可以指定一个特定
数目的服务器,或指定“大多数”,以确保写入保存到大多数的从服务
器。
w
的默认值是
1




Yonyou

Software
Co.,Ltd
.







读一致性



默认情况下,应用程序直接从主服务器读,以确保数据的实时性。


如果对于一个应用程序,不需要完全的最新数据,您可以通过分配
部分或全部读取到次要构件的副本集来提高读取的吞吐量






Yonyou

Software
Co.,Ltd
.







什么情况下从“
Secondary
”去读

用于备份或报表



Running systems operations that do not affect the front
-
end
application, operations such as backups and reports.

地理因素


If one secondary is closer to an application server than the primary,
you may see better performance for that replication if you use
secondary reads.


主服务器不可用


Providing graceful degradation in failover situations where a set has
no primary for 10 seconds or more. In this use case, you should give
the application the primary Preferred read preference, which
prevents the application from performing reads if the set has no
primary.


Yonyou

Software
Co.,Ltd
.







读的模式
(Read Preference Modes)



只从主服务器读



All read operations use only the current replica set primary. This is the
default.

优先从服务器读


if the primary is unavailable, as is the case during failover situations,
operations read from secondary members.

从从服务器读


Operations read only from the secondary members of the set.

优先从从服务器读


in situations where the set consists of a single

primary(and no other
members,) the read operation will use the set’s primary

从最近的服务器读


The driver reads from the nearest member of the set according to the
member selection process. Reads in the nearest mode do not consider
the member’s type. Reads in nearest mode may read from both
primaries and secondaries.

Yonyou

Software
Co.,Ltd
.







为什么

MongoDb

更快
?

Yonyou

Software
Co.,Ltd
.







为什么

MongoDb

更快
?

No rollbacks

Your code must function without rollbacks.
Check all programmatic conditions before
performing the first database write operation.
Order your write operations such that the
most important operation occurs last.



Yonyou

Software
Co.,Ltd
.







为什么

MongoDb

更快
?

Explicit locking

Your code may explicitly lock objects when
performing operations. Thus, the application
programmer has the capability to ensure
"
serializability
" when required. Locking
functionality will be available in late alpha /
early beta release of MongoDB.

Yonyou

Software
Co.,Ltd
.







为什么

MongoDb

更快
?

Primary Key is single

Reads and writes with MongoDB are like
single reads and writes by primary key on a
table with no non
-
clustered indexes in an
RDBMS.


Yonyou

Software
Co.,Ltd
.







MySql

Mongodb
取单行记录的比较

Here is what happens when a client retrieves a single row/document by primary
key. I'll annotate the differences between both systems:

1.
Client builds a binary command (same)

2.
Client sends it over TCP (same)

3.
Server parses the command (same)

4.
Server accesses query plan from cache (SQL only, not MongoDB, not
HandlerSocket
)

5.
Server asks B
-
Tree component to access the row (same)

6.
Server takes a physical
readonly
-
lock on the B
-
Tree path leading to the row
(same)

7.
Server takes a logical lock on the row (SQL only, not MongoDB, not
HandlerSocket
)

8.
Server serializes the row and sends it over TCP (same)

9.
Client
deserializes

it (same)


There are only two additional steps for typical SQL
-
bases
RDBMS'es
.
That's why
there isn't really a difference.


Yonyou

Software
Co.,Ltd
.







选择
SQL
结构,
or
NoSQL

文档结构?

What Data Structure do I really want?

MongoDB's

object oriented data storage makes it
much more flexible when the documents stored are
not uniform
-

often the case in web
apps.


MongoDBs

various APIs can take in JSON
directly.


If you have designed your
sql

schema for
fast data retrieval, you are avoiding large numbers
of documents (high millions) or are partitioning the
tables and avoiding


joins to tables with any large
size.


NoSQL

data structures encourage you to do
this naturally, having no real provisions for


joins.





Yonyou

Software
Co.,Ltd
.







大数据量的扩展性

How much is this going to cost?

If your app is already built over an SQL operation
or that is where your engineering expertise is
much stronger in SQL it sounds like engineering
can deliver strong scaling performance can be
had from
mysql

or oracle.




As far as scaling MongoDB there is a lot of
engineering you don't have to do to scale. It
shards and scales effectively using built in
features without the extensive setup required for
clustering RDBMS systems.





Yonyou

Software
Co.,Ltd
.







NO SQL
数据库比较

Yonyou

Software
Co.,Ltd
.







MongoDB

Written in:

C++

Main point:

Retains some friendly properties of SQL. (Query, index)

License:

AGPL (Drivers: Apache)

Protocol:

Custom, binary (BSON)

Master/Secondary replication (auto failover with replica sets)

Sharding

built
-
in

Queries are
javascript

expressions

Run arbitrary
javascript

functions server
-
side

Better update
-
in
-
place than
CouchDB


Uses memory mapped files for data storage

Performance over features

Journaling (with
--
journal) is best turned on

On 32bit systems, limited to ~2.5Gb

An empty database takes up 192Mb

GridFS

to store big data + metadata (not actually an FS)

Has geospatial indexing

Best used:

If you need dynamic queries. If you prefer to define indexes, not map/reduce
functions. If you need good performance on a big DB. If you wanted
CouchDB
, but your
data changes too much, filling up disks.

For example:

For most things that you would do with
MySQL

or
PostgreSQL
, but having
predefined columns really holds you back.



Yonyou

Software
Co.,Ltd
.







Redis

(V2.4)

Written in:

C/C++

Main point:

Blazing fast

License:

BSD

Protocol:

Telnet
-
like

Disk
-
backed in
-
memory database,

Currently without disk
-
swap (VM and
Diskstore

were abandoned)

Master
-
Secondary replication

Simple values or hash tables by keys,

but
complex operations

like ZREVRANGEBYSCORE.

INCR & co (good for rate limiting or statistics)

Has sets (also union/diff/inter)

Has lists (also a queue; blocking pop)

Has hashes (objects of multiple fields)

Sorted sets (high score table, good for range queries)

Redis

has transactions (!)

Values can be set to expire (as in a cache)

Pub/Sub lets one implement messaging (!)

Best used:

For rapidly changing data with a foreseeable database size (should fit mostly in
memory).

For example:

Stock prices. Analytics. Real
-
time data collection. Real
-
time communication.


Yonyou

Software
Co.,Ltd
.







CouchDB

(V1.1.1)

Written in:

Erlang


Main point:

DB consistency, ease of use

License:

Apache

Protocol:

HTTP/REST

Bi
-
directional (!) replication,

continuous or ad
-
hoc,

with conflict detection,

thus, master
-
master replication. (!)

MVCC
-

write operations do not block reads

Previous versions of documents are available

Crash
-
only (reliable) design

Needs compacting from time to time

Views: embedded map/reduce

Formatting views: lists & shows

Server
-
side document validation possible

Authentication possible

Real
-
time updates via _changes (!)

Attachment handling

thus,
CouchApps

(standalone
js

apps)

jQuery

library included

Best used:

For accumulating, occasionally changing data, on which pre
-
defined queries are to be run. Places
where versioning is important.

For example:

CRM, CMS systems. Master
-
master replication is an especially interesting feature, allowing
easy multi
-
site deployments
.

Yonyou

Software
Co.,Ltd
.







Riak

(V1.0)

Written in:

Erlang

& C, some
Javascript


Main point:

Fault tolerance

License:

Apache

Protocol:

HTTP/REST or custom binary

Tunable trade
-
offs for distribution and replication (N,

R,

W)

Pre
-

and post
-
commit hooks in JavaScript or
Erlang
, for validation and security.

Map/reduce in JavaScript or
Erlang


Links & link walking: use it as a graph database

Secondary indices: but only one at once

Large object support (
Luwak
)

Comes in "open source" and "enterprise" editions

Full
-
text search, indexing, querying with
Riak

Search server (beta)

In the process of migrating the storing backend from "
Bitcask
" to Google's "
LevelDB
"

Masterless

multi
-
site replication
replication

and SNMP monitoring are commercially licensed

Best used:

If you want something Cassandra
-
like (Dynamo
-
like), but no way you're
gonna

deal
with the bloat and complexity. If you need very good single
-
site scalability, availability and
fault
-
tolerance, but you're ready to pay for multi
-
site replication.

For example:

Point
-
of
-
sales data collection. Factory control systems. Places where even
seconds of downtime hurt. Could be used as a well
-
update
-
able web server.


Yonyou

Software
Co.,Ltd
.







HBase

(V0.92.0)

Written in:

Java

Main point:

Billions of rows X millions of columns

License:

Apache

Protocol:

HTTP/REST (also Thrift)

Modeled after Google's
BigTable


Uses
Hadoop's

HDFS as storage

Map/reduce with
Hadoop


Query predicate push down via server side scan and get filters

Optimizations for real time queries

A high performance Thrift gateway

HTTP supports XML,
Protobuf
, and binary

Cascading, hive, and pig source and sink modules

Jruby
-
based (JIRB) shell

Rolling restart for configuration changes and minor upgrades

Random access performance is like
MySQL


A cluster consists of several different types of nodes

Best used:

Hadoop

is probably still the best way to run Map/Reduce jobs on huge datasets.
Best if you use the
Hadoop
/HDFS stack already.

For example:

Analysing

log data.


Yonyou

Software
Co.,Ltd
.







Neo4j (V1.5M02)

Written in:

Java

Main point:

Graph database
-

connected data

License:

GPL, some features AGPL/commercial

Protocol:

HTTP/REST (or embedding in Java)

Standalone, or embeddable into Java applications

Full ACID conformity (including durable data)

Both nodes and relationships can have metadata

Integrated pattern
-
matching
-
based query language ("
Cypher
")

Also the "Gremlin" graph traversal language can be used

Indexing of nodes and relationships

Nice self
-
contained web admin

Advanced path
-
finding with multiple algorithms

Indexing of keys and relationships

Optimized for reads

Has transactions (in the Java API)

Scriptable in Groovy

Online backup, advanced monitoring and High Availability is AGPL/commercial licensed

Best used:

For graph
-
style, rich or complex, interconnected data. Neo4j is quite different from the
others in this sense.

For example:

Social relations, public transport links, road maps, network topologies.


Yonyou

Software
Co.,Ltd
.

Written in:

Java

Main point:

Best of
BigTable

and Dynamo

License:

Apache

Protocol:

Custom, binary (Thrift)

Tunable trade
-
offs for distribution and replication (N,

R,

W)

Querying by column, range of keys

BigTable
-
like features: columns, column families

Has secondary indices

Writes are much faster than reads (!)

Map/reduce possible with Apache
Hadoop


All nodes are similar, as opposed to
Hadoop
/
HBase



Best used:

When you write more than you read (logging). If every component of the system must be in
Java. ("No one gets fired for choosing Apache's stuff.")


For example:

Banking, financial industry (though not necessarily for financial transactions, but these
industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data
analysis.


Cassandra

Yonyou

Software
Co.,Ltd
.

Written in:

Erlang

& C

Main point:

Memcache

compatible, but with persistence and clustering

License:

Apache 2.0

Protocol:

memcached

plus extensions

Very fast (200k+/sec) access of data by key

Persistence to disk

All nodes are identical (master
-
master replication)

Provides
memcached
-
style in
-
memory caching buckets, too

Write de
-
duplication to reduce IO

Very nice cluster
-
management web GUI

Software upgrades without taking the DB offline

Connection proxy for connection pooling and multiplexing (
Moxi
)


Best used:

Any application where low
-
latency data access, high concurrency support and high
availability is a requirement.

For example:

Low
-
latency use
-
cases like ad targeting or highly
-
concurrent web apps like online gaming
(e.g.
Zynga
).


Membase

Yonyou

Software
Co.,Ltd
.

CouchDB

Vs MongoDB Vs
MySQL