2ndQuadrant 10 reasons why MySQL Users should move to ...

disturbedoctopusΔιαχείριση Δεδομένων

27 Νοε 2012 (πριν από 4 χρόνια και 4 μήνες)

285 εμφανίσεις

2ndQuadrant
10 reasons why MySQL Users should
move to PostgreSQL
15 June 2011
2ndQuadrant: 10 reasons why MySQL Users should move to
PostgreSQL
Copyright © 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)
2
Copyright © 2011, 2ndQuadrant Limited. All rights reserved.
This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free,
nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a
particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document.
This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission.
The PostgreSQL elephant logo "Slonik" ® is a registered trademark of the PostgreSQL Global Development Group.
MySQL ® is a registered trademark of Oracle Corporation and/or its affiliates.
Table of Contents
Introduction ................................................................................................................................ 3
Comparing MySQL and PostgreSQL at a glance ............................................................................... 3
Reason 1: Enterprise class query planner ........................................................................................ 3
Reason 2: Crash safety, reliability and the Write-ahead log ................................................................. 4
Reason 3: Streaming replication ..................................................................................................... 4
Reason 4: Excellent third-party/satellite tools .................................................................................... 5
Reason 5: Advanced Features ....................................................................................................... 5
Reason 6: Quality and data-integrity first .......................................................................................... 6
Reason 7: Licensing .................................................................................................................... 6
Reason 8: Strong community ......................................................................................................... 6
Reason 9: Procedural languages .................................................................................................... 7
Reason 10: SQL compliance and integration .................................................................................... 7
Advanced, ISO conforming SQL dialect ................................................................................... 7
Support for popular ORMs .................................................................................................... 8
About 2ndQuadrant ...................................................................................................................... 8
Business and Contact Information .................................................................................................. 8
2ndQuadrant: 10 reasons why MySQL Users should move to
PostgreSQL
Copyright © 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)
3
Introduction
Over the last decade, the two main open-source database solutions were MySQL and PostgreSQL.
Among ICT professionals, there is a traditional perception that MySQL is faster and easier to use than PostgreSQL.
On the other hand, PostgreSQL has always been acknowledged to be more feature-rich and robust than
MySQL. Over the years, both products have probably learnt from each other, and both have certainly improved
considerably.
PostgreSQL, with 5 major releases in the last five years, has focused on improving performance, high availability
and replication as well as making life easier for DBAs and system administrators.
The intended audience of this whitepaper is MySQL users and developers who are interested in PostgreSQL.
These 10 reasons cover both the product (the PostgreSQL object-relational DBMS) and the open source project
(licensing, community, related tools, the PostgreSQL ecosystem, etc.).
Comparing MySQL and PostgreSQL at a glance
Product
MySQL
Community
Edition
MySQL
Classic
Edition
MySQL
Standard
Edition
MySQL
Enterprise
Edition
MySQL
Cluster CGE
PostgreSQL 9
License
GPL
a
Commercial
Commercial
Commercial
Commercial
TPL
b
License costs
c
Free
N/A
USD 2,000
USD 5,000
USD 10,000
Free
Commercial embedded
systems
under GPL
Contact Oracle Sales Team

Connectors






Triggers
Row triggers only
Row &
Statement
d
Standard authentication
methods
-
-
-
-
-
LDAP,
GSSAPI,
SSPI, SSL
Certificate,
Radius, etc.
ACID transactions

-




Online Hot Backup
-
-
-



Master-Slave Replication







e
Partitioning

-
-



Point-In-Time Recovery
f
-
-
-



Administration GUI

-




Monitoring tools
-
-
-




g
Bug fixes, security and
maintenance releases
-





Professional 24x7 support
-




Contact
2ndQuadrant
Sales Team
a
GNU General Public License: http://www.gnu.org/licenses/gpl.html
b
The PostgreSQL License is a highly permissive BSD-style license: http://www.postgresql.org/about/licence
c
MySQL license costs are per 1-4 socket server per year. Pricing on 5+ socket servers requires contacting Oracle Sales Team
d
PostgreSQL 9.1 will support triggers on views
e
PostgreSQL 9.1 will support synchronous replication
f
Ability to recover the database situation to a particular point in time or transaction
g
Integration with standard monitoring and alerting tools such as Cacti, Munin and Nagios
Reason 1: Enterprise class query planner
Perhaps the single biggest factor that differentiates PostgreSQL from MySQL is its extremely sophisticated query
planner.
2ndQuadrant: 10 reasons why MySQL Users should move to
PostgreSQL
Copyright © 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)
4
PostgreSQL uses cost-based query optimization methods in order to get good performance for many different
types of joins. Query costs are assessed based on planner statistics collected when tables are analyzed combined
with adjustable planner cost constants. Advanced features such as the Genetic Query Optimizer allow optimizing
even very complicated joins efficiently.
MySQL’s query optimizer estimates cost by heuristically calculating the number of disk blocks that will most likely
need to be read. Because the query optimizer is implemented in the server, not the storage engines, not all the
salient information is available to the optimizer. When the optimizer chooses badly, there are only a few tunables
for Controlling Query Optimizer Performance. Developers must do things like explicitly provide index hints to make
sure joins are executed correctly. To make this task smoother, MySQL provides a Query Profiler that is easier to
work with than typical EXPLAIN data.
Finding order in execution provides several comparisons of how the two databases handle queries differently.
Because of its more robust automatic optimization, PostgreSQL usually does a better job of handling complicated
joins than MySQL.
Slow queries on PostgreSQL can easily have their explains automatically logged using the auto_explain
module
1
. This approach is useful for DBAs who like to pro-actively tune slow-performing queries.
Reason 2: Crash safety, reliability and the Write-ahead log
A continual focus for 2ndQuadrant’s community work on PostgreSQL has been the write-ahead log. Write-Ahead
Logging (WAL) is a standard method for ensuring data integrity and crash safety
2
of database transactions.
Essentially, WAL’s central concept is that changes to data files (where tables and indexes reside) must be written
only after those changes have been logged, that is, after log records describing the changes have been flushed
to permanent storage.
PostgreSQL disaster recovery features, including Continuous Archiving and Point-In-Time Recovery (PITR),
are based on Write-Ahead Logging
3
.
On the other hand, MyIsam lacks transactions. In the event of a crash, it must rebuild indexes and tables. This
takes an amount of time that is rises at a rate that is roughly linear to the size of the database. This can have
negative implications for the availability of your database. The MySQL documentation has an extensive guide to
repairing MyIsam tables. While MySQL has gone someway towards moving away from MyIsam, particularly in the
recent 5.5 release where it is no longer the default storage engine, many production databases still use MyIsam.
The InnoDB equivalent of the write-ahead log is the binary log.
Reason 3: Streaming replication
PostgreSQL’s built-in asynchronous replication (Hot standby and Streaming replication) is just a logical
extension of write-ahead logging. The database is replicated with a tiny delay, typically one transaction or a few
seconds. The primary database (also known as "master") and the standby databases (sometimes referred to as
"slaves") are identical at the binary level.
In the upcoming 9.1 release, due out in Autumn 2011, replication is further simplified. A new utility,
pg_basebackup, can be used to take base backups of a running PostgreSQL database cluster. These are taken
without affecting other clients to the database, and can be used both for point-in-time recovery and as the starting
point for streaming replication standby servers.
In PostgreSQL 9.1, 2ndQuadrant have added a new type of replication: synchronous replication. This affords
an extra level of durability, by having transactions commit only when they are safely stored on one standby. It
is possible to change the service level at the transaction boundary, and designate only certain transactions as
needing the extra durability as appropriate.
1
http://www.postgresql.org/docs/current/static/auto-explain.html
2
http://www.postgresql.org/docs/current/static/wal.html
3
http://www.postgresql.org/docs/current/static/continuous-archiving.html
2ndQuadrant: 10 reasons why MySQL Users should move to
PostgreSQL
Copyright © 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)
5
2ndQuadrant has developed repmgr
4
, an open-source tool for managing PostgreSQL replication at a very high
level. Features include the ability to:
• Create new standbys using PostgreSQL Streaming Replication;
• Analyse replication lag that impacts both high availability and Hot Standby queries;
• Perform maintenance tasks required to handle failover when a node is removed from the standby cluster, either
intentionally or via failure.
Reason 4: Excellent third-party/satellite tools
There are a large number of PostgreSQL satellite projects. For example, PostGIS
5
is considered to be the most
mature open source Geographical Information System - the first version was released in 2001. It allows users to
perform geospatial queries, transforming PostgreSQL into a geodatabase. The OpenStreetMap project
6
moved
to PostgreSQL in 2009
7
from MySQL.
MySQL’s geospatial support is limited to 2D MBR (minimum bounding rectangle) functions. PostgreSQL/PostGIS,
in contrast, has 300 functions and operators and some MM support of circular strings and compound curves.
The extensibility of Postgres makes this sort of development quite natural, particularly its support for user defined
datatypes and indexing schemes. Postgres supports two types of generalised indexes - GiST (Generalised
Search Tree) and GIN (Generalised inverted index). The main advantage of GiST and GIN is that they allow the
development of custom data types with appropriate access methods, by an expert in the domain of the data type,
rather than a database expert.
Apart from geospatial applications, these generalised indexes have been used to define custom indexing schemes
for anything from soundex support to indexing discrete elements of arrays in a table with an array column very
efficiently.
With PostgresQL 9.1, the addition of the K-nearest neighbour (KNN) algorithm allows for ORDER BY clauses to
use GiST indexes directly. This support will make for world-beating GIS in the upcoming 2.0 release of PostGIS,
as the feature is not available from any of the proprietary databases.
There are also satellite projects for every conceivable supporting task - mature connection poolers,
instrumentation/monitoring tools (including support for popular monitoring frameworks like Nagios and Cacti), log
analysers, GUI administration tools (pgAdmin, phpPgAdmin) and more.
Reason 5: Advanced Features
PostgreSQL has a number of features that set it apart from even the expensive proprietary databases.
One such advanced feature is transactional DDL. In PostgreSQL, when you are inside a transaction almost
any operation can be undone. There are some irreversible operations (like creating or destroying a database
or tablespace), but normal table modifications, including creating and dropping tables, can be backed out of by
issuing a ROLLBACK, due to its Write-ahead log design.
MySQL doesn’t support any sort of rollback when using MyISAM, because it is not transactional. Even InnoDB
and Oracle don’t provide transactional DDL, because DDL operations cause the currently open transaction to
implicitly commit.
Experienced PostgreSQL DBAs know to take advantage of its features here to protect themselves when doing
complicated work like schema upgrades. If you put all such changes into a transaction block, you can make sure
they are applied atomically – all changes are applied or none are. This drastically lowers the possibility that
the database will be corrupted by a typo or other such error in the schema change, which is particularly important
when you’re modifying multiple related tables where a mistake might destroy the relational key.
4
http://projects.2ndquadrant.com/repmgr
5
http://postgis.refractions.net/
6
http://www.openstreetmap.org/
7
http://blog.cleverelephant.ca/2009/04/openstreetmap-moves-to-postgresql.html
2ndQuadrant: 10 reasons why MySQL Users should move to
PostgreSQL
Copyright © 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)
6
Another advanced feature is exclusion constraints. Exclusion constraints are described as “a generalisation of
unique constraints” across all operators. This feature is often used to solve the “double-booking problem” (ensure
that each room, each of which can be in use for some period of time, isn’t double booked), without race conditions,
and without heavy, explicit table-level locks.
PostgreSQL has been the subject of numerous academic papers and research projects, and is considered a good
basis for cutting-edge database research. For example, it will soon have the first production implementation of
Serializable snapshot isolation (SSI), which is based on a 2008 academic paper by Cahill, et al, which itself won
a best paper award from ACM SIGMOD.
Reason 6: Quality and data-integrity first
PostgreSQL has traditionally focused on quality and data integrity.
A simple example of this trend is the date datatype: in MySQL, it is possible to set a date to February 31st.
PostgreSQL won’t allow you to set the date to any day in February past the 28th - except the 29th on leap years.
There has never been a PostgreSQL release with known bugs at release time. In contrast, MySQL has shipped
with known bugs as a matter of course, and certain bugs are known to have been in MySQL for years. MySQL’s
criteria for General Availability does not require that the release contain no bugs; rather, it requires that it contains
no “serious” bugs.
The MySQL project continues to introduce new features in minor releases of MySQL, exposing their users to
potential incompatibilities (See the “Functionality added or changed” section of many MySQL release notes,
including, for example, 5.1.38, 5.1.41 and 5.5.6. The 5.5.6 release notes note an “Incompatible change” ).
The PostgreSQL project has a simple policy: point releases are limited to bug fixes. Major versions are supported
for at least 5 years. We don’t ship with known bugs, and require all patches to pass extensive regression tests
before they are accepted.
Reason 7: Licensing
While 2ndQuadrant employees are very well represented among the developers of PostgreSQL, 2ndQuadrant do
not dominate or otherwise control PostgreSQL’s development, nor does any other commercial entity. Copyright of
the source code is implicitly assigned to the PostgreSQL Global Development Group, a unincorporated loose
association of PostgreSQL developers. Given the extremely liberal terms of the BSD-style PostgreSQL licence, it
would not make any practical difference if a commercial entity were assigned the copyright.
In contrast, Oracle ® owns the source code of MySQL ®, and vigorously pursues violators of its licensing terms.
MySQL AB ® and its successors have a long history of ensuring that all available database APIs were GPL
licensed, to force proprietary application developers to either make their client applications GPL open source, or to
purchase commercial licences (when an application links to a library, it is generally considered to be a derivative
work; when it connects over a network, it is generally not).
Reason 8: Strong community
Since Sun and their MySQL intellectual property were acquired by Oracle, the MySQL community has become
more fractured than ever.
There are now a number of popular forks of the MySQL project. Aside from Oracle MySQL, there is also
considerable interest in Monty Widenius' MariaDB project. While that project strives for binary compatibility with
Oracle MySQL, it is far from clear how viable that is in the long term. These sorts of “shadow forks” can be difficult
to maintain; for example, Redhat, inc. have recently started to frustrate Oracle’s use of Redhat source code to
produce a simple re-branding fork, Oracle Unbreakable Linux. There is also considerable interest in Drizzle, which
is a project that seeks to return MySQL to its roots by adapting a micro kernel architecture, and concentrating
on read mostly web applications. It may put considerable onus on the developer to rewrite their application. For
example, Drizzle does not support stored procedures, and triggers must be written in C++.
2ndQuadrant: 10 reasons why MySQL Users should move to
PostgreSQL
Copyright © 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)
7
MySQL suffers from a proliferation of storage engines, including MyISAM, Percona, OurDelta, InnoDB and Falcon
(discontinued after Oracle acquisition).
The PostgreSQL community revolves around community PostgreSQL. While there is a small number of proprietary
forks that serve niche markets, such as EMC’s leading data warehousing solution Greenplum, there is a singular
focus on improving PostgreSQL as a general database that performs well with varied types of work.
PostgreSQL treats documentation as part of the project proper – in other words, all patches written must have
accompanying documentation, and the documentation is stored as SGML in the same source repository as the
source code to the server itself. Errors in the documentation are considered to be bugs, and are fixed with the
same expediency as any other sort of bug. The documentation is extensive, organised, and of high quality.
Finally, the PostgreSQL community is extended across the world; It has representatives and volunteers in many
countries of all continents that have embraced the open-source spirit.
Reason 9: Procedural languages
PostgreSQL supports numerous loadable procedural languages – PL/PgSQL is similar to Oracle’s PL/SQL, and
can be graphically debugged from within the pgAdmin administration/development tool. Other official procedural
languagees include PL/Python, PL/Perl, and PL/TCL. PL/Java and PL/R are available as separate open-source
projects. In most cases, these powerful procedural languages can be safely sandboxed, so that non-superusers
can securely define procedural functions and triggers.
These languages can be used to define arbitrary constraints on database-level datatypes (domains) and tables.
This is in rather stark contrast to MySQL, where functions and triggers are generally written in a limited procedural
variant of SQL. There is support for triggers written in compiled languages like C and C++.
Reason 10: SQL compliance and integration
Advanced, ISO conforming SQL dialect
PostgreSQL supports a dialect of SQL that is generally considered one of the most ISO standard conforming. For
example, it serves as a reference model of standards conformance for the SQLite developers.
It has support for enterprise class SQL features that were previously only available in the proprietary databases.
These include Window Functions and WITH Queries
8
.
Window Functions, which are of particular use in data-warehousing applications, can summarise a part of the total
result set within a single row of that result set. So, for example, they could be used to show each employee, their
salary, the department of that employee, and, using a window, the average salary within that department. This is
all accomplished within a single, simple query.
WITH queries are a way of creating simple temporary tables that only persist for the length of a single query,
and are typically used to break down complicated queries into simpler parts. They can be reused and even be
recursive, which allows the user to define queries that summarise hierarchical data.
2ndQuadrant consultant Marko Tiikkaja has extended WITH queries to support containing data-modifying
statements. This support will be available in the upcoming 9.1 release.
Apart from supporting none of these advanced features, the SQL dialect of MySQL differs significantly from that of
the SQL standard. These extensive differences are described in a section of the MySQL documentation, “MySQL
Differences from Standard SQL”
9
.
8
Also known as *Common Table Expressions (CTE)
9
http://dev.mysql.com/doc/mysql-reslimits-excerpt/5.5/en/differences-from-ansi.html
2ndQuadrant: 10 reasons why MySQL Users should move to
PostgreSQL
Copyright © 2011, 2ndQuadrant Italia (Devise.IT S.r.l.)
8
Support for popular ORMs
PostgreSQL is supported by all popular ORMs, including Hibernate, ADO.NET (the npgsql project,
PostgreSQL’s .NET driver, has an ADO.NET provider), Django, SQLAlchemy and Active Record.
About 2ndQuadrant
2ndQuadrant are the leading experts in PostgreSQL administration and performance tuning, specialising in critical
operational databases for customers worldwide. We provide hands-on services covering the following areas:
• Operational DBA
• Operational Support
• Development/Consulting
• Custom PostgreSQL development
2ndQuadrant’s customers include BBC, Comodo, Enova Financial, Ericsson, F-Secure, McAfee, Skype and many
others. 2ndQuadrant has more than 20 staff supporting customers in more than 16 countries from offices in US,
UK, France, Italy and Germany, with staff located worldwide. 2ndQuadrant’s headquarters are in the UK.
2ndQuadrant’s differentiator is our ability to deploy top-level skills when and where they are needed. Our strategy
is to employ leading solution experts – people who have contributed directly to the core technologies around
PostgreSQL and at the same time have extensive experience on real world solutions.
2ndQuadrant are also a close partner of EMC, having worked with Greenplum closely since 2005, as a result of our
skills and experience with Data Warehousing database systems. 2ndQuadrant are the only company worldwide
able to offer both PostgreSQL and Greenplum services, with a team of 6 staff experienced with Greenplum
technology.
Business and Contact Information
2nd Quadrant Limited
UK Company Number 4075460
7200 The Quorum
Oxford Business Park North
Oxford OX4 2JZ
United Kingdom
Phone: +44 870 766 7756
Website: http://www.2ndQuadrant.com/
E-mail: info@2ndQuadrant.com