Ataxo Social Insider is an application used to monitor and evaluate
communication in social networks such as Twitter, Facebook, blogs and online
media in Central Europe. It is used by big brands, social media professionals,
public relations agencies, political parties and anyone else who wants to keep
track of online chatter.
When designing the architecture, the Ataxo Interactive team needed a scalable,
robust and 100% reliable database. A previous prototype version of the
application used MySQL and while it certainly “did the job” they were pretty
sure they would outgrow it quickly, and that the traditional schema-based data
model would be too tight for their use.
CouchDB was chosen because it matched Ataxo’s internal design philosophy,
and let them apply their extensive web application design and scaling
experience to their advantage.
This application uses CouchDB as the primary datastore, making clever use of
some of its features such as the ability to store rich documents without pre-
defined schema, HTTP-based architecture, easily pluggable full-text engines,
easy replication and crash-resistant design.
The application is split into two parts: the backend and the frontend. The
backend system retrieves, stores and indexes data from various online services
(Twitter, Facebook, YouTube, online discussions, blogs, etc.) The frontend
system, a Ruby On Rails application, fetches these “mentions” from the
backend via Solr search interface, based on user preferences (“keywords”),
periodically or on-demand. It stores them in a CouchDB database and fires off
series of hooks, which “enrich” the data with information about the number of
“retweets”, Facebook “likes”, URL ranking, and such.
Ataxo stores the mentions in separate databases: one for every account, making
it possible to open up the “raw data” to every customer by mapping the URL
to their database via the webserver. Database permissions are based on the user
credentials corresponding to the account so customers with proper
infrastructure can interface with the database directly, not only through the
read-focused application API. Customers can even define their own views,
using the application as a software platform for custom reporting tasks. The
team was inspired to implement this feature by the open nature of CouchDB’s
HTTP based architecture and powerful replication features.
Ataxo Interactive is a division of
Ataxo, one of the largest provider
of services in online advertising,
search engine marketing and
optimization, social networks
marketing and PR for clients in
Central Europe. Ataxo Interactive
provides technological support,
builds B2B tools for effective
campaign management and
reporting, as well as number of
internal tools. Ataxo Social Insider
is an online tool for complex media
monitoring of Czech and Slovak
Because CouchDB is fully based on HTTP, they were also able to scale the data
store from the start. As Rails developers, they live and breathe HTTP, using
ETags and Expires all the time. They knew how to scale HTTP stacks. So they
knew they could easily split the server read load just by tying separated
CouchDB instances with well understood tools like Nginx or Haproxy and
starting replication between them. They could use a dedicated cache such as
Squid or Varnish to alleviate the read load in a snap. Most importantly, with a
system based on HTTP, every part of the stack is transparent, easy to monitor,
enable, disable or multiply.
CouchDB's crash-resistant design turned out to be one of its biggest advantages
for Ataxo. When you show your operations crew that you can kill -9 a running
database, relaunch it, and it continues to run from the state it was when brought
down, without any repair or maintenance, they are sold on the spot. When you
show them you can scp the raw data to another server with more power to build
view indexes quickly, and it continues the build process where it ended on the
slow machine, with the ability to scp it back without any data corruption
whatsoever, the team thinks they just saw some black magic. Karel Minarik,
Lead Developer at Ataxo Interactive says, “To sum up, CouchDB is absolute
bliss for operations and server support: it requires zero maintenance, recovers
from crashes without problems, local backup is done just by making a snapshot
of the data directory and offsite backups are done via easy to set-up continuous
Karel Minarik, Lead Developer
at Ataxo Interactive says, “We
had our eye on CouchDB from
the start; its minimalist aesthetics
and thorough use of HTTP
intrigued us. We evaluated other
stores, but CouchDB has earned
our trust. We thoroughly enjoyed
the new ways of looking at data
modeling and the possibilities
CouchDB has opened for us.”
