RSS
 

Archive for June, 2010

What is the “Cassandra” database?

30 Jun

“Cassandra” is the name of a highly distributable “nosql” database developed in Java by Facebook.  This database was released to the public through the Apache project in July 2008  as “Apache Cassandra” and is one of the more popular on-premises (or cloud) distributed databases today.

Like many other “nosql” databases, Cassandra offers simple name-value pair storage, although there are both row-like and column-like concepts in play.

Like many other distributed databases, Cassandra makes use of the concept of “eventually consistency”.  The general concept is that, in a quiet state, all nodes will “eventually” get all updates from all other nodes  and will the entire dataset will be “consistent” across all nodes.

However, it is the behavior of this “eventually consistent” database when things are NOT quiet that give it its scalable power: applications built on top of individual nodes of this database must continue to function and must respond to later information gracefully enough to prevent interruptions of end user service (which would otherwise be caused waiting for a single master table to receive all updates).

Cassandra uses timestamps to reconcile distributed commits – another concept common in distributed databases but one that obviously depends on good timekeeping on far-flung nodes.

Cassandra is frequently wrapped by a data type encapsulation layer or an object serialization layer (such as Thrift) to provide applications with a richer data storage experience than simple name-value pairs.

For more information about WHY and WHEN to use Cassandra or another nosql database, please see Andy White’s “Why Cassandra” article.

To learn how to install and use Cassandra on your own system, please see my “Installing and Running the Cassandra Database” article.

 
Comments Off

Posted in Cassandra, Introduction, nosql, Thrift

 

Welcome to “DivConq”!

29 Jun

Welcome to “DivConq”, the site where we discuss how to divide and conquer today’s computing challenges with secure, distributed architecture.

We will be discussing a variety of technologies and architectures here, but all will be considered with an eye toward highly distributed, heterogeneous systems that may span:

  • Different versions of the same operating systems
  • Different hardware
  • Different operating systems
  • Different deployment models (e.g., on-premises datacenter vs. cloud)
  • Widely deployed systems: across states/provinces or even across countries

In 2010 much of the technology to span these environments already exists, but is difficult to assemble in a coherent architecture for general-purpose applications.  This site dedicates itself to bridging the gap between low-level technology and application – let’s begin!

 
Comments Off

Posted in DivConq, Introduction