Author Archives: Mercy Beckham

eXtreme Transaction Processing

Massimo Pezzini, distinguished Gartner Analyst coined the term eXtreme Transaction Processing or XTP in 2007.  XTP relates to a class of applications that requires collecting, correlating, and operating on large volumes of data to deliver meaningful insights into business use … Continue reading

Posted in General | Tagged | Leave a comment

When Big Data meets Fast Data

2001 Gartner research article by Doug Laney titled “3-D Data Management: Controlling Data Volume, Velocity, and Variety” now serves as the construct for big data. Fast data is often related to data velocity – that is defined as the rate … Continue reading

Posted in General | Tagged | 3 Comments

Hadoop Invades My Desk

I gave Hadoop elephant an Indian makeover with red and gold – acrylics on paper.  

Posted in General, Hadoop | Tagged | Leave a comment

Spring for Apache Hadoop

Hadoop has a poor out of the box programming model. Applications often become spaghetti code in the form of scripts calling Hadoop command line applications. Spring aims to simplify Hadoop applications by leveraging several Spring eco-system projects. Spring for Apache … Continue reading

Posted in Hadoop, Spring | Tagged | Leave a comment

Pivotal Platform – where Cloud meets Big Data

There is a major shift towards cloud computing that includes both infrastructure transformation and transformation of application development and its usage. These simply require new methods. Pivotal platform helps organizations in these transformations. Pivotal enables organizations to build a new … Continue reading

Posted in General | Tagged | 2 Comments

HAWQ Soars Higher

HAWQ is a modern distributed and parallel query processor on top of HDFS that gives enterprises the best of both worlds: high-performance query processing with SQL, and scalable open storage. When the data is directly stored on HDFS, it provides … Continue reading

Posted in Hadoop | Tagged | Leave a comment

Delta of Hadoop Distributions

Greenplum introduced first Hadoop distribution GPHD (Greenplum Hadoop Distribution) in 2011 removes the need in building out a Hadoop cluster from scratch. In February this year, Pivotal – Greenplum announced the first product Pivotal HD to expand the capabilities of … Continue reading

Posted in Hadoop, Hadoop Distribution | Tagged , | Leave a comment

Introduction to Pivotal HD

Pivotal HD is a full Apache Hadoop distribution with Pivotal add-ons and a native integration with the Greenplum database. Hence bringing together both NoSQL and SQL access layers to multi-structured data stored within the Pivotal HDFS. This distribution is the … Continue reading

Posted in Hadoop, Hadoop Distribution | Tagged , | Leave a comment

Hadoop Install on Windows Server 2012

My installation notes for Cygwin and Hadoop on Windows Server 2012- https://github.com/mercyp/Hadoop

Posted in Hadoop | Tagged | Leave a comment

Arrow of Time in Big Data – Understanding the Interconnectedness

Arrow of Time is a term coined by British astronomer Arthur Eddington to describe time flows inexorably in one direction or the “asymmetry” of time. We can experience this time’s arrow in our everyday lives. Certain conditions, developments, or processes … Continue reading

Posted in Conceptual, Philosophy | Tagged | 1 Comment