BLOG RSS
- Follow My Data Experiments on WordPress.com
-
Disclaimer
The postings on this site are my own and do not necessarily represent the opinions of CapTech Ventures, Inc.-
Recent Posts
Categories
Links
Archives
Author Archives: Mercy Beckham
eXtreme Transaction Processing
Massimo Pezzini, distinguished Gartner Analyst coined the term eXtreme Transaction Processing or XTP in 2007. XTP relates to a class of applications that requires collecting, correlating, and operating on large volumes of data to deliver meaningful insights into business use … Continue reading
When Big Data meets Fast Data
2001 Gartner research article by Doug Laney titled “3-D Data Management: Controlling Data Volume, Velocity, and Variety” now serves as the construct for big data. Fast data is often related to data velocity – that is defined as the rate … Continue reading
Hadoop Invades My Desk
I gave Hadoop elephant an Indian makeover with red and gold – acrylics on paper.
Spring for Apache Hadoop
Hadoop has a poor out of the box programming model. Applications often become spaghetti code in the form of scripts calling Hadoop command line applications. Spring aims to simplify Hadoop applications by leveraging several Spring eco-system projects. Spring for Apache … Continue reading
Pivotal Platform – where Cloud meets Big Data
There is a major shift towards cloud computing that includes both infrastructure transformation and transformation of application development and its usage. These simply require new methods. Pivotal platform helps organizations in these transformations. Pivotal enables organizations to build a new … Continue reading
HAWQ Soars Higher
HAWQ is a modern distributed and parallel query processor on top of HDFS that gives enterprises the best of both worlds: high-performance query processing with SQL, and scalable open storage. When the data is directly stored on HDFS, it provides … Continue reading
Delta of Hadoop Distributions
Greenplum introduced first Hadoop distribution GPHD (Greenplum Hadoop Distribution) in 2011 removes the need in building out a Hadoop cluster from scratch. In February this year, Pivotal – Greenplum announced the first product Pivotal HD to expand the capabilities of … Continue reading
Introduction to Pivotal HD
Pivotal HD is a full Apache Hadoop distribution with Pivotal add-ons and a native integration with the Greenplum database. Hence bringing together both NoSQL and SQL access layers to multi-structured data stored within the Pivotal HDFS. This distribution is the … Continue reading
Hadoop Install on Windows Server 2012
My installation notes for Cygwin and Hadoop on Windows Server 2012- https://github.com/mercyp/Hadoop
Arrow of Time in Big Data – Understanding the Interconnectedness
Arrow of Time is a term coined by British astronomer Arthur Eddington to describe time flows inexorably in one direction or the “asymmetry” of time. We can experience this time’s arrow in our everyday lives. Certain conditions, developments, or processes … Continue reading
