Author Archives: Mercy Beckham

Measuring Databases

Posted on December 16, 2013 by Mercy Beckham

Measuring the first V – Volume of big data becomes critical and essential. Here are some samples in the technologies I’ve worked on. SQL Server sp_helpdb ‘database_name’ – returns the size of data and log files of a database sp_spaceused … Continue reading →

Posted in Code Snippet | Tagged Code Snippet | Leave a comment

Hosting Big Data

Posted on December 11, 2013 by Mercy Beckham

Rackspace recently introduced its new Big Data hosting options – customize your configuration for managing big data platform, run Hadoop on the public cloud, or configure your own private cloud. Rackspace eliminates the complex process of building and maintaining a … Continue reading →

Posted in Hadoop | Tagged Hadoop, Isilon | Leave a comment

PivotalR

Posted on November 23, 2013 by Mercy Beckham

PivotalR package is an R front-end to PostgreSQL, Pivotal (Greenplum) database, and a wrapper for machine learning open-source library MADlib. It also interacts with Pivotal HD/HAWQ for Big Data analytics by providing an interface to the operations on tables/views in … Continue reading →

Posted in R | Tagged Pivotal, R | Leave a comment

Difference between MapReduce 1.0 and MapReduce 2.0

Posted on November 11, 2013 by Mercy Beckham

Apache Hadoop, introduced in 2005 has a core MapReduce processing engine to support distributed processing of large-scale data workloads. Several years later, there are major changes to the core MapReduce so that Hadoop framework not just supports MapReduce but other … Continue reading →

Posted in Hadoop | Tagged Hadoop | 2 Comments

Self-Service Data Access – Pivotal DD

Posted on November 1, 2013 by Mercy Beckham

Enterprise data resides in heterogeneous systems and of different data types. IT has its challenges to consolidate data in the right time. Also, many times it is difficult to know what data sources are required to access data. Pivotal DD … Continue reading →

Posted in Big Data, Hadoop | Tagged Big Data, Hadoop, Pivotal | Leave a comment

Virtualizing Hadoop

Posted on October 23, 2013 by Mercy Beckham

HDFS, the “storage” and MapReduce, the “compute” are combined in traditional Hadoop model. If this Hadoop model is directly translated into a VM, it will affect the ability to scale up and down as the lifecycle of VM is tightly … Continue reading →

Posted in Hadoop | Tagged Hadoop, Isilon | Leave a comment

Run Splunk with EMC

Posted on October 11, 2013 by Mercy Beckham

Splunk is a powerful data analytics platform that collects, indexes, and analyzes data from virtually any source, including application and machine-generated data in a searchable repository from which it can generate meaningful insights. Splunk makes this data available and usable … Continue reading →

Posted in General | Tagged EMC, Splunk | Leave a comment

Preping for Data Scientist Associate

Posted on September 15, 2013 by Mercy Beckham

I come from a content management background handling terabytes of content. Content lifecycle starts with capture/create, versioning, managing, publishing, to end with archival and retention. Content falls thru information rights, compliance, governance, and retention either at the organization level or … Continue reading →

Posted in Big Data | Tagged Big Data, EMC | Leave a comment

EMC Kazeon – Dark Data Explorer

Posted on September 14, 2013 by Mercy Beckham

Dark matter in astronomy and cosmology is a type of matter that hypothetically accounts for the large part of total mass in the universe. It neither emits nor absorbs light or other electromagnetic radiations so that it cannot be observed … Continue reading →

Posted in Big Data, Conceptual | Tagged Big Data EMC | Leave a comment

Hadoopable?

Posted on September 8, 2013 by Mercy Beckham

Recently I heard “moving content into Hadoop” – although I did not further question their motive, I was wondering seriously about “effective solutions” on Hadoop for the day-to-day business problems. Hadoop is not a magic wand to wipe away all … Continue reading →

Posted in Conceptual, Hadoop | Tagged Hadoop | Leave a comment

Author Archives: Mercy Beckham

Measuring Databases

Hosting Big Data

PivotalR

Difference between MapReduce 1.0 and MapReduce 2.0

Self-Service Data Access – Pivotal DD

Virtualizing Hadoop

Run Splunk with EMC

Preping for Data Scientist Associate

EMC Kazeon – Dark Data Explorer

Hadoopable?

BLOG RSS

Follow Blog via Email

Disclaimer

Recent Posts

Categories

Links

Archives

Meta