Preping for Data Scientist Associate

I come from a content management background handling terabytes of content. Content lifecycle starts with capture/create, versioning, managing, publishing, to end with archival and retention. Content falls thru information rights, compliance, governance, and retention either at the organization level or at the worldwide web level. Soon I moved to content management on the cloud. Establishing trust in managing content both on cloud and on premise is essential. Being part of EMC, it is natural to cover all three “cloud, big data, and trust”. I started exploring big data. Here is my little journey exploring big data…


Best place to start – To understand big data concepts and business drivers are

  • Big Ideas by EMC TV – Patricia Florissi, EMC VP and Global Sales CTO, walks thru a creative video explaining concepts of big data, Hadoop, and associated solutions. This is good to getting acquainted with the terminologies and concepts
  • InFocus Blogs – EMC leaders April Reeve, Bill Schmarzo, David Dietrich, Frank Coleman, Laddie Suk, and Scott Burgess brings great insights in their posts related to big data. These posts brings the industry perspective and leadership thoughts on big data

Get Organized – It is easy to get lost in all the reading and surfing web. I prefer disciplined, structured approach to learning. That’s where EMC Education comes to rescue.

  • Business Transformation Course – This was introduced later for data-savvy business leaders who can identify opportunities to solve business problems using advanced analytics.
  • Data Science and Big Data Analytics Course – My personal favorite, covers extensively on data science, advanced analytics, big data project lifecycle, and available solutions from EMC.

Dive Deeper – If you already have statistical or analytical background, you can skip this step. Since I worked with content management for long time, I wanted to brush up on my analytics basics that I had in college. Coursera MOOCs are very helpful to get the basics straight

Dig More – After getting the fundamentals straight, it is easy to learn the solutions associated with it

  • Greenplum Unified Analytics Platform – EMC Education offers a comprehensive approach to learn the massive parallel programming architecture for big data analytics. I took the Greenplum Architecture and Administration course.
  • Hadoop – I had this training thru EMC Academic Alliance and have been trying out practically with Greenplum Hadoop and now started working with Pivotal HD.

EMC Education Services Data Scientist course and Greenplum Analytics Labs helps you to get started on your big data projects. Hope this helps! If you have any questions, I’ll be more than happy to answer.

This entry was posted in Big Data and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s