Becoming a Data Scientist

IMG_3129Few would debate the value of data-driven insights in public relations, but many question what this actually means. In April 2015, at the Page Society Spring Seminar, Jack Welch said data is a tool, and like any other tool, it “has the capacity to be valuable only if it is applied with strategic purpose toward a meaningful outcome.”

Over the past year, I have heard more and more references to “data scientists.” Business Insider listed data scientist as the sixth “best job of 2015.” Then, after hearing CCOs from top companies like Facebook, GE and MasterCard mention the increased need to use technology to transform their departments to bring meaningful, individualized engagement with stakeholders, I knew I needed to know more.

First, I sought out to identify what a data scientist is. What I found is that, similar to the term “big data,” there isn’t a widely agreed upon definition. But, there seems to be a pretty clear description of a data scientist as someone who finds patterns in data and connects them to real-world decisions and business strategies. At first I thought well, I’m a data scientist. I love playing with data in my research and consulting, and often connect my findings to suggestions for the profession or actionable items for clients. Then I dug a little deeper and found that most job postings for data scientists were asking for candidates to have:

  1. An ability to solve business-related problems using data-driven techniques.
  2. Strong written and verbal communication skills for communicating and collaborating with both IT and other business units.
  3. Ability to advise senior management in clear language about the implications from the analysis.
  4. Curiosity to collect large amounts of data and transform it into a more usable format.
  5. Desire to looking for order and patterns in data, as well as spotting trends that can help a business’s bottom line.
  6. Comfort level in working with a variety of programming languages, including SAS, R and Python.
  7. Solid grasp of statistics, including statistical tests and distributions.
  8. Staying on top of analytical techniques such as machine learning, deep learning and text analytics.

I think I am capable of many of these skills, but programing languages, machine learning, and text analytics, were all skills I was lacking to be a real data scientist. To rectify this, I registered for the first course in the Data Science Specialization taught by biostatistics professors at the  Bloomberg School of Public Health at Johns Hopkins University—The Data Scientist’s Toolbox MOOC (Massive Open Online Course). This course dug into the idea of turning data into actionable knowledge through the use of programs like version control, markdown, git, GitHub, and R Studio. These tools all deal with coding and they were introduced for use in collaborating and analyzing data. Taking this course was fun and exciting like learning a new language. Once I completed the course, I realized I knew many concepts on a surface level but needed to dig in more. I am now taking the second course in the Specialization on R programming. While I’m getting experience with the tools, the instructors are also encouraging us to “think like a hacker.” This hopefully will lead to my becoming a data hacker, an analyst, and a communicator – all skills identified as necessary for a successful data scientist.

Marcia W. DiStaso, Ph.D., is the Director of the Institute for Public Relations Social Science of Social Media Research Center and an associate professor at Penn State University.

Posted in [Blog], [Research Library], New Technology / Social Media.

Join the Discussion