What do you think being a data scientist is about?

In a very informal sense, data science is a practice statisticians have long engaged in, just on a far greater scale. The data science community has advanced with the combination of cloud and big data. Data scientists collect and analyze enormous sets of structured and unstructured data. Computer science, statistics, and mathematics are all combined in the work of a data scientist. In order to produce actionable plans for businesses and other organizations, they evaluate, analyze, and model data.

What do you see as the major duties and/or knowledge areas?

On a daily basis, a data scientist might carry out the following tasks:

  • Find patterns in the data to produce insights
  • Model the data and create algorithms to forecast outcomes
  • Use machine learning to improve the quality of the outcomes
  • Deploy data tools in python, R, or Scala

Knowledge areas required to be a data scientist:

  • A higher degree in the field of Statistics/ Data Science/ Computer Science
  • Strong knowledge of Python, SAS, R, Scala
  • SQL database coding
  • Ability to Google (Yes, this is an important one and more complex than it sounds!)

What differences/similarities do you see between data scientists and statisticians?

To quote a line that I recently read and loved: “A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.” It has become abundantly evident that while the two fields can coexist independently, they are both weak without the other.

The concepts that statisticians work with include t-distributions, standard errors, and hypothesis testing. Statisticians must be experts in statistical analysis. They must be skilled at spotting patterns and irregularities in data. On the other hand, data scientists follows a process- data ingestion, data transformation, exploratory data analysis, model selection and evaluation. Although many of these procedures use statistical techniques under the hood, they are wrapped up in a more interesting package. Data science can be embraced by many more people.

How do you view yourself in relation to these two areas?

I am a data scientist with a background in computer science. But as I’ve already stated, data science and statistics today are insufficient on their own. My current position in the corporate world closely combines data science and software engineering, but I hope that as I gain more knowledge and expertise, it will eventually transform into that of a data scientist.


<
Blog Archive
Archive of all previous blog posts
>
Next Post
Project 1 Review