Chapter 4 Data Scientist Job

In general, the data scientist role can be divided into decision scientist and the machine scientist The Kinds of Data Scientist (hbr.org) 4 Types of Data Science Jobs | Udacity

The entry-level (i.e., data analyst) deals with basic analysis tools such as Excel and SQL programming skills to pull data. The middle-level (i.e., data scientist I) deals with more advanced analytics tools such as R, Python. The senior-level uses the same analytics tool but can write or modify the published package/ library. On a side note, the data product management side deals with data product/service, which is akin to the product manager in general.

In practice, the substantive job content can be differentiated based on strategy, consumer behavior, and optimization tracks. The strategy track produces the analysis for managers to make further decisions. In the consumer behavior track, they produce the analysis to elucidate the psychological mechanism and come up with interesting mental models used by the consumers. The optimization track focuses on making things more efficient on a large scale or using machine learning to automate the analysis.

I list the specific duties of each track below, along with the resources to develop the corresponding skills or business sense.

4.1 Business strategy Track (a.k.a Marketing Analytics)

4.1.1 Database marketing

  • RFM targeting offer design, discount offer optimization See the Database marketing page

4.1.2 Programming

4.1.3 Statistics:

  • Statistical tests for differences:

    • Independent, Paired T tests, F- tests, Chi-square tests (used for A/B testing, incrementality testing)
    • Incrementality
    • Regression
    • Independent variables: dummy variable, variable transformation, exploratory/ descriptive analysis
    • Dependent variable: Binary (logit, probit regression), Count (poission, negative binomial regression), Censored (Tobit, survival regression)
  • Advanced: Causal inference

    • Control for observables
    • Mixed model / Hierachical linear model
    • Difference in Difference
    • Regression Discontinuity
    • Modeling process

4.1.4 Visualization

  • Drawing Interaction plot
  • Decile plots
  • Written report on the analysis

4.1.5 Automation

  • Function building
  • Analysis library

4.2 Consumer insight Track (aka. Marketing Research)

  • Qualitative study design

  • Quantitative study design (i.e., survey)

    • Qualtrics for questionnaire or experiment design
    • Generating research questions
  • Data analysis

    • Meditation , Path Model
    • Measurement model

4.3 Optimization Track (a.k.a Operational Research)

4.3.1 Model optimization

4.3.3 Academic Paper implementation