1. Before Machine Learning
    1. Rule #1: Don’t be afraid to launch a product without machine learning
    2. Rule #2: Make metrics design and implementation a priority
    3. Rule #3: Choose machine learning over a complex heuristic. ML Phase
  2. ML Phase I: First Pipeline
    1. Rule #4: Keep the first model simple and get the infrastructure right.
    2. Rule #5: Test the infrastructure independently from the machine learning.
    3. Rule #6: Be careful about dropped data when copying pipelines.
    4. Rule #7: Turn heuristics into features, or handle them externally.
    5. Monitoring
      1. Rule #8: Know the freshness requirements of your system.
      2. Rule #9: Detect problems before exporting models.
      3. Rule #10: Watch for silent failures.
      4. Rule #11: Give feature sets owners and documentation.
    6. First Objective
      1. Rule #12: Don’t overthink which objective you choose to directly optimize
      2. Rule #13: Choose a simple, observable and attributable metric for your first objective
      3. Rule #14: Starting with an interpretable model makes debugging easier.
      4. Rule #15: Separate Spam Filtering and Quality Ranking in a Policy Layer.
  3. ML Phase II: Feature Engineering
    1. Rule #16: Plan to launch and iterate.
    2. Rule #17: Start with directly observed and reported features as opposed to learned features
    3. Rule #18: Explore with features of content that generalize across contexts.
    4. Rule #19: Use very specific features when you can.
    5. Rule #20: Combine and modify existing features to create new features in human understandable ways
    6. Rule #21: The number of feature weights you can learn in a linear model is roughly proportional to the amount of data you have.
    7. Rule #22: Clean up features you are no longer using.
    8. Human Analysis of the System
      1. Rule #23: You are not a typical end user.
      2. Rule #24: Measure the delta between models.
      3. Rule #25: When choosing models, utilitarian performance trumps predictive power.
      4. Rule #26: Look for patterns in the measured errors, and create new features.
      5. Rule #27: Try to quantify observed undesirable behavior.
      6. Rule #28: Be aware that identical short-term behaviour does not imply identical longterm behavior
    9. Training-Serving Skew
      1. Rule #29: The best way to make sure that you train like you serve is to save the set of features used at serving time, and them pipe those features to a log to use them at training time.
      2. Rule #30: Importance weight sampled data, don’t arbitrarily drop it!
      3. Rule #31: Beware that if you join data from a table at training and serving time, the data in the table may change.
      4. Rule #32: Reuse code between your training pipeline and your serving pipeline whenever possible.
      5. Rule #33: If you produce a model based on the data until January 5th, test the model on the data from January 6th and after.
      6. Rule #34: In binary classification for filtering (such as spam detection or determining interesting emails), make small shortterm sacrifices in performance for very clean data.
      7. Rule #35: Beware of the inherent skew in ranking problems.
      8. Rule #36: Avoid feedback loops with positional features.
      9. Rule #37: Measure Training/Serving Skew.
  4. ML Phase III: Slowed Growth, Optimization Refinement, and Complex Models
    1. Rule #38: Don’t waste time on new features if unaligned objectives have become the issue.
    2. Rule #39: Launch decisions will depend upon more than one metric.
    3. Rule #40: Keep ensembles simple.
    4. Rule #41: When performance plateaus, look for qualitatively new sources of information to add rather than refining existing signals.
    5. Rule #42: Don’t expect diversity, personalization, or relevance to be as correlated with popularity as you think they are.
    6. Rule #43: Your friends tend to be the same across different products. Your interests tend not to be.
  5. Deploying the initial pipeline
  6. Is now the right time to build a machine learning system?
  7. Launching and iterating while adding new features; evaluating training-serving skew.
  8. Overcoming Plateaus