1. Data extraction
    1. Vital data
    2. Education
    3. Positions
      1. clean data
        1. MCA
      2. Clustering
    4. Relations
      1. Clean data
    5. Summary
      1. Clean data
      2. Analysis
      3. Paper sections
      4. Mapping
  2. Two Points of Entry
    1. X-Boorman Text
      1. Pinyin
        1. Institution names
          1. e.g. (kao-teng shih-fan hsue-hsiao [Gaodeng shifan xuexiao])
      2. Pinyin/Chinese
        1. Place names
          1. [pinyin - Chinese]
          2. Anking【Anqing - 安慶】
        2. Person names
          1. Main figures
          2. Biography entry
          3. [pinyin - Chinese
          4. e.g. Ch'en Kung-po [Chen Gongbo - 陳公博]
          5. Biography text
          6. [pinyin]
          7. e.g. Ch'en Kung-po [Chen Gongbo]
          8. e.g. Ch'en [Chen]
          9. Secondary figures
          10. First mention in a biography
          11. [pinyin - Chinese
          12. e.g. Kuo T'in-ling [Guo Tingliang - 郭廷亮]
          13. Other mention in same document
          14. [pinyin]
          15. e.g. [Guo Tingliang or Guo]
      3. OCR corrections
        1. OCR error detection
        2. OCR correction
        3. OCR substitution
      4. Index
        1. Name index
          1. Pinyin substitution
          2. Chen Gongbo [Ch'en Kung-po]
        2. Location index
          1. Pinyin - Chinese
          2. Anqing - 安慶
        3. Institution index
          1. Based on original English name in text
    2. Padagraph exploration
      1. By Name
      2. By Location
      3. By Institution
      4. By Position
      5. By Education [harder to define entries]
  3. SolrDB
    1. OCR Base Text
      1. NLP Processing
        1. Indexed/Segmented Text
  4. Blog posts
    1. X-Boorman (I): A Digital Revival
    2. X-Boorman (II): The Boorman Factory
    3. X-Boorman (III): Birth, Mobility, and Death
    4. X-Boorman (IV): Links and relations