1. Free Generation
    1. [1]. GPT/GPT-2
    2. [2]. BERT for NLG
  2. Conditional Generation
    1. [3]. mircosoft: MASS
    2. [4]. mircosoft: UNIFM
    3. [5]. AI2: GROVER
    4. [6].Pretraining for Conditional Generation with Pseudo Self Attention
  3. Decoding Method
    1. Autoregressive
      1. seq2seq, transformer
    2. Non-AutoRegressive
      1. [8] Non-Autoregressive NMT
      2. [9]. InDIGO
      3. [10]. Insertion Transformer
      4. [11]. Levenshtein Transformer
      5. [12]. based on masked LM
  4. RL+GAN
    1. [13, 14]. seqgan/ RAML
    2. [15, 16]. ScratchGAN/ COT
  5. Metric
    1. F-score/accuracy
    2. BLEU for MT, ppl
    3. [7]. SJTU :Texygen
  6. [13]. train-test skewness, exposure bias
  7. Pre-training for NLG