merlin-misc

questions 问题集
1. questions-mandarin.hed
  1. 普通话标准问题集
2. questions-radio_dnn_416.hed
  1. 普通话问题集，包含416条问题
3. questions-unilex_dnn_600.hed
  1. 普通话问题集，包含600条问题
recipes
1. blstm
  1. blstm.conf
  2. hybrid_blstm.conf
  3. hybrid_blstm_no_dynamic.conf
  4. hybrid_blstm_WORLD.conf
  5. hybrid_blstm_WORLD_no_dynamic.conf
2. dnn
  1. feed_forward_dnn.conf
  2. feed_forward_dnn_ossian.conf
  3. feed_forward_dnn_ossian_DUR.conf
  4. feed_forward_dnn_WORLD.conf
  5. feed_forward_dnn_WORLD_bc16.conf
3. general_config
  1. logging_config.conf
4. lstm
  1. deep_lstm.conf
  2. deep_lstm_WORLD.conf
  3. hybrid_lstm.conf
  4. hybrid_lstm_WORLD.conf
5. lstm_variants
  1. deep_gru.conf
  2. deep_lstm.conf
  3. deep_lstm_nfg.conf
  4. deep_lstm_nig.conf
  5. deep_lstm_nog.conf
  6. deep_lstm_nph.conf
  7. deep_sgru.conf
6. MGE
  1. feed_forward_dnn_cm.conf
  2. feed_forward_dnn_cm_MGE.conf
  3. run_dnn_cm.py
  4. run_mge_dnn.py
7. acoustic_demo.conf
8. duration_demo.conf
scripts
1. alignment
  1. phone_align
    1. run_aligner.sh
      1. 使用festvox工具clustergen强制对齐
      2. 将festival utts转换为lab文件
    2. setup.sh
      1. 下载CMU的arctic数据集
      2. 准备配置文件config.cfg,merlin,festival,前端工具等的目录
  2. state_align
    1. binary_io.py
      1. numpy数组和二进制文件互转
    2. forced_alignment.py
      1. 使用HTK工具训练HMM模型以及做对齐
    3. htk_io.py
      1. 读写HTK格式的文件
    4. htkmfc.py
      1. 读写HTK所使用的声学特征文件
    5. mean_variance_norm.py
      1. 规范化数据
    6. prepare_labels_from_txt.sh
      1. 将文本转换为lab文件
      2. 1. 从txt文本使用前端工具生成scheme文件
      3. 2. 从scheme文件生成 utt文件
      4. 3. 将 festival utt转换为lab文件
      5. 4. state_align或phone_align，规范化lab
    7. run_aligner.sh
      1. 执行状态对齐
      2. 1. 使用HTK的HVite做强制状态对齐
      3. 2. 先使用Festival前端工具准备无时间戳的上下文相关lab文件
    8. setup.sh
      1. 下载CMU的arctic数据集
      2. 准备配置文件config.cfg,merlin,festival,前端工具等的目录
2. frontend
  1. festival_utt_to_lab
    1. extra_feats.scm
    2. label.feats
    3. label-full.awk
    4. label-mono.awk
    5. make_labels
  2. utils
    1. genScmFile.py
      1. 从文本路径下读取所有文本内容，生成utt文件
    2. normalize_lab_for_merlin.py
      1. 根据align 类型规范化lab文件
    3. prepare_txt_done_data_file.py
3. hybrid_voice
  1. compute_tcoef_features.py
    1. 计算tcoef 特征
  2. convert_hts_label_format_to_festival.py
    1. 将HTS格式的lab文件转换为festival格式
  3. processHybridInfo.py
4. vocoder
  1. magphase
    1. extract_features_for_merlin.py
      1. 从一批wav文件中抽取低维的声学特征，抽取的特征包括
      2. mag ： Mel-scaled Log-Mag
      3. .real ： Mel-scaled real
      4. .imag ： Mel-scaled imag
      5. .lf0 ： Log-F0
  2. straight
    1. copy_synthesis.sh
    2. extract_features_for_merlin.py
      1. 使用 straight 抽取特征
      2. raw wav
      3. sp
      4. ap
      5. bapd
      6. f0
      7. lf0
      8. mgc
      9. bap
    3. extract_features_for_merlin.sh
  3. world
    1. copy_synthesis.sh
    2. extract_features_for_merlin.py
      1. 使用 world抽取特征
      2. raw wav
      3. sp
      4. ap
      5. bapd
      6. f0
      7. lf0
      8. mgc
      9. bap
    3. extract_features_for_merlin.sh
    4. synthesis.py
      1. 使用WORLD，输入merlin格式的特征，合成语音
      2. 1. lf0转 f0
      3. 2. 过滤mgc
      4. 3. mgc转sp
      5. 4. bapd转bap
      6. 5. 合成wav
  4. world_v2
    1. copy_synthesis.sh
      1. 使用WORLD_V2，输入merlin格式的特征，合成语音
      2. 1. lf0转 f0
      3. 2. 过滤mgc
      4. 3. mgc转sp
      5. 4. bapd转bap
      6. 5. 合成wav
    2. extract_features_for_merlin.py
      1. 使用 world抽取特征
      2. raw wav
      3. sp
      4. ap
      5. bapd
      6. f0
      7. lf0
      8. mgc
      9. bap
5. voice_conversion
  1. binary_io.py
    1. 读写numpy存写的二进制文件，载入DTW文件等
  2. align_feats.py
  3. dtw_aligner.py
  4. dtw_aligner_festvox.py
  5. dtw_aligner_festvox_magphase.py
  6. dtw_aligner_magphase.py
  7. transform_f0.py
  8. compute_lf0_stats.py
    1. 计算所有 lf0的均值和方差
  9. 使用不同的声码器做对齐，用的是DTW(动态时间调整)