Skip to content

ISTFTHead模型优化 #77

@wzy3650

Description

@wzy3650

https://github.com/jishengpeng/WavTokenizer/blob/5cf440d91ac420ca338f117b7003a77450d64730/decoder/heads.py#L54C1-L55C1

根据传统信号处理理论,FFT的频点0(直流)与最高频(Nyquist频率)处频谱系数一定为实数,即这两个频点处虚部为0。因此可调整self.out = torch.nn.Linear(dim, out_dim)的out_dim为1280(当前是1282),不去预测这两个频点的相位值而是固定为0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions