无法将多线程用于librosa melspectrogram [英] Unable to use Multithread for librosa melspectrogram

查看:311
本文介绍了无法将多线程用于librosa melspectrogram的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有1000多个音频文件(这只是一个初步的发展,将来还会有更多的音频文件),并希望将它们转换为频谱图.

I have over 1000 audio files (it's just a initial development, in the future, there will be even more audio files), and would like to convert them to melspectrogram.

由于我的工作站具有32个线程的英特尔®至强®处理器E5-2698 v3,所以我想使用多线程来完成我的工作.

Since my workstation has a Intel® Xeon® Processor E5-2698 v3, which has 32 threads, I would like to use multithread to do my job.

import os
import librosa
from librosa.display import specshow
from natsort import natsorted
import numpy as np
import sys 
# Libraries for multi thread
from multiprocessing.dummy import Pool as ThreadPool
import subprocess
pool = ThreadPool(20) 

songlist = os.listdir('../opensmile/devset_2015/')
songlist= natsorted(songlist)

def get_spectrogram(song):
    print("start")
    y, sr = librosa.load('../opensmile/devset_2015/' + song)

    ## Add some function to cut y
    y_list = y
    ##

    for i, y_i in enumerate([y_list]): # can remove for loop if no audio is cut
        S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128,fmax=8000)
        try:
            np.save('./Test/' + song + '/' + str(i), S)
        except:
            os.makedirs('./Test/' + song)
            np.save('./Test/' + song + '/' + str(i), S)
        print("done saving")

pool.map(get_spectrogram, songlist)

我的问题

但是,我的脚本在完成第一次转换后就冻结了.

My Problem

However, my script freezes after finished the first conversion.

要调试正在发生的情况,我注释了S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128,fmax=8000)并将其替换为S=0. 这样,多线程代码就可以正常工作了.

To debug what's going on, I commented out S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128,fmax=8000) and replace it by S=0. Then the multi-thread code works fine.

librosa.feature.melspectrogram函数有什么问题?它不支持多线程吗?还是ffmpeg的问题? (使用librosa时,它要求我先安装ffmpeg.)

What's wrong with the librosa.feature.melspectrogram function? Does it not support multi-thread? Or is it a problem of ffmpeg? (When using librosa, it asks me to install ffmpeg before.)

推荐答案

我建议使用 joblib 与librosa并行处理.我相信librosa在内部使用它,因此可以避免一些冲突.以下是一个有效的示例,该示例基于我经常用于处理约1万个文件的代码.

I recommend using joblib to parallel process with librosa. I believe librosa is using it internally, so this might avoid some conflicts. Below is a working example, based on code that I regularly use to process some 10k files.

import os.path
import joblib
import librosa
import numpy

def compute(inpath, outpath):
    y, sr = librosa.load(inpath)
    S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128, fmax=8000)
    numpy.save(outpath, S)
    return outpath

out_dir = 'temp/'
n_jobs=8
verbose=1

# as an reproducable example just processes the same input file
# but making sure to give them unique output names
inputs = [ librosa.util.example_audio_file() ] * 10
outputs = [ os.path.join(out_dir, '{}.npy'.format(n)) for n in range(len(inputs)) ]

jobs = [ joblib.delayed(compute)(i, o) for i,o in zip(inputs, outputs) ]
out = joblib.Parallel(n_jobs=n_jobs, verbose=verbose)(jobs)

print(out)

输出

[Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done   6 out of  10 | elapsed:   10.4s remaining:    6.9s
[Parallel(n_jobs=8)]: Done  10 out of  10 | elapsed:   13.2s finished
['temp/0.npy', 'temp/1.npy', 'temp/2.npy', 'temp/3.npy', 'temp/4.npy', 'temp/5.npy', 'temp/6.npy', 'temp/7.npy', 'temp/8.npy', 'temp/9.npy']

这篇关于无法将多线程用于librosa melspectrogram的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆