Do you have any code snippets showing how to load a stereo audio file into MLMultiArray object?
It depends on the desired buffer layout of PCM data (interleaved or separate, int16 or float32, etc). A good starting point can be:
- https://developer.apple.com/documentation/soundanalysis/snclassifysoundrequest
- https://apple.github.io/turicreate/docs/userguide/sound_classifier/export-coreml.html
If you already have a buffer with the desired layout, you can use MLMultiArray.init(dataPointer:shape:dataType:strides:deallocator:) or MLShapedArrayinit(bytesNoCopy:shape:strides:deallocator:) . If you need to copy with some data type conversion (e.g. short to float32), MLShapedArray.init(unsafeUninitializedShape:initializingWith:) would work.
AAC audio, on the other hand, is often decoded to a sequence of audio chunks which is not aligned to one second boundary. So, we would need to do some (tedious) buffer munching. The following code loads an AAC file into MLShapedArray
for every one second and write each back to a new AAC file.
MLShapedArray
is a Swift-y cousin of MLMultiArray
and, if you are using Swift, it is preferred over MLMultiArray
. Core ML accepts either type. MLMultiArray(_ shapedArray:) and MLShapedArray(_ multiArray:) can convert between them.
import AVFoundation
import CoreML
let audioFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32,
sampleRate: 44100,
channels: 2,
interleaved: false)!
let frameCount = AVAudioFrameCount(audioFormat.sampleRate)
let inputURL = URL(filePath: "/Users/apple/sample.aac")
let sourceAudioFile = try! AVAudioFile(forReading: inputURL)
let sourceAudioBuffer = AVAudioPCMBuffer(
pcmFormat: audioFormat,
frameCapacity: frameCount
)!
let aacSettings = [AVFormatIDKey : kAudioFormatMPEG4AAC,
AVSampleRateKey : 44100,
AVNumberOfChannelsKey : 2]
let outputURL = URL(filePath: "/Users/apple/output.aac")
let outputAudioFile = try! AVAudioFile(forWriting: outputURL,
settings: aacSettings)
// Loop to read and decode source audio file.
while sourceAudioFile.framePosition < sourceAudioFile.length {
try! sourceAudioFile.read(into: sourceAudioBuffer)
let frameLength = Int(sourceAudioBuffer.frameLength)
// Make MLShapedArray from the audio buffer.
let leftChannels = MLShapedArray<Float32>(
bytesNoCopy: sourceAudioBuffer.floatChannelData![0],
shape: [1, frameLength],
strides: [frameLength, 1],
deallocator: .none
)
let rightChannels = MLShapedArray<Float32>(
bytesNoCopy: sourceAudioBuffer.floatChannelData![1],
shape: [1, frameLength],
strides: [frameLength, 1],
deallocator: .none
)
let audioShapedArr = MLShapedArray(
concatenating: [leftChannels, rightChannels],
alongAxis: 0
)
// Write the MLShapedArray back to a audio buffer.
let outputAudioBuffer = AVAudioPCMBuffer(
pcmFormat: audioFormat,
frameCapacity: sourceAudioBuffer.frameLength
)!
audioShapedArr[0].withUnsafeShapedBufferPointer { ptr, _, _ in
outputAudioBuffer.floatChannelData![0].initialize(
from: ptr.baseAddress!,
count: frameLength
)
}
audioShapedArr[1].withUnsafeShapedBufferPointer { ptr, _, _ in
outputAudioBuffer.floatChannelData![1].initialize(
from: ptr.baseAddress!,
count: frameLength
)
}
outputAudioBuffer.frameLength = sourceAudioBuffer.frameLength
// And encode and write to an AAC file.
try! outputAudioFile.write(from: outputAudioBuffer)
}