Overview
The VoxtaAudioUtility module handles real-time audio capture, encoding- and decoding audio data from and to VoxtaServer.
- Audio capture from microphone input
- WAV data decoding
- Real-time audio streaming
- WebSocket-based audio transmission
Module Structure
Sending audio to VoxtaServer
AudioCaptureHandler
: Manages microphone input capture and provides audio data to the voice system.
- Real-time audio capture from system microphones
- Configurable capture settings (sample rate, channels, etc.)
- Audio buffer management and queueing (triggered by
VoiceRunnerThread
)
AudioWebSocket
: Manages transmitting audio data through Unreal's FWebSocketsModule.
- Connection state management
- Broadcast notifications (connection, authenticaion, etc.)
- Error handling
VoiceRunnerThread
: A dedicated non-blocking thread for processing voice data.
- Every 0.2 seconds:
- Fill a buffer with all available voice data
- Calculate decibel level from the buffer
- If the buffer contains data, send it to VoxtaServer
Receiving audio from VoxtaServer
FBaseRuntimeCodec
: Generic base codec to support more formats in the future.
WAV_RuntimeCodec
: WAV codec with dr_wav library integration
RuntimeAudioImporterLibrary
: Main interface for audio import operations
Platform Support
The module supports:
- Android*
- iOS*
- Mac*
- Windows
*untested
Usage Examples
Initializing Audio Capture
TSharedPtr<AudioWebSocket> audioWebSocket = MakeShared<AudioWebSocket>(serverAddress, serverPort);
audioWebSocket->OnConnectedEvent.AddUObject(this, &YourClass::OnSocketConnected);
audioWebSocket->OnConnectionErrorEvent.AddUObject(this, &YourClass::OnSocketConnectionError);
audioWebSocket->OnClosedEvent.AddUObject(this, &YourClass::OnSocketClosed);
audioWebSocket->Send(FString::Format(TEXT("{\"contentType\":\"audio/wav\","
"\"sampleRate\":{0},\"channels\":{1},\"bitsPerSample\": 16,\"bufferMilliseconds\":{2}}"),
{ sampleRate, inputChannels, bufferMs }));
audioWebSocket->Connect(chatSessionId);
AudioCaptureHandler.
Definition AudioCaptureHandler.h:19
bool TryInitializeVoiceCapture(int sampleRate=16000, int numChannels=1)
Tries to create a voice capture instance (IVoiceCapture).
Definition AudioCaptureHandler.cpp:18
void RegisterSocket(TWeakPtr< AudioWebSocket > socket, int bufferMillisecondSize)
Store a pointer to the websocket that will receive the parsed data from the FVoiceModule.
Definition AudioCaptureHandler.cpp:12
bool TryStartVoiceCapture()
Tries to start that IVoiceCapture and also start the background thread that will forward any captured...
Definition AudioCaptureHandler.cpp:82
void ConfigureSilenceThresholds(float micNoiseGateThreshold, float silenceDetectionThreshold, float micInputGain)
Configure values to help the microphone to pick up voice without background noise.
Definition AudioCaptureHandler.cpp:63
Decoding WAV data
URuntimeAudioImporterLibrary::ImportAudioFromBuffer(TArray64<uint8>(rawAudioData),
[Self = TWeakPtr<YourClass>(AsShared())] (UImportedSoundWave* soundWave)
{
if (soundWave)
{
if (TSharedPtr<MessageChunkAudioContainer> sharedSelf = Self.Pin())
{
}
}
else
{
UE_LOGFMT(VoxtaLog, Error, "Failed to process raw audio data into UImportedSoundWave.");
}
});
Dependencies
- UnrealEngine
AudioCaptureCore
AudioExtensions
AudioPlatformConfiguration
Core
Engine
Voice
WebSockets
- UnrealVoxta
- UnrealEngine (platform-specific)
AVFoundation
AndroidPermission
AudioCaptureAndroid
AudioCaptureAudioUnit
AudioCaptureRtAudio
AudioToolbox
CoreAudio
Licensing
RuntimeAudioImporter:
MIT license - copyright (c) 2024 Georgy Treshchev. See VoxtaAudioUtility/Public/RuntimeAudioImporter/LICENSE for details.
Other code:
MIT license - copyright (c) 2025 grrimgrriefer & DZnnah. See LICENSE in root for details.