UnrealVoxta 0.1.1
 
Loading...
Searching...
No Matches
VoxtaAudioUtility Module

Overview

The VoxtaAudioUtility module handles real-time audio capture, encoding- and decoding audio data from and to VoxtaServer.

  • Audio capture from microphone input
  • WAV data decoding
  • Real-time audio streaming
  • WebSocket-based audio transmission

Module Structure

Sending audio to VoxtaServer

  • AudioCaptureHandler: Manages microphone input capture and provides audio data to the voice system.
    • Real-time audio capture from system microphones
    • Configurable capture settings (sample rate, channels, etc.)
    • Audio buffer management and queueing (triggered by VoiceRunnerThread)
  • AudioWebSocket: Manages transmitting audio data through Unreal's FWebSocketsModule.
    • Connection state management
    • Broadcast notifications (connection, authenticaion, etc.)
    • Error handling
  • VoiceRunnerThread: A dedicated non-blocking thread for processing voice data.
    • Every 0.2 seconds:
      • Fill a buffer with all available voice data
      • Calculate decibel level from the buffer
      • If the buffer contains data, send it to VoxtaServer

SequenceDiagramAudioUtility_send image

Receiving audio from VoxtaServer

  • FBaseRuntimeCodec: Generic base codec to support more formats in the future.
  • WAV_RuntimeCodec: WAV codec with dr_wav library integration
  • RuntimeAudioImporterLibrary: Main interface for audio import operations

SequenceDiagramAudioUtility_receive image

Platform Support

The module supports:

  • Android*
  • iOS*
  • Mac*
  • Windows

*untested

Usage Examples

Initializing Audio Capture

TSharedPtr<AudioWebSocket> audioWebSocket = MakeShared<AudioWebSocket>(serverAddress, serverPort);
// Listen to connection-related broadcasts
audioWebSocket->OnConnectedEvent.AddUObject(this, &YourClass::OnSocketConnected);
audioWebSocket->OnConnectionErrorEvent.AddUObject(this, &YourClass::OnSocketConnectionError);
audioWebSocket->OnClosedEvent.AddUObject(this, &YourClass::OnSocketClosed);
// Create and initialize audio capture
audioCaptureDevice.ConfigureSilenceThresholds(0.001f, 0.001f, 6.0f); // micNoiseGateThreshold, silenceDetectionThreshold, micInputGain
// Inform server of audio data format we will send
audioWebSocket->Send(FString::Format(TEXT("{\"contentType\":\"audio/wav\","
"\"sampleRate\":{0},\"channels\":{1},\"bitsPerSample\": 16,\"bufferMilliseconds\":{2}}"),
{ sampleRate, inputChannels, bufferMs }));
audioCaptureDevice.RegisterSocket(audioWebSocket, 200); // audioWebSocket, bufferMs
audioCaptureDevice.TryInitializeVoiceCapture(16000, 1); // 16kHz, mono
// Connect to the current ongoing chat
audioWebSocket->Connect(chatSessionId);
// Start capturing
audioCaptureDevice->TryStartVoiceCapture();
AudioCaptureHandler.
Definition AudioCaptureHandler.h:19
bool TryInitializeVoiceCapture(int sampleRate=16000, int numChannels=1)
Tries to create a voice capture instance (IVoiceCapture).
Definition AudioCaptureHandler.cpp:18
void RegisterSocket(TWeakPtr< AudioWebSocket > socket, int bufferMillisecondSize)
Store a pointer to the websocket that will receive the parsed data from the FVoiceModule.
Definition AudioCaptureHandler.cpp:12
bool TryStartVoiceCapture()
Tries to start that IVoiceCapture and also start the background thread that will forward any captured...
Definition AudioCaptureHandler.cpp:82
void ConfigureSilenceThresholds(float micNoiseGateThreshold, float silenceDetectionThreshold, float micInputGain)
Configure values to help the microphone to pick up voice without background noise.
Definition AudioCaptureHandler.cpp:63

Decoding WAV data

URuntimeAudioImporterLibrary::ImportAudioFromBuffer(TArray64<uint8>(rawAudioData),
[Self = TWeakPtr<YourClass>(AsShared())] (UImportedSoundWave* soundWave)
{
if (soundWave)
{
if (TSharedPtr<MessageChunkAudioContainer> sharedSelf = Self.Pin())
{
// soundWave->AddToRoot();
// ...
}
}
else
{
UE_LOGFMT(VoxtaLog, Error, "Failed to process raw audio data into UImportedSoundWave.");
}
});

Dependencies

  • UnrealEngine
    • AudioCaptureCore
    • AudioExtensions
    • AudioPlatformConfiguration
    • Core
    • Engine
    • Voice
    • WebSockets
  • UnrealVoxta
  • UnrealEngine (platform-specific)
    • AVFoundation
    • AndroidPermission
    • AudioCaptureAndroid
    • AudioCaptureAudioUnit
    • AudioCaptureRtAudio
    • AudioToolbox
    • CoreAudio

Licensing

RuntimeAudioImporter:
MIT license - copyright (c) 2024 Georgy Treshchev. See VoxtaAudioUtility/Public/RuntimeAudioImporter/LICENSE for details.

Other code:
MIT license - copyright (c) 2025 grrimgrriefer & DZnnah. See LICENSE in root for details.