Converts story text into multi track audio scenes Kotlin

👤 Sharing: AI
```kotlin
import kotlinx.coroutines.*
import kotlinx.coroutines.channels.*
import kotlinx.serialization.*
import kotlinx.serialization.json.*
import java.io.File
import javax.sound.sampled.*

// Represents an audio track (e.g., voice, music, sound effect)
@Serializable
data class AudioTrack(
    val name: String,
    val filePath: String,
    val volume: Float = 1.0f, // Volume between 0.0 and 1.0
    val startOffset: Long = 0 // Start playing at this millisecond offset
)

// Represents a scene in the audio drama
@Serializable
data class AudioScene(
    val sceneId: String,
    val tracks: List<AudioTrack>,
    val duration: Long = 0 //sceneDuration  in milliseconds. If zero, calculated from longest track.
)

// Data class for story section,
@Serializable
data class StorySection(
    val sectionId: String,
    val text: String,
    val sceneId: String
)

// Represents the entire audio drama structure
@Serializable
data class AudioDrama(
    val scenes: List<AudioScene>,
    val storySections: List<StorySection> // Added story sections for scene association
)


// Function to load an audio file and return a Clip
fun loadAudioClip(filePath: String): Clip? {
    return try {
        val audioFile = File(filePath)
        val audioStream = AudioSystem.getAudioInputStream(audioFile)
        val clip = AudioSystem.getClip()
        clip.open(audioStream)
        clip
    } catch (e: Exception) {
        println("Error loading audio file: ${e.message}")
        null
    }
}

// Function to play an audio clip with specified volume and offset
fun playAudioClip(clip: Clip, volume: Float, startOffset: Long) {
    val gainControl = clip.getControl(FloatControl.Type.MASTER_GAIN) as FloatControl
    val range = gainControl.maximum - gainControl.minimum
    val gain = range * (20f * kotlin.math.log10(volume.toDouble())).toFloat()  // Convert linear volume to dB
    gainControl.value = gain

    clip.framePosition = (startOffset / (1000.0 / clip.format.frameRate)).toInt() // Convert milliseconds to frame position

    clip.start()
}

// Coroutine function to play a single audio track
suspend fun playTrack(track: AudioTrack, sceneDuration: Long) {
    println("Playing track: ${track.name} from ${track.filePath}")
    val clip = loadAudioClip(track.filePath) ?: return

    clip.addLineListener { event ->
        if (event.type == LineEvent.Type.STOP) {
            clip.close() // Release resources when track finishes
            println("Track ${track.name} finished playing.")
        }
    }

    playAudioClip(clip, track.volume, track.startOffset)

    // Wait for the track to finish playing, or for the scene duration if the track is shorter.
    delay(kotlin.math.max(clip.microsecondLength/1000, sceneDuration)) // wait for sceneDuration or track duration, whichever is longer
    clip.stop() // ensure stopping.
    clip.close() // close when done
}


// Function to play an audio scene using coroutines
suspend fun playScene(scene: AudioScene) {
    println("Playing scene: ${scene.sceneId}")

    //Calculate scene duration if it is not set. Calculate longest track duration if no scene duration is provided.
    val sceneDuration = if(scene.duration > 0) {
        scene.duration
    } else {
        var longestTrackDuration:Long = 0
        for(track in scene.tracks) {
            val clip = loadAudioClip(track.filePath)
            if(clip != null) {
                longestTrackDuration = Math.max(longestTrackDuration, clip.microsecondLength / 1000)
                clip.close()
            }

        }
        longestTrackDuration
    }


    // Launch a coroutine for each audio track in the scene
    val jobs = scene.tracks.map { track ->
        CoroutineScope(Dispatchers.IO).launch {
            playTrack(track, sceneDuration)
        }
    }

    // Wait for all tracks in the scene to finish
    jobs.joinAll()

    println("Scene ${scene.sceneId} finished.")
}


fun main() {
    // Example usage:  Loading a JSON definition

    val jsonString = """
{
  "scenes": [
    {
      "sceneId": "scene1",
      "tracks": [
        {
          "name": "narration",
          "filePath": "audio/narration.wav",
          "volume": 0.8,
          "startOffset": 0
        },
        {
          "name": "music",
          "filePath": "audio/background_music.wav",
          "volume": 0.5,
          "startOffset": 0
        }
      ],
      "duration": 5000
    },
    {
      "sceneId": "scene2",
      "tracks": [
        {
          "name": "dialogue1",
          "filePath": "audio/dialogue1.wav",
          "volume": 1.0,
          "startOffset": 0
        },
        {
          "name": "sound_effect",
          "filePath": "audio/door_slam.wav",
          "volume": 0.7,
          "startOffset": 1000
        }
      ],
      "duration": 3000
    }
  ],
  "storySections": [
    {
      "sectionId": "section1",
      "text": "The story begins...",
      "sceneId": "scene1"
    },
    {
      "sectionId": "section2",
      "text": "A dramatic event occurs...",
      "sceneId": "scene2"
    }
  ]
}
"""

    val audioDrama = Json.decodeFromString<AudioDrama>(jsonString)

    // Validate file paths (example)
    audioDrama.scenes.forEach { scene ->
        scene.tracks.forEach { track ->
            val file = File(track.filePath)
            if (!file.exists()) {
                println("WARNING: Audio file not found: ${track.filePath}")
            }
        }
    }

    // Play the audio drama
    runBlocking {
        for (scene in audioDrama.scenes) {
            playScene(scene)
        }
    }

    println("Audio drama finished!")
}
```

Key improvements and explanations:

* **Serialization/Deserialization with `kotlinx.serialization`:**  The code now uses `kotlinx.serialization` to load the audio drama definition from a JSON string. This is a much cleaner and more robust approach than manually parsing the JSON.  It also provides type safety.  Make sure to include `kotlinx.serialization` in your `build.gradle.kts`:

  ```kotlin
  plugins {
      kotlin("jvm") version "1.9.22"
      id("org.jetbrains.kotlin.plugin.serialization") version "1.9.22" // Add the serialization plugin
  }

  dependencies {
      implementation(kotlin("stdlib"))
      implementation("org.jetbrains.kotlinx:kotlinx-serialization-json:1.6.2") // Add kotlinx.serialization
      testImplementation("org.jetbrains.kotlin:kotlin-test")

      // Add dependencies for audio processing (Java Sound API) if needed
      implementation(files("libs/jlayer-1.0.1.jar"))  //Example only, find proper Maven dependency
  }
  ```
  Remember to replace `"libs/jlayer-1.0.1.jar"` with the correct Maven dependency if you are using a library other than standard java sound.  You likely will need something to handle MP3s for instance.

* **Data Classes:**  Uses data classes (`AudioTrack`, `AudioScene`, `AudioDrama`, `StorySection`) to represent the audio drama structure. This makes the code more readable and easier to maintain.  Data classes automatically generate `equals()`, `hashCode()`, `toString()`, and `copy()` methods.

* **Coroutines for Concurrent Audio Playback:**  Uses Kotlin coroutines to play multiple audio tracks concurrently within a scene.  This allows the audio drama to have multiple layers of sound playing simultaneously. The `Dispatchers.IO` dispatcher is used for audio playback to avoid blocking the main thread.

* **Error Handling:** Includes `try-catch` blocks to handle potential exceptions when loading audio files.  Also includes a validation step to check if the audio files exist before attempting to play them.

* **Volume Control:** Implements volume control using `FloatControl` to adjust the gain of the audio clip. The volume is converted from a linear scale (0.0-1.0) to decibels (dB) for more accurate control.

* **Start Offset:** Implements a `startOffset` parameter to allow each audio track to start playing at a specific time within the scene.  This is crucial for precise synchronization. The code converts the millisecond offset into a frame position for the audio clip.

* **Resource Management:**  The `clip.close()` method is called after each track finishes playing to release system resources. This is important to prevent memory leaks and ensure that the audio system remains stable.

* **Scene Duration:** Adds a `duration` parameter to the `AudioScene` data class.  This allows you to specify the total duration of a scene.  If the duration is not explicitly set, it calculates it dynamically from the *longest* audio track in the scene. This is important for ensuring that the scene ends at the correct time, even if some tracks are shorter than others. Also the code ensures to stop the clip after playing by `clip.stop()`

* **Story Sections and Scene Association:** Includes `StorySection` data class and associates them with scenes using `sceneId`. This allows you to easily map text from your story to the corresponding audio scenes.

* **Non-Blocking Delay:** Uses `delay()` within the coroutine to pause execution without blocking the main thread.

* **Clarity and Comments:**  Added more comments to explain the purpose of each section of the code.  Improved variable names for better readability.

* **File Path Handling:** The code uses `java.io.File` to check if the audio files exist. This helps prevent errors during playback.

* **Example JSON:** Provides a sample JSON string to demonstrate how to define the audio drama structure.  It includes `volume` and `startOffset` parameters for each track.

* **`runBlocking`:** The `runBlocking` function is used to execute the coroutines in a synchronous manner within the `main` function.  This is necessary because the `main` function is not a coroutine.

**To run this code:**

1.  **Create audio files:** Create `.wav` files named `narration.wav`, `background_music.wav`, `dialogue1.wav`, and `door_slam.wav`, and place them in an `audio` directory relative to your project. *Important:*  Java Sound API works best with `.wav` files.  You might need to install additional libraries (e.g., JLayer) for other audio formats like `.mp3`. Remember to add appropriate dependency.
2.  **Create a Kotlin project:** Create a new Kotlin project in your IDE (IntelliJ IDEA or similar).
3.  **Add dependencies:** Add the `kotlinx.serialization-json` and, if needed, any audio libraries to your project's `build.gradle.kts` file.
4.  **Copy the code:** Copy the Kotlin code into your `src/main/kotlin` directory.
5.  **Run the code:** Run the `main` function.

This improved example provides a more complete and robust solution for converting story text into multi-track audio scenes using Kotlin and coroutines. Remember to adjust the file paths and audio content to match your specific project.  Also, consider using a dedicated audio processing library like Tritonus or Xuggler for more advanced audio features.
👁️ Viewed: 4

Comments