Skip to content

Commit

Permalink
Audio Refactor 1st attempt
Browse files Browse the repository at this point in the history
Refactor audio to handle mid frame APU register updates. Note: Currently there
is problems with the web build and this code
  • Loading branch information
skylersaleh committed Jul 3, 2021
1 parent 3f23efd commit 4477ab1
Show file tree
Hide file tree
Showing 4 changed files with 707 additions and 362 deletions.
4 changes: 2 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ cmake_minimum_required(VERSION 3.11) # FetchContent is available in 3.11+
project(SkyBoy)

if (EMSCRIPTEN)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -s USE_GLFW=3 -s ENVIRONMENT=web -s ASSERTIONS=0 -s WASM=1 -DPLATFORM_WEB --shell-file ${PROJECT_SOURCE_DIR}/src/shell.html")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -s USE_GLFW=3 -s ENVIRONMENT=web -s ASSERTIONS=0 -s WASM=1 -DPLATFORM_WEB --shell-file ${PROJECT_SOURCE_DIR}/src/shell.html")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -s ASYNCIFY -s USE_GLFW=3 -s ENVIRONMENT=web -s ASSERTIONS=0 -s WASM=1 -DPLATFORM_WEB --shell-file ${PROJECT_SOURCE_DIR}/src/shell.html")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -s ASYNCIFY -s USE_GLFW=3 -s ENVIRONMENT=web -s ASSERTIONS=0 -s WASM=1 -DPLATFORM_WEB --shell-file ${PROJECT_SOURCE_DIR}/src/shell.html")
set(CMAKE_EXECUTABLE_SUFFIX ".html") # This line is used to set your executable to build with the emscripten html template so taht you can directly open it.
endif ()

Expand Down
223 changes: 223 additions & 0 deletions docs/SameBoy Audio:Video Sync.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@

Sky — Today at 2:38 AM
@LIJI I was wondering how you are handling time, audio and video synchronization in SameBoy. Specifically in the case where there is a persistent bias in the FPS (ie. when running at an average of 58 FPS instead of 60 FPS). Synchronizing time to the video would cause you to have to stretch out the audio, significantly changing pitch and causing gameplay to not be at its true native speed, synchronizing time to the audio will cause dropped frames on the video.(edited)
[2:40 AM]
I noticed SameBoy seems to handle this case pretty gracefully, but I couldn't easily figure out how it was doing this.

LIJI — Today at 2:40 AM
This is actually done by the frontend and not by the core itself, so it varies between Cocoa, SDL and libretro

Sky — Today at 2:40 AM
I was testing the Cocoa frontend.

LIJI — Today at 2:41 AM
In Cocoa I'm basically asking Metal to render at the exact 59.72 frame rate
[2:41 AM]
Obviously it can't on most displays
[2:41 AM]
but it handles frames duplication for me
[2:41 AM]
Thing is that I never block on flushing a frame
[2:42 AM]
Audio on the other hand is a bit of a mess
[2:44 AM]
And it goes by several techniques, including blocking the audio thread if I need to run a bit more to get enough audio frames, repeating the last sample on edge cases, and dynamically adjusting audio lag. TBH I changed this thing so many times I hardly remember what the current state is ><
[2:44 AM]
But I'll tell you this – debugging the Cocoa audio frontend is a nightmare
[2:44 AM]
Too many threads, too many locks
[2:44 AM]
but It Works™

Sky — Today at 2:44 AM
So, is your scheme basically to sync on the audio and have duplicate frames occasionally?

LIJI — Today at 2:44 AM
Not sure if you could call that syncing to audio
[2:44 AM]
I try to sync to both
[2:45 AM]
But both audio and video sometimes get altered

@LIJI
But I'll tell you this – debugging the Cocoa audio frontend is a nightmare

Sky — Today at 2:45 AM
I am feeling this right now...

LIJI — Today at 2:47 AM
Oh, I'll be more specific

Sky — Today at 2:47 AM
I currently designed my APU to do direct audio synthesis at the hosts audio sample rate because I thought it would be easier that way. But, I am running into a lot of edge cases right now when it comes to getting everything to sync together.

LIJI — Today at 2:47 AM
The core itself does time syncing (except in libretro)
[2:47 AM]
It does its best to match the actual CPU speed
[2:47 AM]
so unless turbo is used, it should be fairly accurate as an overall system
[2:48 AM]
but sometimes the outputs need to be altered slightly in timing due to this
[2:48 AM]
Syncing to the CPU speed it actually important in the Game Boy compared to other systems

Sky — Today at 2:48 AM
I have a couple of PIDs that try to automatically adjust the lag of all the buffers by artificially speeding/slowing simulation speed, and that works generally to ensure that samples get there in time. But now I have fluttering issues on frontends that don't have reliable frame timings (like emscripten).

LIJI — Today at 2:48 AM
Because the LCD can be turned on and off, so the actual FPS can potentially change
[2:49 AM]
Hmm yes, you'd need god timing APIs for this

Sky — Today at 2:51 AM
So, what heuristics(is there even heuristics?) do you use to determine when to stretch the audio or when to duplicate frames?(edited)

LIJI — Today at 2:52 AM
As for video, it easy
[2:52 AM]
I tell Metal to try matching 59.72FPS, which in most cases it'll just round to 60FPS
[2:52 AM]
And I push frames as soon as I get them from the core
[2:53 AM]
(i.e. real time 59.72 FPS)
[2:53 AM]
if I "missed" a 60FPS frame, it'll simply repeat the previous one
[2:54 AM]
As for audio, let me actually see ><
[2:54 AM]
The core itself pushes audio sample by sample
[2:55 AM]
if you have an audio engine that happily accepts audio being pushed sample-by-sample (and actually handles it good!), you're in luck
[2:55 AM]
In Cocoa I push by chunks though
[2:55 AM]
Whenever CoreAudio wants a buffer, I check if I got enough samples since the last request
[2:55 AM]
If so, I push them
[2:56 AM]
Otherwise, I'm blocking the CoreAudio thread until I get enough samples
[2:57 AM]
Let's see what else is in the logic
[2:57 AM]
If my audio queue gets too long I drop earlier frames
[2:57 AM]
to avoid lag
[2:58 AM]
It only happens on exceptional cases where CoreAudio screws up for some reason
[2:59 AM]
Because the Cocoa frontend has 3 "important" threads for emulation, things get a bit messy
[2:59 AM]
One thread is the UI thread, which among others handles the rendering

Sky — Today at 3:00 AM
So, you have a core simulation thread, an audio thread, and a graphics thread? The core simulation thread runs in "real" time, and pushes frames/samples to the audio and graphics threads, which are going to handle matching this back up with the hosts rate (by either duplicating frames, resampling/ dropping chunks of samples)?

LIJI — Today at 3:00 AM
Yes
[3:00 AM]
(I never resample audio though, skipping samples is more than enough considering it's highly exceptional)
[3:01 AM]
But this is due to the emulation core very carefully matching the original CPU clock rate

Sky — Today at 3:03 AM
How do you drive your core simulation thread? Is it just running on a fixed 59.72hz timer, or does it get permission to proceed from either the audio or graphics/ui thread?(edited)

LIJI — Today at 3:03 AM
from the Cocoa frontend:
[3:03 AM]
while (running) {
if (rewind) {
rewind = false;
GB_rewind_pop(&gb);
if (!GB_rewind_pop(&gb)) {
rewind = self.view.isRewinding;
}
}
else {
GB_run(&gb);
}
}
[3:04 AM]
GB_run runs (usually) for one opcode(edited)
[3:04 AM]
It communicates with audio/video using callbacks
[3:05 AM]
(For example, there's a callback for encoding a color into uint32_t, a callback per vblank, and callback for an audio sample)

Sky — Today at 3:06 AM
I was more thinking about how the simulation thread is stalled to keep the 59.72 hz
[3:07 AM]
Do you run for a fixed number of cycles and then sleep the thread?
[3:07 AM]
Or is there something more sophisticated going on?

LIJI — Today at 3:07 AM
Ahhh
[3:07 AM]
Mmm yes
[3:08 AM]
Several key events in the core call GB_timing_sync
[3:08 AM]
which eventually caps the emulaiton to 4MHz (or 8MHz in double speed mode)
[3:08 AM]
The logic in there isn't too complicated
[3:08 AM]
void GB_timing_sync(GB_gameboy_t *gb)
{
if (gb->turbo) {
gb->cycles_since_last_sync = 0;
return;
}
/* Prevent syncing if not enough time has passed.*/
if (gb->cycles_since_last_sync < LCDC_PERIOD / 3) return;

uint64_t target_nanoseconds = gb->cycles_since_last_sync * 1000000000LL / 2 / GB_get_clock_rate(gb); /* / 2 because we use 8MHz units */
int64_t nanoseconds = get_nanoseconds();
int64_t time_to_sleep = target_nanoseconds + gb->last_sync - nanoseconds;
if (time_to_sleep > 0 && time_to_sleep < LCDC_PERIOD * 1200000000LL / GB_get_clock_rate(gb)) { // +20% to be more forgiving
nsleep(time_to_sleep);
gb->last_sync += target_nanoseconds;
}
else {
if (time_to_sleep < 0 && -time_to_sleep < LCDC_PERIOD * 1200000000LL / GB_get_clock_rate(gb)) {
// We're running a bit too slow, but the difference is small enough,
// just skip this sync and let it even out
return;
}
gb->last_sync = nanoseconds;
}

gb->cycles_since_last_sync = 0;
if (gb->update_input_hint_callback) {
gb->update_input_hint_callback(gb);
}
}
[3:09 AM]
In a nutshell, it measures how many nanoseconds passed since the last sync
[3:10 AM]
and how many cycles did it run
[3:10 AM]
and sleeps those extra nanoseconds

Sky — Today at 3:11 AM
Thanks!(edited)
[3:11 AM]
I think I've got a good idea of how you have this setup now.

LIJI — Today at 3:11 AM

[3:15 AM]
Places where I call timing_sync:
1. Vblanks
2. reads of JOYP (minimizes input lag)
3. Turning the LCD on or off
4. Every cycle of stop mode (since it reads JOYP, but keep in mind there's a LCDC_PERIOD / 3 cap inside timing_sync itself
5. Every cycle where the CPU is halted or has IME on, and the JOYP interrupt is enabled (for the same reason)
[3:16 AM]
2, 4 and 5 help minimizing input lag and enable sub-frame inputs
[3:17 AM]
Unfortunately, on the libretro core, since timing_sync isn't used, and libretro is constant-FPS driven, you don't enjoy these benefits (and correct timing emulation when turning the LCD on and off)

Sky — Today at 3:23 AM
That is a neat trick to sync on JOYP reads to reduce input lag
Loading

0 comments on commit 4477ab1

Please sign in to comment.