This is actually an issue with shairport-sync, as it does not fully comply with the AirPlay protocol.
You'll notice that when playing to AirPlay devices using audio|acacia, playback starts almost instantly (well, after 420ms to be exact) while with iTunes or iOS playback starts after 2000ms.
The developers of shairport-sync have hardcoded playback to start after 88,200 samples (2000ms) after receiving the initial Play command, rather than deriving the playback start time from the first sync packet like native AirPlay devices do.
As documented here, to fix this issue you should run shairport-sync with the following parameter:
--latency=18522
18,522 samples = 420ms
This should work in all scenarios except when adding shairport-sync devices to a session on-the-fly (i.e. while the session is playing). This issue will be resolved once shairport-sync correctly follows the protocol or when audio|acacia adds its "fast start" feature for devices joining on-the-fly.