'Confusion converting updated SPS and PPS h264 NALUs into AVCDecoderConfigurationRecords
I have code mostly working which allows a WebRTC connection to publish video, and to allow viewers to watch the feed via RTMP.
When the WebRTC setup starts sending over the H264 packets I wait until I receive at least one SPS and PPS nalu, then convert that over to an AVCDecoderConfigurationRecord, and then proceed to send the h264 data to RTMP clients as they come in (converting them to AVC packet types).
When an RTMP client joins, I send them the cached sequence header/avc configuration record and send them video packets as I receive them. This works, and if an RTMP client is connected prior to the publisher sending video everything works fine and ffmpeg reports no errors.
The problem with this setup is if someone joins mid-stream then they have to wait for an IDR frame to come in before they can start to see video. This is a problem because it seems like web browsers feeding video from a webcam send IDRs very infrequently (I've heard 3k frames).
In most RTMP scenarios it's suggested to set a keyframe encoder interval to 2 or so seconds to get around this issue. So to accomplish this with my media server I've set a timer to send a Packet Loss Indicator message to the publisher every 3 seconds (I don't use FIR as most browsers don't seem to support it).
This works, as now an RTMP client only has to wait a maximum of 3 seconds and they can then start viewing the live stream. Video comes through clear and all seemed good.
However, I noticed that the PLI request triggers Firefox/my webcam to send down a new SPS and PPS. Sometimes the SPS even changes (which from reading I didn't think was valid, as SPS shouldn't change mid-stream), but every 3 seconds a brand new PPS comes down the pipe.
While video still displays properly via ffplay, ffplay reports a lot of spam of non-existing PPS X referenced messages, and every 3 seconds X is incremented by 1.
Since the AVCDecoderConfigurationRecord supports multiple sps and pps records, I updated my code to cache these. So anytime any sps or pps nalus are received I add them to a collection of cached sps and pps nalus, then create a new AVCDecoderConfigurationRecord with all sps and pps nalus seen so far.
This seems to function correctly with video still showing in ffplay, but ffplay still spams the output with the non-existing pps messages.
So my questions are:
- Why is ffplay complaining about non-existing pps when I can verify I'm sending an AVCDecoderConfigurationRecord with every pps I've received (and this is localhost, so no packet loss)?
- What is the proper way to handle updated sps and pps records? Should I be caching every one I see and re-send an AVCDecoderConfigurationRecord with all sps and pps I've seen so far? Does AVC ignore AVCDecoderConfigurationRecords that come later?
- What happens after 93 seconds when I run out of room for the 5 bits for the number of sps records, and likewise what happens after I've received 255 pps records?
- Why does my video still display correctly despite the missing PPS spam?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
