最近在研究ffmpeg,发现网上关于ffmpeg解封装的源码分析不多,也不完整,这里总结一下,我自己ffmpeg解封装mov、mp4格式的源码分析主要就是关于mov.c让学生了解源码分析,mp4的流AVStream、AVPacket这两个结构体变量是如何赋值的?了解封装过程,如pts、dts如何获得有助于ffmpeg二次开发。关于MP4协议,网上有详细描述,这里就不写了,直接上干货,代码书注释。
所有接口和数据结构都写得很详细,但经过一段时间的研究,写起来非常麻烦和疲惫。读完后,我给弟弟一些关注 哈哈哈哈哈
重点小Tips:
1.ffmpeg许多结构体(AVStream、URLContext、AVFormatContext)很喜欢用void *priv_data变量 其实这个变量是用来存储结构体的 例如,子结构体AVStream中的priv_data用于存储mov协议MOVStreamContext结构体变量,URLContext中的priv_data用于存储file协议中FileContext结构体的,这实际上是为了分离协议接口功能或数据和主要接口,使整个库具有可扩展性。所以你会发现,在各种协议界面的开头,你会谈论主干priv_data给协议本身的结构赋值。所以你会发现,在各种协议界面的开头,你会谈论主干priv_data给协议本身的结构赋值。mov_read_stsd当中的 MOVStreamContext *sc = st->priv_data;这样写,也是语法糖,sc 不会受priv_data名称的影响。 即使命名有变化,外部变量也很少影响内部接口。ffmpeg这种方式主要用于接口,特别是涉及一些外部协议 rtmp流媒体、file文件、mov格式等。
2.对于Context例如:这个命名:URLContext、FileContext、AVFormatContext我个人的理解是完成功能所需的数据 方法(接口)。如URLContext当中就有 file协议FileContext结构体里面有 open、close、read等方法和uint*data用于存储从文件中读取的数据。这是一级一级存储,为了更好地扩展代码,这个库是很多人写的。不知道我有没有解释清楚,哈哈哈哈哈。
3.对于internal这种命名如AVStreamInternal,一般用于存储数据并传输给接口
4.因为多媒体文件都是字节流形式,所以接口 AV_RL32读取4个字节以大端方式读取 av_rl32小端读取
本文主要讲述MP4格式中最重要的trak box(atom)模块中的mov_read_stsd(stts)、(stss)、(ctts)、(stsc)、(stsz)、(stco)接口分析,因为这些接口是基于的(stsd)、(stts)、(stss)、(ctts)、(stsc)、(stsz)、(stco)这些box信息,得出sample(一帧音视频)信息可以根据这些信息定位,编码后的音视频数据可以在整个文件的位置,从而通过AVIOContext读出变量,存入AVPacket在变量中。
在代码中atom,其实就是MP4协议中的box,在代码或协议中经常提到sample其实是音视频的一帧,提醒关键界面哈
//mov格式stream有音视频的结构体sample信息(大小、序号、关键帧等。),但一般音视频 ////种类不多atom,这里我主要以电影为主,所有的变量注释都是对电影的分析MP4格式中使用的剩余格式一般不使用 ///结构体喜欢用指针数组来表示序列,如:int *keyframes关键帧序列 //Entry是MP4格式协议中的一个概念,你可以把它当作结构,就像MP4是以box(atom)作为存储的概念 typedef struct MOVStreamContext { AVIOContext *pb; int pb_is_copied; int ffindex; //< AVStream index 一般为(0或1 也就是音频或视频) int next_chunk; unsigned int chunk_count; //chunk 总数(一个chunk中有几个sample) int64_t *chunk_offsets; //stco 每个chunk相对于整个文件的绝对偏移(即相对于整个文件头的位置) ///了找到每一个chunk,不依赖其他参数 unsigned int stts_count; //sample的dts信息stts Entry 个数 MOVStts *stts_data;//sample的dts信息stts data结构 typedef struct MOVStts { unsigned int count; //相同duration的sample数量 int duration; //每个sample的dts的偏差值 也就是 delta增量 } MOVStts; unsigned int ctts_count; //sample的dts和pts偏移量信息 ctts Entry 个数 //开始阅读ctts atom 的时候ctts_count为ctts entry结构的个数 //但是路过mov_build_index接口需要重新给出ctts_data赋值(因为有ctts不止一个sample所以总的ctts_count会少于sample数量),ctts_count为sample数量 unsigned int ctts_allocated_size; //已分配ctts个数 MOVStts *ctts_data;//sample的dts和pts偏移量信息ctts data结构 unsigned int stsc_count; //chunk中有多少sample的信息stsc entry结构数量(注:不是chunk数目,因为stsc得到的是chunk的序号) MOVStsc *stsc_data;//chunk中有多少sample的息 stsc data结构 typedef struct MOVStsc { int first;//chunk 中的第一个sample的id (一个chunk一个或多个sample) int count;//每个chunk中sample数量 int id; //Sample description 一般为1 } MOVStsc; unsigned int stsc_index; //stts_data数组下表 int stsc_sample; unsigned int stps_count; unsigned *stps_data; ///< partial sync sample for mpeg-2 open gop MOVElst *elst_data; 决定第一个sample的DTS信息 edit list 数据 typedef struct MOVElst { int64_t duration;//sample的总时间 int64_t time;//sample的dts起始值(取time的负数就是dts第一值) float rate; //sample rate 一般为1 } MOVElst; unsigned int elst_count; //elst entry结构的个数 (一般为1) int ctts_index;//ctts_data数组下表 int ctts_sample; unsigned int sample_size;如果一切sample相同的值就是这个值,否则sample_size==0< may contain value calculated from stsd or value from stsz atom unsigned int stsz_sample_size; 如果所有sample相同的值就是这个值,否则stsz_sample_size==0< always contains sample size from stsz atom unsigned int sample_count;///所有帧数(帧总数) int *sample_sizes; ///每帧的大小 int keyframe_absent; //不要关键帧 unsigned int keyframe_count; ///关键帧数 int *keyframes; //关键帧数组(以int类型指针数组形式存储关键帧 int time_scale; //mdhd box时间缩放比例中 int64_t time_offset; //sample的dts起始值(取time_offset的负数就是dts第一值) int64_t min_corrected_pts; //sample的dts起始值(取time的负数就是dts第一值)< minimum Composition time shown by the edits excluding empty edits. int current_sample; //当前sample序号 int64_t current_index;//当前sample序号 MOVIndexRange* index_ranges; MOVIndexRange* current_index_range; unsigned int bytes_per_frame; ///音频所需的数据(AAC格式一般不用) unsigned int samples_per_frame;///音频所需的数据(AAC格式一般不用) int dv_audio_container; int pseudo_stream_id; //stsd Entry数目一般为1 -1 means demux all ids int16_t audio_cid; ///< stsd audio compression id unsigned drefs_count; MOVDref *drefs; //与网络流媒体有关 int dref_id; int timecode_track; int width; ///< tkhd width int height; ///< tkhd height int dts_shift; /一般值为0 dts shift when ctts is negative
uint32_t palette[256];
int has_palette;
int64_t data_size; //所有帧大小总和
uint32_t tmcd_flags; ///< tmcd track flags
int64_t track_end; //帧结尾位置(也就是duration时间总数) ///< used for dts generation in fragmented movie files
int start_pad; ///< amount of samples to skip due to enc-dec delay
unsigned int rap_group_count;
MOVSbgp *rap_group;
int nb_frames_for_fps; //所有帧个数(帧总数)
int64_t duration_for_fps; //所有帧的持续时间总和
/** extradata array (and size) for multiple stsd */
uint8_t **extradata;
int *extradata_size;
int last_stsd_index;
int stsd_count;//stsd Entry个数
int stsd_version;//stsd 版本
int32_t *display_matrix;//视频矩阵
AVStereo3D *stereo3d;
AVSphericalMapping *spherical;
size_t spherical_size;
AVMasteringDisplayMetadata *mastering;
AVContentLightMetadata *coll;
size_t coll_size;
uint32_t format; //编码格式
int has_sidx; // If there is an sidx entry for this stream.
struct {
struct AVAESCTR* aes_ctr;
unsigned int per_sample_iv_size; // Either 0, 8, or 16.
AVEncryptionInfo *default_encrypted_sample;
MOVEncryptionIndex *encryption_index;
} cenc;
} MOVStreamContext;
这里所有的接口在开始的时候都会写:
AVStream *st;
MOVStreamContext *sc;
int ret, entries;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams - 1];
sc = st->priv_data;
这样写的好处是,接口有扩展性,所有接口的形参都是(MOVContext *c, AVIOContext *pb, MOVAtom atom),上面也说了,利用上一级AVStream 变量中的priv_data变量,赋值给mov模块自己的结构体MOVStreamContext ,方便书写和扩展,同理,原本最外层的AVFormatContext也被赋值到MOVContext 当中,传入接口方便书写,这样即使外部变量如命名有变化也会很少的影响内部接口。ffmpeg的接口大多都用到这种方式,尤其是涉及到一些外部协议rtmp流媒体、file文件、mov格式等。
所有的 metadata atom 统一的构成 version+flag+entry 数量 这里说一下entry 的概念:按照我的理解就是一种结构体概念,有点像atom或box的概念一样,用来存储具体的metadata 数据,每个atom有一个或多个entry,MOVStsc MOVStts MOVStsz 的变量数组下标为 Entry 的序号
stsd atom :sample 的metadata 信息读取
//stsd atom
static int mov_read_stsd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
int ret, entries;
//外部参数赋值,方便书写和扩展
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams - 1];
sc = st->priv_data;
sc->stsd_version = avio_r8(pb); //版本号
avio_rb24(pb); /* flags */
entries = avio_rb32(pb); //entry 数量 (一般为1,音频或视频或subtitle)
/* Each entry contains a size (4 bytes) and format (4 bytes). */
if (entries <= 0 || entries > atom.size / 8) {
av_log(c->fc, AV_LOG_ERROR, "invalid STSD entries %d\n", entries);
return AVERROR_INVALIDDATA;
}
if (sc->extradata) {
av_log(c->fc, AV_LOG_ERROR,
"Duplicate stsd found in this track.\n");
return AVERROR_INVALIDDATA;
}
/* Prepare space for hosting multiple extradata. */
sc->extradata = av_mallocz_array(entries, sizeof(*sc->extradata));
if (!sc->extradata)
return AVERROR(ENOMEM);
sc->extradata_size = av_mallocz_array(entries, sizeof(*sc->extradata_size));
if (!sc->extradata_size) {
ret = AVERROR(ENOMEM);
goto fail;
}
//解析stsd的entry
ret = ff_mov_read_stsd_entries(c, pb, entries);
if (ret < 0)
goto fail;
/* Restore back the primary extradata. */
av_freep(&st->codecpar->extradata);
st->codecpar->extradata_size = sc->extradata_size[0];
if (sc->extradata_size[0]) {
st->codecpar->extradata = av_mallocz(sc->extradata_size[0] + AV_INPUT_BUFFER_PADDING_SIZE);
if (!st->codecpar->extradata)
return AVERROR(ENOMEM);
memcpy(st->codecpar->extradata, sc->extradata[0], sc->extradata_size[0]);
}
return mov_finalize_stsd_codec(c, pb, st, sc); //针对音频的
fail:
...
return ret;
}
//解析stsd的entry
int ff_mov_read_stsd_entries(MOVContext *c, AVIOContext *pb, int entries)
{
AVStream *st;
MOVStreamContext *sc;
int pseudo_stream_id;
av_assert0 (c->fc->nb_streams >= 1);
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
for (pseudo_stream_id = 0;
pseudo_stream_id < entries && !pb->eof_reached;
pseudo_stream_id++) {
//Parsing Sample description table
enum AVCodecID id;
int ret, dref_id = 1;
MOVAtom a = { AV_RL32("stsd") };
int64_t start_pos = avio_tell(pb);
int64_t size = avio_rb32(pb); /* size 大小*/
uint32_t format = avio_rl32(pb); /* 编码格式 AAC H.264等 */
if (size >= 16) {
avio_rb32(pb); /* reserved 保留位没什么意义*/
avio_rb16(pb); /* reserved 保留位没什么意义*/
dref_id = avio_rb16(pb);
} else if (size <= 7) {
av_log(c->fc, AV_LOG_ERROR,
"invalid size %"PRId64" in stsd\n", size);
return AVERROR_INVALIDDATA;
}
if (mov_skip_multiple_stsd(c, pb, st->codecpar->codec_tag, format,
size - (avio_tell(pb) - start_pos))) {
sc->stsd_count++;
continue;
}
//sc->pseudo_stream_id :stsd Entry数目一般为1
sc->pseudo_stream_id = st->codecpar->codec_tag ? -1 : pseudo_stream_id;
sc->dref_id= dref_id;
sc->format = format;
//找出文件编码格式所对应的的 id(通过这个id,可以找出对应的解码器)
id = mov_codec_id(st, format);
av_log(c->fc, AV_LOG_TRACE,
"size=%"PRId64" 4CC=%s codec_type=%d\n", size,
av_fourcc2str(format), st->codecpar->codec_type);
//赋值codecpar->codec_id和codecpar->codec_type
st->codecpar->codec_id = id;//赋值codecpar->codec_id和codecpar->codec_type
//视频
if (st->codecpar->codec_type==AVMEDIA_TYPE_VIDEO) {
mov_parse_stsd_video(c, pb, st, sc);
}
//音频
else if (st->codecpar->codec_type==AVMEDIA_TYPE_AUDIO) {
mov_parse_stsd_audio(c, pb, st, sc);
if (st->codecpar->sample_rate < 0) {
av_log(c->fc, AV_LOG_ERROR, "Invalid sample rate %d\n", st->codecpar->sample_rate);
return AVERROR_INVALIDDATA;
}
}
//subtitle(字幕)
else if (st->codecpar->codec_type==AVMEDIA_TYPE_SUBTITLE){
mov_parse_stsd_subtitle(c, pb, st, sc,
size - (avio_tell(pb) - start_pos));
} else {
ret = mov_parse_stsd_data(c, pb, st, sc,
size - (avio_tell(pb) - start_pos));
if (ret < 0)
return ret;
}
/* this will read extra atoms at the end (wave, alac, damr, avcC, hvcC, SMI ...) */
a.size = size - (avio_tell(pb) - start_pos);
if (a.size > 8) {
if ((ret = mov_read_default(c, pb, a)) < 0) //stsd atom还有剩余数据就继续往下读,这里应该是一些额外数据
return ret;
} else if (a.size > 0)
avio_skip(pb, a.size);
if (sc->extradata && st->codecpar->extradata) {
int extra_size = st->codecpar->extradata_size;
/* Move the current stream extradata to the stream context one. */
sc->extradata_size[pseudo_stream_id] = extra_size;
sc->extradata[pseudo_stream_id] = st->codecpar->extradata;
st->codecpar->extradata = NULL;
st->codecpar->extradata_size = 0;
}
sc->stsd_count++;
}
if (pb->eof_reached) { //文件读取出现错误
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STSD atom\n");
return AVERROR_EOF;
}
return 0;
}
//解析视频Entry内容,读取的数据直接赋值给外部AVstream 结构中
static void mov_parse_stsd_video(MOVContext *c, AVIOContext *pb, AVStream *st, MOVStreamContext *sc)
{
uint8_t codec_name[32] = { 0 };
int64_t stsd_start;
unsigned int len;
/* The first 16 bytes of the video sample description are already
* read in ff_mov_read_stsd_entries() */
stsd_start = avio_tell(pb) - 16;
avio_rb16(pb); /* version */
avio_rb16(pb); /* revision level */
avio_rb32(pb); /* vendor */
avio_rb32(pb); /* temporal quality */
avio_rb32(pb); /* spatial quality */
st->codecpar->width = avio_rb16(pb); /* width */
st->codecpar->height = avio_rb16(pb); /* height */
avio_rb32(pb); /* horiz resolution */
avio_rb32(pb); /* vert resolution */
avio_rb32(pb); /* data size, always 0 */
avio_rb16(pb); /* frames per samples */
len = avio_r8(pb); /* codec name, pascal string */
if (len > 31)
len = 31;
mov_read_mac_string(c, pb, len, codec_name, sizeof(codec_name));
if (len < 31)
avio_skip(pb, 31 - len);
if (codec_name[0])
av_dict_set(&st->metadata, "encoder", codec_name, 0);
/* codec_tag YV12 triggers an UV swap in rawdec.c */
if (!strncmp(codec_name, "Planar Y'CbCr 8-bit 4:2:0", 25)) {
st->codecpar->codec_tag = MKTAG('I', '4', '2', '0');
st->codecpar->width &= ~1;
st->codecpar->height &= ~1;
}
/* Flash Media Server uses tag H.263 with Sorenson Spark */
if (st->codecpar->codec_tag == MKTAG('H','2','6','3') &&
!strncmp(codec_name, "Sorenson H263", 13))
st->codecpar->codec_id = AV_CODEC_ID_FLV1;
st->codecpar->bits_per_coded_sample = avio_rb16(pb); /* depth */
avio_seek(pb, stsd_start, SEEK_SET);
//QuickTime 格式需要设置调色板,h.264不需要这里 h.264 color depth 为32
if (ff_get_qtpalette(st->codecpar->codec_id, pb, sc->palette)) {
st->codecpar->bits_per_coded_sample &= 0x1F;
sc->has_palette = 1;
}
}
//解析音频Entry内容,读取的数据直接赋值给外部AVstream 结构中
static void mov_parse_stsd_audio(MOVContext *c, AVIOContext *pb, AVStream *st, MOVStreamContext *sc)
{
int bits_per_sample, flags;
uint16_t version = avio_rb16(pb);
AVDictionaryEntry *compatible_brands = av_dict_get(c->fc->metadata, "compatible_brands", NULL, AV_DICT_MATCH_CASE);
avio_rb16(pb); /* revision level */
avio_rb32(pb); /* vendor */
st->codecpar->channels = avio_rb16(pb); /* channel count */
st->codecpar->bits_per_coded_sample = avio_rb16(pb); /* sample size */
av_log(c->fc, AV_LOG_TRACE, "audio channels %d\n", st->codecpar->channels);
sc->audio_cid = avio_rb16(pb);
avio_rb16(pb); /* packet size = 0 */
st->codecpar->sample_rate = ((avio_rb32(pb) >> 16));
// Read QuickTime 格式
if (!c->isom ||
(compatible_brands && strstr(compatible_brands->value, "qt ")) ||
(sc->stsd_version == 0 && version > 0)) {
if (version == 1) {
sc->samples_per_frame = avio_rb32(pb);
avio_rb32(pb); /* bytes per packet */
sc->bytes_per_frame = avio_rb32(pb);
avio_rb32(pb); /* bytes per sample */
} else if (version == 2) {
avio_rb32(pb); /* sizeof struct only */
st->codecpar->sample_rate = av_int2double(avio_rb64(pb));
st->codecpar->channels = avio_rb32(pb);
avio_rb32(pb); /* always 0x7F000000 */
st->codecpar->bits_per_coded_sample = avio_rb32(pb);
flags = avio_rb32(pb); /* lpcm format specific flag */
sc->bytes_per_frame = avio_rb32(pb);
sc->samples_per_frame = avio_rb32(pb);
if (st->codecpar->codec_tag == MKTAG('l','p','c','m'))
st->codecpar->codec_id =
ff_mov_get_lpcm_codec_id(st->codecpar->bits_per_coded_sample,
flags);
}
if (version == 0 || (version == 1 && sc->audio_cid != -2)) {
/* can't correctly handle variable sized packet as audio unit */
switch (st->codecpar->codec_id) {
case AV_CODEC_ID_MP2:
case AV_CODEC_ID_MP3:
st->need_parsing = AVSTREAM_PARSE_FULL;
break;
}
}
}
...
switch (st->codecpar->codec_id) {
... 这里的代码一般用不上
default:
break;
}
bits_per_sample = av_get_bits_per_sample(st->codecpar->codec_id);
if (bits_per_sample) {
st->codecpar->bits_per_coded_sample = bits_per_sample;
sc->sample_size = (bits_per_sample >> 3) * st->codecpar->channels;
}
}
stts atom :sample 的dts 信息读取 小Tips : 这里有个重要的地方:在stts读取的每个sample的dts值 + ctts_data[i]->duration=pts 即:dts+duration(偏差值)= pts,(这个会在mov_read_packet接口中计算,之后直接赋值给AVPacket)但 如果没有ctts, 那么dts==pts
typedef struct MOVStts {
unsigned int count; //相同duration的sample数量
int duration; //每个sample的dts的偏差值 也就是 delta增量
} MOVStts;
static int mov_read_stts(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries, alloc_size = 0;
int64_t duration = 0;//总的显示时间
int64_t total_sample_count = 0;//总的帧数(样本数)
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb); //Entry 个数(一般情况下为1)
av_log(c->fc, AV_LOG_TRACE, "track[%u].stts.entries = %u\n",
c->fc->nb_streams-1, entries);
if (sc->stts_data)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STTS atom\n");
av_freep(&sc->stts_data);
sc->stts_count = 0;//Entry 个数(一般情况下为1)
if (entries >= INT_MAX / sizeof(*sc->stts_data))
return AVERROR(ENOMEM);
for (i = 0; i < entries && !pb->eof_reached; i++) {
int sample_duration;
unsigned int sample_count;
unsigned int min_entries = FFMIN(FFMAX(i + 1, 1024 * 1024), entries);
//开辟内存
MOVStts *stts_data = av_fast_realloc(sc->stts_data, &alloc_size,
min_entries * sizeof(*sc->stts_data));
if (!stts_data) {
av_freep(&sc->stts_data);
sc->stts_count = 0;
return AVERROR(ENOMEM);
}
sc->stts_count = min_entries;
sc->stts_data = stts_data;
sample_count = avio_rb32(pb);
sample_duration = avio_rb32(pb);
//i为Entry序号
sc->stts_data[i].count= sample_count; //相同duration的sample数量
sc->stts_data[i].duration= sample_duration; //每个sample的dts的偏差值 也就是 delta增量
av_log(c->fc, AV_LOG_TRACE, "sample_count=%d, sample_duration=%d\n",
sample_count, sample_duration);
duration+=(int64_t)sample_duration*(uint64_t)sample_count; //总的显示时间
total_sample_count+=sample_count;//总的样本数
}
sc->stts_count = i; //stts Enntry 个数
if (duration > 0 &&
duration <= INT64_MAX - sc->duration_for_fps &&
total_sample_count <= INT_MAX - sc->nb_frames_for_fps) {
sc->duration_for_fps += duration;
sc->nb_frames_for_fps += total_sample_count;
}
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STTS atom\n");
return AVERROR_EOF;
}
st->nb_frames= total_sample_count;
if (duration)
st->duration= FFMIN(st->duration, duration);总的显示时间
sc->track_end = duration;//帧结尾位置(也就是duration时间总数)
return 0;
}
static int mov_read_stss(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb); //Enry 数量 (一般为1)
av_log(c->fc, AV_LOG_TRACE, "keyframe_count = %u\n", entries);
if (!entries) {
sc->keyframe_absent = 1;
if (!st->need_parsing && st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO)
st->need_parsing = AVSTREAM_PARSE_HEADERS;
return 0;
}
if (sc->keyframes)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STSS atom\n");
if (entries >= UINT_MAX / sizeof(int))
return AVERROR_INVALIDDATA;
av_freep(&sc->keyframes);
sc->keyframe_count = 0;//关键帧个数
sc->keyframes = av_malloc_array(entries, sizeof(*sc->keyframes));//开辟内存,关键帧以int*方式存储
if (!sc->keyframes)
return AVERROR(ENOMEM);
for (i = 0; i < entries && !pb->eof_reached; i++) {
sc->keyframes[i] = avio_rb32(pb); //给keyframes赋值(每一个数组元素都是关键帧序号)
}
sc->keyframe_count = i; //关键帧个数
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STSS atom\n");
return AVERROR_EOF;
}
return 0;
}
ctts atom :sample 的dts和pts 的偏差值 信息读取 小Tips : 这里有个重要的地方:在stts读取的每个sample的dts值 + ctts_data[i]->duration=pts 即:dts+duration(偏差值)= pts,(这个会在mov_read_packet接口中计算,之后直接赋值给AVPacket)但 如果没有ctts, 那么dts==pts
typedef struct MOVStts {
unsigned int count; //相同duration的sample数量
int duration; //每个sample的dts的偏差值 也就是 delta增量
} MOVStts;
static int mov_read_ctts(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries, ctts_count = 0;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb);//Entry 个数
av_log(c->fc, AV_LOG_TRACE, "track[%u].ctts.entries = %u\n", c->fc->nb_streams - 1, entries);
if (!entries)
return 0;
if (entries >= UINT_MAX / sizeof(*sc->ctts_data))
return AVERROR_INVALIDDATA;
av_freep(&sc->ctts_data);
//开辟内存
sc->ctts_data = av_fast_realloc(NULL, &sc->ctts_allocated_size, entries * sizeof(*sc->ctts_data));
if (!sc->ctts_data)
return AVERROR(ENOMEM);
for (i = 0; i < entries && !pb->eof_reached; i++) {
int count = avio_rb32(pb);//相同duration的sample数量
int duration = avio_rb32(pb);//每个sample的dts的偏差值 也就是 delta增量
if (count <= 0) {
av_log(c->fc, AV_LOG_TRACE,
"ignoring CTTS entry with count=%d duration=%d\n",
count, duration);
continue;
}
//给 ctts_data赋值
add_ctts_entry(&sc->ctts_data, &ctts_count, &sc->ctts_allocated_size,
count, duration);
av_log(c->fc, AV_LOG_TRACE, "count=%d, duration=%d\n",
count, duration);
if (FFNABS(duration) < -(1<<28) && i+2<entries) {
av_log(c->fc, AV_LOG_WARNING, "CTTS invalid\n");
av_freep(&sc->ctts_data);
sc->ctts_count = 0;
return 0;
}
if (i+2<entries)
//sc->dts_shift一般为0
mov_update_dts_shift(sc, duration, c->fc);
}
sc->ctts_count = ctts_count; //ctts Entry个数
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted CTTS atom\n");
return AVERROR_EOF;
}
av_log(c->fc, AV_LOG_TRACE, "dts shift %d\n", sc->dts_shift);
return 0;
}
stsc atom :sample chunk 序号 信息读取 小Tips : mov_build_index接口通过stsz每个sample大小 + stsc 每个chunk中sample数量 + stco 每个chunk相对整个文件的绝对偏移量得出每个sample相对整个文件的绝对偏移量
typedef struct MOVStsc {
int first;//chunk 中的第一个sample的id (一个chunk中有一个或多个sample)
int count;//每个chunk中sample数量
int id; //Sample description 一般为1
} MOVStsc;
static int mov_read_stsc(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb); //Entry 个数
if ((uint64_t)entries * 12 + 4 > atom.size)
return AVERROR_INVALIDDATA;
av_log(c->fc, AV_LOG_TRACE, "track[%u].stsc.entries = %u\n", c->fc->nb_streams - 1, entries);
if (!entries)
return 0;
if (sc->stsc_data)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STSC atom\n");
av_free(sc->stsc_data);
sc->stsc_count = 0;
//开辟内存
sc->stsc_data = av_malloc_array(entries, sizeof(*sc->stsc_data));
if (!sc->stsc_data)
return AVERROR(ENOMEM);
//sc->stsc_data赋值
for (i = 0; i < entries && !pb->eof_reached; i++) {
sc->stsc_data[i].first = avio_rb32(pb);
sc->stsc_data[i].count = avio_rb32(pb);
sc->stsc_data[i].id = avio_rb32(pb);
}
sc->stsc_count = i;//Entry 个数
for (i = sc->stsc_count - 1; i < UINT_MAX; i--) {
int64_t first_min = i + 1;
if ((i+1 < sc->stsc_count && sc->stsc_data[i].first >= sc->stsc_data[i+1].first) ||
(i > 0 && sc->stsc_data[i].first <= sc->stsc_data[i-1].first) ||
sc->stsc_data[i].first < first_min ||
sc->stsc_data[i].count < 1 ||
sc->stsc_data[i].id < 1) {
av_log(c->fc, AV_LOG_WARNING, "STSC entry %d is invalid (first=%d count=%d id=%d)\n", i, sc->stsc_data[i].first, sc->stsc_data[i].count, sc->stsc_data[i].id);
if (i+1 >= sc->stsc_count) {
if (sc->stsc_data[i].count == 0 && i > 0) {
sc->stsc_count --;
continue;
}
sc->stsc_data[i].first = FFMAX(sc->stsc_data[i].first, first_min);
if (i > 0 && sc->stsc_data[i].first <= sc->stsc_data[i-1].first)
sc->stsc_data[i].first = FFMIN(sc->stsc_data[i-1].first + 1LL, INT_MAX);
sc->stsc_data[i].count = FFMAX(sc->stsc_data[i].count, 1);
sc->stsc_data[i].id = FFMAX(sc->stsc_data[i].id, 1);
continue;
}
av_assert0(sc->stsc_data[i+1].first >= 2);
// We replace this entry by the next valid
sc->stsc_data[i].first = sc->stsc_data[i+1].first - 1;
sc->stsc_data[i].count = sc->stsc_data[i+1].count;
sc->stsc_data[i].id = sc->stsc_data[i+1].id;
}
}
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STSC atom\n");
return AVERROR_EOF;
}
return 0;
}
stsz atom : 每个sample 大小 信息读取 小Tips : mov_build_index接口通过stsz每个sample大小 + stsc 每个chunk中sample数量 + stco 每个chunk相对整个文件的绝对偏移量得出每个sample相对整个文件的绝对偏移量
static int mov_read_stsz(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries, sample_size, field_size, num_bytes;
GetBitContext gb;
unsigned char* buf;
int ret;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
if (atom.type == MKTAG('s','t','s','z')) {
sample_size = avio_rb32(pb); //sample总数
if (!sc->sample_size) /* do not overwrite value computed in stsd */
sc->sample_size = sample_size;
sc->stsz_sample_size = sample_size;
field_size = 32;
} else {
sample_size = 0;
avio_rb24(pb); /* reserved */
field_size = avio_r8(pb);
}
entries = avio_rb32(pb);//Entry 个数(一般是sample总数)
av_log(c->fc, AV_LOG_TRACE, "sample_size = %u sample_count = %u\n", sc->sample_size, entries);
sc->sample_count = entries;//Entry 个数(一般是sample总数)
if (sample_size)
return 0;
if (field_size != 4 && field_size != 8 && field_size != 16 && field_size != 32) {
av_log(c->fc, AV_LOG_ERROR, "Invalid sample field size %u\n", field_size);
return AVERROR_INVALIDDATA;
}
if (!entries)
return 0;
if (entries >= (UINT_MAX - 4) / field_size)
return AVERROR_INVALIDDATA;
if (sc->sample_sizes)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STSZ atom\n");
av_free(sc->sample_sizes);
sc->sample_count = 0;
//开辟内存
sc->sample_sizes = av_malloc_array(entries, sizeof(*sc->sample_sizes));
if (!sc->sample_sizes)
return AVERROR(ENOMEM);
num_bytes = (entries*field_size+4)>>3;
buf = av_malloc(num_bytes+AV_INPUT_BUFFER_PADDING_SIZE);
if (!buf) {
av_freep(&sc->sample_sizes);
return AVERROR(ENOMEM);
}
ret = ffio_read_size(pb, buf, num_bytes);
if (ret < 0) {
av_freep(&sc->sample_sizes);
av_free(buf);
av_log(c->fc, AV_LOG_WARNING, "STSZ atom truncated\n");
return 0;
}
init_get_bits(&gb, buf, 8*num_bytes);
//sc->sample_sizes赋值,int*形式表示每帧大小
for (i = 0; i < entries && !pb->eof_reached; i++) {
sc->sample_sizes[i] = get_bits_long(&gb, field_size);
if (sc->sample_sizes[i] < 0) {
av_free(buf);
av_log(c->fc, AV_LOG_ERROR, "Invalid sample size %d\n", sc->sample_sizes[i]);
return AVERROR_INVALIDDATA;
}
sc->data_size += sc->sample_sizes[i];//sample总体大小
}
sc->sample_count = i;
av_free(buf);
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STSZ atom\n");
return AVERROR_EOF;
}
return 0;
}
stco atom : 每个chunk相对整个文件的绝对偏移量 信息读取 (为了不依靠其他参数寻找每个sample的位置) 小Tips : mov_build_index接口通过stsz每个sample大小 + stsc 每个chunk中sample数量 + stco 每个chunk相对整个文件的绝对偏移量得出每个sample相对整个文件的绝对偏移量
static int mov_read_stco(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries;
if (c->trak_index < 0) {
av_log(c->fc, AV_LOG_WARNING, "STCO outside TRAK\n");
return 0;
}
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb);//Entry 个数(chunk总数)
if (!entries)
return 0;
if (sc->chunk_offsets)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STCO atom\n");
av_free(sc->chunk_offsets);
sc->chunk_count = 0;
//开辟内存
sc->chunk_offsets = av_malloc_array(entries, sizeof(*sc->chunk_offsets));
if (!sc->chunk_offsets)
return AVERROR(ENOMEM);
sc->chunk_count = entries;
//chunk_offsets 以int*形式赋值
if (atom.type == MKTAG('s','t','c','o'))
for (i = 0; i < entries && !pb->eof_reached; i++)
sc->chunk_offsets[i] = avio_rb32(pb);
else if (atom.type == MKTAG('c','o','6','4'))
for (i = 0; i < entries && !pb->eof_reached; i++)
sc->chunk_offsets[i] = avio_rb64(pb);
else
return AVERROR_INVALIDDATA;
sc->chunk_count = i;//Entry 个数(chunk总数)
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STCO atom\n");
return AVERROR_EOF;
}
return 0;
}
elst atom 第一个sample的dts信息读取 小Tips : mov_build_index接口计算,取MOVElst 中 time的负数就是dts的第一个值,所以一般第一个sample的dts为负数。再通过stts中的duration增量值(偏差值)得出每个sample的dts,再通过ctts中的duration增量值(偏差值)进而得出pts
typedef struct MOVElst {
int64_t duration;//sample的总时间
int64_t time;//sample的dts起始值(取time的负数就是dts的第一个值)
float rate; //sample rate 一般为1
} MOVElst;
static int mov_read_elst(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
MOVStreamContext *sc;
int i, edit_count, version;
int64_t elst_entry_size;
if (c->fc->nb_streams < 1 || c->ignore_editlist)
return 0;
sc = c->fc->streams[c->fc->nb_streams-1]->priv_data;
version = avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
edit_count = avio_rb32(pb); /* entries 一般为1*/
atom.size -= 8;
elst_entry_size = version == 1 ? 20 : 12;
if (atom.size != edit_count * elst_entry_size) {
if (c->fc->strict_std_compliance >= FF_COMPLIANCE_STRICT) {
av_log(c->fc, AV_LOG_ERROR, "Invalid edit list entry_count: %d for elst atom of size: %"PRId64" bytes.\n",
edit_count, atom.size + 8);
return AVERROR_INVALIDDATA;
} else {
edit_count = atom.size / elst_entry_size;
if (edit_count * elst_entry_size != atom.size) {
av_log(c->fc, AV_LOG_WARNING, "ELST atom of %"PRId64" bytes, bigger than %d entries.\n", atom.size, edit_count);
}
}
}
if (!edit_count)
return 0;
if (sc->elst_data)
av_log(c->fc, AV_LOG_WARNING, "Duplicated ELST atom\n");
av_free(sc->elst_data);
sc->elst_count = 0;
//开辟内存
sc->elst_data = av_malloc_array(edit_count, sizeof(*sc->elst_data));
if (!sc->elst_data)
return AVERROR(ENOMEM);
//elst_data赋值
for (i = 0; i < edit_count && atom.size > 0 && !pb->eof_reached; i++) {
MOVElst *e = &sc->elst_data[i];
if (version == 1) {
e->duration = avio_rb64(pb);
e->time = avio_rb64(pb);
atom.size -= 16;
} else {
e->duration = avio_rb32(pb); /* segment duration */
e->time = (int32_t)avio_rb32(pb); /* media time */
atom.size -= 8;
}
e->rate = avio_rb32(pb) / 65536.0;
atom.size -= 4;
av_log(c->fc, AV_LOG_TRACE, "duration=%"PRId64" time=%"PRId64" rate=%f\n",
e->duration, e->time, e->rate);
if (e->time < 0 && e->time != -1 &&
c->fc->strict_std_compliance >= FF_COMPLIANCE_STRICT) {
av_log(c->fc, AV_LOG_ERROR, "Track %d, edit %d: Invalid edit list media time=%"PRId64"\n",
c->fc->nb_streams-1, i, e->time);
return AVERROR_INVALIDDATA;
}
}
sc->elst_count = i;
return 0;
}
主要Atom 接口都写完了,通过这些metadata数据,就能在mov_build_index接口中赋值给AVIndexEntry 变量,从而在mov_read_packet接口中,通过sample的绝对位置和大小在文件中取音视频数据到AVPacket中,进而调用解码器进行解码。