writting Vorbis Comment in C

Not too long ago (back in 2008) I wrote tagutil, a small CLI tool to display and edit music files tags. At the time I felt in need of something easy script without GUI, but most of all I wanted a tool with a complete support for Vorbis Comment.
While both Ogg/Vorbis and FLAC uses Vorbis Comment, libFLAC has the courtesy to provide a simple API to read and write them, where libogg/libvorbis only provide a way to read them. OK, It's not entierly true; it is possible to write them, it's just damn hard compared to libFLAC because you need a good understanding of Ogg/Vorbis's design.
Theory time!
In order to modify Vorbis Comment you need to understand a bit about Ogg/Vorbis. In a Ogg/Vorbis stream, Ogg is the container format and Vorbis is the audio codec format. So far so good.
Ogg
An Ogg stream is divided into pages containing packets. Each page contains up to 255 segments, and a packet can be formed by any number of segments. Without going too deep, what you need to understand is:
-
A packet, which is a
meaningful unit of data for the decoder
, is contained into one or more pages. - A page can contains many packets, or one unique packet, or only parts of a packet (that is several segments of said packet but not all of them).
If you want to know more, you'll find plenty of documentation on the web (The Ogg page on Wikipedia is a good start).
Vorbis
A Vorbis stream is logically divided into packets and the three first packet are the header packets. More importantly for us, the Vorbis Comment header is the second packet. Thus, the general algorithm to replace Vorbis Comment is, as described in the "Metadata workflow" of libvorbis documentation:
- copy the first packet
- insert a packet with your own Vorbis Comment
- copy the third and all following packets.
There's always a trick that's where the pain comes in
For each audio packet, we need to compute the granulepos. The
granulepos is an codec specific value, allowing seeking through the stream.
In Vorbis the granulepos is based on the number of sample elapsed. The
sample count for a packet can be computed using the
vorbis_packet_blocksize()
routine.
An interesting discussion
on the vorbis-dev mailling about granulepos explains why we need to
compute it for each packet and hint that
vorbis_packet_blocksize()
is poorly named and actually compute
the number of samples that a packet would hold.
It is important to note that although the granulepos value is only meaningful for the codec, it is stored at the Ogg page layer.
More about the granulepos:
- In general: the Gstreamer README about ogg
- Ogg: How to seek using granulepos from the xiph wiki
- Vorbis: the Encapsulation section from the vorbis specifications
C examples
Most examples are tools to repair the granulepos. Although not obvious at first, they are very close to do what we need. The only difference is that we want to replace the second header packet. Since the granulepos has to be computed only for audio packets, there is no need to compute it for the header packets.
vorbiscomment
The reference example is vorbiscomment from vorbis-tools. Unlike the other examples, this program is able to alter Vorbis Comment. Although it does what we want, two problems prevented me to make good use of it:
- It's licensed under the GPLv2 (and my project is under a 2-clause BSD).
- It's not as easy to read as you would like an example to be.
The magic happens in the vcedit_write()
routine from
vcedit.c.
revorb.cpp it's actually C
This program wrote by
Jiri Hruska is a granuelpos fixer. The header packets processing
is easy to read and is the purpose of the copy_headers()
routine. The granulepos computation use the same formula as in vcedit.
oggfix_granulepos.c
Gunter Königsmann wrote a small program to
recalculates the granulepos of a complete Ogg/Vorbis stream without altering the rest of it
. He kindly
sent it to the vorbis-dev mailling list
(I also host a copy
here). It is very
well commented and its granulepos computation is unlike every other
examples: it actually decode the Vorbis stream to get the samples count "by
hand" rather than using vorbis_packet_blocksize()
.
Conclusion
There is a lot of work and understanding to get the job done. Ideally there should be simple ways of doing simple tasks. I hope that this post will help anyone trying to modify Vorbis Comment !
The #vorbis IRC channel on freenode was kind, reactive and precise. Looking back I should have get there sooner. Kudos to them !
Finally, here is my program to write Vorbis Comment. Note that in this example you cannot use the ogg "source" file as the destination file, you can find another of my example creating a temporary file and using rename(2) here (but it's using non-standard stuff).
writting Vorbis Comment in C
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 | /* * vorbis_comment.c * * A simple example on how to modify Vorbis Comments with libogg and libvorbis. * Compile with: * cc -I/include/path vorbis_comment.c -L/lib/path -logg -lvorbis -lvorbisfile */ #include <stdio.h> #include <stdlib.h> #include "ogg/ogg.h" #include "vorbis/codec.h" #define OV_EXCLUDE_STATIC_CALLBACKS 1 #include "vorbis/vorbisfile.h" #undef OV_EXCLUDE_STATIC_CALLBACKS /** * write the page p into the given file pointer fp. * * return 0 on success and -1 on error. */ int write_page(ogg_page *p, FILE *fp) { if (fwrite(p->header, 1, p->header_len, fp) != p->header_len) return -1; if (fwrite(p->body, 1, p->body_len, fp) != p->body_len) return -1; return 0; } /* * copy a ogg/vorbis file from path_in to path_out, using the given Vorbis Comments * vc_out for the new file. * * return 0 on success and -1 on error. */ int save_it(const char *path_in, struct vorbis_comment *vc_out, const char *path_out) { FILE *fp_in = NULL; /* input file pointer */ FILE *fp_out = NULL; /* output file pointer */ ogg_sync_state oy_in; /* sync and verify incoming physical bitstream */ ogg_stream_state os_in; /* take physical pages, weld into a logical stream of packets */ ogg_stream_state os_out; /* take physical pages, weld into a logical stream of packets */ ogg_page og_in; /* one Ogg bitstream page. Vorbis packets are inside */ ogg_page og_out; /* one Ogg bitstream page. Vorbis packets are inside */ ogg_packet op_in; /* one raw packet of data for decode */ ogg_packet my_vc_packet; /* our custom packet containing vc_out */ vorbis_info vi_in; /* struct that stores all the static vorbis bitstream settings */ vorbis_comment vc_in; /* struct that stores all the bitstream user comment */ unsigned long nstream_in; /* stream counter */ unsigned long npage_in; /* page counter */ unsigned long npacket_in; /* packet counter */ unsigned long bs; /* blocksize of the current packet */ unsigned long lastbs; /* blocksize of the last packet */ ogg_int64_t granulepos; /* granulepos of the current page */ enum { BUILDING_VC_PACKET, SETUP, B_O_S, START_READING, STREAMS_INITIALIZED, READING_HEADERS, READING_DATA, READING_DATA_NEED_FLUSH, READING_DATA_NEED_PAGEOUT, E_O_S, WRITE_FINISH, DONE_SUCCESS, } state; /* * In order to modify the file's tag, we have to rewrite the entire * file. The Ogg container is divided into "pages" and "packets" and the * algorithm is to replace the SECOND ogg packet (which contains vorbis * comments) and copy ALL THE OTHERS. See "Metadata workflow": * https://xiph.org/vorbis/doc/libvorbis/overview.html */ state = BUILDING_VC_PACKET; /* create the packet holding our vorbis_comment */ if (vorbis_commentheader_out(vc_out, &my_vc_packet) != 0) goto cleanup_label; state = SETUP; /* open files & stuff */ (void)ogg_sync_init(&oy_in); /* always return 0 */ if ((fp_in = fopen(path_in, "r")) == NULL) goto cleanup_label; if ((fp_out = (path_out == NULL ? stdout : fopen(path_out, "w"))) == NULL) goto cleanup_label; lastbs = granulepos = 0; nstream_in = 0; bos_label: /* beginning of a stream */ state = B_O_S; /* never read, but that's fine */ nstream_in += 1; npage_in = npacket_in = 0; vorbis_info_init(&vi_in); vorbis_comment_init(&vc_in); state = START_READING; /* main loop: read the input file into buf in order to sync pages out */ while (state != E_O_S) { switch (ogg_sync_pageout(&oy_in, &og_in)) { case 0: /* more data needed or an internal error occurred. */ case -1: /* stream has not yet captured sync (bytes were skipped). */ if (feof(fp_in)) { if (state < READING_DATA) goto cleanup_label; /* There is no more data to read and we could not get a page so we're done here. */ state = E_O_S; } else { /* read more data and try again to get a page. */ char *buf; size_t s; /* get a buffer */ if ((buf = ogg_sync_buffer(&oy_in, BUFSIZ)) == NULL) goto cleanup_label; /* read a part of the file */ if ((s = fread(buf, sizeof(char), BUFSIZ, fp_in)) == -1) goto cleanup_label; /* tell ogg how much was read */ if (ogg_sync_wrote(&oy_in, s) == -1) goto cleanup_label; } continue; } /* here ogg_sync_pageout() returned 1 and a page was sync'ed. */ if (++npage_in == 1) { /* init both input and output streams with the serialno of the first page */ if (ogg_stream_init(&os_in, ogg_page_serialno(&og_in)) == -1) goto cleanup_label; if (ogg_stream_init(&os_out, ogg_page_serialno(&og_in)) == -1) { ogg_stream_clear(&os_in); goto cleanup_label; } state = STREAMS_INITIALIZED; } /* put the page in input stream, and then loop through each its packet(s) */ if (ogg_stream_pagein(&os_in, &og_in) == -1) goto cleanup_label; while (ogg_stream_packetout(&os_in, &op_in) == 1) { ogg_packet *target; /* * This is where we really do what we mean to do: the * second packet is the commentheader packet, we replace * it with my_vc_packet if we're on the first stream. */ if (++npacket_in == 2 && nstream_in == 1) target = &my_vc_packet; else target = &op_in; if (npacket_in <= 3) { /* * The first three packets are header packets. * We use them to get the vorbis_info which * will be used later. vc_in will not be unused. */ if (vorbis_synthesis_headerin(&vi_in, &vc_in, &op_in) != 0) goto cleanup_label; /* force a flush after the third ogg_packet */ state = (npacket_in == 3 ? READING_DATA_NEED_FLUSH : READING_HEADERS); } else { /* * granulepos computation. * * The granulepos is stored into the *pages* and * is used by the codec to seek through the * bitstream. Its value is codec dependent (in * the Vorbis case it is the number of samples * elapsed). * * The vorbis_packet_blocksize() actually * compute the number of sample that would be * stored by the packet (without decoding it). * This is the same formula as in vcedit example * from vorbis-tools. * * We use here the vorbis_info previously filled * when reading header packets. * * XXX: check if this is not a vorbis stream ? */ bs = vorbis_packet_blocksize(&vi_in, &op_in); granulepos += (lastbs == 0 ? 0 : (bs + lastbs) / 4); lastbs = bs; /* write page(s) if needed */ if (state == READING_DATA_NEED_FLUSH) { while (ogg_stream_flush(&os_out, &og_out)) { if (write_page(&og_out, fp_out) == -1) goto cleanup_label; } } else if (state == READING_DATA_NEED_PAGEOUT) { while (ogg_stream_pageout(&os_out, &og_out)) { if (write_page(&og_out, fp_out) == -1) goto cleanup_label; } } /* * Decide wether we need to write a page based * on our granulepos computation. The -1 case is * very common because only the last packet of a * page has its granulepos set by the ogg layer * (which only store a granulepos per page), so * all the other have a value of -1 (we need to * set the granulepos for each packet though). * * The other cases logic are borrowed from * vcedit and I fail to understand how * granulepos could mismatch because we don't * change the data packet. */ state = READING_DATA; if (op_in.granulepos == -1) { op_in.granulepos = granulepos; } else if (granulepos <= op_in.granulepos) { state = READING_DATA_NEED_PAGEOUT; } else /* if granulepos > op_in.granulepos */ { state = READING_DATA_NEED_FLUSH; granulepos = op_in.granulepos; } } /* insert the target packet into the output stream */ if (ogg_stream_packetin(&os_out, target) == -1) goto cleanup_label; } if (ogg_page_eos(&og_in)) { /* og_in was the last page of the stream */ state = E_O_S; } } /* forces remaining packets into a last page */ os_out.e_o_s = 1; while (ogg_stream_flush(&os_out, &og_out)) { if (write_page(&og_out, fp_out) == -1) goto cleanup_label; /* ogg_page and ogg_packet structs always point to storage in libvorbis. They're never freed or manipulated directly */ /* check if we need to read another stream */ if (!feof(fp_in)) { ogg_stream_clear(&os_in); ogg_stream_clear(&os_out); vorbis_comment_clear(&vc_in); vorbis_info_clear(&vi_in); goto bos_label; } else { (void)fclose(fp_in); fp_in = NULL; } state = WRITE_FINISH; if (fp_out != stdout && fclose(fp_out) != 0) goto cleanup_label; fp_out = NULL; state = DONE_SUCCESS; /* FALLTHROUGH */ cleanup_label: if (state >= STREAMS_INITIALIZED) { ogg_stream_clear(&os_in); ogg_stream_clear(&os_out); } if (state >= START_READING) { vorbis_comment_clear(&vc_in); vorbis_info_clear(&vi_in); } ogg_sync_clear(&oy_in); if (fp_out != stdout && fp_out != NULL) (void)fclose(fp_out); if (fp_in != NULL) (void)fclose(fp_in); ogg_packet_clear(&my_vc_packet); return (state == DONE_SUCCESS ? 0 : -1); } int main(int argc, char **argv) { const char *path_in, *path_out; struct OggVorbis_File vf; struct vorbis_comment *vc; int i; if (argc == 2) { path_in = argv[1]; path_out = NULL; } else if (argc == 3) { path_in = argv[1]; path_out = argv[2]; } else { (void)fprintf(stderr, "usage: %s file [output]\n", argv[0]); return (EXIT_FAILURE); } /* try to read path_in as an ogg/vorbis file */ if ((i = ov_fopen(path_in, &vf) != 0)) { (void)fprintf(stderr, "%s: can't open as ogg/vorbis file.\n", path_in); return (EXIT_FAILURE); } /* get the vorbis comments */ if ((vc = ov_comment(&vf, -1)) == NULL) { (void)fprintf(stderr, "ov_comment\n"); return (EXIT_FAILURE); } /* change something */ vorbis_comment_add(vc, "test=42"); /* display vorbis comments to stderr (in case stdin is used as output) */ for (i = 0; i < vc->comments; i++) (void)fprintf(stderr, "%s\n", vc->user_comments[i]); /* now save the modified comments (and copy audio data) into path_out */ if (save_it(path_in, vc, path_out) == -1) { (void)fprintf(stderr, "save_it failed.\n"); return (EXIT_FAILURE); } /* cleanup */ ov_clear(&vf); return (EXIT_SUCCESS); } |