Writing ID3v2 Tag parsing code, need Good examples to test

Writing ID3v2 Tag parsing code, need Good examples to test - id3

I am writing software to parse ID3v2 tags in Java. I need to find some files with good examples of the tag with lots of different frames. Ideally the tags will contain an embedded picture because that is what is kicking my butt right now.
Does anyone know where I can find some good free (legal) ID3v2 tagged files (ID3v2.2 and ID3v2.3)?

You can create the example files with a tagger by yourself. I'm the author of the Windows freeware tagger Mp3tag which is able to write ID3v2.3 (UTF-16 and ISO-8859-1) and ID3v2.4 tags in UTF-8 (both along with APIC frames). You can find a list of supported frames here.
To create ID3v2.2 tags, I think the only program out there is iTunes which interpretes the ID3 spec in it's very own way and writes numerous iTunes specific frames that are not in the spec.

This maybe be obvious and not what you are seeking, but what about ripping some of your legally obtained CDs and editing them using iTunes? iTunes would also allow you to add embedded picture. There are of course many open source programs that will also do this.

Related

What makes some EXIF tags non-writable?

Certain EXIF tags, for example many of the QuickTime tags listed here, are non-writable by common EXIF editors.
This list of writable vs. non-writable is maintained by Phil Harvey's exiftool, but I have found similar results attempting to edit the same tags with other tools such as MetaClean. My edits to these tags do not persist and the original values return when I reload the file.
Why is that? What about a certain tag makes it uneditable, and is there any manual way to override this?

First, the tags you linked are not EXIF tags, they are Quicktime tags. EXIF is just a common, but narrow subset of all types Metadata. Sorry for being a pedantic ass about it.
In the case of exiftool and especially video files, the standards and formats for such tags is, as Phil Harvey (exiftool author) has put it, a complete mess. There are apparently a lot of differences in how various programs and cameras implement such metadata. Phil doesn't feel he has time to troubleshoot all the various differences and edge cases. To give an example, he recently started adding read support for gps tracks in video files. This ended up ballooning to having to support over 20 different variations of geotracks. And that's just for reading.
Followup: As of exiftool ver 11.39, a number of the more useful tags have been made writable. Exiftool now lists them under the group ItemList instead of Quicktime, though they are still part of the Quicktime group.

Reverse-engineer Cubase .cpr format

I don't have an opportunity to buy Cubase, but my partner uses it a lot. I wanted to simplify his life and provide him with cpr projects instead of plain wav files, but no other software can open/save this format.
I looked at a sample cpr he sent me and it seems like the file does not contain audio data itself, it rather contains the mark-up and effects.
I wanted to know the following things:
Is it legal to try to reverse-engineer cpr files?
Is it difficult and who tried?
If someone knows other ways to transfer project files between Audacity/Rosegarden and Cubase? The main thing is the support of several tracks and their timing in one project, nothing fancy.

Cpr files comes from a proprietary format. You can have a look on this question.
I suppose it is pretty hard... and I didn't tried !
To my knowledge, there is no way to export/import a project between cubase and Audacity or Rosegarden. The OMF format which could be a good candidate, is not supported by Audacity or Rosegarden for now. You can still import/export the audio mix, the separated tracks, and the midi files separately. This method is really fastidious, but it probably provides the advantage to let you play and edit your projects in the next decades, that isn't obvious with project files.

How can I see data from all different ID3 versions on a file?

I'm attempting to track down the source of a problem in Clementine (an audio player) that I thinks stems from having differing ID3v1 and ID3v2 tags on files. My problem is that I can't find an application that displays both sets of data.
I'll take either an application or a library. Runnable on Linux is preferred, but Windows is acceptable.

Bulk ID3 will do what you're looking for. Use the --print-only flag. Keep in mind the current release is an Alpha. Be sure to read the README.txt file in the download.
http://sourceforge.net/projects/bulkid3/
Andrew

Libraries of audio samples (spoken text)

For a project we're currently working on, we need a library of spoken words in many different languages.
Two options seem possible: text-to-speech or "real" recordings by native speakers. As the quality is important to us, we're thinking about going the latter path.
In order to create a prototype for our application, we're looking for libraries that contain as many words in different languages as possible. To get a feeling for the quality of our approach, this library should not be made up of synthesized speech.
Do you know of any available/accessible libraries?

A co-worker just found this community based library, which is nice, but rather small in size:
Forvo.com

I've just found this on the Audacity wiki: VoxForge. From their site:
VoxForge was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).
We will make available all submitted audio files under the GPL license, and then 'compile' them into acoustic models for use with Open Source speech recognition engines such as Sphinx, ISIP, Julius and HTK (note: HTK has distribution restrictions).

There is also Old time radio, not sure if this is the sort of spoken word you're after though.

My guess is that you won't find a library anywhere that consists of just individual words. Whatever you find, you're going to have to open the audio up in an editor (like Pro Tools or Cool Edit) and chop it up into individual words.
You would probably be better off creating a list of all the words you need for each language, and then finding native speakers to read them while you record. You can have them read slowly, so that you'll have an easy time chopping up each individual word.

One I use to use a lot: http://shtooka.net/index.php
Easy access to the recordings.

Writing Color Calibration Data to a TIFF or PNG file

My custom homebrew photography processing software, running on 64 bit Linux/GNU, writes out PNG and TIFF files. These are to be sent to a quality printing shop to be made into fine art. Working with interior designers - it's important to get the colors just right!
The print shops usually have no trouble with TIFF and PNGs made from commercial software such as Photoshop. Even though i have the TIFF 6.0 specs, PNG specs, and other info in hand, it is not clear how to include color calibration data or implement color management system on linux. My files are often rejected as faulty, without sufficient error reports to make fixes.
This has been a nasty problem for a while for many. Even my contacts at the Hollywood postproduction studios are struggling with this issue. One studio even wanted to hire me to take care of their color calibration, thinking i was the expert - but no, i am just as blind and lost as everyone!
Does anyone know of good code examples, detailed technical information, or have any other enlightenment? Or time to switch to pure Apple?

Take a look at LittleCMS
http://www.littlecms.com/
This page has the code for applying it to TIFF
http://www.littlecms.com/newutils.htm
The basic thing you need to know is that Color profile data is something you need to store in the meta-data of the file itself.

There is a consultant called Charles Poynton who specialises in this area. I work for one of the post production studios you mention (albeit in london not hollywood), and have seen him speak on the subject a couple of times. His website contains a lot of the material he presents and you might find something of use there. He also has a book called Digital Video and HDTV Algorithms and Interfaces which is not as heavy as the title might suggest! While these resources might not answer your question directly, it might provide a spring board to other solutions.
More specifically, which libraries are you using to write the png and tif files - you mention they are homebrew, but how custom are they exactly? Postprocessing the images in an image manipulation program (such as ImageMagick or dcraw) might allow you to inject this information into the header more successfully.
Sorry, I don't have any specific answers, but maybe something that will point you a bit further in the right direction...

As a GNU/Linux user, you’ll want to consider DispcalGUI – http://dispcalgui.hoech.net/ – a GNOME-based GUI that centralizes color management, ICC profile management, and (crucially for your case) device calibration. It can talk to well-known pro- and mid-level hardware, e.g, i1, X-Rite, Spyder, etc.
But before you get into that – you say you are generating your files to spec; are you validating your output using a test suite specific to the format in question? If not, here are three to get you started:
imagetestsuite supports the well-known formats: https://code.google.com/p/imagetestsuite/w/list?can=1&q=
The Luminous* test suite is a JIRA plugin, if that’s your thing: https://marketplace.atlassian.com/plugins/com.luminouslead.plugin.jira.testsuite.LuminousTestSuite
FLOSS Decoder implementations often have one you can use, i.e. OpenJPEG – https://code.google.com/p/openjpeg/wiki/TestSuiteDocumentation
But even barring all of those, it seems like your problem is with embedded ICC data – which is two specs in one. First, there’s the host image-file format, and they all handle embedding differently (meaning the ICC data will likely look totally different when embedded in a TIFF than, say, a JPEG or WebP file). Second, there is the ICC spec itself. It is documented here: http://color.org/v4spec.xalter – and you may also want to look at the source for the aforementioned dispcalGUI, which includes a very legible and hackable ICC profile class in Python: http://sourceforge.net/p/dispcalgui/code/HEAD/tree/trunk/dispcalGUI/ICCProfile.py
Full disclosure: I have contributed to that very ICC profile class, to which I just linked in that last ¶
That’s the basics (many of which you have no doubt covered)... beyond that, if you post more information about what exactly is going wrong, I’d be interested to look it over. Good luck with it either way.
* NB. This project is unrelated to the long-standing photography website, “the Luminous Landscape”

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string