Information about JPEG-LS is easily found on Google and in a lot of DICOM chapters.
However, there are links/pages/readings that mention JPEG-LL, too. However, i have taken a deeper look at the DICOM standard, not a chapter has ever mentioned anything about JPEG-LL, but in a lot of other conformance statements/forums/articles, JPEG-LL has been mentioned...
So, what is the difference between JPEG-LS and JPEG-LL?
I suppose JPEG-LL is the lossless version of the JPEG (usually called JPEG Lossless), with transfer syntax 1.2.840.10008.1.2.4.70, and it's the most common lossless compression. Anyway, it's not the best possible (it does not support signed pixel values, and its compression rates are about 2-2.5).
JPEG-LS is a completely different standard with transfer syntax 1.2.840.10008.1.2.4.80, developped by HP, which is faster and have better ratios than standard JPEG. However, it's not widely used.
I supposed JPEG-LL refers to ITU-T T.81, ISO/IEC IS 10918-1 aka JPEG lossless. While JPEG-LS refers to ITU-T T.87, ISO/IEC IS 14495-1.
Technically speaking JPEG-LL is exactly the same standard as the usual 8bits lossy jpeg as found on internet website. JPEG-LS is much much less common.
Related
As the title states: Both the jpeg_compress_struct and the jpeg_decompress_struct in libjpeg have a field defined like this:
boolean CCIR601_sampling; /* TRUE=first samples are cosited */
I am having a hard time figuring out what this means, or how it's supposed to be used. If you try to set this flag to true, either for compression or decompression, libjpeg will simply trigger a fatal error with this message:
JMESSAGE(JERR_CCIR601_NOTIMPL, "CCIR601 sampling not implemented yet")
The "yet" is amusing because it's been this way for 20+ years now, at least back to libjpeg62.
So, what is CCIR601_sampling supposed to do? Is it meant as a user-settable parameter for compression, decompression, or both? Is it stored as part of the file format? And why has it never actually been implemented?
I have asked the libjpeg-turbo maintainer about this on the mailing list (https://groups.google.com/g/libjpeg-turbo-users/c/Aeacg_cq5ms). Here is part of the response:
To the best of my understanding, the libjpeg API and algorithms adhere to the RGB-to-YCbCr conversion formulae specified in CCIR 601 (now ITU-R Recommendation BT.601). The "CCIR601_sampling" field in the libjpeg API is meant to allow for future support of co-sited Cb and Cr samples-- that is, to allow for the sample arrangement used in MPEG-2. That sample arrangement is non-planar and specifies a row of Y samples, then a row of packed Cb/Cr samples, then another row of Y samples, etc.
... Thus, the fact that Rec. 601 sampling isn't implemented in libjpeg v6b means that JPEG files with that sampling arrangement are basically non-existent "in the wild." The JPEG specification supports other features, including a lossless mode, but ultimately, the de facto definition of the "JPEG format" converged to the subset of features implemented by libjpeg v6b (per Tom Lane's original goal.) To this day, that same chicken-and-egg phenomenon means that web browsers don't support arithmetic-coded JPEG files, even though the patent on arithmetic coding expired long ago and libjpeg-turbo supports those files.
... The "CCIR601_sampling" field remains in the API because the API structures are exposed. Thus, removing the field would break backward ABI compatibility, and backward ABI compatibility is one of the primary reasons (performance is the other) why libjpeg-turbo became the preferred open source JPEG library.
In conclusion: CCIR601_sampling was intended as a user-settable parameter to JPEG compression, which would have produced a JPEG file containing "co-sited" CbCr components (both components stored packed together as one "component", instead of remaining two separate Cb and Cr planes). On decompression, jpeg_read_header() should set the field in the structure to indicate that this JPEG is CCIR601 formatted (it is not a user-settable decompression parameter, rather an indicator)
Of course libjpeg did not support this mode, thus no JPEGs exist that use it, so there is no need to support this mode.
Could anyone please explain briefly the difference between JPEG and EZW? And why JPEG is more popular? Is JPEG always better than EZW or just in most of the cases?
Thank you very much!
EZW is a theoretical technique which can be used as one step in wavelet compression. It's not a complete image encoder, and can't be used on its own. As best I can tell, nobody has proposed any image formats which depend on EZW, so nothing uses it (to the best of my knowledge).
(As an aside, wavelet image compression techniques have generally proven unsuccessful.)
JPEG, by contrast, is a standard which encompasses all layers of an image compressor, including the DCT as well as everything surrounding it: color space, entropy coding, file format, metadata, etc. Unlike EZW, it's been a complete, usable standard since 1992.
It just seems to me that when writing code for dynamic data visualization, I end up doing the same things over and over in different languages/platforms. Now if I had a cross platform language(which I do) and something like a binary version of SVG, I could make my code target that and use/create interpreters for whatever platform I currently need to use it on.
The reason I don't want SVG is because the plaintext part makes it too slow for my purposes. I could of course just create my own intermediary format but if there is something already out there that's implemented by various things then the less work for me!
Depending on what you mean by “too slow”, the answer varies:
Filesize too large
Officially, the closest thing SVG has to a binary format is SVGZ, which is a gzipped SVG file with the .svgz extension. All conforming SVG viewers should be able to open it. Making one is simple on *nix systems:
gzip yourfile.svg && mv yourfile.svg.gz yourfile.svgz
You could also try Brotli compression, which tends to have smaller filesize at the cost of more compression time.
Including other assets is inefficient
SVG can only bundle bitmaps and other binary data through base64 encoding, which has a fair amount of overhead.
PDF can include “streams” of raw binary data, and is surprisingly efficient when programmatically generated.
Parsing the text data takes too long
This is tricky. PDF and its brother, Encapsulated PostScript, are also old, well-supported vector graphic formats. Unfortunately, they too are also text at their core, with optional compression.
You could try Computer Graphics Metafiles, which can be compiled ahead of time. But I’m unsure how well-supported they are across consumer devices.
From a comment:
Almost nothing about the performance of SVG other than the transmission cost of sending it over a network is down to it being plaintext
No, that's completely wrong. I worked at CSIRO using XML for massive 3D models. GeoScience Australia did a formal study into the parsing speed - parsing floating point numbers from text is relatively expensive for big data sets, compared to reading a 4 or 8 byte binary representation.
I've spent a lot of time optimising my internal binary formats for Touchgram and am now looking at vector art.
One of the techniques you can use is a combination of
variable-length integer coding and
normalising your points to a scale represented by integers, then storing paths as sequences of deltas
That can yield paths where often only 1 or 2 bytes are used per step, as opposed to the typical 12.
Consider a basic line
<polyline class="Connect" points="100,200 100,100" />
I could represent that with 4 bytes instead of 53.
So far, all I've been able to find in binary SVG is this post about a Go project linking to the project description and repo
Adobe Flash SWF files may work. Due to its previous ubiquity, 'players' and libraries were written for many platforms. The specifications were open and license permitting. For simple 2D graphics, earlier, more compatible versions would do fine.
The files are binary and extraordinarily small.
[Note: This is a rewrite of an earlier question that was considered inappropriate and closed.]
I need to do some pixel-level analysis of television (TV) video. The exact nature of this analysis is not pertinent, but it basically involves looking at every pixel of every frame of TV video, starting from an MPEG-2 transport stream. The host platform will be server-class, multiprocessor 64-bit Linux machines.
I need a library that can handle the decoding of the transport stream and present me with the image data in real-time. OpenCV and ffmpeg are two libraries that I am considering for this work. OpenCV is appealing because I have heard it has easy to use APIs and rich image analysis support, but I have no experience using it. I have used ffmpeg in the past for extracting video frame data from files for analysis, but it lacks image analysis support (though Intel's IPP can supplement).
In addition to general recommendations for approaches to this problem (excluding the actual image analysis), I have some more specific questions that would help me get started:
Are ffmpeg or OpenCV commonly used in industry as a foundation for real-time
video analysis, or is there something else I should be looking at?
Can OpenCV decode video frames in real time, and still leave enough
CPU left over to do nontrivial image analysis, also in real-time?
Is sufficient to use ffpmeg for MPEG-2 transport stream decoding, or
is it preferable to just use an MPEG-2 decoding library directly (and if so, which one)?
Are there particular pixel formats for the output frames that ffmpeg
or OpenCV is particularly efficient at producing (like RGB, YUV, or YUV422, etc)?
1.
I would definitely recommend OpenCV for "real-time" image analysis. I assume by real-time you are referring to the ability to keep up with TV frame rates (e.g., NTSC (29.97 fps) or PAL (25 fps)). Of course, as mentioned in the comments, it certainly depends on the hardware you have available as well as the image size SD (480p) vs. HD (720p or 1080p). FFmpeg certainly has its quirks, but you would be hard pressed to find a better free alternative. Its power and flexibility quite impressive; I'm sure that is one of the reasons that the OpenCV developers decided to use it as the back-end for video decoding/encoding with OpenCV.
2.
I have not seen issues with high-latency while using OpenCV for decoding. How much latency can your system have? If you need to increase performance, consider using separate threads for capture/decoding and image analysis. Since you mentioned having multi-processor systems, this should take greater advantage of your processing capabilities. I would definitely recommend using the latest Intel Core-i7 (or possibly the Xeon equivalent) architecture as this will give you the best performance available today.
I have used OpenCV on several embedded systems, so I'm quite familiar with your desire for peak performance. I have found many times that it was unnecessary to process a full frame image (especially when trying to determine masks). I would highly recommend down-sampling the images if you are having difficultly processing your acquired video streams. This can sometimes instantly give you a 4-8X speedup (depending on your down-sample factor). Also on the performance front, I would definitely recommend using Intel's IPP. Since OpenCV was originally an Intel project, IPP and OpenCV blend very well together.
Finally, because image-processing is one of those "embarrassingly parallel" problem fields don't forget about the possibility of using GPUs as a hardware accelerator for your problems if needed. OpenCV has been doing a lot of work on this area as of late, so you should have those tools available to you if needed.
3.
I think FFmpeg would be a good starting point; most of the alternatives I can think of (Handbrake, mencoder, etc.) tend to use ffmpeg as a backend, but it looks like you could probably roll your own with IPP's Video Coding library if you wanted to.
4.
OpenCV's internal representation of colors is BGR unless you use something like cvtColor to convert it. If you would like to see a list of the pixel formats that are supported by FFmpeg, you can run
ffmpeg -pix_fmts
to see what it can input and output.
For the 4th question only:
video streams are encoded in a 422 format: YUV, YUV422, YCbCr, etc. Converting them to BGR and back (for re-encoding) eats up lots of time. So if you can write your algorithms to run on YUV you'll get an instant performance boost.
Note 1. While OpenCV natively supports BGR images, you can make it process YUV, with some care and knowledge about its internals.
By example, if you want to detect some people in the video, just take the upper half of the decoded video buffer (it contains the grayscale representation of the image) and process it.
Note 2. If you want to access the YUV image in opencv, you must use ffmpeg API directly in your app. OpenCV force the conversion from YUV to BGR in its VideoCapture API.
We have some raw voice audio that we need to distribute over the internet. We need decent quality, but it doesn't need to be of musical quality. Our main concern is usability by the consumer (i.e. what and where they can play it) and size of the download. My experience has shown that mp3s do not produce the best compression numbers for voice audio, but I am at a loss for what the best alternatives are. Ultimately we would like to automate the conversion process to allow the consumer to choose the quality vs. size level that they would like.
You should give Opus a try. Example compression command line:
ffmpeg -i x.wav -b:a 32k x.opus
Start here.
As you rightly point out, voice compression is different from general audio compression. You'll find many codecs dedicated to telephony applications, ranging from PCM and ADPCM through later packet based encodings such as CELP used on GSM cellular networks.
Still, VOIP voice encoding is slightly different from that due to the medium used. you can find a good, free (unencumbered and open source (BSD)) library for speech encoding/decoding in the Speex software library.
Again, which you choose depends on the speech you're encoding and the medium it's being transmitted over. Also note that many libraries have several algorithms they can use depending on the circumstances, and some will even switch on the fly based on conditions of the sound and network.
To get more help, narrow your question down.
-Adam
The most frequently used compression formats used in live voice audio (like VoIP telephony) are μ-Law (mu-Law/u-Law is used in the US) and a-Law (used in Europe, etc.) which, unlike Uncompressed PCM, don't support as wide of a frequency range (a smaller range of possible values ignores sounds outside of the necessary spectrum and requires less space to store).
For usability sake it is easiest to use mpeg compressions (mp2/3/4) for streaming to standard media players as the algorithms are readily available and typically quite fast and almost all media players should support it, but for voice you might try to specify a lower bitrate or do your conversion from a lower quality file in the first place (WAV can be at several sampling rates and voice requires a much lower sampling rate than music or effects, it's basically like frame-per-second on video). Alternatively you can use Real Media, WMA or other proprietary formats, but this would limit usability since the users would require specific third party software for playback, though WMA has an excellent compression ratio as well as compression options specific to voice audio.
Assuming your users will be running Windows, there is a WMA speech compression codec that you can use with the Windows Media Encoder SDK. Failing that, you can use ACM to use something like G723/G728, ADPCM, mu-law or a-law, some of which are installed as standard on Windows XP & above. These can be packaged inside WAV files. You'll need to experiment a little to find the right bitrate/quality (probably don't bother with mu-law or a-law). With voice data you can get away with quite low sample rates - e.g. 16000 or 8000, as there isn't much above 4Khz in the human spoken voice.
I think AMR is one of the best speech codecs. I was using it about a year ago and I remember that quality was very good and size levels were rather small.
One drawback, especially in your case is that, as far as I know, it isn't supported by wide range of media players. QuickTime and RealPlayer are two which I know to play .amr files.
Try speex ... unencumbered by patents, good performance both sizewise and CPU-wise. I've been having good luck using it on iPhone.