got an error on decoding jpeg2000 in c++ with OpenJPEG - jpeg

I am trying to decode JPEG2000 format with OpenJPEG.
(Decoding just one raw tile buffer from a svs file.)
But I got an error on opj_read_header().
Is there something I forgot to set before calling opj_read_header()?
opj_image_t *image = NULL;
int32_t datalen = tileByteCounts[test_num];
opj_stream_t *stream = opj_stream_create(datalen, true);
struct buffer_state state =
{
.data = buffer,
.length = datalen,
};
opj_stream_set_user_data(stream, &state, NULL);
opj_stream_set_user_data_length(stream, datalen);
opj_stream_set_read_function(stream, read_callback);
opj_stream_set_skip_function(stream, skip_callback);
opj_stream_set_seek_function(stream, seek_callback);
opj_codec_t *codec = opj_create_decompress(OPJ_CODEC_JP2);
opj_dparameters_t parameters;
opj_set_default_decoder_parameters(&parameters);
parameters.decod_format = 1;
parameters.cod_format = 2;
parameters.DA_x0 = 0;
parameters.DA_y0 = 0;
parameters.DA_x1 = tileWidth;
parameters.DA_y1 = tileHeight;
opj_setup_decoder(codec, &parameters);
// enable error handlers
opj_set_warning_handler(codec, warning_callback, NULL);
opj_set_error_handler(codec, error_callback, NULL);
// read header
if (!opj_read_header(stream, codec, &image)) // It's calling error_callback !
{
printf("error on reading header");
return 1;
}

I would use libvips to read from SVS images. It has a good openslide import operation written by the openslide devs, and handles this kind of thing well. It's free and cross-platform. Most linuxes have it in their package manager.
At the command-line you can write:
$ vipsheader CMU-1.svs
CMU-1.svs: 46000x32914 uchar, 4 bands, rgb, openslideload
$ time vips crop CMU-1.svs x.jpg 10 10 100 100
real 0m0.058s
user 0m0.050s
sys 0m0.008s
Be aware that jp2k is rather slow to decode. To convert the entire image, I see:
$ time vips copy CMU-1.svs x.jpg --vips-progress
vips temp-5: 46000 x 32914 pixels, 8 threads, 128 x 128 tiles, 256 lines in buffer
vips temp-5: done in 14.6s
real 0m14.720s
user 1m10.978s
sys 0m1.179s
In C++:
// compile with
// g++ crop.cpp `pkg-config vips-cpp --cflags --libs`
#include <vips/vips8>
int
main (int argc, char **argv)
{
if (VIPS_INIT (argv[0]))
vips_error_exit (NULL);
vips::VImage image = vips::VImage::new_from_file (argv[1]);
image.write_to_file (argv[2]);
}
There are C, Python, Ruby, JavaScript, PHP, C#, Go, Rust etc. bindings too.
You can write other formats by changing the output filename, or by setting optional arguments in the C++ code. For example, you can run that C++ program with:
$ ./a.out CMU-1.svs x.tif[compression=jpeg,tile,pyramid]
Or you could change the write_to_file to be:
image.tiffsave (argv[2], VImage::option()
->set ("compression", "jpeg")
->set ("tile", TRUE)
->set ("pyramid", TRUE));
The docs have all the details.

Related

Using Lame function hip_decode in Android NDK to decode mp3 return 0

I am using Lame's mpglib to decode mp3 to PCM in Android NDK for playing. But when I called hip_decode(), it returen 0 meaning that "need more data before we can complete the decode". I had no idea how to solve it. Can someone helps me? Here is my code:
void CBufferWrapper::ConvertMp3toPCM (AAssetManager* mgr, const char *filename){
Print ("ConvertMp3toPCM:file:%s", filename);
AAsset* asset = AAssetManager_open (mgr, filename, AASSET_MODE_UNKNOWN);
// the asset might not be found
assert (asset != NULL);
// open asset as file descriptor
off_t start, length;
int fd = AAsset_openFileDescriptor (asset, &start, &length);
assert (0 <= fd);
long size = AAsset_getLength (asset);
char* buffer = (char*)malloc (sizeof(char)*size);
memset (buffer, 0, size*sizeof(char));
AAsset_read (asset, buffer, size);
AAsset_close (asset);
hip_t ht = hip_decode_init ();
int count = hip_decode (ht, (unsigned char*)buffer, size, pcm_l, pcm_r);
free (buffer);
Print ("ConvertMp3toPCM: length:%ld,pcmcount=%d",length, count);
}
I used MACRO "HAVE_MPGLIB" to compile Lame in NDK. So I think it should work for decoding literally.
Yesterday I had the same problem. Is the same problem but using lame_enc.dll. I did not know how to resolve this 0 returned, this is the reason to this post.
Create a buffer to put mp3 data: unsigned char mp3Data[4096]
Create two buffers for pcm data, but bigger than mp3 one:
unsigned short[4096 * 100];
Open mp3 file and initialize hip.
Now, enter in a do while loop until read bytes are 0 (the end of file).
Inside the loop read 4096 bytes into mp3Data and call hip_decode with
hip_decode(ht, mp3Data, bytesRead, lpcm, rpcm);
You are right, it returns 0. It is asking you for more data.
You need to repeat the reading of 4096 bytes and the call to hip_decode until it returns a valid samples number.
Here is the important part of my program:
int total = 0;
int hecho = 0;
int leido = 0;
int lon = 0;
int x;
do
{
total = fread(mp3b, 1, MAXIMO, fich);
leido += total;
x = hip_decode(hgf, mp3b, total, izquierda, derecha);
if(x > 0)
{
int tamanio;
int y;
tamanio = 1.45 * x + 9200;
unsigned char * bu = (unsigned char *) malloc(tamanio);
y = lame_encode_buffer(lamglofla, izquierda, derecha, x, bu, tamanio);
fwrite(bu, 1, y, fichs);
free(bu);
}
}while(total > 0);
My program decodes a mp3 file and encodes the output into another mp3 file.
I expect that this could be useful.

Generating a comprehensive callgraph using GCC & Egypt

I am trying to generate a comprehensive callgraph (complete with low level calls to Linux, runtime, the lot).
I have statically compiled my source files with "-fdump-rtl-expand" and created RTL files, which I passed to a PERL script called Egypt (which I believe is Graphviz/Dot) and generated a PDF file of the callgraph. This works perfectly, no problems at all.
Except, there are calls being made into some libraries that are getting shown as built-in. I was looking to see if there is a way for the callgraph not to be printed as and instead the real calls made into the libraries ?
Please let me know if the question is unclear.
http://i.imgur.com/sp58v.jpg
Basically, I am trying to avoid the callgraph from generating < built-in >
Is there a way to do that ?
-------- CODE ---------
#include <cilk/cilk.h>
#include <stdio.h>
#include <stdlib.h>
unsigned long int t0, t5;
unsigned int NOSPAWN_THRESHOLD = 32;
int fib_nospawn(int n)
{
if (n < 2)
return n;
else
{
int x = fib_nospawn(n-1);
int y = fib_nospawn(n-2);
return x + y;
}
}
// spawning fibonacci function
int fib(long int n)
{
long int x, y;
if (n < 2)
return n;
else if (n <= NOSPAWN_THRESHOLD)
{
x = fib_nospawn(n-1);
y = fib_nospawn(n-2);
return x + y;
}
else
{
x = cilk_spawn fib(n-1);
y = cilk_spawn fib(n-2);
cilk_sync;
return x + y;
}
}
int main(int argc, char *argv[])
{
int n;
long int result;
long int exec_time;
n = atoi(argv[1]);
NOSPAWN_THRESHOLD = atoi(argv[2]);
result = fib(n);
printf("%ld\n", result);
return 0;
}
I compiled the Cilk Library from source.
I might have found the partial solution to the problem:
You need to pass the following option to egypt
--include-external
This produced a slightly more comprehensive callgraph, although there still is the " visible
http://i.imgur.com/GWPJO.jpg?1
Can anyone suggest if I get more depth in the callgraph ?
You can use the GCC VCG Plugin: A gcc plugin, which can be loaded when debugging gcc, to show internal structures graphically.
gcc -fplugin=/path/to/vcg_plugin.so -fplugin-arg-vcg_plugin-cgraph foo.c
Call-graph is place to store data needed
for inter-procedural optimization. All datastructures
are divided into three components:
local_info that is produced while analyzing
the function, global_info that is result
of global walking of the call-graph on the end
of compilation and rtl_info used by RTL
back-end to propagate data from already compiled
functions to their callers.

atoi on a character array with lots of integers

I have a code in which the character array is populated by integers (converted to char arrays), and read by another function which reconverts it back to integers. I have used the following function to get the conversion to char array:
char data[64];
int a = 10;
std::string str = boost::lexical_cast<std::string>(a);
memcpy(data + 8*k,str.c_str(),sizeof(str.c_str())); //k varies from 0 to 7
and the reconversion back to characters is done using:
char temp[8];
memcpy(temp,data+8*k,8);
int a = atoi(temp);
This works fine in general, but when I try to do it as part of a project involving qt (ver 4.7), it compiles fine and gives me segmentation faults when it tries to read using memcpy(). Note that the segmentation fault happens only while in the reading loop and not while writing data. I dont know why this happens, but I want to get it done by any method.
So, are there any other other functions which I can use which can take in the character array, the first bit and the last bit and convert it into the integer. Then I wouldnt have to use memcpy() at all. What I am trying to do is something like this:
new_atoi(data,8*k,8*(k+1)); // k varies from 0 to 7
Thanks in advance.
You are copying only a 4 characters (dependent on your system's pointer width). This will leave numbers of 4+ characters non-null terminated, leading to runaway strings in the input to atoi
sizeof(str.c_str()) //i.e. sizeof(char*) = 4 (32 bit systems)
should be
str.length() + 1
Or the characters will not be nullterminated
STL Only:
make_testdata(): see all the way down
Why don't you use streams...?
#include <sstream>
#include <iostream>
#include <algorithm>
#include <iterator>
#include <string>
#include <vector>
int main()
{
std::vector<int> data = make_testdata();
std::ostringstream oss;
std::copy(data.begin(), data.end(), std::ostream_iterator<int>(oss, "\t"));
std::stringstream iss(oss.str());
std::vector<int> clone;
std::copy(std::istream_iterator<int>(iss), std::istream_iterator<int>(),
std::back_inserter(clone));
//verify that clone now contains the original random data:
//bool ok = std::equal(data.begin(), data.end(), clone.begin());
return 0;
}
You could do it a lot faster in plain C with atoi/itoa and some tweaks, but I reckon you should be using binary transmission (see Boost Spirit Karma and protobuf for good libraries) if you need the speed.
Boost Karma/Qi:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
namespace qi=::boost::spirit::qi;
namespace karma=::boost::spirit::karma;
static const char delimiter = '\0';
int main()
{
std::vector<int> data = make_testdata();
std::string astext;
// astext.reserve(3 * sizeof(data[0]) * data.size()); // heuristic pre-alloc
std::back_insert_iterator<std::string> out(astext);
{
using namespace karma;
generate(out, delimit(delimiter) [ *int_ ], data);
// generate_delimited(out, *int_, delimiter, data); // equivalent
// generate(out, int_ % delimiter, data); // somehow much slower!
}
std::string::const_iterator begin(astext.begin()), end(astext.end());
std::vector<int> clone;
qi::parse(begin, end, qi::int_ % delimiter, clone);
//verify that clone now contains the original random data:
//bool ok = std::equal(data.begin(), data.end(), clone.begin());
return 0;
}
If you wanted to do architecture independent binary serialization instead, you'd use this tiny adaptation making things a zillion times faster (see benchmark below...):
karma::generate(out, *karma::big_dword, data);
// ...
qi::parse(begin, end, *qi::big_dword, clone);
Boost Serialization
The best performance can be reached when using Boost Serialization in binary mode:
#include <sstream>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/serialization/vector.hpp>
int main()
{
std::vector<int> data = make_testdata();
std::stringstream ss;
{
boost::archive::binary_oarchive oa(ss);
oa << data;
}
std::vector<int> clone;
{
boost::archive::binary_iarchive ia(ss);
ia >> clone;
}
//verify that clone now contains the original random data:
//bool ok = std::equal(data.begin(), data.end(), clone.begin());
return 0;
}
Testdata
(common to all versions above)
#include <boost/random.hpp>
// generates a deterministic pseudo-random vector of 32Mio ints
std::vector<int> make_testdata()
{
std::vector<int> testdata;
testdata.resize(2 << 24);
std::generate(testdata.begin(), testdata.end(), boost::mt19937(0));
return testdata;
}
Benchmarks
I benchmarked it by
using input data of 2<<24 (33554432) random integers
not displaying output (we don't want to measure the scrolling performance of our terminal)
the rough timings were
STL only version isn't too bad actually at 12.6s
Karma/Qi text version ran in 18s 5.1s, thanks to Arlen's hint at generate_delimited :)
Karma/Qi binary version (big_dword) in only 1.4s (roughly 12x 3-4x as fast)
Boost Serialization takes the cake with around 0.8s (or when subsituting text archives instead of binaries, around 13s)
There is absolutely no reason for the Karma/Qi text version to be any slower than the STL version. I improved #sehe implementation of the Karma/Qi text version to reflect that claim.
The following Boost Karma/Qi text version is more than twice as fast as the STL version:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/random.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
namespace ascii = boost::spirit::ascii;
namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phoenix = boost::phoenix;
template <typename OutputIterator>
void generate_numbers(OutputIterator& sink, const std::vector<int>& v){
using karma::int_;
using karma::generate_delimited;
using ascii::space;
generate_delimited(sink, *int_, space, v);
}
template <typename Iterator>
void parse_numbers(Iterator first, Iterator last, std::vector<int>& v){
using qi::int_;
using qi::phrase_parse;
using ascii::space;
using qi::_1;
using phoenix::push_back;
using phoenix::ref;
phrase_parse(first, last, *int_[push_back(ref(v), _1)], space);
}
int main(int argc, char* argv[]){
static boost::mt19937 rng(0); // make test deterministic
std::vector<int> data;
data.resize(2 << 24);
std::generate(data.begin(), data.end(), rng);
std::string astext;
std::back_insert_iterator<std::string> out(astext);
generate_numbers(out, data);
//std::cout << astext << std::endl;
std::string::const_iterator begin(astext.begin()), end(astext.end());
std::vector<int> clone;
parse_numbers(begin, end, clone);
//verify that clone now contains the original random data:
//std::copy(clone.begin(), clone.end(), std::ostream_iterator<int>(std::cout, ","));
return 0;
}

How does one find the start of the "Central Directory" in zip files?

Wikipedia has an excellent description of the ZIP file format, but the "central directory" structure is confusing to me. Specifically this:
This ordering allows a ZIP file to be created in one pass, but it is usually decompressed by first reading the central directory at the end.
The problem is that even the trailing header for the central directory is variable length. How then, can someone get the start of the central directory to parse?
(Oh, and I did spend some time looking at APPNOTE.TXT in vain before coming here and asking :P)
My condolences, reading the wikipedia description gives me the very strong impression that you need to do a fair amount of guess + check work:
Hunt backwards from the end for the 0x06054b50 end-of-directory tag, look forward 16 bytes to find the offset for the start-of-directory tag 0x02014b50, and hope that is it. You could do some sanity checks like looking for the comment length and comment string tags after the end-of-directory tag, but it sure feels like Zip decoders work because people don't put funny characters into their zip comments, filenames, and so forth. Based entirely on the wikipedia page, anyhow.
I was implementing zip archive support some time ago, and I search last few kilobytes for a end of central directory signature (4 bytes). That works pretty good, until somebody will put 50kb text into comment (which is unlikely to happen. To be absolutely sure, you can search last 64kb + few bytes, since comment size is 16 bit).
After that, I look up for zip64 end of central dir locator, that's easier since it has fixed structure.
Here is a solution I have just had to roll out incase anybody needs this. This involves grabbing the central directory.
In my case I did not want any of the compression features that are offered in any of the zip solutions. I just wanted to know about the contents. The following code will return a ZipArchive of a listing of every entry in the zip.
It also uses a minimum amount of file access and memory allocation.
TinyZip.cpp
#include "TinyZip.h"
#include <cstdio>
namespace TinyZip
{
#define VALID_ZIP_SIGNATURE 0x04034b50
#define CENTRAL_DIRECTORY_EOCD 0x06054b50 //signature
#define CENTRAL_DIRECTORY_ENTRY_SIGNATURE 0x02014b50
#define PTR_OFFS(type, mem, offs) *((type*)(mem + offs)) //SHOULD BE OK
typedef struct {
unsigned int signature : 32;
unsigned int number_of_disk : 16;
unsigned int disk_where_cd_starts : 16;
unsigned int number_of_cd_records : 16;
unsigned int total_number_of_cd_records : 16;
unsigned int size_of_cd : 32;
unsigned int offset_of_start : 32;
unsigned int comment_length : 16;
} ZipEOCD;
ZipArchive* ZipArchive::GetArchive(const char *filepath)
{
FILE *pFile = nullptr;
#ifdef WIN32
errno_t err;
if ((err = fopen_s(&pFile, filepath, "rb")) == 0)
#else
if ((pFile = fopen(filepath, "rb")) == NULL)
#endif
{
int fileSignature = 0;
//Seek to start and read zip header
fread(&fileSignature, sizeof(int), 1, pFile);
if (fileSignature != VALID_ZIP_SIGNATURE) return false;
//Grab the file size
long fileSize = 0;
long currPos = 0;
fseek(pFile, 0L, SEEK_END);
fileSize = ftell(pFile);
fseek(pFile, 0L, SEEK_SET);
//Step back the size of the ZipEOCD
//If it doesn't have any comments, should get an instant signature match
currPos = fileSize;
int signature = 0;
while (currPos > 0)
{
fseek(pFile, currPos, SEEK_SET);
fread(&signature, sizeof(int), 1, pFile);
if (signature == CENTRAL_DIRECTORY_EOCD)
{
break;
}
currPos -= sizeof(char); //step back one byte
}
if (currPos != 0)
{
ZipEOCD zipOECD;
fseek(pFile, currPos, SEEK_SET);
fread(&zipOECD, sizeof(ZipEOCD), 1, pFile);
long memBlockSize = fileSize - zipOECD.offset_of_start;
//Allocate zip archive of size
ZipArchive *pArchive = new ZipArchive(memBlockSize);
//Read in the whole central directory (also includes the ZipEOCD...)
fseek(pFile, zipOECD.offset_of_start, SEEK_SET);
fread((void*)pArchive->m_MemBlock, memBlockSize - 10, 1, pFile);
long currMemBlockPos = 0;
long currNullTerminatorPos = -1;
while (currMemBlockPos < memBlockSize)
{
int sig = PTR_OFFS(int, pArchive->m_MemBlock, currMemBlockPos);
if (sig != CENTRAL_DIRECTORY_ENTRY_SIGNATURE)
{
if (sig == CENTRAL_DIRECTORY_EOCD) return pArchive;
return nullptr; //something went wrong
}
if (currNullTerminatorPos > 0)
{
pArchive->m_MemBlock[currNullTerminatorPos] = '\0';
currNullTerminatorPos = -1;
}
const long offsToFilenameLen = 28;
const long offsToFieldLen = 30;
const long offsetToFilename = 46;
int filenameLength = PTR_OFFS(int, pArchive->m_MemBlock, currMemBlockPos + offsToFilenameLen);
int extraFieldLen = PTR_OFFS(int, pArchive->m_MemBlock, currMemBlockPos + offsToFieldLen);
const char *pFilepath = &pArchive->m_MemBlock[currMemBlockPos + offsetToFilename];
currNullTerminatorPos = (currMemBlockPos + offsetToFilename) + filenameLength;
pArchive->m_Entries.push_back(pFilepath);
currMemBlockPos += (offsetToFilename + filenameLength + extraFieldLen);
}
return pArchive;
}
}
return nullptr;
}
ZipArchive::ZipArchive(long size)
{
m_MemBlock = new char[size];
}
ZipArchive::~ZipArchive()
{
delete[] m_MemBlock;
}
const std::vector<const char*> &ZipArchive::GetEntries()
{
return m_Entries;
}
}
TinyZip.h
#ifndef __TinyZip__
#define __TinyZip__
#include <vector>
#include <string>
namespace TinyZip
{
class ZipArchive
{
public:
ZipArchive(long memBlockSize);
~ZipArchive();
static ZipArchive* GetArchive(const char *filepath);
const std::vector<const char*> &GetEntries();
private:
std::vector<const char*> m_Entries;
char *m_MemBlock;
};
}
#endif
Usage:
TinyZip::ZipArchive *pArchive = TinyZip::ZipArchive::GetArchive("Scripts_unencrypt.pak");
if (pArchive != nullptr)
{
const std::vector<const char*> entries = pArchive->GetEntries();
for (auto entry : entries)
{
//do stuff
}
}
In case someone out there is still struggling with this problem - have a look at the repository I hosted on GitHub containing my project that could answer your questions.
Zip file reader
Basically what it does is download the central directory part of the .zip file which resides in the end of the file.
Then it will read out every file and folder name with it's path from the bytes and print it out to console.
I have made comments about the more complicated steps in my source code.
The program can work only till about 4GB .zip files. After that you will have to do some changes to the VM size and maybe more.
Enjoy :)
I recently encountered a similar use-case and figured I would share my solution for posterity since this post helped send me in the right direction.
Using the Zip file central directory offsets detailed on Wikipedia here, we can take the following approach to parse the central directory and retrieve a list of the contained files:
STEPS:
Find the end of the central directory record (EOCDR) by scanning the zip file in binary format for the EOCDR signature (0x06054b50), beginning at the end of the file (i.e. read the file in reverse using std::ios::ate if using a ifstream)
Use the offset located in the EOCDR (16 bytes from the EOCDR) to position the stream reader at the beginning of the central directory
Use the offset (46 bytes from the CD start) to position the stream reader at the file name and track its position start point
Scan until either another central directory header is found (0x02014b50) or the EOCDR is found, and track the position
Reset the reader to the start of the file name and read until the end
Position the reader over the next header, or terminate if the EOCDR is found
The key point here is that the EOCDR is uniquely identified by a signature (0x06054b50) that occurs only one time. Using the 16 byte offset, we can position ourselves to the first occurrence of the central directory header (0x02014b50). Each record will have the same 0x02014b50 header signature, so you just need to loop through occurrences of the header signatures until you hit the EOCDR ending signature (0x06054b50) again.
SUMMARY:
If you want to see a working example of the above steps, you can check out my minimal implementation (ZipReader) on GitHub here. The implementation can be used like this:
ZipReader zr;
if (zr.SetInput("blah.zip") == ZipReaderStatus::S_FAIL)
std::cout << "set input error" << std::endl;
std::vector<std::string> entries;
if (zr.GetEntries(entries) == ZipReaderStatus::S_FAIL)
std::cout << "get entries error" << std::endl;
for (auto entry : entries)
std::cout << entry << std::endl;

How to get the size of a gunzipped file in vim

When viewing (or editing) a .gz file, vim knows to locate gunzip and display the file properly.
In such cases, getfsize(expand("%")) would be the size of the gzipped file.
Is there a way to get the size of the expanded file?
[EDIT]
Another way to solve this might be getting the size of current buffer, but there seems to be no such function in vim. Am I missing something?
There's no easy way to get the uncompressed size of a gzipped file, short of uncompressing it and using the getfsize() function. That might not be what you want. I took at a look at RFC 1952 - GZIP File Format Specification, and the only thing that might be useful is the ISIZE field, which contains "...the size of the original (uncompressed) input data modulo 2^32".
EDIT:
I don't know if this helps, but here's some proof-of-concept C code I threw together that retrieves the value of the ISIZE field in a gzip'd file. It works for me using Linux and gcc, but your mileage may vary. If you compile the code, and then pass in a gzip'd filename as a parameter, it will tell you the uncompressed size of the original file.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
int main(int argc, char *argv[])
{
FILE *fp = NULL;
int i=0;
if ( argc != 2 ) {
fprintf(stderr, "Must specify file to process.\n" );
return -1;
}
// Open the file for reading
if (( fp = fopen( argv[1], "r" )) == NULL ) {
fprintf( stderr, "Unable to open %s for reading: %s\n", argv[1], strerror(errno));
return -1;
}
// Look at the first two bytes and make sure it's a gzip file
int c1 = fgetc(fp);
int c2 = fgetc(fp);
if ( c1 != 0x1f || c2 != 0x8b ) {
fprintf( stderr, "File is not a gzipped file.\n" );
return -1;
}
// Seek to four bytes from the end of the file
fseek(fp, -4L, SEEK_END);
// Array containing the last four bytes
unsigned char read[4];
for (i=0; i<4; ++i ) {
int charRead = 0;
if ((charRead = fgetc(fp)) == EOF ) {
// This shouldn't happen
fprintf( stderr, "Read end-of-file" );
exit(1);
}
else
read[i] = (unsigned char)charRead;
}
// Copy the last four bytes into an int. This could also be done
// using a union.
int intval = 0;
memcpy( &intval, &read, 4 );
printf( "The uncompressed filesize was %d bytes (0x%02x hex)\n", intval, intval );
fclose(fp);
return 0;
}
This appears to work for getting the byte count of a buffer
(line2byte(line("$")+1)-1)
If you're on Unix/linux, try
:%!wc -c
That's in bytes. (It works on windows, if you have e.g. cygwin installed.) Then hit u to get your content back.
HTH
From within vim editor, try this:
<Esc>:!wc -c my_zip_file.gz
That will display you the number of bytes the file is having.

Resources