The Question is similar to what is asked in image data as source in gstreamer
Here is what my requirement :
1. There is a binary file[consider it as any extension] which contains multiple image data [say 10 image data in one binary file]
2. The binary file is extracted and the image data(s) are saved to a folder location in windows as .jpg
3. For video display I used the below pipeline
gst-launch-1.0 multifilesrc location=":/Images/%d.jpg" caps="image/jpeg,framerate=10/5" ! jpegdec ! autovideosink
** Under Images folder, there are .jpg files stored with names starting from 1.jpg, 2.jpg etc.. . The %d would take the images from the path starting index from 1 to the highest count of value [continuously numbered images]
Everything looks fine till here. Now the requirement is to skip the step #2, i.e we don't need to convert the image binary data to be stored in a physical drive path as image files. Instead, we are looking for something like passing the binary data that is extracted from the binary file directly as a source pad to the GStreamer pipeline. Is this possible ? If so, how should I write the source pad for the pipeline ?
Reason : There are as much as 32 GB data in the binary file [these are webcam images compressed in binary file ], so saving these data again to image format is another 32 GB space required in the drive [and this goes on and on]. Since we already have the image data in binary format, we need a mechanism to pass these binary data [in the form of buffer in C programming] directly to GStreamer pipeline source.
Note : gstreamer 1.0 is used in windows OS
First question would be why you are not saving video data in a video file format - for that exact purpose..
You can write a GStreamer application with an appsrc. You would then have to parse the data as you do in step 2) in your application and feed the image chunks via appsrc to the jpegdec.
You could write a source element, deriving from BaseSrc.
Related
I'm reading a tiff file using OpenSlide. Due to its large size, I'm planning to read the image by regions of 4k x 4k using read_region() function. After getting that region, I want to do the same process I have planned for the complete tiff file. To continue that process, I need the image read in OpenSlide. So I can use OpenSlide parameters.
I tried to read the selected region using read_region with Openslide again as follows.
wsi = wsi.read_region((0,0),0,(4000,4000))
wsi = openslide.OpenSlide(wsi)
The issue was I could use parameters I usually get when reading a tiff file using OpenSlide. Does anyone know a way to solve this issue?
I have a client that created several large PDFs, each containing hundreds of images within them. The images were created with a program that adds unique info to each file; random binary data was placed in some file headers, some files have data disguised as image artifacts, and general metadata in each image. While I'm unfamiliar with the program, I understand that it's a marketing software suite of some sort so I assume the data is used for tracking online distribution and analytics.
I have the source files used to create the PDFs and while I could open each image, clone its visual data, strip metadata and re-compress the images to remove the identification data, I would much rather automate the process using Pillow. The problem is, I'm worried I could miss something. The client is hoping to release the files from behind an online username, and he doesn't want the username tied to this program or its analytical tracking mechanisms.
So my question is this: how would I clone an image with Pillow in a way that would strip all identifying metadata? The image files are massive, ranging from 128MB to 2GB. All of the images are either PNG uncompressed or JPEG files with very mild compression. I'm not married to Pillow, so if there's a better software library (or standalone software) that better suits this, I'll use it instead.
Just use ImageMagick as installed on most Linux distros and available for macOS and Windows. So, in Terminal, strip the metadata from an input file and resave:
magick input.jpg -strip result.jpg
If you want to do all JPEGs in current directory:
magick mogrify -strip *.jpg
Or maybe you want to change the quality a little too:
magick mogrify -quality 82 -strip *.jpg
Copying the pixel data to a new image should remove all metadata and compressing the image slightly as a jpeg should remove steganographic tracking data.
You may have to modify the load/copy/save methods to deal with large files. Also pay attention to the PIL file size limitations. Opacity in png files isn't handled here.
import os
from PIL import Image
picture_dir = ''
for subdir, dirs, files in os.walk(picture_dir):
for f in files:
ext = os.path.splitext(f)[1]
if( ext in ['.jpg','.jpeg','.png'] ):
full_path = os.path.join(subdir, f)
im = Image.open(full_path)
data = list(im.getdata())
no_exif = Image.new('RGB', im.size) # not handling opacity
no_exif.putdata(data) # should strip exif
out_path = full_path.split(ext)[0] + 'clean.jpg'
no_exif.save(out_path, 'JPEG', quality=95) # compressing should remove steganography
I want to add Dicom tags to a series of Dicom images and want to save that modified batch.
I have written a simple python script using pydicom which can edit and add dicom tags in a single Dicom image, but i want to do same procedure for complete image set (say 20 or 30 images).
can anybody suggest me a way to do such task using pydicom or python?
Just collect your filenames in a list and process each filename (read the file, edit contents, save as new or maybe use the same name).
Have a look at the os module from python. For instance, os.listdir('path') returns a list of filenames found in the given path. If that path points to a directory that contains only dicom images you now have a list of dicom filenames. Next use os.path.join('path', filename) to get an absolute path that you can use as input for reading a dicom file with pydicom.
Also you might want to use a for loop.
Let's suppose you have a list of dicom image file paths in an array named dicom_paths. Then:
import pydicom
dicom_paths = [ list of image paths here ]
dicom_data = [pydicom.read_file(s) for s in dicom_paths]
for dicom_data_item in dicom_data:
#do what you want here
Hope it helps
I want to create an M4A file from an MP4, I want to attempt this from scratch without using other libraries but just the raw data.
So far I am able to locate the atom moov and parser it. And as a result I can pull the audio data from the mdat. So then I create my own M4A file with the right ftyp (M4A isomiso2) then add a new mdat with just the audio data I previously recovered, finally I add the moov with the same mvhd, and only the audio trak but with an updated stco to reflect the change in offsets of the chunks of audio data (as they are just one after each other now). I am sure I am doing all of this right.
However the M4A file just plays silence. I believe it is because I have to edit more in the moov but I am not sure what - I put it into FFmpeg corruption and I got:
"Sample rate index in program config element does not match the sample rate index configured by the container."
"Too large remapped id is not implemented."
So as a result I think it is something to do with the stsd atom but I am not sure how to change it.
I have a folder which contains lot of MP3 files, some of them are encoded using mp3PRO.
Since this format is now obsolete, I'd like to convert them back to MP3 (converters can be found easily).
Is there is a way to detect programatically if a file is encoded using mp3PRO format ? (eg : by looking at file header or specific signatures using an hex editor)
The official player is able to detect if file is encoded using mp3PRO (the logo is highlighted or not) so I suppose this is technically possible.
What I found so far is that bitrate of mp3PRO file appears to be pretty low (50% of non encoded file) : eg : a 128 kbps file will appears as 64kbps. However a 320 kbps file will appears as 160 kpbs (which are pretty common) so it cannot be used as a rule.
Here is what I found out and how I fixed it. I wrote in here in case somebody would need it :
MP3Pro files does not contains any special flag in the mp3 header that would help to recognize them.
They are technically very similar to usual mp3 files, except they are encoded half the bit and sample rate (eg : a 128kpbs 44100hz file will be encoded as a 64kps 22050hz file, resulting in mp3pro file being approx half the size of original file).
This has been made for compatibility, so default players can play them without any change.
They also contains some SBR data, which allow to synthetically rebuild the lost audio part (high frequencies) and to play them it was before the mp3 pro conversion.
Detecting the SBR data seems very hard if not impossible : it would require to decode the actual mp3 frames. Also there is no documentation to be found about mp3pro format.
What I did (which works but required some manual effort) : I added all files to be checked to playlist of an mp3 player (foobar 2000 in my case) then sorted the files on the sample rate column : most 22050 hz mp3 files were indeed mp3 pro files.
They were converted back to mp3 using winamp + the mp3pro plugin made for it, available here : http://www.wav-mp3.com/mp3pro-to-mp3.htm