How to write binary stream as an opus file - python - python-3.x

I'm trying to write an opus file out of a binary stream using the code below, but the resulting files are unplayable. It seems that dumping it into the container won't work. Any ideas as to how to go about writing these audio files?
audio_start.append(start_audio + name[2])
audio_end.append(start_audio + name[2] + name[3])
count = 0
for i, j in zip(audio_start, audio_end):
in_file.seek(0)
count += 1
audio_output_path = input[:-4] + "audio_" + str(count) + ".opus"
with open(audio_output_path, 'wb') as out_audio:
out_audio.write(in_file.read()[i:j])
out_audio.close()

Related

Python code to concat images and ts files using ffmpeg

I have a folder with multiple ts files in it and I want to join the files by inserting an image for n number of duration between videos. Below is the list with the duration for which an image needs to be inserted for.
['00:00:06:17', '00:00:00:16', '00:00:01:05', '00:00:00:31', '00:00:01:01']
For example, if the folder has 5 ts files (this number might change so the folder needs to be iterable) then,
video1 + image for 00:00:06:17 + video2 + image for 00:00:00:16 + video 3, etc...
Any pointers will be much appreciated.
UPDATE:
for i in new_ts3:
for m in filename[:-1]:
p1 = subprocess.Popen (['ffmpeg', '-loop', '1', '-i', sys.argv[2], '-c:v', 'libx264', '-profile:v', 'high', '-t', i, '-pix_fmt', 'yuvj420p', '-vf', 'scale=1280:720', '-f', 'mpegts', '{}{}_.ts'.format((os.path.splitext(sys.argv[1]) [0]), m)], stdout=subprocess.PIPE)
out1 = p1.communicate()
break
where new_ts3 is ['00:00:06:17', '00:00:00:16', '00:00:01:05', '00:00:00:31', '00:00:01:01'] and
filename is ['file1', 'file2', 'file3', 'file4', 'file5', 'file6']
With the above, I am getting 5 files with different filenames but each file is of duration 00:00:06:17
import os
i = 0
image_list = ['img1.jpeg', 'img2.jpeg']
duration = ['1200', '1300']
## First we make loops of images to videos with given duration
for i in range(len(image_list)):
# change name according to whatever order you want to save the files format
outname = image_list[i] + str(i) + '.mp4'
dur = duration[i]
os.system(f'ffmpeg -loop 1 -i {images_list[i]} -c:v libx264 -t {dur} -pix_fmt yuv420p -vf scale=320:240 {outname}')
# Then we onvert .mp4 files to .mpeg if not in .mpeg format as .mp4 files have headers
for file_name in os.listdir(path):
if (filename.endswith(".mp4")):
out_name = filename.split(".")[0] + str(.ts)
os.system(f'ffmpeg -i filename -c copy -bsf:v h264_mp4toannexb -f mpegts {out_name}')
## create a list of the order in which you want to combine files here
order_list = [] # create with your own logic
new_out = 'out.ts'
file_list = os.listdir(path)
for i in range(len(file_list)-1):
new_out = 'out' + str(i) + '.ts'
os.system(f'ffmpeg -i "concat:{file_list[i]}|{file_list[i+1]}" -c copy {new_out}')
if (i+1 < len(file_list))
file_list[i+1] = new_out
else:
continue
## finally convert it to mp4 at the last step
the flags used in commands are mostly self explanatory such as scale of output.
In case of the last loop , we're simply updating the output file name every time by adding that to our file list and combining it with the next item in the list and pushing the output
You can convert the final .ts file to any format of your choice. Look for FFMPEG documentation and to execute it in python use the os.system

files are saved repeatedly with single name, no looping, no ranging

My codes runs well, but have one flaw. They are not saving accordingly. For example, Let's say I caught 3 jpeg files, when I ran the codes, it saves 3 times on slot 1, 3 times on slot 2, and 3 times on slot 3. So I ended up with 3 same files.
I think there is something wrong with my looping logic?
If I changed for n in range(len(soup_imgs)): to for n in range(len(src)):, the operation saves infinitely of the last jpeg files.
soup_imgs = soup.find(name='div', attrs={'class':'t_msgfont'}).find_all('img', alt="", src=re.compile(".jpg"))
for i in soup_imgs:
src = i['src']
print(src)
dirPath = "C:\\__SPublication__\\"
img_folder = dirPath + '/' + soup_title + '/'
if (os.path.exists(img_folder)):
pass
else:
os.mkdir(img_folder)
for n in range(len(src)):
n += 1
img_name = dirPath + '/' + soup_title + '/' + str({}).format(n) + '.jpg'
img_files = open(img_name, 'wb')
img_files.write(requests.get(src).content)
print("Outputs:" + img_name)
I am amateur in coding, just started not long ago as a hobby of mine. Please give me some guidance, chiefs.
Try this when you are writing your image files:
from os import path
for i, img in enumerate(soup_imgs):
src = img['src']
img_name = path.join(dirPath, soup_title, "{}.jpg".format(i))
with open(img_name, 'wb') as f:
f.write(requests.get(src).content)
print("Outputs:{}".format(img_name))
You need to loop over all image sources, rather than using the last src value from a previous for block.
I've also added a safer method for joining directory and file paths that should be OS independent. Finally, when opening a file, always use the with open() as f: construct - this way Python will automatically close the filehandle for you.

image processing commands in python using for loop

I have this part code which i don't understand the for loop part:
also what does file path holds and how it is different than all file path?
all_files_path = glob.glob("ADNI1_Screening_1.5T/ADNI/*/*/*/*/*.nii")
for file_path in all_file_path:
print(os.system("./runROBEX.sh " + file_path + " stripped/" +file_path.split("/")[-1]))

Selective download and extraction of data (CAB)

So I have a specific need to download and extract a cab file but the size of each cab file is huge > 200MB. I wanted to selectively download files from the cab as rest of the data is useless.
Done so much so far :
Request 1% of the file from the server. Get the headers and parse them.
Get the files list, their offsets according to This CAB Link.
Send a GET request to server with the Range header set to the file Offset and the Offset+Size.
I am able to get the response but it is in a way "Unreadable" cause it is compressed (LZX:21 - Acc to 7Zip)
Unable to decompress using zlib. Throws invlid header.
Also I did not quite understand nor could trace the CFFOLDER or CFDATA as shown in the example cause its uncompressed.
totalByteArray =b''
eofiles =0
def GetCabMetaData(stream):
global eofiles
cabMetaData={}
try:
cabMetaData["CabFormat"] = stream[0:4].decode('ANSI')
cabMetaData["CabSize"] = struct.unpack("<L",stream[8:12])[0]
cabMetaData["FilesOffset"] = struct.unpack("<L",stream[16:20])[0]
cabMetaData["NoOfFolders"] = struct.unpack("<H",stream[26:28])[0]
cabMetaData["NoOfFiles"] = struct.unpack("<H",stream[28:30])[0]
# skip 30,32,34,35
cabMetaData["Files"]= {}
cabMetaData["Folders"]= {}
baseOffset = cabMetaData["FilesOffset"]
internalOffset = 0
for i in range(0,cabMetaData["NoOfFiles"]):
fileDetails = {}
fileDetails["Size"] = struct.unpack("<L",stream[baseOffset+internalOffset:][:4])[0]
fileDetails["UnpackedStartOffset"] = struct.unpack("<L",stream[baseOffset+internalOffset+4:][:4])[0]
fileDetails["FolderIndex"] = struct.unpack("<H",stream[baseOffset+internalOffset+8:][:2])[0]
fileDetails["Date"] = struct.unpack("<H",stream[baseOffset+internalOffset+10:][:2])[0]
fileDetails["Time"] = struct.unpack("<H",stream[baseOffset+internalOffset+12:][:2])[0]
fileDetails["Attrib"] = struct.unpack("<H",stream[baseOffset+internalOffset+14:][:2])[0]
fileName =''
for j in range(0,len(stream)):
if(chr(stream[baseOffset+internalOffset+16 +j])!='\x00'):
fileName +=chr(stream[baseOffset+internalOffset+16 +j])
else:
break
internalOffset += 16+j+1
cabMetaData["Files"][fileName] = (fileDetails.copy())
eofiles = baseOffset + internalOffset
except Exception as e:
print(e)
pass
print(cabMetaData["CabSize"])
return cabMetaData
def GetFileSize(url):
resp = requests.head(url)
return int(resp.headers["Content-Length"])
def GetCABHeader(url):
global totalByteArray
size = GetFileSize(url)
newSize ="bytes=0-"+ str(int(0.01*size))
totalByteArray = b''
cabHeader= requests.get(url,headers={"Range":newSize},stream=True)
for chunk in cabHeader.iter_content(chunk_size=1024):
totalByteArray += chunk
def DownloadInfFile(baseUrl,InfFileData,InfFileName):
global totalByteArray,eofiles
if(not os.path.exists("infs")):
os.mkdir("infs")
baseCabName = baseUrl[baseUrl.rfind("/"):]
baseCabName = baseCabName.replace(".","_")
if(not os.path.exists("infs\\" + baseCabName)):
os.mkdir("infs\\"+baseCabName)
fileBytes = b''
newRange = "bytes=" + str(eofiles+InfFileData["UnpackedStartOffset"] ) + "-" + str(eofiles+InfFileData["UnpackedStartOffset"]+InfFileData["Size"] )
data = requests.get(baseUrl,headers={"Range":newRange},stream=True)
with open("infs\\"+baseCabName +"\\" + InfFileName ,"wb") as f:
for chunk in data.iter_content(chunk_size=1024):
fileBytes +=chunk
f.write(fileBytes)
f.flush()
print("Saved File " + InfFileName)
pass
def main(url):
GetCABHeader(url)
cabMetaData = GetCabMetaData(totalByteArray)
for fileName,data in cabMetaData["Files"].items():
if(fileName.endswith(".txt")):
DownloadInfFile(url,data,fileName)
main("http://path-to-some-cabinet.cab")
All the file details are correct. I have verified them.
Any guidance will be appreciated. Am I doing it wrong? Another way perhaps?
P.S : Already Looked into This Post
First, the data in the CAB is raw deflate, not zlib-wrapped deflate. So you need to ask zlib's inflate() to decode raw deflate with a negative windowBits value on initialization.
Second, the CAB format does not exactly use standard deflate, in that the 32K sliding window dictionary carries from one block to the next. You'd need to use inflateSetDictionary() to set the dictionary at the start of each block using the last 32K decompressed from the last block.

Pydub audio export has silence at the start?

I wrote a python script to take audio in 30 minute mp3's and slice it into unix timestamped, second long files. The source audio files are 192kbps, 441000Hz, stero mp3 files.
I want it that way for a service that archives audio from a radio station (where I work) and can deliver it to a user over a given start and end time, to the second. We had the server shut down for an hour for maintenance (we try not to but it happens) and we recorded it over that time using a different computer that saved our audio in 30-minute chunks. Normally this archive server saves the second-long chunks itself without issue.
The function that does the conversion, given a 30 minute input audio file, the directory to save the output chunks in, and the start time of the file as a unix timestamp:
def slice_file( infile, workingdir, start ):
#find the duration of the input clip in millliseconds
duration_in_milliseconds = len(infile)
print ("Converting " + working_file + " (", end="", flush=True)
song = infile
#grab each one second slice and save it from the first second to the last whole second in the file
for i in range(0,duration_in_milliseconds,1*one_second):
#get the folder where this second goes:
arr = datefolderfromtimestamp( int(start) + (int(i/1000)))
#print ("Second number: %s \n" % (int(i/1000)) )
offset = (i + one_second)
current_second = song[i:offset]
ensure_dir(working_directory + "/" + arr[0] + "/" + arr[1] + "/" + arr[2] + "/")
filename = os.path.normpath(working_directory + "/" + arr[0] + "/" + arr[1] + "/" + arr[2] + "/" + str(int(start) + (int(i/1000))) + "-second.mp3")
current_second.export(filename, format="mp3")
#indicate some sort of progress is happening by printing a dot every three minutes processed
if( i % (3*60*one_second) == 0 ):
print ('.', end="", flush=True)
print (")")
My issue is that all the second files converted by this script seem to be longer than a second with on average 70 ms of silence at the start of them. When I download files from my archiver server it gives me all the files concatenated together, so it sounds terrible and glitchy.
Can someone help me out? I'm not sure where this error is coming from.
My full script if you're curious:
http://pastebin.com/fy8EkVSz
Update: Found out the source of this - LAME adds buffers to the start of the file.
See: http://lame.sourceforge.net/tech-FAQ.txt

Resources