How to shorten silence in audio file using ffmpeg? - audio

I'm trying to shorten excess silence in audio recordings using ffmpeg (shorten them, not cutting silence out entirely). The current code I use:
ffmpeg -hide_banner -i file_name.m4a -af silenceremove=0:0:0:-1:0.7:-30dB file_name_short.m4a
is not working. It detects silence longer than 0.7 seconds and remove them completely, which is not what I want. Anyone knows how to truncate silence, say, to shorten silence longer than 1 second down to 0.5 second?

The parameters of ffmpeg's silenceremove command seem to only allow you to delete all of the silence that is above a certain length. This means if you pass in stop_duration=0.5, and there is a block of silence that is 2.2 seconds long, you'll end up with 0.2 seconds of silence remaining (2.2 - 0.5 - 0.5 - 0.5 - 0.5 = 0.2).
If you don't mind converting back and forth between .wav format, you can use this Python script that I cooked up. It has quite a few options and even though it's in Python, it's using NumPy, so it can handle short files in much less than a second, and can handle a 2 hour long .wav in about 5.7 seconds, which is decent. For serious speed, this could be rewritten in C++. For videos, it may be possible to expand the solution using OpenCV.
Pluses:
Can automatically determine threshold, with adjustable aggressiveness
Can specify the maximum silence duration
Can specify minimum non-silence duration to avoid momentary blips between silence
Can use it just to detect periods of silence (much faster; processes 2 hours in 1.7 seconds)
Avoids overwriting files unless told to do so
It's limited by the modules it uses. Catches are:
it reads the whole file into memory
it only works with wave files and doesn't keep the metadata. (see below for workaround)
it can handle the common WAVE standards, unless you don't have SciPy installed, in which case it uses Python's wave module which only works well with 16-bit PCM
Usage in your case:
Convert m4a to wav: ffmpeg -i myfile.m4a myfile.wav
Run Silence Remover: python3 trim_silence.py --input=myfile.wav
Convert back with metadata: ffmpeg -i result.wav -i myfile.m4a -map_metadata 1 myfile_trimmed.m4a
Full usage notes:
usage: trim_silence.py [-h] --input INPUT [--output OUTPUT] [--threshold THRESHOLD] [--silence-dur SILENCE_DUR] [--non-silence-dur NON_SILENCE_DUR]
[--mode MODE] [--auto-threshold] [--auto-aggressiveness AUTO_AGGRESSIVENESS] [--detect-only] [--verbose] [--show-silence] [--time-it]
[--overwrite]
optional arguments:
-h, --help show this help message and exit
--input INPUT (REQUIRED) name of input wav file (default: None)
--output OUTPUT name of output wave file (default: result.wav)
--threshold THRESHOLD
silence threshold - can be expressed in dB, e.g. --threshold=-25.5dB (default: -25dB)
--silence-dur SILENCE_DUR
maximum silence duration desired in output (default: 0.5)
--non-silence-dur NON_SILENCE_DUR
minimum non-silence duration between periods of silence of at least --silence-dur length (default: 0.1)
--mode MODE silence detection mode - can be 'any' or 'all' (default: all)
--auto-threshold automatically determine silence threshold (default: False)
--auto-aggressiveness AUTO_AGGRESSIVENESS
aggressiveness of the auto-threshold algorithm. Integer between [-20,20] (default: 3)
--detect-only don't trim, just detect periods of silence (default: False)
--verbose print general information to the screen (default: False)
--show-silence print locations of silence (always true if --detect-only is used) (default: False)
--time-it show steps and time to complete them (default: False)
--overwrite overwrite existing output file, if applicable (default: False)
Contents of trim_silence.py:
import numpy as np
import argparse
import time
import sys
import os
def testmode(mode):
mode = mode.lower()
valid_modes = ["all","any"]
if mode not in valid_modes:
raise Exception("mode '{mode}' is not valid - must be one of {valid_modes}")
return mode
def testaggr(aggr):
try:
aggr = min(20,max(-20,int(aggr)))
return aggr
except:
raise Exception("auto-aggressiveness '{aggr}' is not valid - see usage")
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument("--input", type=str, help="(REQUIRED) name of input wav file", required=True)
parser.add_argument("--output", default="result.wav", type=str, help="name of output wave file")
parser.add_argument("--threshold", default="-25dB", type=str, help="silence threshold - can be expressed in dB, e.g. --threshold=-25.5dB")
parser.add_argument("--silence-dur", default=0.5, type=float, help="maximum silence duration desired in output")
parser.add_argument("--non-silence-dur", default=0.1, type=float, help="minimum non-silence duration between periods of silence of at least --silence-dur length")
parser.add_argument("--mode", default="all", type=testmode, help="silence detection mode - can be 'any' or 'all'")
parser.add_argument("--auto-threshold",action="store_true", help="automatically determine silence threshold")
parser.add_argument("--auto-aggressiveness",default=3,type=testaggr, help="aggressiveness of the auto-threshold algorithm. Integer between [-20,20]")
parser.add_argument("--detect-only", action="store_true", help="don't trim, just detect periods of silence")
parser.add_argument("--verbose", action="store_true", help="print general information to the screen")
parser.add_argument("--show-silence", action="store_true", help="print locations of silence (always true if --detect-only is used)")
parser.add_argument("--time-it", action="store_true", help="show steps and time to complete them")
parser.add_argument("--overwrite", action="store_true", help="overwrite existing output file, if applicable")
args = parser.parse_args()
args.show_silence = args.show_silence or args.detect_only
if not args.detect_only and not args.overwrite:
if os.path.isfile(args.output):
print(f"Output file ({args.output}) already exists. Use --overwrite to overwrite the existing file.")
sys.exit(1)
if (args.silence_dur < 0): raise Exception("Maximum silence duration must be >= 0.0")
if (args.non_silence_dur < 0): raise Exception("Minimum non-silence duration must be >= 0.0")
try:
from scipy.io import wavfile
using_scipy = True
except:
if args.verbose: print("Failure using 'import scipy.io.wavfile'. Using 'import wave' instead.")
import wave
using_scipy = False
if args.verbose: print(f"Inputs:\n Input File: {args.input}\n Output File: {args.output}\n Max. Silence Duration: {args.silence_dur}\n Min. Non-silence Duration: {args.non_silence_dur}")
from matplotlib import pyplot as plt
def plot(x):
plt.figure()
plt.plot(x,'o')
plt.show()
def threshold_for_channel(ch):
global data
nbins = 100
max_len = min(1024*1024*100,data.shape[0]) # limit to first 100 MiB
if len(data.shape) > 1:
x = np.abs(data[:max_len,ch]*1.0)
else:
x = np.abs(data[:max_len]*1.0)
if data.dtype==np.uint8: x -= 127
hist,edges = np.histogram(x,bins=nbins,density=True)
slope = np.abs(hist[1:] - hist[:-1])
argmax = np.argmax(slope < 0.00002)
argmax = max(0,min(argmax + args.auto_aggressiveness, len(edges)-1))
thresh = edges[argmax] + (127 if data.dtype==np.uint8 else 0)
return thresh
def auto_threshold():
global data
max_thresh = 0
channel_count = 1 if len(data.shape)==1 else data.shape[1]
for ch in range(channel_count):
max_thresh = max(max_thresh,threshold_for_channel(ch))
return max_thresh
silence_threshold = str(args.threshold).lower().strip()
if args.auto_threshold:
if args.verbose: print (f" Silence Threshold: AUTO (aggressiveness={args.auto_aggressiveness})")
else:
if "db" in silence_threshold:
silence_threshold_db = float(silence_threshold.replace("db",""))
silence_threshold = np.round(10**(silence_threshold_db/20.),6)
else:
silence_threshold = float(silence_threshold)
silence_threshold_db = 20*np.log10(silence_threshold)
if args.verbose: print (f" Silence Threshold: {silence_threshold} ({np.round(silence_threshold_db,2)} dB)")
if args.verbose: print (f" Silence Mode: {args.mode.upper()}")
if args.verbose: print("")
if args.time_it: print(f"Reading in data from {args.input}... ",end="",flush=True)
start = time.time()
if using_scipy:
sample_rate, data = wavfile.read(args.input)
input_dtype = data.dtype
Ts = 1./sample_rate
if args.auto_threshold:
silence_threshold = auto_threshold()
else:
if data.dtype != np.float32:
sampwidth = data.dtype.itemsize
if (data.dtype==np.uint8): silence_threshold += 0.5 # 8-bit unsigned PCM
scale_factor = (256**sampwidth)/2.
silence_threshold *= scale_factor
else:
handled_sampwidths = [2]
with wave.open(args.input,"rb") as wavin:
params = wavin.getparams()
if params.sampwidth in handled_sampwidths:
raw_data = wavin.readframes(params.nframes)
if params.sampwidth not in handled_sampwidths:
print(f"Unable to handle a sample width of {params.sampwidth}")
sys.exit(1)
end = time.time()
if args.time_it: print(f"complete (took {np.round(end-start,6)} seconds)")
if not using_scipy:
if args.time_it: print(f"Unpacking data... ",end="",flush=True)
start = time.time()
Ts = 1.0/params.framerate
if params.sampwidth==2: # 16-bit PCM
format_ = 'h'
data = np.frombuffer(raw_data,dtype=np.int16)
elif params.sampwidth==3: # 24-bit PCM
format_ = 'i'
print(len(raw_data))
data = np.frombuffer(raw_data,dtype=np.int32)
data = data.reshape(-1,params.nchannels) # reshape into channels
if args.auto_threshold:
silence_threshold = auto_threshold()
else:
scale_factor = (256**params.sampwidth)/2. # scale to [-1:1)
silence_threshold *= scale_factor
data = 1.0*data # convert to np.float64
end = time.time()
if args.time_it: print(f"complete (took {np.round(end-start,6)} seconds)")
silence_duration_samples = args.silence_dur / Ts
if args.verbose: print(f"Input File Duration = {np.round(data.shape[0]*Ts,6)}\n")
combined_channel_silences = None
def detect_silence_in_channels():
global combined_channel_silences
if len(data.shape) > 1:
if args.mode=="any":
combined_channel_silences = np.min(np.abs(data),axis=1) <= silence_threshold
else:
combined_channel_silences = np.max(np.abs(data),axis=1) <= silence_threshold
else:
combined_channel_silences = np.abs(data) <= silence_threshold
combined_channel_silences = np.pad(combined_channel_silences, pad_width=1,mode='constant',constant_values=0)
def get_silence_locations():
global combined_channel_silences
starts = combined_channel_silences[1:] & ~combined_channel_silences[0:-1]
ends = ~combined_channel_silences[1:] & combined_channel_silences[0:-1]
start_locs = np.nonzero(starts)[0]
end_locs = np.nonzero(ends)[0]
durations = end_locs - start_locs
long_durations = (durations > silence_duration_samples)
long_duration_indexes = np.nonzero(long_durations)[0]
if len(long_duration_indexes) > 1:
non_silence_gaps = start_locs[long_duration_indexes[1:]] - end_locs[long_duration_indexes[:-1]]
short_non_silence_gap_locs = np.nonzero(non_silence_gaps <= (args.non_silence_dur/Ts))[0]
for loc in short_non_silence_gap_locs:
if args.verbose and args.show_silence:
ns_gap_start = end_locs[long_duration_indexes[loc]] * Ts
ns_gap_end = start_locs[long_duration_indexes[loc+1]] * Ts
ns_gap_dur = ns_gap_end - ns_gap_start
print(f"Removing non-silence gap at {np.round(ns_gap_start,6)} seconds with duration {np.round(ns_gap_dur,6)} seconds")
end_locs[long_duration_indexes[loc]] = end_locs[long_duration_indexes[loc+1]]
long_duration_indexes = np.delete(long_duration_indexes, short_non_silence_gap_locs + 1)
if args.show_silence:
if len(long_duration_indexes)==0:
if args.verbose: print("No periods of silence found")
else:
if args.verbose: print("Periods of silence shown below")
fmt_str = "%-12s %-12s %-12s"
print(fmt_str % ("start","end","duration"))
for idx in long_duration_indexes:
start = start_locs[idx]
end = end_locs[idx]
duration = end - start
print(fmt_str % (np.round(start*Ts,6),np.round(end*Ts,6),np.round(duration*Ts,6)))
if args.verbose: print("")
return start_locs[long_duration_indexes], end_locs[long_duration_indexes]
def trim_data(start_locs,end_locs):
global data
if len(start_locs)==0: return
keep_at_start = int(silence_duration_samples / 2)
keep_at_end = int(silence_duration_samples - keep_at_start)
start_locs = start_locs + keep_at_start
end_locs = end_locs - keep_at_end
delete_locs = np.concatenate([np.arange(start_locs[idx],end_locs[idx]) for idx in range(len(start_locs))])
data = np.delete(data, delete_locs, axis=0)
def output_data(start_locs,end_locs):
global data
if args.verbose: print(f"Output File Duration = {np.round(data.shape[0]*Ts,6)}\n")
if args.time_it: print(f"Writing out data to {args.output}... ",end="",flush=True)
if using_scipy:
wavfile.write(args.output, sample_rate, data)
else:
packed_buf = data.astype(format_).tobytes()
with wave.open(args.output,"wb") as wavout:
wavout.setparams(params) # same params as input
wavout.writeframes(packed_buf)
start = time.time()
if not args.verbose and args.time_it: print("Detecting silence... ",end="",flush=True)
detect_silence_in_channels()
(start_locs,end_locs) = get_silence_locations()
end = time.time()
if not args.verbose and args.time_it: print(f"complete (took {np.round(end-start,6)} seconds)")
if args.detect_only:
if args.verbose: print("Not trimming, because 'detect only' flag was set")
else:
if args.time_it: print("Trimming data... ",end="",flush=True)
start = time.time()
trim_data(start_locs,end_locs)
end = time.time()
if args.time_it: print(f"complete (took {np.round(end-start,6)} seconds)")
start = time.time()
output_data(start_locs, end_locs)
end = time.time()
if args.time_it: print(f"complete (took {np.round(end-start,6)} seconds)")
If you want a script that assumes 16-bit PCM and without all the extra print statements and what not:
import numpy as np
from scipy.io import wavfile
# Params
(infile,outfile,threshold_db,silence_dur,non_silence_dur,mode) = ("test_stereo.wav","result.wav",-25,0.5,0.1,"all")
silence_threshold = np.round(10**(threshold_db/20.),6) * 32768 # Convert from dB to linear units and scale, assuming 16-bit PCM input
# Read data
Fs, data = wavfile.read(infile)
silence_duration_samples = silence_dur * Fs
if len(data.shape)==1: data = np.expand_dims(data,axis=1)
# Find silence
find_func = np.min if mode=="any" else np.max
combined_channel_silences = find_func(np.abs(data),axis=1) <= silence_threshold
combined_channel_silences = np.pad(combined_channel_silences, pad_width=1,mode='constant',constant_values=0)
# Get start and stop locations
starts = combined_channel_silences[1:] & ~combined_channel_silences[0:-1]
ends = ~combined_channel_silences[1:] & combined_channel_silences[0:-1]
start_locs = np.nonzero(starts)[0]
end_locs = np.nonzero(ends)[0]
durations = end_locs - start_locs
long_durations = (durations > silence_duration_samples)
long_duration_indexes = np.nonzero(long_durations)[0]
# Cut out short non-silence between silence
if len(long_duration_indexes) > 1:
non_silence_gaps = start_locs[long_duration_indexes[1:]] - end_locs[long_duration_indexes[:-1]]
short_non_silence_gap_locs = np.nonzero(non_silence_gaps <= (non_silence_dur * Fs))[0]
for loc in short_non_silence_gap_locs:
end_locs[long_duration_indexes[loc]] = end_locs[long_duration_indexes[loc+1]]
long_duration_indexes = np.delete(long_duration_indexes, short_non_silence_gap_locs + 1)
(start_locs,end_locs) = (start_locs[long_duration_indexes], end_locs[long_duration_indexes])
# Trim data
if len(long_duration_indexes) > 1:
if len(start_locs) > 0:
keep_at_start = int(silence_duration_samples / 2)
keep_at_end = int(silence_duration_samples - keep_at_start)
start_locs = start_locs + keep_at_start
end_locs = end_locs - keep_at_end
delete_locs = np.concatenate([np.arange(start_locs[idx],end_locs[idx]) for idx in range(len(start_locs))])
data = np.delete(data, delete_locs, axis=0)
# Output data
wavfile.write(outfile, Fs, data)

Related

cv2.VideoWriter made the code slower compared to cv2.imshow (multiple kinect V2 sensors)

I've been trying to record multi-source video(color, depth, IR) from two kinect V2 sensors. I'm using libfreenect2, drivers that enables multiple kinect v2 sensors and its python wrapper:pylibfreenect2.
The example code is for single sensor. It supports multi-source video streaming for one kinect sensor using cv2.imshow. Since my purpose is to record, I simply changed cv2.imshow to cv2.videowriter to save the video. However, I found that the code becomes slower after using videowriter since the frame number is decreasing for a fixed period of time.
For example, for one minute running the code using a single kinect sensor:
single_stream 1875 frames=31.25Hz,
single_record 1036 frames=17.26Hz
Question 1:
We can see there is a significant drop on the sampling rate. Is there any way to avoid the drop?
Also, since I'm trying to use multiple kinect sensors, I simply instantiate everything twice for two different sensors. Compared to single sensor, dual sensors version also has a significant sampling rate drop:
dualstream 1500 frames=25Hz,
dualsave 660 frames=11Hz
Question 2:
How can I optimize the code using multi-thread instead of just instantiating everything twice?
Here is my code:
# coding: utf-8
# An example using startStreams
import numpy as np
import cv2
import sys
from pylibfreenect2 import Freenect2, SyncMultiFrameListener
from pylibfreenect2 import FrameType, Registration, Frame
import argparse
import datetime
import keyboard
parser = argparse.ArgumentParser()
parser.add_argument('--out_name', type=str, help="the output names", default=None)
parser.add_argument('--out_dir', type=str, help="the output directory", default='data/')
args = parser.parse_args()
try:
from pylibfreenect2 import OpenGLPacketPipeline
pipeline0 = OpenGLPacketPipeline()
pipeline1 = OpenGLPacketPipeline()
except:
try:
from pylibfreenect2 import OpenCLPacketPipeline
pipeline0 = OpenCLPacketPipeline()
pipeline1 = OpenCLPacketPipeline()
except:
from pylibfreenect2 import CpuPacketPipeline
pipeline0 = CpuPacketPipeline()
pipeline1 = CpuPacketPipeline()
print("Packet pipeline:", type(pipeline0).__name__)
enable_rgb = True
enable_depth = True
fn = Freenect2()
num_devices = fn.enumerateDevices()
if num_devices == 0:
print("No device connected!")
sys.exit(1)
serial0 = fn.getDeviceSerialNumber(0)
serial1 = fn.getDeviceSerialNumber(1)
device0 = fn.openDevice(serial0, pipeline=pipeline0)
device1 = fn.openDevice(serial1, pipeline=pipeline1)
types = 0
if enable_rgb:
types |= FrameType.Color
if enable_depth:
types |= (FrameType.Ir | FrameType.Depth)
listener0 = SyncMultiFrameListener(types)
listener1 = SyncMultiFrameListener(types)
# Register listeners
device0.setColorFrameListener(listener0)
device0.setIrAndDepthFrameListener(listener0)
device1.setColorFrameListener(listener1)
device1.setIrAndDepthFrameListener(listener1)
if enable_rgb and enable_depth:
device0.start()
device1.start()
else:
device0.startStreams(rgb=enable_rgb, depth=enable_depth)
device1.startStreams(rgb=enable_rgb, depth=enable_depth)
# NOTE: must be called after device.start()
if enable_depth:
registration0 = Registration(device0.getIrCameraParams(),
device0.getColorCameraParams())
registration1 = Registration(device1.getIrCameraParams(),
device1.getColorCameraParams())
undistorted0 = Frame(512, 424, 4)
registered0 = Frame(512, 424, 4)
undistorted1 = Frame(512, 424, 4)
registered1 = Frame(512, 424, 4)
time_list = []
#specify output folder
out_dir = args.out_dir
out_name = args.out_name
# height = 540
# width = 960
height = 1080//2
width = 1920//2
channel = 3
fps = 30
cnt = 0
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
# cv2.VideoWriter_fourcc(*'mp4v')
# cv2.VideoWriter_fourcc(*'MP42')
color_wrapper0 = cv2.VideoWriter(out_dir + out_name + '_color0.mp4', fourcc, float(fps), (width, height), True)
depth_wrapper0 = cv2.VideoWriter(out_dir + out_name + '_depth0.mp4', fourcc, float(fps), (512, 424), False)
ir_wrapper0 = cv2.VideoWriter(out_dir + out_name + '_ir0.mp4', fourcc, float(fps), (512, 424), False)
register_wrapper0 = cv2.VideoWriter(out_dir + out_name + '_register0.mp4', fourcc, float(fps), (512, 424), True)
color_wrapper1 = cv2.VideoWriter(out_dir + out_name + '_color1.mp4', fourcc, float(fps), (width, height), True)
depth_wrapper1 = cv2.VideoWriter(out_dir + out_name + '_depth1.mp4', fourcc, float(fps), (512, 424), False)
ir_wrapper1 = cv2.VideoWriter(out_dir + out_name + '_ir1.mp4', fourcc, float(fps), (512, 424), False)
register_wrapper1 = cv2.VideoWriter(out_dir + out_name + '_register1.mp4', fourcc, float(fps), (512, 424), True)
start_time = datetime.datetime.now()
while True:
frames0 = listener0.waitForNewFrame()
frames1 = listener1.waitForNewFrame()
if enable_rgb:
color0 = frames0["color"]
color1 = frames1["color"]
if enable_depth:
ir0 = frames0["ir"]
depth0 = frames0["depth"]
ir1 = frames1["ir"]
depth1 = frames1["depth"]
if enable_rgb and enable_depth:
registration0.apply(color0, depth0, undistorted0, registered0)
registration1.apply(color1, depth1, undistorted1, registered1)
elif enable_depth:
registration0.undistortDepth(depth0, undistorted0)
registration1.undistortDepth(depth1, undistorted1)
# if enable_depth:
# cv2.imshow("ir0", ir0.asarray() / 65535.)
# cv2.imshow("depth0", depth0.asarray() / 4500.)
# cv2.imshow("undistorted0", undistorted0.asarray(np.float32) / 4500.)
# cv2.imshow("ir1", ir1.asarray() / 65535.)
# cv2.imshow("depth1", depth1.asarray() / 4500.)
# cv2.imshow("undistorted1", undistorted1.asarray(np.float32) / 4500.)
# if enable_rgb:
# cv2.imshow("color0", cv2.resize(color0.asarray(),
# (int(1920 / 3), int(1080 / 3))))
# cv2.imshow("color1", cv2.resize(color1.asarray(),
# (int(1920 / 3), int(1080 / 3))))
# if enable_rgb and enable_depth:
# cv2.imshow("registered0", registered0.asarray(np.uint8))
# cv2.imshow("registered1", registered1.asarray(np.uint8))
color_wrapper0.write(cv2.resize(color0.asarray(),(width, height))[:,:,:-1])
color_wrapper1.write(cv2.resize(color0.asarray(),(width, height))[:,:,:-1])
color_wrapper0.write(color0.asarray()[:,:,:-1])
color_wrapper1.write(color1.asarray()[:,:,:-1])
depth_wrapper0.write((depth0.asarray() * (255.0/4500.0)).clip(0, 255).astype(np.uint8))
depth_wrapper1.write((depth1.asarray() * (255.0/4500.0)).clip(0, 255).astype(np.uint8))
ir_wrapper0.write((ir0.asarray() * (255.0/65535.0)).clip(0, 255).astype(np.uint8))
ir_wrapper1.write((ir1.asarray() * (255.0/65535.0)).clip(0, 255).astype(np.uint8))
register_wrapper0.write(registered0.asarray(np.uint8)[:,:,:-1])
register_wrapper1.write(registered1.asarray(np.uint8)[:,:,:-1])
time_list.append(datetime.datetime.now())
listener0.release(frames0)
listener1.release(frames1)
cnt += 1
# key = cv2.waitKey(delay=1)
# if key == ord('q'):
# break
if keyboard.is_pressed("q"): # if key 'q' is pressed
print('finishing the loop')
break # finishing the loop
key = cv2.waitKey(delay=1)
if key == ord('q'):
break
ir_wrapper0.release()
ir_wrapper1.release()
print("save ir video successfully")
depth_wrapper0.release()
depth_wrapper1.release()
print("save depth video successfully")
color_wrapper0.release()
color_wrapper1.release()
print("save color video successfully")
register_wrapper0.release()
register_wrapper1.release()
print("save registered video successfully")
np.save(out_dir + out_name + '_time.npy', time_list)
print("save time array successfully")
print("total frame is: ", cnt)
end_time = datetime.datetime.now()
print("total time is: ", end_time - start_time)
device0.stop()
device0.close()
device1.stop()
device1.close()
sys.exit(0)
The comment part is the streaming code using cv2.imshow. Below that is what I modified to save the video using cv2.videowriter.

Pitch and Yaw from a Wiimote Motion Plus using Cwiid Python

I'm trying to extract the orientation of a wiimote using cwiid in python. I've managed to get the accelerometer values but there doesn't seem to be any object attributes relating to the purely gyroscopic data.
This guy managed to do it in python, but to the best of my knowledge there's no python code online with an example.
https://www.youtube.com/watch?v=cUjh0xQO6eY
There is information on wiibrew about the controller data but again this seems to be excluded from any python library.
Has anyone got any suggestions? This link has an example of getting gyro data but the packages used don't seem available.
I was actually looking for this a few days ago, found this post: https://ofalcao.pt/blog/2014/controlling-the-sbrick-with-a-wiimote. More specifically, I think the code you're looking for is:
# roll = accelerometer[0], standby ~125
# pitch = accelerometer[1], standby ~125
...
roll=(wm.state['acc'][0]-125)
pitch=(wm.state['acc'][1]-125)
I'm assuming you can use the z-axis (index 2) for the yaw
So this question has a few parts, firstly how to extract the gyro data from the motion plus sensor. To do this, the motion plus will first need to be enabled.
The gyro provides the angular rotation vectors, but due to drift caused by integration errors you can't simply use a combination of these two things to get Eular angles. The second part of the question is how to use this data to give position, and that is done by using a Kalman filter, a highly complex matrix sequence, or a complementary filter, a less complex mathematical operation. Both of these filters are essentially combining the gyro and accelerometer data, so as mentioned in a comment above, resulting in more stable measurements, less drift and a system not prone to breaking when the remote is shaken.
Kalman filter:
http://blog.tkjelectronics.dk/2012/09/a-practical-approach-to-kalman-filter-and-how-to-implement-it/
Using PyKalman on Raw Acceleration Data to Calculate Position
Complementary filter
https://www.instructables.com/Angle-measurement-using-gyro-accelerometer-and-Ar/
Still currently developing the core but will post when I'm finished, hopefully tomorrow.
The foundation code I am using to test the measurements is found here:
http://andrew-j-norman.blogspot.com/2010/12/more-code.html. Very handy, as it plots the sensor readings automatically after recording. You can see by doing this that even when still, the position estimate by using simple integration of the angular velocities results in a drift in the position vector.
EDIT:
Testing this allows for the gyro sensor to accurately calculate the angle changed over time, however there remains drift in the acceleration - which I believe is unavoidable.
Here is an image demonstrating the gyro motion sensor:
Just finished up the code:
#!/usr/bin/python
import cwiid
from time import time, asctime, sleep, perf_counter
from numpy import *
from pylab import *
import math
import numpy as np
from operator import add
HPF = 0.98
LPF = 0.02
def calibrate(wiimote):
print("Keep the remote still")
sleep(3)
print("Calibrating")
messages = wiimote.get_mesg()
i=0
accel_init = []
angle_init = []
while (i<1000):
sleep(0.01)
messages = wiimote.get_mesg()
for mesg in messages:
# Motion plus:
if mesg[0] == cwiid.MESG_MOTIONPLUS:
if record:
angle_init.append(mesg[1]['angle_rate'])
# Accelerometer:
elif mesg[0] == cwiid.MESG_ACC:
if record:
accel_init.append(list(mesg[1]))
i+=1
accel_init_avg = list(np.mean(np.array(accel_init), axis=0))
print(accel_init_avg)
angle_init_avg = sum(angle_init)/len(angle_init)
print("Finished Calibrating")
return (accel_init_avg, angle_init_avg)
def plotter(plot_title, timevector, data, position, n_graphs):
subplot(n_graphs, 1, position)
plot(timevector, data[0], "r",
timevector, data[1], "g",
timevector, data[2], "b")
xlabel("time (s)")
ylabel(plot_title)
print("Press 1+2 on the Wiimote now")
wiimote = cwiid.Wiimote()
# Rumble to indicate a connection
wiimote.rumble = 1
print("Connection established - release buttons")
sleep(0.2)
wiimote.rumble = 0
sleep(1.0)
wiimote.enable(cwiid.FLAG_MESG_IFC | cwiid.FLAG_MOTIONPLUS)
wiimote.rpt_mode = cwiid.RPT_BTN | cwiid.RPT_ACC | cwiid.RPT_MOTIONPLUS
accel_init, angle_init = calibrate(wiimote)
str = ""
print("Press plus to start recording, minus to end recording")
loop = True
record = False
accel_data = []
angle_data = []
messages = wiimote.get_mesg()
while (loop):
sleep(0.01)
messages = wiimote.get_mesg()
for mesg in messages:
# Motion plus:
if mesg[0] == cwiid.MESG_MOTIONPLUS:
if record:
angle_data.append({"Time" : perf_counter(), \
"Rate" : mesg[1]['angle_rate']})
# Accelerometer:
elif mesg[0] == cwiid.MESG_ACC:
if record:
accel_data.append({"Time" : perf_counter(), "Acc" : [mesg[1][i] - accel_init[i] for i in range(len(accel_init))]})
# Button:
elif mesg[0] == cwiid.MESG_BTN:
if mesg[1] & cwiid.BTN_PLUS and not record:
print("Recording - press minus button to stop")
record = True
start_time = perf_counter()
if mesg[1] & cwiid.BTN_MINUS and record:
if len(accel_data) == 0:
print("No data recorded")
else:
print("End recording")
print("{0} data points in {1} seconds".format(
len(accel_data), perf_counter() - accel_data[0]["Time"]))
record = False
loop = False
else:
pass
wiimote.disable(cwiid.FLAG_MESG_IFC | cwiid.FLAG_MOTIONPLUS)
if len(accel_data) == 0:
sys.exit()
timevector = []
a = [[],[],[]]
v = [[],[],[]]
p = [[],[],[]]
last_time = 0
velocity = [0,0,0]
position = [0,0,0]
for n, x in enumerate(accel_data):
if (n == 0):
origin = x
else:
elapsed = x["Time"] - origin["Time"]
delta_t = x["Time"] - last_time
timevector.append(elapsed)
for i in range(3):
acceleration = x["Acc"][i] - origin["Acc"][i]
velocity[i] = velocity[i] + delta_t * acceleration
position[i] = position[i] + delta_t * velocity[i]
a[i].append(acceleration)
v[i].append(velocity[i])
p[i].append(position[i])
last_time = x["Time"]
n_graphs = 3
if len(angle_data) == len(accel_data):
n_graphs = 5
angle_accel = [(math.pi)/2 if (j**2 + k**2)==0 else math.atan(i/math.sqrt(j**2 + k**2)) for i,j,k in zip(a[0],a[1],a[2])]
ar = [[],[],[]] # Angle rates
aa = [[],[],[]] # Angles
angle = [0,0,0]
for n, x in enumerate(angle_data):
if (n == 0):
origin = x
else:
delta_t = x["Time"] - last_time
for i in range(3):
rate = x["Rate"][i] - origin["Rate"][i]
angle[i] = HPF*(np.array(angle[i]) + delta_t * rate) + LPF*np.array(angle_accel)
ar[i].append(rate)
aa[i].append(angle[i])
last_time = x["Time"]
plotter("Acceleration", timevector, a, 1, n_graphs)
if n_graphs == 5:
plotter("Angle Rate", timevector, ar, 4, n_graphs)
plotter("Angle", timevector, aa, 5, n_graphs)
show()

Python Image Compression

I am using the Pillow library of Python to read in image files. How can I compress and decompress using Huffman encoding? Here is an instruction:
You have been given a set of example images and your goal is to compress them as much as possible without losing any perceptible information –upon decompression they should appear identical to the original images. Images are essentially stored as a series of points of color, where each point is represented as a combination of red, green, and blue (rgb). Each component of the rgb value ranges between 0-255, so for example: (100, 0, 200) would represent a shade of purple. Using a fixed-length encoding, each component of the rgb value requires 8 bits to encode (28= 256) meaning that the entire rgb value requires 24 bits to encode. You could use a compression algorithm like Huffman encoding to reduce the number of bits needed for more common values and thereby reduce the total number of bits needed to encode your image.
# For my current code I just read the image, get all the rgb and build the tree
from PIL import Image
import sys, string
import copy
codes = {}
def sortFreq(freqs):
letters = freqs.keys()
tuples = []
for let in letters:
tuples.append (freqs[let],let)
tuples.sort()
return tuples
def buildTree(tuples):
while len (tuples) > 1:
leastTwo = tuple (tuples[0:2]) # get the 2 to combine
theRest = tuples[2:] # all the others
combFreq = leastTwo[0][0] + leastTwo[1][0] # the branch points freq
tuples = theRest + [(combFreq, leastTwo)] # add branch point to the end
tuples.sort() # sort it into place
return tuples[0] # Return the single tree inside the list
def trimTree(tree):
# Trim the freq counters off, leaving just the letters
p = tree[1] # ignore freq count in [0]
if type (p) == type (""):
return p # if just a leaf, return it
else:
return (trimTree (p[0]), trimTree (p[1]) # trim left then right and recombine
def assignCodes(node, pat=''):
global codes
if type (node) == type (""):
codes[node] = pat # A leaf. Set its code
else:
assignCodes(node[0], pat+"0") # Branch point. Do the left branch
assignCodes(node[1], pat+"1") # then do the right branch.
dictionary = {}
table = {}
image = Image.open('fall.bmp')
#image.show()
width, height = image.size
px = image.load()
totalpixel = width*height
print ("Total pixel: "+ str(totalpixel))
for x in range (width):
for y in range (height):
# print (px[x, y])
for i in range (3):
if dictionary.get(str(px[x, y][i])) is None:
dictionary[str(px[x, y][i])] = 1
else:
dictionary[str(px[x, y][i])] = dictionary[str(px[x, y][i])] +1
table = copy.deepcopy(dictionary)
#combination = len(dictionary)
#for value in table:
# table[value] = table[value] / (totalpixel * combination) * 100
#print(table)
print(dictionary)
sortdic = sortFreq(dictionary)
tree = buildTree(sortdic)
trim = trimTree(tree)
print(trim)
assignCodes(trim)
print(codes)
The class HuffmanCoding takes complete path of the text file to be compressed as parameter. (as its data members store data specific to the input file).
The compress() function returns the path of the output compressed file.
The function decompress() requires path of the file to be decompressed. (and decompress() is to be called from the same object created for compression, so as to get code mapping from its data members)
import heapq
import os
class HeapNode:
def __init__(self, char, freq):
self.char = char
self.freq = freq
self.left = None
self.right = None
def __cmp__(self, other):
if(other == None):
return -1
if(not isinstance(other, HeapNode)):
return -1
return self.freq > other.freq
class HuffmanCoding:
def __init__(self, path):
self.path = path
self.heap = []
self.codes = {}
self.reverse_mapping = {}
# functions for compression:
def make_frequency_dict(self, text):
frequency = {}
for character in text:
if not character in frequency:
frequency[character] = 0
frequency[character] += 1
return frequency
def make_heap(self, frequency):
for key in frequency:
node = HeapNode(key, frequency[key])
heapq.heappush(self.heap, node)
def merge_nodes(self):
while(len(self.heap)>1):
node1 = heapq.heappop(self.heap)
node2 = heapq.heappop(self.heap)
merged = HeapNode(None, node1.freq + node2.freq)
merged.left = node1
merged.right = node2
heapq.heappush(self.heap, merged)
def make_codes_helper(self, root, current_code):
if(root == None):
return
if(root.char != None):
self.codes[root.char] = current_code
self.reverse_mapping[current_code] = root.char
return
self.make_codes_helper(root.left, current_code + "0")
self.make_codes_helper(root.right, current_code + "1")
def make_codes(self):
root = heapq.heappop(self.heap)
current_code = ""
self.make_codes_helper(root, current_code)
def get_encoded_text(self, text):
encoded_text = ""
for character in text:
encoded_text += self.codes[character]
return encoded_text
def pad_encoded_text(self, encoded_text):
extra_padding = 8 - len(encoded_text) % 8
for i in range(extra_padding):
encoded_text += "0"
padded_info = "{0:08b}".format(extra_padding)
encoded_text = padded_info + encoded_text
return encoded_text
def get_byte_array(self, padded_encoded_text):
if(len(padded_encoded_text) % 8 != 0):
print("Encoded text not padded properly")
exit(0)
b = bytearray()
for i in range(0, len(padded_encoded_text), 8):
byte = padded_encoded_text[i:i+8]
b.append(int(byte, 2))
return b
def compress(self):
filename, file_extension = os.path.splitext(self.path)
output_path = filename + ".bin"
with open(self.path, 'r+') as file, open(output_path, 'wb') as output:
text = file.read()
text = text.rstrip()
frequency = self.make_frequency_dict(text)
self.make_heap(frequency)
self.merge_nodes()
self.make_codes()
encoded_text = self.get_encoded_text(text)
padded_encoded_text = self.pad_encoded_text(encoded_text)
b = self.get_byte_array(padded_encoded_text)
output.write(bytes(b))
print("Compressed")
return output_path
""" functions for decompression: """
def remove_padding(self, padded_encoded_text):
padded_info = padded_encoded_text[:8]
extra_padding = int(padded_info, 2)
padded_encoded_text = padded_encoded_text[8:]
encoded_text = padded_encoded_text[:-1*extra_padding]
return encoded_text
def decode_text(self, encoded_text):
current_code = ""
decoded_text = ""
for bit in encoded_text:
current_code += bit
if(current_code in self.reverse_mapping):
character = self.reverse_mapping[current_code]
decoded_text += character
current_code = ""
return decoded_text
def decompress(self, input_path):
filename, file_extension = os.path.splitext(self.path)
output_path = filename + "_decompressed" + ".txt"
with open(input_path, 'rb') as file, open(output_path, 'w') as output:
bit_string = ""
byte = file.read(1)
while(byte != ""):
byte = ord(byte)
bits = bin(byte)[2:].rjust(8, '0')
bit_string += bits
byte = file.read(1)
encoded_text = self.remove_padding(bit_string)
decompressed_text = self.decode_text(encoded_text)
output.write(decompressed_text)
print("Decompressed")
return output_path
Running the program:
Save the above code, in a file huffman.py.
Create a sample text file. Or download a sample file from sample.txt (right click, save as)
Save the code below, in the same directory as the above code, and Run this python code (edit the path variable below before running. initialize it to text file path)
UseHuffman.py
from huffman import HuffmanCoding
#input file path
path = "/home/ubuntu/Downloads/sample.txt"
h = HuffmanCoding(path)
output_path = h.compress()
h.decompress(output_path)
The compressed .bin file and the decompressed file are both saved in the same directory as of the input file.
Result
On running on the above linked sample text file:
Initial Size: 715.3 kB
Compressed file Size: 394.0 kB
Plus, the decompressed file comes out to be exactly the same as the original file, without any data loss.
And that is all for Huffman Coding implementation, with compression and decompression. This was fun to code.
The above program requires the decompression function to be run using the same object that created the compression file (because the code mapping is stored in its data members). We can also make the compression and decompression function run independently, if somehow, during compression we store the mapping info also in the compressed file (in the beginning). Then, during decompression, we will first read the mapping info from the file, then use that mapping info to decompress the rest file.

improvement in slicing data?

BACKGROUND
I have a piece of hardware that returns an ascii stream & the user can choose
the number of timesteps
the number of channels
One packet = [ADDRESS][MSB][LSB][CHECKSUM]
One timestamp = [TIME_PACKET][CH1_PACKET][CH2_PACKET][CH3_PACKET][CH4_PACKET][CH5_PACKET][CH6_PACKET][CH7_PACKET]
Below is some stripped down code that I use to create 7 plus time arrays to then plot via matplotlib. The issue is, it takes a fair amount of time to process... 2.7seconds for 30,000 samples & 7 channels. I have used cython and I compiled this block down as a C extension and it isn't that much faster...
COMPLETE time is 2.7seconds.
Disable Checksum is 1.8seconds # this one test is 0.9seconds
Disable sign extend is 1.8seconds # this one test is 0.9 seconds
Disable all checks is 1.0 seconds
so roughly half the time is splitting the data
1/4 of the time is Checksum check on each packet
1/4 of the time is sign extend check on each packet
those checks are quite important as they help identify framing errors w.r.t. the USB link & also any noise pickup on the piece of test equipment (so the top 4bits then start being incorrectly set...)
Is the below code reasonably efficient and should I live with it or is it really really bad...
import numpy as np
import time
BIT_LENGTH=16 # payload bitlength
num_samples=30000 # number of samples
packet = 4 # 4 bytes
chT = b'\x01\x00\x05\xfb' # timestep of 5 # 1
ch1 = b'\x03\x00\x11\xed' # data 17 # 3
ch2 = b'\x04\x00\x07\xfc' # data 7 # 4
ch3 = b'\x08\x00\x17\xe0' # data 23 # 8
ch4 = b'\x0c\x00e\x96' # data 101 # 12
ch5 = b'\x0e\x00\x01\xf0' # data 1 # 14
ch6 = b'\x13\x04\x00\xe8' # data 1024 # 19
ch7 = b'\x14\x04\xd2=' # data 1234 # 20
stream = chT+ch1+ch2+ch3+ch4+ch5+ch6+ch7
stream = stream*num_samples
req = [1,3,4,8,12,14,19,20] # array of channel addresses.
req_array = [np.zeros(num_samples) for i in req] # init np arrays.
fpga_error_codes = {
'\x80\x00\x01':'ERR 1',
'\x80\x00\x02':'ERR 2',
'\x80\x00\x04':'ERR 3',
'\x80\x00\x08':'ERR 4',
'\x80\x00\x10':'ERR 5',
'\x80\x00\x20':'ERR 6',
'\x80\x00\x40':'ERR 7',
}
class PacketError(Exception):
def __init__(self, value):
self.parameter = value
def __str__(self):
return repr(self.parameter)
### FUNCTIONS USED
def checksum(data):
return (~(data[0] ^ data[1] ^ data[2]) & 0xff).to_bytes(1,byteorder='big',signed=False)
def check_packet_bits(data, s=None, nbits=BIT_LENGTH):
''' additional method to assist with error detection'''
int_data = int.from_bytes(data[1:-1],byteorder='big',signed=True)
if s=='unsigned':
if int_data > (2**nbits)-1:
return True # ERROR!
else:
return False
elif s == 'signed':
hi = 2**(nbits -1) -1
lo = -1 * 2**(nbits-1)
return True if int_data < lo else True if int_data > hi else False
def check_packet(addr, data, num_bits=BIT_LENGTH, sign_type='signed'):
'''Method to determine whether an ITI packet is valid'''
if checksum(data[:-1]) != data[-1].to_bytes(1,byteorder='big',signed=False):
msg = 'Checksum Error: %s' % ( ' '.join([hex(i) for i in data]) )
raise PacketError(msg)
elif data[0] == '\x80':
msg = "%sd %s" %(fpga_error_codes.get(data[:3],'Unspecified Error'),' '.join([hex(i) for i in data]) )
raise PacketError(msg)
elif data[0] != addr:
msg = "Wrong addr. Got: %d Want: %d %s" % (data[0],addr,' '.join([hex(i) for i in data]) )
raise PacketError(msg)
elif check_packet_bits(data,s=sign_type, nbits=num_bits):
msg = "Target FPGA sent a corrupted payload addr: %d, page: %d Data: 0x%02x%02x" %(addr, self.present_page,data[1],data[2])
raise PacketError(msg)
else:
return int.from_bytes(data[1:-1],byteorder='big',signed=True)
t0= time.time()
### CODE IN QUESTION
timestep_iter = zip(*(iter(stream),)*packet*len(req))
for entry,time_window in enumerate(timestep_iter):
packet_iter = zip(*(iter(bytes(time_window)),)*packet)
tmp = [check_packet(i,bytes(next(packet_iter)),sign_type=['signed','unsigned'][0],num_bits=12,) for i in req]
for index,addr in enumerate(req):
req_array[index][entry] = tmp[index]
req_array[0] = req_array[0].cumsum() - req_array[0]
### CODE IN QUESTION
print(time.time()-t0)

AttributeError: 'NoneType' object has no attribute 'get_width'

This is simple script on Ascii art generator from image , I get this error :
I run it in cmd line , and I am using windows 7 operating system
Traceback (most recent call last):
File "C:\Python33\mbwiga.py", line 251, in <module>
converter.convertImage(sys.argv[-1])
File "C:\Python33\mbwiga.py", line 228, in convertImage
self.getBlobs()
File "C:\Python33\mbwiga.py", line 190, in getBlobs
width, height = self.cat.get_width(), self.cat.get_height()
AttributeError: 'NoneType' object has no attribute 'get_width'
what am I messing here..?? can some one help..?
Here is full source code some one asked :
import sys
import pygame
NAME = sys.argv[0]
VERSION = "0.1.0" # The current version number.
HELP = """ {0} : An ASCII art generator. Version {1}
Usage:
{0} [-b BLOB_SIZE] [-p FONT_WIDTH:HEIGHT] [-c] image_filename
Commands:
-b | --blob Change the blob size used for grouping pixels. This is the width of the blob; the height is calculated by multiplying the blob size by the aspect ratio.
-p | --pixel-aspect Change the font character aspect ratio. By default this is 11:5, which seems to look nice. Change it based on the size of your font. Argument is specified in the format "WIDTH:HEIGHT". The colon is important.
-c | --colour Use colour codes in the output. {0} uses VT100 codes by default, limiting it to 8 colours, but this might be changed later.
-h | --help Shows this help.""""
.format(NAME, VERSION)
NO_IMAGE = \
""" Usage: %s [-b BLOB_SIZE] [-p FONT_WIDTH:HEIGHT] image_filename """ % (NAME)
import math
CAN_HAS_PYGAME = False
try:
import pygame
except ImportError:
sys.stderr.write("Can't use Pygame's image handling! Unable to proceed, sorry D:\n")
exit(-1)
VT100_COLOURS = {"000": "[0;30;40m",
"001": "[0;30;41m",
"010": "[0;30;42m",
"011": "[0;30;43m",
"100": "[0;30;44m",
"101": "[0;30;45m",
"110": "[0;30;46m",
"111": "[0;30;47m",
"blank": "[0m"}
VT100_COLOURS_I = {"000": "[0;40;30m",
"001": "[0;40;31m",
"010": "[0;40;32m",
"011": "[0;40;33m",
"100": "[0;40;34m",
"101": "[0;40;35m",
"110": "[0;40;36m",
"111": "[0;40;37m",
"blank": "[0m"}
# Convenient debug function.
DO_DEBUG = True
def debug(*args):
if not DO_DEBUG: return # Abort early, (but not often).
strrep = ""
for ii in args:
strrep += str(ii)
sys.stderr.write(strrep + "\n") # Write it to stderr. Niiicce.
# System init.
def init():
""" Start the necessary subsystems. """
pygame.init() # This is the only one at the moment...
# Get a section of the surface.
def getSubsurface(surf, x, y, w, h):
try:
return surf.subsurface(pygame.Rect(x, y, w, h))
except ValueError as er:
return getSubsurface(surf, x, y, w - 2, h - 2)
# The main class.
class AAGen:
""" A class to turn pictures into ASCII "art". """
def __init__(self):
""" Set things up for a default conversion. """
# Various blob settings.
self.aspectRatio = 11.0 / 5.0 # The default on my terminal.
self.blobW = 12 # The width. Also, the baseline for aspect ratio.
self.blobH = self.aspectRatio * self.blobW # The height.
self.blobList = []
self.cat = None # The currently open file.
self.chars = """##%H(ks+i,. """ # The characters to use.
self.colour = False # Do we use colour?
def processArgs(self):
""" Process the command line arguments, and remove any pertinent ones. """
cc = 0
for ii in sys.argv[1:]:
cc += 1
if ii == "-b" or ii == "--blob":
self.setBlob(int(sys.argv[cc + 1]))
elif ii == "-p" or ii == "--pixel-aspect":
jj = sys.argv[cc + 1]
self.setAspect(float(jj.split(":")[1]) / float(jj.split(":")[0]))
elif ii == "-c" or ii == "--colour":
self.colour = True
elif ii == "-h" or ii == "--help":
print(HELP)
exit(0)
if len(sys.argv) == 1:
print(NO_IMAGE)
exit(0)
def setBlob(self, blobW):
""" Set the blob size. """
self.blobW = blobW
self.blobH = int(math.ceil(self.aspectRatio * self.blobW))
def setAspect(self, aspect):
""" Set the aspect ratio. Also adjust the blob height. """
self.aspectRatio = aspect
self.blobH = int(math.ceil(self.blobW * self.aspectRatio))
def loadImg(self, fname):
""" Loads an image into the store. """
try:
tmpSurf = pygame.image.load(fname)
except:
print("Either this is an unsupported format, or we had problems loading the file.")
return None
self.cat = tmpSurf.convert(32)
if self.cat == None:
sys.stderr.write("Problem loading the image %s. Can't convert it!\n"
% fname)
return None
def makeBlob(self, section):
""" Blob a section into a single ASCII character."""
pxArr = pygame.surfarray.pixels3d(section)
colour = [0, 0, 0]
size = 0 # The number of pixels.
# Get the density/colours.
for i in pxArr:
for j in i:
size += 1
# Add to the colour.
colour[0] += j[0]
colour[1] += j[1]
colour[2] += j[2]
# Get just the greyscale.
grey = apply(lambda x, y, z: (x + y + z) / 3 / size,
colour)
if self.colour:
# Get the 3 bit colour.
threshold = 128
nearest = ""
nearest += "1" if (colour[0] / size > threshold) else "0"
nearest += "1" if (colour[1] / size > threshold) else "0"
nearest += "1" if (colour[2] / size > threshold) else "0"
return VT100_COLOURS[nearest], grey
return grey
# We just use a nasty mean function to find the average value.
# total = 0
# for pix in pxArr.flat:
# total += pix # flat is the array as a single-dimension one.
# return total / pxArr.size # This is a bad way to do it, it loses huge amounts of precision with large blob size. However, with ASCII art...
def getBlobs(self):
""" Get a list of blob locations. """
self.blobList = [] # Null it out.
width, height = self.cat.get_width(), self.cat.get_height()
# If the image is the wrong size for blobs, add extra space.
if height % self.blobH != 0 or width % self.blobW != 0:
oldimg = self.cat
newW = width - (width % self.blobW) + self.blobW
newH = height - (height % self.blobH) + self.blobH
self.cat = pygame.Surface((newW, newH))
self.cat.fill((255, 255, 255))
self.cat.blit(oldimg, pygame.Rect(0, 0, newW, newH))
# Loop over subsections.
for row in range(0, height, int(self.blobH)):
rowItem = []
for column in range(0, width, self.blobW):
# Construct a Rect to use.
src = pygame.Rect(column, row, self.blobW, self.blobH)
# Now, append the reference.
rowItem.append(self.cat.subsurface(src))
self.blobList.append(rowItem)
return self.blobList
def getCharacter(self, value, colour = False):
""" Get the correct character for a pixel value. """
col = value[0] if colour else ""
value = value[1] if colour else value
if not 0 <= value <= 256:
sys.stderr.write("Incorrect pixel data provided! (given %d)\n"
% value)
return "E"
char = self.chars[int(math.ceil(value / len(self.chars))) % len(self.chars)]
return char + col
def convertImage(self, fname):
""" Convert an image, and print it. """
self.loadImg(fname)
self.getBlobs()
pval = "" # The output value.
# Loop and add characters.
for ii in converter.blobList:
for jj in ii:
ch = self.makeBlob(jj)
pval += self.getCharacter(ch, self.colour) # Get the character.
# Reset the colour at the end of the line.
if self.colour: pval += VT100_COLOURS["blank"]
pval += "\n" # Split it up by line.
pval = pval[:-1] # Cut out the final newline.
print(pval) # Print it.
# Main program execution.
if __name__ == "__main__":
init()
converter = AAGen()
converter.processArgs()
converter.convertImage(sys.argv[-1])
sys.exit(1)
The problem is hidden somewhere in the loadImg. The error says that self.cat is None. The self.cat could get the None when initialised at the line 97, or it was assigned the result of tmpSurf.convert(32) and the result of that call is None. In the first case, you should observe the message Either this is an unsupported format..., in the later case you should see the message Problem loading the image... as you are testing self.cat against None:
def loadImg(self, fname):
""" Loads an image into the store. """
try:
tmpSurf = pygame.image.load(fname)
except:
print("Either this is an unsupported format, or we had problems loading the file.")
return None
self.cat = tmpSurf.convert(32)
if self.cat == None:
sys.stderr.write("Problem loading the image %s. Can't convert it!\n"
% fname)
return None
By the way, return None is exactly the same as return without argument. Also, the last return None can be completely removed because any function implicitly returns None when the end of the body is reached.
For testing to None, the is operator is recommended -- i.e. if self.cat is None:.
Update based on the comment from May 31.
If you want to make a step further, you should really learn Python a bit. Have a look at the end of the original script (indentation fixed):
# Main program execution.
if __name__ == "__main__":
init() # pygame is initialized here
converter = AAGen() # you need the converter object
converter.processArgs() # the command-line arguments are
# converted to the object attributes
converter.convertImage(sys.argv[-1]) # here the conversion happens
sys.exit(1) # this is unneccessary for the conversion
If the original script is saved in the mbwiga.py, then you can or call it as a script, or you can use it as a module. In the later case, the body below the if __name__ == "__main__": is not executed, and you have to do it in the caller script on your own. Say you have test.py that tries to do that. Say it is located at the same directory. It must import the mbwiga. Then the mbwiga. becomes the prefix of the functionality from inside the module. Your code may look like this:
import mbwiga
mbwiga.init() # pygame is initialized here
converter = mbwiga.AAGen() # you need the converter object
# Now the converter is your own object name. It does not take the mbwiga. prefix.
# The original converter.processArgs() took the argumens from the command-line
# when mbwiga.py was called as a script. If you want to use some of the arguments
# you can set the converter object's attributes the way that is shown
# in the .processArgs() method definition. Or you can call it the same way to
# extract the information from the command line passed when you called the test.py
#
converter.processArgs()
# Now the conversion
converter.convertImage('myImageFilename.xxx') # here the conversion happens

Resources