PyOpenCl Kernel in Loop Crashes GPU

PyOpenCl Kernel in Loop Crashes GPU - python-3.x

I am writing a neighbor look up routine that is brute force using pypopencl. Later on it will fit into my smoothed particle hydro code. Brute force certainly is not efficient but its simple and its a starting point. I have been testing my look up kernel and I find that when I run it in a loop it crashes. I don't get any error messages in python but the screen flickers off, then comes back on with a note that the graphics drivers failed but have been recovered. The odd thing is that if the number of particles that are searched over are small (~1000 or less) its does just fine. If I increase the count (~10k) it crashes. I tried adding in barriers and wait commands, and a finish command, to no avail. I checked to see if I have an array overrun but I cannot find it. I am including the relevant code and apologize upfront for the size of it but wanted to give it out everything so people can look at it. I am hoping some one can run this and recreate the error, or tell me where I am going wrong. My setup is python 3.5 using spyder and installed pyopencl 2016.1.
Thanks,
Seth
First The main file
import numpy as np
import gpuParameters as gpuParameters
import pyopencl as cl
import pyopencl.array as ar
from BruteForceSearch import BruteForceSearch
import time as time
dim = 3 # dimensions of the problem
n = 15000 # number of particles
nbs = 50 # number of neighbors
x = np.random.rand(n) # randomly choose some x
y = np.random.rand(n) # randomly choose some y
z = np.random.rand(n) # randomly choose some z
h = np.ones(n) # smoothing parameter for the b spline
# setup gpu context
gpu = gpuParameters.gpuParameters()
# neighbor list
nlist = -1*np.ones(n*nbs, dtype=np.int32)
# data to gpu
xg = ar.to_device(gpu.queue, x) # x pos on gpu
yg = ar.to_device(gpu.queue, y) # y pos on gpu
zg = ar.to_device(gpu.queue, z) # z pos on gpu
hg = ar.to_device(gpu.queue, h) # h pos on gpu
num_p = ar.to_device(gpu.queue, np.array(n, dtype=np.int32)) # num of particles
nb = ar.to_device(gpu.queue, np.array(nbs, dtype=np.int32)) # num of neighbors
nlst = ar.to_device(gpu.queue, nlist) # neighbor list on gpu
dg = ar.to_device(gpu.queue, np.array(dim, dtype=np.int32)) # dimension on gpu
out = ar.zeros(gpu.queue, n, np.float64) # debug parameter
# call the Brute force neighbor search and h parameter set class
srch = BruteForceSearch(gpu) # instatiate
s = time.time() # timer start
for ii in range(100):
# set a marker I really didn't think this would be necessary
mark = cl.enqueue_marker(gpu.queue) # set a marker for kernel complete
srch.search.search(gpu.queue, x.shape, None,
num_p.data, nb.data, dg.data, xg.data, yg.data, zg.data,
hg.data, nlst.data, out.data) # run the kernel
cl.Event.wait(mark) # wait for complete run of kernel before next iteration
# gpu.queue.finish()
print('iteration: ', ii) # print iteration time to show me its running
e = time.time() # end the timer
cs = time.time() # clock the time it takes to return the array
nlist = nlst.get()
ce = time.time()
# output the times
print('time to calculate: ', e-s)
print('time to copy back: ', ce - cs)
GPU Context Class
import pyopencl as cl
class gpuParameters:
def __init__(self, dType = []):
#will setup the proper context based on given device preference
#if no device perference given will default to first value
if dType == []:
pltfrms = cl.get_platforms()[0]
devices = pltfrms.get_devices(cl.device_type.GPU)
context = cl.Context(devices) #create a device context
print(context)
print(devices)
self.cntxt = context#keep this context in motion
self.queue = cl.CommandQueue(self.cntxt) #create a command que for this context
self.mF = cl.mem_flags
Neighbor Loop up
import numpy as np
import pyopencl as cl
import gpu_sph_assistance_functions as gsaf
class BruteForceSearch:
def __init__(self, gpu):
# instantiation of the search routine primarilly for pre compiling of
# the function
self.gpu = gpu # save the gpu context
# setup and compile the search
self.bruteSearch()
def bruteSearch(self):
W = gsaf.gpu_sph_kernel()
self.search = cl.Program(
self.gpu.cntxt,
W + '''__kernel void search(__global int *nP, __global int *nN,
__global int *dim,
__global double *x, __global double *y,
__global double *z, __global double *h,
__global int *nlist, __global double *out)
{
// indices
int gid = get_global_id(0); // current particle
int idv = 0; // unrolled array id
int count = 0; // count
int dm = *dim; // problem dimension
int itr = 0; // start iteration
int mxitr = 25; // max number of iterations
// calculate variables
double dms = 1.0/(*dim); // 1 over dimension for pow
double xi = x[gid]; // current x position
double yi = y[gid]; // current y position
double zi = z[gid]; // current z position
double dx = 0; // difference in x
double dy = 0; // difference in y
double dz = 0; // difference in z
double r = 0; // radius
double hg = h[gid]; // smoothing parametre
double Wsum = 0; // sum of weights
double W = 0; // current weight
double dwdx = 0; // derivative of weight in x direction
double dwdy = 0; // derivative of weight in y direction
double dwdz = 0; // derivative of weight in z direction
double dwdr = 0; // derivative of weight in r direction
double V = 0; // Volume of particle
double hn = 0; // holding value for comparison
double err = 10; // error
double tol = 1e-7; // tolerance
double diff = 0; // difference
// first clean the array of neighbors
for (int ii = 0; ii < *nN; ii++) // length of num of neighbors
{
idv = *nN*gid + ii; // unrolled index
nlist[idv] = -1; // this is a trigger for excluding values
}
// Next calculate the h parameter
while (err > tol)
{
Wsum = 0; // clean summation
for (int jj = 0; jj < *nP; jj++) // loop over all particles
{
dx = xi - x[jj];
dy = yi - y[jj];
dz = zi - z[jj];
// spline for weights
quintic_spline(dm, hg, dx, dy, dz, &W,
&dwdx, &dwdy, &dwdz, &dwdr);
Wsum += W; // add to store
}
V = 1.0/Wsum; /// volume
hn = pow(V, dms); // new h parameter
diff = hn - hg; // difference
err = fabs(diff); // error
out[gid] = err; // store error for debug
hg = hn; // reset h
itr ++; // update iter
if (itr > mxitr) // break out
{ break; }
}
h[gid] = hg; // store h
/* // get all neighbors in vicinity of particle not
// currently assessed
for(int ii = 0; ii < *nP; ii++)
{
dx = xi - x[ii];
dy = yi - y[ii];
dz = zi - z[ii];
r = sqrt(dx*dx + dy*dy + dz*dz);
if (r < 3.25*hg & count < *nN)
{
idv = *nN*gid + count;
nlist[idv] = ii;
count++;
}
}
*/
}
''').build()
The Spline function for weighting
W = '''void quintic_spline(
int dim, double h, double dx, double dy, double dz, double *W,
double *dWdx, double *dWdy, double *dWdz, double *dWdrO)
{
double pi = 3.141592654; // pi
double m3q = 0; // prefix values
double m2q = 0; // prefix values
double m1q = 0; // prefix values
double T1 = 0; // prefix values
double T2 = 0; // prefix values
double T3 = 0; // prefix values
double D1 = 0; // prefix values
double D2 = 0; // prefix values
double D3 = 0; // prefix values
double Ch = 0; // normalizing parameter for kernel
double C = 0; // normalizing prior to h
double r = sqrt(dx*dx + dy*dy + dz*dz);
double q = r/h; // normalized radius
double dqdr = 1.0/h; // intermediate derivative
double dWdq = 0; // intermediate derivative
double dWdr = 0; // intermediate derivative
double drdx = dx/r; // intermediate derivative
double drdy = dy/r; // intermediate derivative
double drdz = dz/r; // intermediate derivative
if (dim == 1)
{
C = 1.0/120.0;
}
else if (dim == 2)
{
C = 7.0/(pi*478.0);
}
else if (dim == 3)
{
C = 1.0/(120.0*pi);
}
Ch = C/pow(h, dim);
if (r <= 0)
{
drdx = 0.0;
drdy = 0.0;
drdz = 0.0;
}
// local prefix constants
m1q = 1.0 - q;
m2q = 2.0 - q;
m3q = 3.0 - q;
// smoothing parameter constants
T1 = Ch*pow(m3q, 5);
T2 = -6.0*Ch*pow(m2q, 5);
T3 = 15.0*Ch*pow(m1q, 5);
//derivative of spline coefficients
D1 = -5.0*Ch*pow(m3q,4);
D2 = 30.0*Ch*pow(m2q,4);
D3 = -75.0*Ch*pow(m1q,4);
// W calculation
if (q < 1.0)
{
*W = T1 + T2 + T3;
dWdq = D1 + D2 + D3;
}
else if (q >= 1.0 && q < 2.0)
{
*W = T1 + T2;
dWdq = D1 + D2;
}
else if (q >= 2.0 && q < 3.0)
{
*W = T1;
dWdq = D1;
}
else
{
*W = 0.0;
dWdq = 0.0;
}
dWdr = dWdq*dqdr;
// assign the derivatives
*dWdx = dWdr*drdx;
*dWdy = dWdr*drdy;
*dWdz = dWdr*drdz;
*dWdrO = dWdr;
}'''

I tested the code on a Intel i7-4790K CPU with AMD Accelerated Parallel Processing. It does not crash at n=150000 (I only run one iteration). The only odd thing I discovered while quickly looking into the code, was that the kernel is reading and writing in the array h. This should not be a problem, but still I usually try to avoid this.

Related

Plotting solution 2nd ODE using Euler

I have used the Equation of Motion (Newtons Law) for a simple spring and mass scenario incorporating it into the given 2nd ODE equation y" + (k/m)x = 0; y(0) = 3; y'(0) = 0.
Using the Euler method and the exact solution to solve the problem, I have been able to run and receive some ok results. However, when I execute a plot of the results I get this diagonal line across the oscillating results that I am after.
Current plot output with diagonal line
Can anyone help point out what is causing this issue, and how I can fix it please?
MY CODE:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from sympy import Function, dsolve, Eq, Derivative, sin, cos, symbols
from sympy.abc import x, i
import math
# Given is y" + (k/m)x = 0; y(0) = 3; y'(0) = 0
# Parameters
h = 0.01; #Step Size
t = 50.0; #Time(sec)
k = 1; #Spring Stiffness
m = 1; #Mass
x0 = 3;
v0 = 0;
# Exact Analytical Solution
x_exact = x0*cos(math.sqrt(k/m)*t);
v_exact = -x0*math.sqrt(k/m)*sin(math.sqrt(k/m)*t);
# Eulers Method
x = np.zeros( int( t/h ) );
v = np.zeros( int( t/h ) );
x[1] = x0;
v[1] = v0;
x_exact = np.zeros( int( t/h ) );
v_exact = np.zeros( int( t/h ) );
te = np.zeros( int( t/h ) );
x_exact[1] = x0;
v_exact[1] = v0;
#print(len(x));
for i in range(1, int(t/h) - 1): #MAIN LOOP
x[i+1] = x[i] + h*v[i];
v[i+1] = v[i] - h*k/m*x[i];
te[i] = i * h
x_exact[i] = x0*cos(math.sqrt(k/m)* te[i]);
v_exact[i] = -x0*math.sqrt(k/m)*sin(math.sqrt(k/m)* te[i]);
# print(x_exact[i], '\t'*2, x[i]);
#plot
%config InlineBackend.figure_format = 'svg'
plt.plot(te, x_exact, te ,v_exact)
plt.title("DISPLACEMENT")
plt.xlabel("Time (s)")
plt.ylabel("Displacement (m)")
plt.grid(linewidth=0.3)

An in some details more direct computation is
te = np.arange(0,t,h)
N = len(te)
w = (k/m)**0.5
x_exact = x0*np.cos(w*te);
v_exact = -x0*w*np.sin(w*te);
plt.plot(te, x_exact, te ,v_exact)
resulting in
Note that arrays in python start at the index zero,
x = np.empty(N)
v = np.empty(N)
x[0] = x0;
v[0] = v0;
for i in range(N - 1): #MAIN LOOP
x[i+1] = x[i] + h*v[i];
v[i+1] = v[i] - h*k/m*x[i];
plt.plot(te, x, te ,v)
then gives the plot
with the expected increasing amplitude.

How to decode raw_outputs/box_encodings from Tensorflow Object detection ssd-mobilenet without nms

In order to deploy my own ssd-mobile model on android and use NNAPI acceleration , I retrained the model without NMS post processing according to the tensorflow objection detection API.
without NMS, the output raw_outputs/box_encodings are encoded box location, I decode it as follows, but it does not work:
for(int j =0; j < 5; j++)
{
float sk = (float)(0.2 + (0.949 - 0.200) *j * 1.0 / 5*1.0);
float width_a = (float)(sk * Math.sqrt(aspectra[j]));
float height_a = (float)(sk * 1.0 / Math.sqrt(aspectra[j]));
for(int k = 0; k < featuresize[j] ; k++)
{
float center_x_a = (float)((k + 0.5) * 1.0/ featuresize[j]);
float center_y_a = (float)((k + 0.5) * 1.0/ featuresize[j]);
float ty = (float)(outputBox[0][i][0] / 10.);
float tx = (float)(outputBox[0][i][1] / 10.);
float th = (float)(outputBox[0][i][2] / 5.);
float tw = (float)(outputBox[0][i][3] / 5.);
float w =(float)(Math.exp(tw) * width_a);
float h = (float)(Math.exp(th) * height_a);
float y_center = ty * height_a + center_y_a;
float x_ceneter = tx * width_a + center_x_a;
float ymin = (float)((y_center - h ) / 2.);
float xmin = (float)((x_ceneter - w ) / 2.);
float ymax = (float)((y_center + h ) / 2.);
float xmax = (float)((x_ceneter + w ) / 2.);

In order to decode raw_outputs/box_encodings you also need anchors as the box_encodings are encoded with respect to anchors.
Following is my implementation of decoding raw_outputs/box_encodings:
private float[][][] decodeBoxEncodings(final float[][][] boxEncoding, final float[][] anchor, final int numBoxes) {
final float[][][] decodedBoxes = new float[1][numBoxes][4];
for (int i = 0; i < numBoxes; ++i) {
final double ycenter = boxEncoding[0][i][0] / y_scale * anchor[i][2] + anchor[i][0];
final double xcenter = boxEncoding[0][i][1] / x_scale * anchor[i][3] + anchor[i][1];
final double half_h = 0.5 * Math.exp((boxEncoding[0][i][2] / h_scale)) * anchor[i][2];
final double half_w = 0.5 * Math.exp((boxEncoding[0][i][3] / w_scale)) * anchor[i][3];
decodedBoxes[0][i][0] = (float)(ycenter - half_h); //ymin
decodedBoxes[0][i][1] = (float)(xcenter - half_w); //xmin
decodedBoxes[0][i][2] = (float)(ycenter + half_h); //ymax
decodedBoxes[0][i][3] = (float)(xcenter + half_w); //xmax
}
return decodedBoxes;
}
This decoding technique is from TFLite detection_postprocess operation.
Edit: scale values are:
float y_scale = 10.0f;
float x_scale = 10.0f;
float h_scale = 5.0f;
float w_scale = 5.0f;

https://actcast.hatenablog.com/entry/2021/08/06/085134
This worked for me.
My pb model - SSD Mobilenet V1 0.75 Depth Quantized (tflite_graph.pb)
Tf Version - 1.15
outputs - raw_outputs/box_encodings & raw_outputs/class_predictions
Start from the step 3 (Create Anchor) steps in the above mentioned blog (as the model ready was ready with me, I didn't do training part)
4th step is not required if model ready & load our model (they have
loaded nnoir_model, instead of that we can load our model)
corresponding google colabs : https://colab.research.google.com/github/Idein/tensorflow-object-detection-api-to-nnoir/blob/master/notebook/ssd_mobilenet_v1_coco_2018_01_28_to_nnoir.ipynb#scrollTo=51d0jACEslWu

How to get the average color of a specific area in a webcam feed (Processing/JavaScript)?

I'm using Processing to get a webcam feed from my laptop. In the top left corner, I have drawn a rectangle over the displayed feed. I'm trying to get the average color of the webcam, but only in the region contained by that rectangle.
I keep getting color (0, 0, 0), black, as the result.
Thank you all!
PS sorry if my code seems messy..I'm new at Processing and so I don't know if this might be hard to read or contain bad practices. Thank you.
import processing.video.*;
Capture webcam;
Capture cap;
PImage bg_img;
color bgColor = color(0, 0, 0);
int rMargin = 50;
int rWidth = 100;
color input = color(0, 0, 0);
color background = color(255, 255, 255);
color current;
int bgTolerance = 5;
void setup() {
size(1280,720);
// start the webcam
String[] inputs = Capture.list();
if (inputs.length == 0) {
println("Couldn't detect any webcams connected!");
exit();
}
webcam = new Capture(this, inputs[0]);
webcam.start();
}
void draw() {
if (webcam.available()) {
// read from the webcam
webcam.read();
image(webcam, 0,0);
webcam.loadPixels();
noFill();
strokeWeight(2);
stroke(255,255, 255);
rect(rMargin, rMargin, rWidth, rWidth);
int yCenter = (rWidth/2) + rMargin;
int xCenter = (rWidth/2) + rMargin;
// rectMode(CENTER);
int rectCenterIndex = (width* yCenter) + xCenter;
int r = 0, g = 0, b = 0;
//for whole image:
//for (int i=0; i<bg_img.pixels.length; i++) {
// color c = bg_img.pixels[i];
// r += c>>16&0xFF;
// g += c>>8&0xFF;
// b += c&0xFF;
//}
//r /= bg_img.pixels.length;
//g /= bg_img.pixels.length;
//b /= bg_img.pixels.length;
//CALCULATE AVG COLOR:
int i;
for(int x = 50; x <= 150; x++){
for(int y = 50; y <= 150; y++){
i = (width*y) + x;
color c = webcam.pixels[i];
r += c>>16&0xFF;
g += c>>8&0xFF;
b += c&0xFF;
}
}
r /= webcam.pixels.length;
g /= webcam.pixels.length;
b /= webcam.pixels.length;
println(r + " " + g + " " + b);
}
}

You're so close, but missing out one important aspect: the number of pixels you're sampling.
Notice in the example code that is commented out for a full image you're dividing by the full number of pixels (pixels.length).
However, in your adapted version you want to compute the average colour of only a subsection of the full image which means a smaller number of pixels.
You're only sampling an area that is 100x100 pixels meaning you need to divide by 10000 instead of webcam.pixels.length (1920x1000). That is why you get 0 as it's integer division.
This is what I mean in code:
int totalSampledPixels = rWidth * rWidth;
r /= totalSampledPixels;
g /= totalSampledPixels;
b /= totalSampledPixels;
Full tweaked sketch:
import processing.video.*;
Capture webcam;
Capture cap;
PImage bg_img;
color bgColor = color(0, 0, 0);
int rMargin = 50;
int rWidth = 100;
int rHeight = 100;
color input = color(0, 0, 0);
color background = color(255, 255, 255);
color current;
int bgTolerance = 5;
void setup() {
size(1280,720);
// start the webcam
String[] inputs = Capture.list();
if (inputs.length == 0) {
println("Couldn't detect any webcams connected!");
exit();
}
webcam = new Capture(this, inputs[0]);
webcam.start();
}
void draw() {
if (webcam.available()) {
// read from the webcam
webcam.read();
image(webcam, 0,0);
webcam.loadPixels();
noFill();
strokeWeight(2);
stroke(255,255, 255);
rect(rMargin, rMargin, rWidth, rHeight);
int yCenter = (rWidth/2) + rMargin;
int xCenter = (rWidth/2) + rMargin;
// rectMode(CENTER);
int rectCenterIndex = (width* yCenter) + xCenter;
int r = 0, g = 0, b = 0;
//for whole image:
//for (int i=0; i<bg_img.pixels.length; i++) {
// color c = bg_img.pixels[i];
// r += c>>16&0xFF;
// g += c>>8&0xFF;
// b += c&0xFF;
//}
//r /= bg_img.pixels.length;
//g /= bg_img.pixels.length;
//b /= bg_img.pixels.length;
//CALCULATE AVG COLOR:
int i;
for(int x = 0; x <= width; x++){
for(int y = 0; y <= height; y++){
if (x >= rMargin && x <= rMargin + rWidth && y >= rMargin && y <= rMargin + rHeight){
i = (width*y) + x;
color c = webcam.pixels[i];
r += c>>16&0xFF;
g += c>>8&0xFF;
b += c&0xFF;
}
}
}
//divide by just the area sampled (x >= 50 && x <= 150 && y >= 50 && y <= 150 is a 100x100 px area)
int totalSampledPixels = rWidth * rHeight;
r /= totalSampledPixels;
g /= totalSampledPixels;
b /= totalSampledPixels;
fill(r,g,b);
rect(rMargin + rWidth, rMargin, rWidth, rHeight);
println(r + " " + g + " " + b);
}
}
Bare in mind this is averaging in the RGB colour space which is not the same as perceptual colour space. For example, if you average red and yellow you'd expect orange, but in RGB, a bit of red and green makes yellow.
Hopefully the RGB average is good enough for what you need, otherwise you may need to convert from RGB to CIE XYZ colour space then to Lab colour space to compute the perceptual average (then convert back to XYZ and RGB to display on screen). If that is something you're interested in trying, you can find an older answer demonstrating this in openFrameworks (which you'll notice can be similar to Processing in simple scenarios).

PyOpenCL how to modify a matrix locally within the kernel function

I am trying to modify a matrix (Pbis) locally within a pyOpenCL kernel function and when filling up this matrix with 0 it alters the result matrix R. When executing this code we obtain weird values in the R matrix. It is probably due to memory allocation but we cannot figure out how to fix it. Normally R should be exclusively composed of the init value.
program = cl.Program(context, """
__kernel void generate_paths(__global float *P, ushort const n,
ushort N, ushort init, __global float *R){
int i = get_global_id(0);
__private float* Pbis;
for (int k=0; k<n; k++){
Pbis[k] = 0;
}
for (int j=0; j<n; j++)
{
R[i*(n+1) + j] = init;
}
R[i*(n+1) + n] = init;
}
""").build()
The parameters for the generation are:
program.generate_paths(queue, res_np.shape, None, P_buf, np.uint16(n), np.uint16(N), np.uint16(init), res_buf)
Here is the entire code for reproducibility:
import numpy as np
import pyopencl as cl
import numpy.linalg as la
import os
os.environ['PYOPENCL_COMPILER_OUTPUT'] = '1'
os.environ['PYOPENCL_CTX'] = '1'
(n, N) = (3,6)
U = np.random.uniform(0,1, size=(n+1)*N)
U = U.astype(np.float32)
P = np.matrix([[0, 1/3, 1/3, 1/3], [1/3, 0, 1/3, 1/3], [1/3, 1/3, 0, 1/3], [1/3, 1/3, 1/3, 0]])
P = P.astype(np.float32)
res_np = np.zeros((N, n+1),dtype = np.float32)
platform = cl.get_platforms()[0]
device = platform.get_devices()[0]
context = cl.Context([device])
queue = cl.CommandQueue(context)
mf = cl.mem_flags
U_buf = cl.Buffer(context, mf.COPY_HOST_PTR | mf.COPY_HOST_PTR, hostbuf=U)
P_buf = cl.Buffer(context, mf.COPY_HOST_PTR | mf.COPY_HOST_PTR, hostbuf=P)
res_buf = cl.Buffer(context, mf.WRITE_ONLY, res_np.nbytes)
init = 0
program = cl.Program(context, """
__kernel void generate_paths(__global const float *U, __global float *P, ushort const n,
ushort N, ushort init, __global float *R){
int i = get_global_id(0);
int current = init;
__private float* Pbis;
for (int k=0; k<n; k++){
Pbis[k] = 0;
}
for (int j=0; j<n; j++)
{
R[i*(n+1) + j] = current;
}
R[i*(n+1) + n] = init;
}
""").build()
#prg.multiply(queue, c.shape, None,
# np.uint16(n), np.uint16(m), np.uint16(p),
# a_buf, b_buf, c_buf)
# a_mul_b = np.empty_like(c)
# cl.enqueue_copy(queue, a_mul_b, c_buf)
program.generate_paths(queue, res_np.shape, None, U_buf, P_buf, np.uint16(n), np.uint16(N), np.uint16(init), res_buf)
chem_gen = np.empty_like(res_np)
cl.enqueue_copy(queue, chem_gen, res_buf)
print("Platform Selected = %s" %platform.name)
print("Device Selected = %s" %device.name)
print("Generated Paths:")
print (chem_gen)

Error using dde23 (line 224) Derivative and history vectors have different lengths

I am trying to solve a couple system of delay differential equations using dde23. While running the following code, I am getting an annoying error "Derivative and history vectors have different lengths"
function sol = prob1
clf
global Lembda alpha u1 u2 p q c d k a T b zeta1 zeta2 A1 A2
Lembda=2; b=0.07; d=0.0123; a=0.6; k=50; q=13; c=40; p=30; alpha = 0.4476; T=1; B=0.4; A1 =200; A2=100; zeta1=10; zeta2=30;
lags = [ 10; 0.2; 2; 10; 0.2; 10; 0.2; 2; 10; 0.2; 15; 0.9; 0.17; 0.01; 0.5; 0.000010; 0.00002];
sol = dde23(#prob2f,T,lags,[0,10], u1, u2);
function yp = prob2f(t,y,Z,B)
global Lembda alpha p b d c q T a k zeta1 zeta2 A1 A2
x2 = y(1);
y2 = y(2);
z2 = y(3);
v = y(4);
w = y(5);
xlag = Z(1,1);
vlag = Z(2,1);
%%%%%%%%%%%%%%%%
x1 = y(6);
y1 = y(7);
z1 = y(8);
v1 = y(9);
w1 = y(10);
x1lag = Z(1,1);
v1lag = Z(2,1);
%%%%%%%%%%%%%%%%%%%
lambda1 = y(11);
lambda2 = y(12);
lambda3 = y(13);
lambda4 = y(14);
lambda5 = y(15);
u1 = y(16);
u2= y(17);
lambda1lag = Z(1,1);
lambda4lag = Z(2,1);
%%%%%%%%%
dxdt=Lembda-d*x2-B*x2*v;
dydt=B*exp(-a*T)*xlag*vlag-a*y2 - alpha*y2*w;
dzdt=alpha*y2*w - b*z2;
dvdt=k*y2-p*v;
dwdt=c*z2-q*w;
%%%%%%%%%
dx1dt=Lembda-d*x1-(1-u1)*B*x1*v1;
dy1dt=(1-u1)*B*exp(-a*T)*x1lag*v1lag-a*y1 - alpha*y1*w1;
dz1dt=alpha*y1*w1 - b*z1;
dv1dt=(1-u2)*k*y1-p*v1;
dw1dt=c*z1-q*w1;
%%%%%%%%%%
dlambda1dt= A1+lambda1*d+(1-u1)*lambda1*B*v1-(1-u1)*lambda2*B*v1lag*exp(-a*T)*lambda2*(T);
dlambda2dt= a*lambda2+(lambda2-lambda3)*alpha*w1-lambda4*k*(u2-1);
dlambda3dt= b*lambda3-c*lambda5;
dlambda4dt= A2+(1-u1)*lambda1*B*x1+lambda4*p+lambda4*(T)*lambda2*x1lag*(u2-1)*exp(-a*T);
dlambda5dt=alpha*lambda2*z1-alpha*lambda3*z1+lambda5*q;
du1dt = ( lambda2*x1lag*v1lag - lambda1*x1*v1)*(B/zeta1);
du2dt =(lambda4*k*y2)/zeta2;
yp = [ dxdt; dydt; dzdt; dvdt;dwdt; dx1dt; dy1dt; dz1dt; dv1dt;dw1dt; dlambda1dt; dlambda2dt; dlambda3dt ;dlambda4dt ;dlambda5dt; du1dt; du2dt ];
Can anyone guide me, to be able to resolve this issue?
Thanks

The error occurs because your return vector yp is not the same size as the lags vector.
The lags vector has length 17, but the yp vector comes out to be of length 10. Even though you have 17 entries in yp, many of them as []
yp = [ dxdt; dydt; dzdt; dvdt;dwdt; dx1dt; dy1dt; dz1dt; dv1dt;dw1dt;
dlambda1dt; dlambda2dt; dlambda3dt ;dlambda4dt ;dlambda5dt; du1dt; du2dt ];
K>> dxdt
dxdt =
[]
K>> length(yp)
10
lags = [ 10; 0.2; 2; 10; 0.2; 10; 0.2; 2; 10; 0.2; 15; 0.9; 0.17; 0.01;
0.5; 0.000010; 0.00002];
sol = dde23(#prob2f,T,lags,[0,10], u1, u2);
K>> length(lags)
17
The return from your prob2f() should have same length as lags. This is why the error shows up
f0 = feval(ddefun,t0,y0,Z0,varargin{:});
nfevals = nfevals + 1;
[m,n] = size(f0);
if n > 1
error(message('MATLAB:dde23:DDEOutputNotCol'))
elseif m ~= neq
error(message('MATLAB:dde23:DDELengthMismatchHistory')); <========
end
You need to check your prob2f function and make sure yp has same length as lags.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

PyOpenCl Kernel in Loop Crashes GPU - python-3.x

Related

Plotting solution 2nd ODE using Euler

How to decode raw_outputs/box_encodings from Tensorflow Object detection ssd-mobilenet without nms

How to get the average color of a specific area in a webcam feed (Processing/JavaScript)?

PyOpenCL how to modify a matrix locally within the kernel function

Error using dde23 (line 224) Derivative and history vectors have different lengths

Categories

Resources