Although there are many resources about how to calculate the receptive field (RF) of CNNs (ex: http://fomoro.com/tools/receptive-fields), I didn't find anything regarding skip connections. In [1] they mention that skip connections make the effective RF smaller, but what happens to the theoretical RF?
At the end of the day, I would like to know how to calculate the receptive field of a network comprising many residual blocks.
Thanks,
Daniel
TL;DR compute the receptive field ignoring all skip connections.
First, in a general case, let's say we have two branches of data flow - A and B. You can compute the receptive field for branches A and B independently, and then simply take the maximum when the branches merge. (The reason you can take the max is that branches typically merge via channels concatenation.)
Now, when one branch is a skip connection, and the other is not, the one which is not, gives the larger receptive field. If you have many skip connections, the longest route (with no skip connections) would give the maximum receptive field. Hence the result in TL;DR.
Getting the maximum among branches becomes more complicated if instead of a simple skip connection you have something like an inception block.
In those cases, you may want to compute the receptive field directly by definition.
Related
I'm struggling with a task of optimization of resources, and I wonder if someone knows any efficient way to find the optimal solution for it, using only Excel. I explain what I'm trying to achieve:
Suppose you are managing an assembly factory for metal tubes. Your raw material is standard size tubes, and then in your factory you need to cut these tubes according to a list of requests from clients, with very specific sizes. All tubes are of the same type, so we can reuse leftovers from each cut, if the length of that leftover is sufficient to satisfy any tube request.
We can also group small length requests to be made from one single tube, for example, on the attached list, we could use one 8 metre tube to deliver the last four entries (1,615+1,62+1,625+1,67), with 1,47 leftover wasted.
Assuming a long list of requests, and that the tubes supplied are 8 metres each, do you know of any way of calculating how many tubes I have to order to satisfy the list of requests, minimising the losses per each cut?
Example of request list, each entry is in metres
We are senior year student who designs FPGA based Convolutional Neural Network accelerator.
We built pipelined architecture. (Convolution, Pooling, Convolution and Pooling), for this 4 stage of the architecture, we need to multiply one particular window and filter. We have (5*5)*6*16 window in the 2nd convolution layer and filter.
Up to here, I accept this is not a clear explanation. But the main problem in here is that we need to access 5*5*6*16 filter coefficients which are stored in block ram sequentially at the same time. But at every clock, I can just reach one particular address on the ROM.
What approach can we take?
What approach can we take?
You don't want to hear this but the only solution is:
Go back to the start and change your architecture/code. (or run very slowly)
You can NOT access 2400 coefficients sequentially unless you run the memory at a 2400 times the clock frequency of your main system. So lets say with a 100MHz RAM/ROM operating frequency your main system must run at ~42KHz.
This is a recurrent theme I encounter on these forums. You have made a wrong decision and now want a solution. Preferable an easy one. Sorry but there is none.
I am having the same issue. For some layers we want to access multiple kernels for parallel operations, however, for a BRAM implementation you can have at most 2 accesses per cycle. So, the solution I made to this is to create a ROM array, whether implemented in BRAM style or Distributed style.
Unlike RAM array, you can't just implement the ROM array as easily. So you need a script/software layer that generates RTL for your module.
I chose to implement with Distributed approach, however, I can't estimate the resources needed and the utilization reports give me unclear results. Still investigating into this.
For future reference you could look into HLS pragmas to help use the FPGAs resources. What you could do is use the array partition pragma with the cyclic setting. This makes it so that each subsequent element of an array is stored in a different sub array.
For example with a factor of 4 there’d be four smaller arrays created from the original array. The first element in each sub array would be arr[0], arr[1], arr[2] respectively.
That’s how you would distribute an array across Block RAMs to have more parallel access at a time.
I have a object with many fields. Each field has different range of values. I want to use hypothesis to generate different instances of this object.
Is there a limit to the number of combination of field values Hypothesis can handle? Or what does the search tree hypothesis creates look like? I don't need all the combinations but I want to make sure that I get a fair number of combinations where I test many different values for each field. I want to make sure Hypothesis is not doing a DFS until it hits the max number of examples to generate
TLDR: don't worry, this is a common use-case and even a naive strategy works very well.
The actual search process used by Hypothesis is complicated (as in, "lead author's PhD topic"), but it's definitely not a depth-first search! Briefly, it's a uniform distribution layered on a psudeo-random number generator, with a coverage-guided fuzzer biasing that towards less-explored code paths, with strategy-specific heuristics on top of that.
In general, I trust this process to pick good examples far more than I trust my own judgement, or that of anyone without years of experience in QA or testing research!
Is there any way to test multiple algorithms rather than doing it once for each and every algorithm; then checking the result? There are a lot of times where I don’t really know which one to use, so I would like to test multiple and get the result (error rate) fairly quick in Azure Machine Learning Studio.
You could connect the scores of multiple algorithms with an 'Evaluate Model' button to evaluate algorithms against each other.
Hope this helps.
The module you are looking for, is the one called “Cross-Validate Model”. It basically splits whatever comes in from the input-port (dataset) into 10 pieces, then reserves the last piece as the “answer”; and trains the nine other subset models and returns a set of accuracy statistics measured towards the last subset. What you would look at is the column called “Mean absolute error” which is the average error for the trained models. You can connect whatever algorithm you want to one of the ports, and subsequently you will receive the result for that algorithm in particular after you “right-click” the port which gives the score.
After that you can assess which algorithm did the best. And as a pro-tip; you could use the Filter-based-feature selection to actually see which column had a significant impact on the result.
You can check section 6.2.4 of hands-on-lab at GitHub https://github.com/Azure-Readiness/hol-azure-machine-learning/blob/master/006-lab-model-evaluation.md which focuses on the evaluation of multiple algorithms etc.
I am looking for an appropriate formalism (i.e. a temporal logic) to model the following kind of situation
There can be events happening at discrete events in time (subject to conditions to be detailed below).
There is state. This state cannot be expressed by a fixed number of variables. However, it is possible to express it with a linear list/array, where each entry consists of a finite number of variables.
Before any events have happened, the state is fixed.
At any point in time, events are possible. They have a fixed structure (with a few variables). The possible events are constrained by the current state.
Events will cause an immediate change of the state.
Events can also cause continuous state changes. For example, a variable (of one of the entries of the array mentioned above) changes its value from 0 to 1 over some time (either immediately or after a specified delay).
It should also be possible to specify discrete points in time in the form "the earliest point in time after event E where some condition C holds", and to start a continuos state change at such a point.
Is there an existing temporal logic to model something like this?
It should also be possible to express desired conditions, like the following:
Referring to a certain point in time: The sum of a specific variables of all the entries of the array may not exceed a certain threshold.
Referring to change over time: For all possible time intervals, the value of a certain variable (again, from each entry of said array) [realistically, rather of some arithmetic expression computed for each entry] must not change faster than a given threshold.
There should exist a model checker that can check whether for all possible scenarios, all the conditions are met. If this is not the case, it should print one possible scenario and tell me which condition is not met. In other words, it should distinguish between conditions describing the possible scenarios, and conditions that have have to be fulfilled in those scenarios, and not just tell me "not possible".
You need a model checker with more flexible language. Technically speaking model checking of systems of infinite state space is open research problem and in general case algorithmically undecidable. The temporal logic is more typically related to propreties under the question.
Considering limited info you shared about your project, why do not you try Spin/Promela it is loosely inspired by C and has 'buffers' which can be considered to be arrays. At the least you might be able to simulate your system?